The issue: fine-tuning a SLM usually requires a dataset of 10k - 100k records, which is huge.
I created a platform for applying "human in the loop" augmentation techniques on your small dataset, so that you can start with maybe 100 records and build-up quickly huge datasets and launch a training without prior knowledge.
I implemented 2 techniques, based on LLM distillation, however: more to come.
HN, what Do you think? Do you see value in this idea? Would you prefer a public API or CLI instead?
I appreciate your help.
Kind regards Pawel