It's a 9M Conformer-CTC model trained on ~300h (AISHELL + Primewords), quantized to INT8 (11 MB), runs 100% in-browser via ONNX Runtime Web.
Grades per-syllable pronunciation + tones with Viterbi forced alignment.
Try it here: https://simedw.com/projects/ear/
jellojello•3h ago
simedw•3h ago
I had a quick look at Farsi datasets, and there seem to be a few options. That said, written Farsi doesn’t include short vowels… so can you derive pronunciation from the text using rules?
kranner•3h ago
You can't, but Farsi dictionaries list the missing short vowels/diacritics/"eraab" for every word.
For instance, see this entry: https://vajehyab.com/dehkhoda/%D8%AD%D8%B3%D8%A7%D8%A8?q=%D8...
With the short vowel on the first letter it would be written حِساب (normally written as just حساب)
The dictionary entry linked shows that there is a ِ on the first letter ح
But you would have to disambiguate between homographs that differ only in the eraab.