- Last updated September 16, 2024
- In AI News
The organisation has also revamped its website, making datasets, models, and tools more accessible.
AI4Bharat has launched a series of innovations aimed at enhancing Indian language technology, including speech recognition, data annotation, and expressive text-to-speech (TTS).
One of the key releases is IndicASR, India’s first speech recognition model covering all 22 official languages. A web demo is available for users to test and provide feedback. Also called ndicConformers, it is a comprehensive set of ASR models designed to accurately convert speech to text in all 22 official Indian languages.
Another major release is Anudesh, v0.1, an open-source platform designed to improve LLMs for Indian languages through data annotation. The platform’s first version facilitates conversational data collection through LLM interactions and supports model evaluation workflows.
Additionally, AI4Bharathas introduced Rasa, a dataset for expressive TTS that spans nine languages and features 14 speakers. The dataset includes at least 20 hours of …