Comparative performance analysis of end-to-end ASR models on Indo-Aryan and Dravidian languages within India’s linguistic landscape

Abstract India’s linguistic diversity encompasses multiple language families, including the Indo-Aryan and Dravidian, which represent distinct phonological and morphological characteristics. This study aims to evaluate and compare the performance of end-to-end automatic speech recognition (ASR) syst...

Full description

Saved in:
Bibliographic Details
Main Authors: Palash Jain, Anirban Bhowmick
Format: Article
Language:English
Published: SpringerOpen 2025-02-01
Series:EURASIP Journal on Audio, Speech, and Music Processing
Subjects:
Online Access:https://doi.org/10.1186/s13636-025-00395-5
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Abstract India’s linguistic diversity encompasses multiple language families, including the Indo-Aryan and Dravidian, which represent distinct phonological and morphological characteristics. This study aims to evaluate and compare the performance of end-to-end automatic speech recognition (ASR) systems for three Indo-Aryan languages—Marathi, Odia, and Gujarati—and three Dravidian languages—Tamil, Telugu, and Malayalam. Using four transformer-based pre-trained models—Wav2Vec2.0-base, XLSR-53, W2V2-BERT, and Whisper small—the analysis explores their adaptability to these languages’ linguistic features, with word error rate (WER) and character error rate (CER) serving as evaluation metrics. Results indicate that W2V2-BERT and XLSR-53 outperform other models, achieving lower WER and CER, especially for Indo-Aryan languages. However, higher error rates for Dravidian languages highlight challenges such as complex phonology and agglutinative morphology. This work provides a comparative insight into the strengths and limitations of pre-trained ASR models across India’s diverse linguistic landscape and underscores the need for language-specific adaptations to improve ASR accuracy for underrepresented languages.
ISSN:1687-4722