A syllable-character collaborative model for enhanced Pinyin and Chinese recognition.

In Chinese speech recognition, end-to-end speech recognition models usually use Chinese characters as direct output and perform poorly compared with other language models. The main reason for this phenomenon is that the relationship between Chinese text and pronunciation is more complex. Inspired by...

Full description

Saved in:
Bibliographic Details
Main Authors: Zeyuan Chen, Cheng Zhong, Danyang Chen
Format: Article
Language:English
Published: Public Library of Science (PLoS) 2025-01-01
Series:PLoS ONE
Online Access:https://doi.org/10.1371/journal.pone.0325045
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1850116515434594304
author Zeyuan Chen
Cheng Zhong
Danyang Chen
author_facet Zeyuan Chen
Cheng Zhong
Danyang Chen
author_sort Zeyuan Chen
collection DOAJ
description In Chinese speech recognition, end-to-end speech recognition models usually use Chinese characters as direct output and perform poorly compared with other language models. The main reason for this phenomenon is that the relationship between Chinese text and pronunciation is more complex. Inspired by the learning process of Chinese beginners, who first master initials, finals, and pinyin before learning characters, we propose the Syllable-Character Collaborative Model (SCCM), which incorporates these phonetic elements into the training process. Additionally, we design a Pinyin-Ensemble module that employs an ensemble learning approach to reduce pinyin recognition errors, which in turn leads to a reduction in text recognition errors. Experiments on AISHELL-1 show that our approach not only reduces pinyin and character error rates compared to a prior end-to-end method using pinyin as auxiliary information, but also achieves a 45.7% relative reduction in Character Error Rate (CER) over the AISHELL-1 baseline.
format Article
id doaj-art-07a7e82a43c04127a8af5901a970cf83
institution OA Journals
issn 1932-6203
language English
publishDate 2025-01-01
publisher Public Library of Science (PLoS)
record_format Article
series PLoS ONE
spelling doaj-art-07a7e82a43c04127a8af5901a970cf832025-08-20T02:36:18ZengPublic Library of Science (PLoS)PLoS ONE1932-62032025-01-01207e032504510.1371/journal.pone.0325045A syllable-character collaborative model for enhanced Pinyin and Chinese recognition.Zeyuan ChenCheng ZhongDanyang ChenIn Chinese speech recognition, end-to-end speech recognition models usually use Chinese characters as direct output and perform poorly compared with other language models. The main reason for this phenomenon is that the relationship between Chinese text and pronunciation is more complex. Inspired by the learning process of Chinese beginners, who first master initials, finals, and pinyin before learning characters, we propose the Syllable-Character Collaborative Model (SCCM), which incorporates these phonetic elements into the training process. Additionally, we design a Pinyin-Ensemble module that employs an ensemble learning approach to reduce pinyin recognition errors, which in turn leads to a reduction in text recognition errors. Experiments on AISHELL-1 show that our approach not only reduces pinyin and character error rates compared to a prior end-to-end method using pinyin as auxiliary information, but also achieves a 45.7% relative reduction in Character Error Rate (CER) over the AISHELL-1 baseline.https://doi.org/10.1371/journal.pone.0325045
spellingShingle Zeyuan Chen
Cheng Zhong
Danyang Chen
A syllable-character collaborative model for enhanced Pinyin and Chinese recognition.
PLoS ONE
title A syllable-character collaborative model for enhanced Pinyin and Chinese recognition.
title_full A syllable-character collaborative model for enhanced Pinyin and Chinese recognition.
title_fullStr A syllable-character collaborative model for enhanced Pinyin and Chinese recognition.
title_full_unstemmed A syllable-character collaborative model for enhanced Pinyin and Chinese recognition.
title_short A syllable-character collaborative model for enhanced Pinyin and Chinese recognition.
title_sort syllable character collaborative model for enhanced pinyin and chinese recognition
url https://doi.org/10.1371/journal.pone.0325045
work_keys_str_mv AT zeyuanchen asyllablecharactercollaborativemodelforenhancedpinyinandchineserecognition
AT chengzhong asyllablecharactercollaborativemodelforenhancedpinyinandchineserecognition
AT danyangchen asyllablecharactercollaborativemodelforenhancedpinyinandchineserecognition
AT zeyuanchen syllablecharactercollaborativemodelforenhancedpinyinandchineserecognition
AT chengzhong syllablecharactercollaborativemodelforenhancedpinyinandchineserecognition
AT danyangchen syllablecharactercollaborativemodelforenhancedpinyinandchineserecognition