360 iou upgraded to qifu iou qifu technology’s top global conference interspeech attracted heated discussions, and dialect recognition technology became the focus

2024-09-27

recently, qifu technology was invited to attend interspeech 2024, the top international speech communication and signal processing conference held in greece, and published a paper titled qifusion-net: layer-adapted stream/non-stream model for end-to-end multi-accent speech recognition's keynote speech comprehensively demonstrated its achievements in the field of speech recognition technology, setting a new benchmark for china's speech technology to go global and participate in global competition.

figure 1: qifu technology gave a keynote speech at the interspeech 2024 conference

in the speech, qifu technology introduced the new generation qifu speech recognition system "qifree" that can support more than 20 dialects at the same time. in the comparison of kespeech, the authoritative test set in the field of chinese accent and dialect speech recognition, qifu technology relied on its deep accumulation in the field of automatic speech recognition (asr) to achieve a significant improvement in dialect accent classification accuracy, reaching it achieved 79.10%, far exceeding kespeech’s baseline level of 61.13%. this data intuitively reflects qifu technology’s excellent performance in speech recognition accuracy. at the same time, in terms of the key indicator to measure the recognition error rate - cer (character error rate, character error rate), qifu technology achieved a score of 8.08%, which is far better than kespeech's 10.38%, demonstrating its performance in chinese dialect recognition. efficiency and precision in the field.

table 1: comparison of performance effects of qifu technology’s “qifree” and kespeech baseline

qifree technology's self-developed chinese speech recognition system "qifree" breaks the dilemma that a single model can only recognize a specific single dialect. through an innovative layer adaptive fusion structure and a shared information encoding module, it can extract dialect information more efficiently and achieve instant translation further enhances the real-time interaction capabilities of voice robots. it is worth mentioning that "qifree" not only maintains a leading position in cer in the field of mandarin recognition, but also has better recognition performance in multiple dialect areas such as hebei-lu, jianghuai, jiao-liao, lan-yin, etc., compared with the best in the past. the best results achieved a significant improvement of more than 15%.

it is worth mentioning that in comparison with first-class domestic companies (such as a technology giant and the most influential speech recognition open source community in china), qifu technology also showed an overwhelming advantage. even when facing opponents with larger parameter scales and richer training data, qifu technology can still stand out with a lower cer (8.08% vs 15.61% vs 26.55%), proving the superiority of its technical architecture and efficiency of algorithm optimization.

table 2: comparison of key indicators of qifu technology’s “qifree” with first-class domestic and foreign technology companies

qifu technology’s once again wonderful appearance at interspeech 2024 is not only a comprehensive display of its years of intensive work in the field of speech recognition technology, but also a declaration to the world of the strong competitiveness and unlimited potential of chinese enterprises in this field. potential. qifu technology is leading a new development trend in dialect recognition technology with its outstanding technical strength and innovative spirit, contributing chinese wisdom and chinese power to the advancement of global voice communication and signal processing technology.

report/feedback

news

360 iou upgraded to qifu iou qifu technology’s top global conference interspeech attracted heated discussions, and dialect recognition technology became the focus

introduction

my contact information