基本信息
易江燕  女  硕导  中国科学院自动化研究所
电子邮件: jiangyan.yi@nlpr.ia.ac.cn
通信地址: 北京市海淀区中关村东路95智能化大厦707
邮政编码: 100190

研究领域

语音信息处理、语音识别与合成、个性化语音生成与鉴别、小数据建模、迁移学习

招生信息

   
招生专业
081104-模式识别与智能系统
招生方向
语音识别与合成
语音信息安全
人机交互

教育背景

2015-09--2018-06   中国科学院自动化研究所   博士
2007-09--2010-07   中国社会科学院研究生院   硕士

工作经历

   
工作简历
2020-10~现在, 中国科学院自动化研究所, 副研究员
2018-07~2020-10,中国科学院自动化研究所, 助理研究员
2011-09~2014-11,阿里巴巴集团, 资深算法工程师
社会兼职
2022-05-07-2022-05-27,ICASSP 2022, Session Chair
2021-12-01-2022-09-24,Interspeech 2022, Area Chairs
2021-10-09-今,IEEE信号处理学会语音与语言处理技术委员会IEEE SLTC, Associate Member
2021-04-30-2021-06-15,ICASSP 2021, Session Chair
2020-11-23-今,APSIPA SLA Technical Committee, TC member
2020-10-25-2020-10-29,Interspeech 2020, Session Co-Chairs
2019-09-15-2020-10-30,Interspeech 2020, Area Co-Chairs
2019-09-15-2019-09-19,Interspeech 2019, Session Chair
2019-08-22-今,全国人机语音通讯学术会议(NCMMSC)常设机构, 委员
2019-01-01-2020-04-01,APSIPA 2019, Publication Co-Chairs
2018-12-01-2020-02-01,NCMMSC 2019出版委员会, 出版共同主席
2018-10-11-今,中国计算机学会语音对话与听觉专委会, 委员

教授课程

语音信息处理
语音交互

专利与奖励

   
奖励信息
(1) 2022年度第十二届吴文俊人工智能科学技术奖, 特等奖, 部委级, 2022
(2) 语音顶级会议国际会议ICASSP 2021多说话人多风格音色克隆大赛极少样本赛道, 一等奖, 其他, 2021
(3) 第十九届全国信号处理学术年会最佳论文, , 其他, 2019
(4) 第十三届全国人机语音通讯学术会议最佳论文, 其他, 2019
(5) Intel AIDC Beijing Best Poster Award, 其他, 2018
专利成果
[1] 陶建华, 马浩鑫, 易江燕. 一种无需原始数据存储的持续性学习生成语音特征的方法. CN113299315A, 2022-08-24.

[2] 陶建华, 汪涛, 易江燕, 傅睿博. 一种统一的语音合成与语音转换的训练方法和系统. CN114495898A, 2022-05-13.

[3] 傅睿博, 陶建华, 易江燕. 语音对抗样本生成方法及装置、电子设备及存储介质. CN114267363A, 2022-04-01.

[4] 陶建华, 王诗明, 傅睿博, 易江燕. 一种细粒度韵律建模的语音生成模型、设备及存储介质. CN114093342A, 2022-02-25.

[5] 陶建华, 田正坤, 易江燕. 端到端语音转写模型的训练方法、系统、装置. CN: CN110689879B, 2022-02-25.

[6] 陶建华, 傅睿博, 易江燕. Method for detecting voice splicing points and storage medium. US17/668,074, 2022-02-09.

[7] 陶建华, 张帅, 易江燕. 一种可定制的中英混合语音识别端到端系统. CN113936641A, 2022-01-14.

[8] 陶建华, 田正坤, 易江燕. 语音识别模型的训练方法、语音识别方法和系统. CN113936647A, 2022-01-14.

[9] 陶建华, 张帅, 易江燕. 一种语音识别与语音翻译端到端系统及设备. CN113920989A, 2022-01-11.

[10] 易江燕, 陶建华, 傅睿博, 田正坤. 生成语音的检测方法、装置、电子设备及存储介质. CN113808579A, 2021-12-17.

[11] 陶建华, 汪涛, 易江燕. 编辑音频的方法、装置、电子设备及存储介质. CN113724686A, 2021-11-13.

[12] 陶建华, 张帅, 易江燕. 统一中英混合文本生成和语音识别的端到端系统. CN113284485B, 2021-11-09.

[13] 傅睿博, 陶建华, 易江燕. 语音拼接点检测方法及存储介质. CN113555007A, 2021-10-26.

[14] 易江燕, 陶建华, 田正坤, 傅睿博. 篡改音频的篡改区域检测方法、装置及存储介质. CN113555037A, 2021-10-26.

[15] 易江燕, 陶建华, 田正坤, 傅睿博. 基于知识迁移的电话信道虚假语音鉴别方法及存储介质. CN113380235A, 2021-09-10.

[16] 易江燕, 陶建华, 田正坤, 傅睿博. 一种融合组合模型信息的语音鉴别模型压缩方法. CN113362814A, 2021-09-07.

[17] 易江燕, 陶建华, 田正坤, 傅睿博. 一种环境对抗的鲁棒语音鉴别方法. CN113284486A, 2021-08-20.

[18] 陶建华, 田正坤, 易江燕. 基于层级区分的生成音频检测系统. CN113284508A, 2021-08-20.

[19] 陶建华, 田正坤, 易江燕. 一种融合多模态语义不变性的语音识别文本增强系统. CN113270086A, 2021-08-17.

[20] 陶建华, 田正坤, 易江燕. 一种流式和非流式混合语音识别系统及流式语音识别方法. CN113257248A, 2021-08-13.

[21] 梁山, 聂帅, 陶建华, 易江燕. 基于相位偏移检测的数字音频篡改取证方法. CN: CN113178199A, 2021-07-27.

[22] 易江燕. 基于音素时长特征的虚假语音检测方法及装置. ZL 202110841276.2, 2021-07-26.

[23] 陶建华, 易江燕, 温正棋. 语音识别中的小数据语音声学建模方法. CN: CN108682417A, 2018-10-19.

[24] 陶建华, 易江燕, 温正棋, 倪浩. 基于口音瓶颈特征的声学模型自适应方法. CN: CN106875942A, 2017-06-20.

[25] 陶建华, 易江燕, 温正棋, 刘斌. 语音识别中的正则化口音自适应方法. CN: CN106531157A, 2017-03-22.

出版信息

   
发表论文
[1] Wang, Tao, Fu, Ruibo, Yi, Jiangyan, Tao, Jianhua, Wen, Zhengqi. NeuralDPS: Neural Deterministic Plus Stochastic Model With Multiband Excitation for Noise-Controllable Waveform Generation. IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING[J]. 2022, 30: 865-878, http://dx.doi.org/10.1109/TASLP.2022.3140480.

[2] Yi, Jiangyan, Fu, Ruibo, Tao, Jianhua, Nie, Shuai, Ma, Haoxin, Wang, Chenglong, Wang, Tao, Tian, Zhengkun, Bai, Ye, Fan, Cunhang, Liang, Shan, Wang, Shiming, Zhang, Shuai, Yan, Xinrui, Xu, Le, Wen, Zhengqi, Li, Haizhou. ADD 2022: the First Audio Deep Synthesis Detection Challenge. 2022, http://arxiv.org/abs/2202.08433.

[3] Tian, Zhengkun, Yi, Jiangyan, Tao, Jianhua, Zhang, Shuai, Wen, Zhengqi. Hybrid Autoregressive and Non-Autoregressive Transformer Models for Speech Recognition. IEEE SIGNAL PROCESSING LETTERS[J]. 2022, 29: 762-766, http://dx.doi.org/10.1109/LSP.2022.3152128.

[4] Tao wang, Jiangyan Yi, ruibo fu, Tao, Jianhua. Context-Aware Mask Prediction Network for End-to-End Text-Based Speech Editing. ICASSP[J]. 2022, 
[5] 汪涛, 易江燕, 傅睿博, 陶建华, 温正棋. CampNet: Context-Aware Mask Prediction for End-to-End Text-Based Speech Editing. IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING[J]. 2022, 
[6] Fan, Cunhang, Yi, Jiangyan, Tao, Jianhua, Tian, Zhengkun, Liu, Bin, Wen, Zhengqi. Gated Recurrent Fusion With Joint Training Framework for Robust End-to-End Speech Recognition. IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING[J]. 2021, 29(29): 198-209, http://dx.doi.org/10.1109/TASLP.2020.3039600.

[7] Wang, Tao, Fu, Ruibo, Yi, Jiangyan, Tao, Jianhua, Wen, Zhengqi, Qiang, Chunyu, Wang, Shiming, IEEE. PROSODY AND VOICE FACTORIZATION FOR FEW-SHOT SPEAKER ADAPTATION IN THE CHALLENGE M2VOC 2021. 2021 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP 2021)null. 2021, 8603-8607, 
[8] 王成龙, 易江燕. 基于全局-时频注意力网络的语音伪造检测. 计算机研究与发展[J]. 2021, 
[9] Bai, Ye, Yi, Jiangyan, Tao, Jianhua, Tian, Zhengkun, Wen, Zhengqi, Zhang, Shuai. Fast End-to-End Speech Recognition Via Non-Autoregressive Models and Cross-Modal Knowledge Transferring From BERT. IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING[J]. 2021, 29(29): 1897-1911, 
[10] Fu, Ruibo, Tao, Jianhua, Wen, Zhengqi, Yi, Jiangyan, Wang, Tao, Qiang, Chunyu, IEEE. BI-LEVEL STYLE AND PROSODY DECOUPLING MODELING FOR PERSONALIZED END-TO-END SPEECH SYNTHESIS. 2021 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP 2021)null. 2021, 6568-6572, 
[11] Yi, Jiangyan, Bai, Ye, Tao, Jianhua, Tian, Zhengkun, Wang, Chenglong, Wang, Tao, Fu, Ruibo. Half-Truth: A Partially Fake Audio Detection Dataset. 2021, http://arxiv.org/abs/2104.03617.

[12] Zhang, Shuai, Yi, Jiangyan, Tian, Zhengkun, Bai, Ye, Tao, Jianhua, Wen, Zhengqi, IEEE. Decoupling_Pronunciation_and_Language_for_End-to-End_Code-Switching_Automatic_Speech_Recognition. 2021 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP 2021)null. 2021, 6249-6253, 
[13] Wang, Shiming, Ling, Zhenhua, Fu, Ruibo, Yi, Jiangyan, Tao, Jianhua, IEEE. PATNET : A PHONEME-LEVEL AUTOREGRESSIVE TRANSFORMER NETWORK FOR SPEECH SYNTHESIS. 2021 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP 2021)null. 2021, 5684-5688, 
[14] Bai, Ye, Yi, Jiangyan, Tao, Jianhua, Wen, Zhengqi, Tian, Zhengkun, Zhang, Shuai. Integrating Knowledge Into End-to-End Speech Recognition From External Text-Only Data. IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING[J]. 2021, 29: 1340-1351, http://dx.doi.org/10.1109/TASLP.2021.3066274.

[15] Ma Haoxin, Yi Jiangyan, Tao Jianhua, Bai Ye, Tian Zhengkun, Wang Chenglong. Continual Learning for Fake Audio Detection. 2021, http://arxiv.org/abs/2104.07286.

[16] Tao Wang, Ruibo Fu, Jiangyan Yi, Tao Jianhua. Non-autoregressive End-to-End TTS with Coarse-to-Fine Decoding. INTERSPEECH 2020[J]. 2020, 
[17] Tian Zhengkun, Yi, Jiangyan, Bai Ye, Tao Jianhua, Zhang Shuai, Wen Zhengqi. Synchronous Transformers for End-to-End Speech Recognition. 2020, http://arxiv.org/abs/1912.02958.

[18] Fan, Cunhang, Tao, Jianhua, Liu, Bin, Yi, Jiangyan, Wen, Zhengqi, Liu, Xuefei. End-to-End Post-Filter for Speech Separation With Deep Attention Fusion Features. IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING[J]. 2020, 28: 1303-1314, http://dx.doi.org/10.1109/TASLP.2020.2982029.

[19] Fu, Ruibo, Tao, Jianhua, Wen, Zhengqi, Yi, Jiangyan, Wang, Tao, IEEE. FOCUSING ON ATTENTION: PROSODY TRANSFER AND ADAPTATIVE OPTIMIZATION STRATEGY FOR MULTI-SPEAKER END-TO-END SPEECH SYNTHESIS. 2020 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSINGnull. 2020, 6709-6713, 
[20] Tian, Zhengkun, Yi, Jiangyan, Bai, Ye, Tao, Jianhua, Zhang, Shuai, Wen, Zhengqi, IEEE. SYNCHRONOUS TRANSFORMERS FOR END-TO-END SPEECH RECOGNITION. 2020 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSINGnull. 2020, 7884-7888, 
[21] Tao Wang, Ruibo Fu, Yi Jiangyan, Tao Jianhua. Spoken Content and Voice Factorization for Few-shot Speaker Adaptation. INTERSPEECH 2020[J]. 2020, 
[22] Yi, Jiangyan, Tao, Jianhua. Focal Loss for Punctuation Prediction. INTERSPEECH[J]. 2020, 
[23] Tao Jianhua, Tao Wang, ruibo fu, Yi, Jiangyan. Dynamic Speaker Representations Adjustment and Decoder Factorization for Speaker Adaptation in End-to-End Speech Synthesis. INTERSPEECH[J]. 2020, 
[24] 陶建华, 傅睿博, 易江燕, 王成龙, 汪涛. 语音伪造与鉴伪的发展与挑战. 信息安全学报[J]. 2020, 5(2): 28-38, http://lib.cqvip.com/Qikan/Article/Detail?id=7101732838.

[25] Bai, Ye, Yi, Jiangyan, Tao, Jianhua, Wen, Zhengqi, Fan, Cunhang. A Public Chinese Dataset for Language Model Adaptation. JOURNAL OF SIGNAL PROCESSING SYSTEMS FOR SIGNAL IMAGE AND VIDEO TECHNOLOGY[J]. 2020, 92(8): 839-851, https://www.webofscience.com/wos/woscc/full-record/WOS:000490530600001.

[26] Fan Cunhang, Tao Jianhua, Liu Bin, Yi Jiangyan, Wen Zhengqi. Joint Training for Simultaneous Speech Denoising and Dereverberation with Deep Embedding Representations. INTERSPEECH[J]. 2020, 
[27] Tao Wang, Tao Jianhua, ruibo fu, Yi, Jiangyan. Bi-level Speaker Supervision for One-shot Speech Synthesis. INTERSPEECH 2020[J]. 2020, 
[28] ruibo fu, Tao Jianhua, Zhengqi Wen, Yi, Jiangyan. Dynamic Soft Windowing and Language Dependent Style Token for Code-Switching End-to-End Speech Synthesis. INTERSPEECH[J]. 2020, 
[29] Fan, Cunhang, Yi, Jiangyan, Tao, Jianhua, Tian, Zhengkun, Liu, Bin, Wen, Zhengqi. Gated Recurrent Fusion with Joint Training Framework for Robust End-to-End Speech Recognition. 2020, http://arxiv.org/abs/2011.04249.

[30] Ye Bai, Jiangyan Yi, Jianhua Tao, Zhengkun Tian, Zhengqi Wen, Shuai Zhang. Listen Attentively, and Spell Once: Whole Sentence Generation via a Non-Autoregressive Architecture for Low-Latency Speech Recognition. 2020, http://arxiv.org/abs/2005.04862.

[31] Zhengkun Tian, Jiangyan Yi, Jianhua Tao, Ye Bai, Shuai Zhang, Zhengqi Wen. Spike-Triggered Non-Autoregressive Transformer for End-to-End Speech Recognition. 2020, http://arxiv.org/abs/2005.07903.

[32] Fan Cunhang, Tao Jianhua, Liu Bin, Yi Jiangyan, Wen Zhengqi. Gated Recurrent Fusion of Spatial and Spectral Features for Multi-channel Speech Separation with Deep Embedding Representations. Interspeech[J]. 2020, 
[33] Yi, Jiangyan, Tao, Jianhua, Bai, Ye. LANGUAGE-INVARIANT BOTTLENECK FEATURES FROM ADVERSARIAL END-TO-END ACOUSTIC MODELS FOR LOW RESOURCE SPEECH RECOGNITION. 2019 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP)[J]. 2019, 6071-6075, 
[34] Fan Cunhang, Liu Bin, Tao Jianhua, Yi Jiangyan, Wen Zhengqi. Discriminative Learning for Monaural Speech Separation Using Deep Embedding Features. interspeech2019null. 2019, http://arxiv.org/abs/1907.09884.

[35] Zheng, Yibin, Tao, Jianhua, Wen, Zhengqi, Yi, Jiangyan. Forward-Backward Decoding Sequence for Regularizing End-to-End TTS. IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING[J]. 2019, 27(12): 2067-2079, http://dx.doi.org/10.1109/TASLP.2019.2935807.

[36] Ye Bai, Jiangyan Yi, Tao Jianhua. A Time Delay Neural Network with Shared Weight Self-Attention for Small-Footprint Keyword Spotting. INTERSPEECH[J]. 2019, 
[37] Ye Bai, Jiangyan Yi, Jianhua Tao, Zhengkun Tian, Zhengqi Wen. Learn Spelling from Teachers: Transferring Knowledge from Language Models to Sequence-to-Sequence Speech Recognition. 2019, http://arxiv.org/abs/1907.06017.

[38] Zhengkun Tian, Jiangyan Yi, Jianhua Tao, Ye Bai, Zhengqi Wen. Self-Attention Transducers for End-to-End Speech Recognition. 2019, http://arxiv.org/abs/1909.13037.

[39] 范存航, 刘斌, 陶建华, 温正棋, 易江燕. 一种基于卷积神经网络的端到端语音分离方法. 信号处理[J]. 2019, 35(4): 542-548, http://lib.cqvip.com/Qikan/Article/Detail?id=7002150057.

[40] Jiangyan Yi, Jianhua Tao, Zhengqi Wen, Ye Bai. Language-Adversarial Transfer Learning for Low-Resource Speech Recognition. IEEE / ACM Transactions on Audio, Speech and Language Processing (TASLP). 2019, 27(3): http://kns.cnki.net/KCMS/detail/detail.aspx?QueryID=0&CurRec=1&recid=&FileName=SJCM0D850DF295CFFB91AA79EEA520C4B2A5&DbName=WWMERGEJ01&DbCode=WWME&yx=&pr=&URLID=&bsm=.

[41] Yi, Jiangyan, Tao, Jianhua. SELF-ATTENTION BASED MODEL FOR PUNCTUATION PREDICTION USING WORD AND SPEECH EMBEDDINGS. 2019 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP)[J]. 2019, 7270-7274, 
[42] Yi, Jiangyan, Tao, Jianhua, Wen, Zhengqi, Bai, Ye. Language-Adversarial Transfer Learning for Low-Resource Speech Recognition. IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING[J]. 2019, 27(3): 621-630, http://ir.ia.ac.cn/handle/173211/25289.

[43] 易江燕, 陶建华, 刘斌, 温正棋. 基于迁移学习的噪声鲁棒语音识别声学建模. 清华大学学报:自然科学版[J]. 2018, 58(1): 55-60, http://lib.cqvip.com/Qikan/Article/Detail?id=674347161.

[44] Yi, Jiangyan, Wen, Zhengqi, Tao, Jianhua, Ni, Hao, Liu, Bin. CTC Regularized Model Adaptation for Improving LSTM RNN Based Multi-Accent Mandarin Speech Recognition. JOURNAL OF SIGNAL PROCESSING SYSTEMS FOR SIGNAL IMAGE AND VIDEO TECHNOLOGY[J]. 2018, 90(7): 985-997, http://dx.doi.org/10.1007/s11265-017-1291-1.

[45] Huang, Jian, Tao, Jianhua, Li, Ya, Lian, Zheng, Yi, Jiangyan. End-to-End Continuous Emotion Recognition from Video Using 3D Convlstm Networks. ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processingnull. 2018, http://ir.ia.ac.cn/handle/173211/22375.

[46] Yi Jiangyan, Tao Jianhua, Wen Zhengqi, Bai Ye, IEEE. ADVERSARIAL MULTILINGUAL TRAINING FOR LOW-RESOURCE SPEECH RECOGNITION. 2018 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP)null. 2018, 4899-4903, http://apps.webofknowledge.com/CitedFullRecord.do?product=UA&colName=WOS&SID=5CCFccWmJJRAuMzNPjj&search_mode=CitedFullRecord&isickref=WOS:000446384605014.

[47] Huang,Jian, Tao, Jianhua, Lian, Zheng, Li, Ya, Yi, Jiangyan, Niu, Mingyue. Speech Emotion Recognition Using Semi-supervised Learning with Ladder Networks. 2018 1st Asian Conference on Affective Computing and Intelligent Interaction, ACII Asia 2018null. 2018, http://ir.ia.ac.cn/handle/173211/22377.

[48] 陶建华, 易江燕, 温正棋, 刘斌. 基于迁移学习的鲁棒语音识别声学建模方法. 清华大学学报[J]. 2018, 58(1): 55-60, http://ir.ia.ac.cn/handle/173211/19898.

[49] Yi Jiangyan, NiHao, Tao Jianhua, Wen Zhengqi. Acoustic Model Compression with Knowledge Transfer. 2017, http://ir.ia.ac.cn/handle/173211/19888.

[50] Jiangyan Yi, Tao, Jianhua, Zhengqi Wen, Hao Ni. Distilling Knowledge from an Ensemble of Models for Punctuation Prediction. INTERSPEECH 2017[J]. 2017, 
[51] 乌日其其格, 陶建华, 白音门德, 易江燕, 温正棋, 白烨. 面向语音识别的蒙古语标准音语音库的建立. 2017, http://ir.ia.ac.cn/handle/173211/19916.

[52] Bai Ye, Yi Jiangyan, Ni Hao, Wen Zhengqi, Liu Bin, Li Ya, Tao Jianhua. End-to-end Keywords Spotting Based on Connectionist Temporal Classification for Mandarin. 2016, http://ir.ia.ac.cn/handle/173211/19742.

[53] Yi Jiangyan, Tao Jianhua, Ni Hao, Wen Zhengqi. Improving BLSTM RNN Based Mandarin Speech Recognition Using Accent Dependent Bottleneck Features. 2016, http://ir.ia.ac.cn/handle/173211/19881.

[54] Yi, Jiangyan, Ni, Hao, Wen, Zhengqi, Tao, Jianhua, IEEE. Improving BLSTM RNN Based Mandarin Speech Recognition Using Accent Dependent Bottleneck Features. 2016 ASIA-PACIFIC SIGNAL AND INFORMATION PROCESSING ASSOCIATION ANNUAL SUMMIT AND CONFERENCE (APSIPA)null. 2016, 
[55] Bai, Ye, Yi, Jiangyan, Ni, Hao, Wen, Zhengqi, Liu, Bin, Li, Ya, Tao, Jianhua, Lee, T, Xie, L, Dang, J, Wang, HM, Wei, J, Feng, H, Hou, Q, Wei, Y. End-to-end Keywords Spotting Based on Connectionist Temporal Classification for Mandarin. 2016 10TH INTERNATIONAL SYMPOSIUM ON CHINESE SPOKEN LANGUAGE PROCESSING (ISCSLP)null. 2016, http://apps.webofknowledge.com/CitedFullRecord.do?product=UA&colName=WOS&SID=5CCFccWmJJRAuMzNPjj&search_mode=CitedFullRecord&isickref=WOS:000405610900098.

[56] Yi, Jiangyan, Ni, Hao, Wen, Zhengqi, Liu, Bin, Tao, Jianhua, Lee, T, Xie, L, Dang, J, Wang, HM, Wei, J, Feng, H, Hou, Q, Wei, Y. CTC Regularized Model Adaptation for Improving LSTM RNN Based MultiAccent Mandarin Speech Recognition. 2016 10TH INTERNATIONAL SYMPOSIUM ON CHINESE SPOKEN LANGUAGE PROCESSING (ISCSLP)[J]. 2016, http://apps.webofknowledge.com/CitedFullRecord.do?product=UA&colName=WOS&SID=5CCFccWmJJRAuMzNPjj&search_mode=CitedFullRecord&isickref=WOS:000405610900058.

科研活动

   
科研项目
( 1 ) 面向小数据语音建模的跨语言迁移学习研究, 负责人, 国家任务, 2020-01--2022-12
( 2 ) 基于小数据的强噪声语音识别声学模型研究, 负责人, 中国科学院计划, 2019-01--2021-12
( 3 ) 网络多媒体鉴伪关键技术研究, 负责人, 国家任务, 2020-07--2023-06
( 4 ) 语音识别及语音合成技术研究合作项目, 负责人, 企业委托, 2021-03--2022-03
( 5 ) 大数据多模态交互协同关键技术, 参与, 国家任务, 2017-10--2021-09
( 6 ) 大数据分析关键技术及应用, 参与, 中国科学院计划, 2020-01--2020-12
( 7 ) 基于多语种多通道融合的XXX, 参与, 国家任务, 2018-02--2021-12
( 8 ) 语音关键技术, 负责人, 企业委托, 2021-06--2022-09
( 9 ) 基于连续学习的音视频分析关键技术, 负责人, 国家任务, 2021-10--2024-09

指导学生

现指导学生

顾浩  硕士研究生  081104-模式识别与智能系统