基本信息

程高峰 男 中国科学院声学研究所
电子邮件: chenggaofeng@hccl.ioa.ac.cn
通信地址: No. 21 North 4th Ring Road, Haidian Dist
邮政编码: 100190
电子邮件: chenggaofeng@hccl.ioa.ac.cn
通信地址: No. 21 North 4th Ring Road, Haidian Dist
邮政编码: 100190
招生信息
招生专业
081002-信号与信息处理
招生方向
语音信号处理,语音识别
教育背景
2014-09--2019-06 中国科学院大学 工学博士
2010-09--2014-06 北京邮电大学 理学学士
2010-09--2014-06 北京邮电大学 理学学士
工作经历
工作简历
2023-05~现在, 中国科学院声学研究所, 副研究员
2021-11~2023-05,中国科学院声学研究所, 助理研究员
2019-07~2021-11,中国科学院声学研究所, 特别研究助理
2021-11~2023-05,中国科学院声学研究所, 助理研究员
2019-07~2021-11,中国科学院声学研究所, 特别研究助理
社会兼职
2023-10-19-2025-10-19,《声学学报》青年编委, 青年编委
专利与奖励
奖励信息
(1) 中国电子学会科学技术奖, 二等奖, 省级, 2023
(2) 北京市科学技术奖, 二等奖, 省级, 2019
(2) 北京市科学技术奖, 二等奖, 省级, 2019
专利成果
( 1 ) 一种基于私有参数的语音识别联邦学习方法和系统, 发明专利, 2022, 第 1 作者, 专利号: CN114783425A
( 2 ) 一种语音识别模型的个性化联邦学习方法和系统, 发明专利, 2022, 第 2 作者, 专利号: CN114783443A
( 3 ) 一种多领域自适应的端到端语音识别方法、系统及电子装置, 发明专利, 2021, 第 1 作者, 专利号: CN113436616A
( 4 ) 一种语音识别解码的方法及装置, 发明专利, 2021, 第 1 作者, 专利号: CN113436619A
( 5 ) 一种语音关键词检索方法、系统和电子装置, 发明专利, 2021, 第 1 作者, 专利号: CN113192535A
( 6 ) 联结主义时间分类和截断式注意力联合在线语音识别技术, 发明专利, 2020, 第 3 作者, 专利号: CN111179918A
( 7 ) 一种在线端对端语音转写方法及系统, 发明专利, 2020, 第 3 作者, 专利号: CN111128191A
( 8 ) 一种基于窗口输入的双向回馈神经网络的语音识别方法, 发明专利, 2020, 第 2 作者, 专利号: CN111091817A
( 9 ) 一种基于混合声学模型的语音识别系统及方法, 专利授权, 2019, 第 2 作者, 专利号: CN109754790A
( 10 ) 一种基于无网格最大互信息准则的神经网络训练加速方法, 发明专利, 2018, 第 3 作者, 专利号: CN108629412A
( 2 ) 一种语音识别模型的个性化联邦学习方法和系统, 发明专利, 2022, 第 2 作者, 专利号: CN114783443A
( 3 ) 一种多领域自适应的端到端语音识别方法、系统及电子装置, 发明专利, 2021, 第 1 作者, 专利号: CN113436616A
( 4 ) 一种语音识别解码的方法及装置, 发明专利, 2021, 第 1 作者, 专利号: CN113436619A
( 5 ) 一种语音关键词检索方法、系统和电子装置, 发明专利, 2021, 第 1 作者, 专利号: CN113192535A
( 6 ) 联结主义时间分类和截断式注意力联合在线语音识别技术, 发明专利, 2020, 第 3 作者, 专利号: CN111179918A
( 7 ) 一种在线端对端语音转写方法及系统, 发明专利, 2020, 第 3 作者, 专利号: CN111128191A
( 8 ) 一种基于窗口输入的双向回馈神经网络的语音识别方法, 发明专利, 2020, 第 2 作者, 专利号: CN111091817A
( 9 ) 一种基于混合声学模型的语音识别系统及方法, 专利授权, 2019, 第 2 作者, 专利号: CN109754790A
( 10 ) 一种基于无网格最大互信息准则的神经网络训练加速方法, 发明专利, 2018, 第 3 作者, 专利号: CN108629412A
出版信息
发表论文
(1) 面向鲁棒自动语音识别的一致性自监督学习方法, Consistency self-supervised learning method for robust automatic speech recognition, 声学学报, 2023, 第 2 作者
(2) Alleviating ASR Long-Tailed Problem by Decoupling the Learning of Representation and Classification, IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2022, 第 2 作者
(3) 多语言语音识别声学模型建模方法最新进展, Latest Development of Multilingual Speech Recognition Acoustic Model Modeling Methods, 计算机科学, 2022, 第 1 作者
(4) 基于端到端语音识别的关键词检索技术研究, Study on Keyword Search Framework Based on End-to-End Automatic Speech Recognition, 计算机科学, 2022, 第 2 作者
(5) ETEH: Unified Attention-Based End-to-End ASR and KWS Architecture, IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2022, 第 1 作者
(6) 语言声学智能化的思考与探索, Thinking and exploration into intellectualization of speech and medical acoustics, 中国科学:物理学、力学、天文学, 2022, 第 2 作者
(7) An E2E-ASR-Based Iteratively-Trained Timestamp Estimator, IEEE SIGNAL PROCESSING LETTERS, 2022, 第 2 作者
(8) Self-Supervised Pre-Training for Attention-Based Encoder-Decoder ASR Model, IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2022, 第 2 作者
(9) ETEH: Unified Attention-Based End-to-End ASR and KWS Architecture, IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2022, 第 1 作者
(10) An E2E-ASR-Based Iteratively-Trained Timestamp Estimator, IEEE SIGNAL PROCESSING LETTERS, 2022, 第 2 作者
(11) History Utterance Embedding Transformer LM for Speech Recognition, 2021 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP 2021), 2021, 第 2 作者
(12) Keyword search using attention-based end-to-end ASR and framesynchronous phoneme alignments, IEEE/ACM Transactions on Audio, Speech and Language Processing (TASLP), 2021, 第 2 作者
(13) Pre-Training Transformer Decoder for End-to-End ASR Model with Unpaired Text Data, 2021 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP 2021), 2021, 第 2 作者
(14) Keyword search using attention-based end-to-end ASR and framesynchronous phoneme alignments, IEEE/ACM Transactions on Audio, Speech and Language Processing (TASLP), 2021,
(15) Online Hybrid CTC/Attention Architecture for End-to-End Speech Recognition, PROC. INTERSPEECH 2019, 2019, 第 2 作者
(16) 利用高速通道连接的长短时记忆循环神经网络语音识别, CHINESE JOURNAL OF ELECTRONICS, 2019, 第 1 作者
(17) Automatic Speech Recognition System with Output-Gate Projected Gated Recurrent Unit, IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS, 2019,
(18) Using Highway Connections to Enable Deep Small-footprint LSTM-RNNs for Speech Recognition, CHINESE JOURNAL OF ELECTRONICS, 2019,
(19) Using Highway Connections to Enable Deep Small-footprint LSTM-RNNs for Speech Recognition, Using Highway Connections to Enable Deep Small-footprint LSTM-RNNs for Speech Recognition, 电子学报:英文版, 2019,
(20) Utterance-level Permutation Invariant Training with Latency-controlled BLSTM for Single-channel Multi-talker Speech Separation, 2019 ASIA-PACIFIC SIGNAL AND INFORMATION PROCESSING ASSOCIATION ANNUAL SUMMIT AND CONFERENCE (APSIPA ASC), 2019,
(21) Semi-Orthogonal Low-Rank Matrix Factorization for Deep Neural Networks, 19TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2018), VOLS 1-6, 2018,
(22) Investigation on the combination of batch normalization and dropout in BLSTM-based acoustic modeling for ASR, 19TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2018), VOLS 1-6, 2018,
(23) Bidirectional LSTM with Extended Input Context, 2018 11TH INTERNATIONAL SYMPOSIUM ON CHINESE SPOKEN LANGUAGE PROCESSING (ISCSLP), 2018,
(24) Output-Gate Projected Gated Recurrent Unit for Speech Recognition, 19TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2018), VOLS 1-6, 2018,
(25) An exploration of dropout with LSTMs, 18TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2017), VOLS 1-6, 2017,
(2) Alleviating ASR Long-Tailed Problem by Decoupling the Learning of Representation and Classification, IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2022, 第 2 作者
(3) 多语言语音识别声学模型建模方法最新进展, Latest Development of Multilingual Speech Recognition Acoustic Model Modeling Methods, 计算机科学, 2022, 第 1 作者
(4) 基于端到端语音识别的关键词检索技术研究, Study on Keyword Search Framework Based on End-to-End Automatic Speech Recognition, 计算机科学, 2022, 第 2 作者
(5) ETEH: Unified Attention-Based End-to-End ASR and KWS Architecture, IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2022, 第 1 作者
(6) 语言声学智能化的思考与探索, Thinking and exploration into intellectualization of speech and medical acoustics, 中国科学:物理学、力学、天文学, 2022, 第 2 作者
(7) An E2E-ASR-Based Iteratively-Trained Timestamp Estimator, IEEE SIGNAL PROCESSING LETTERS, 2022, 第 2 作者
(8) Self-Supervised Pre-Training for Attention-Based Encoder-Decoder ASR Model, IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2022, 第 2 作者
(9) ETEH: Unified Attention-Based End-to-End ASR and KWS Architecture, IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2022, 第 1 作者
(10) An E2E-ASR-Based Iteratively-Trained Timestamp Estimator, IEEE SIGNAL PROCESSING LETTERS, 2022, 第 2 作者
(11) History Utterance Embedding Transformer LM for Speech Recognition, 2021 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP 2021), 2021, 第 2 作者
(12) Keyword search using attention-based end-to-end ASR and framesynchronous phoneme alignments, IEEE/ACM Transactions on Audio, Speech and Language Processing (TASLP), 2021, 第 2 作者
(13) Pre-Training Transformer Decoder for End-to-End ASR Model with Unpaired Text Data, 2021 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP 2021), 2021, 第 2 作者
(14) Keyword search using attention-based end-to-end ASR and framesynchronous phoneme alignments, IEEE/ACM Transactions on Audio, Speech and Language Processing (TASLP), 2021,
(15) Online Hybrid CTC/Attention Architecture for End-to-End Speech Recognition, PROC. INTERSPEECH 2019, 2019, 第 2 作者
(16) 利用高速通道连接的长短时记忆循环神经网络语音识别, CHINESE JOURNAL OF ELECTRONICS, 2019, 第 1 作者
(17) Automatic Speech Recognition System with Output-Gate Projected Gated Recurrent Unit, IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS, 2019,
(18) Using Highway Connections to Enable Deep Small-footprint LSTM-RNNs for Speech Recognition, CHINESE JOURNAL OF ELECTRONICS, 2019,
(19) Using Highway Connections to Enable Deep Small-footprint LSTM-RNNs for Speech Recognition, Using Highway Connections to Enable Deep Small-footprint LSTM-RNNs for Speech Recognition, 电子学报:英文版, 2019,
(20) Utterance-level Permutation Invariant Training with Latency-controlled BLSTM for Single-channel Multi-talker Speech Separation, 2019 ASIA-PACIFIC SIGNAL AND INFORMATION PROCESSING ASSOCIATION ANNUAL SUMMIT AND CONFERENCE (APSIPA ASC), 2019,
(21) Semi-Orthogonal Low-Rank Matrix Factorization for Deep Neural Networks, 19TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2018), VOLS 1-6, 2018,
(22) Investigation on the combination of batch normalization and dropout in BLSTM-based acoustic modeling for ASR, 19TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2018), VOLS 1-6, 2018,
(23) Bidirectional LSTM with Extended Input Context, 2018 11TH INTERNATIONAL SYMPOSIUM ON CHINESE SPOKEN LANGUAGE PROCESSING (ISCSLP), 2018,
(24) Output-Gate Projected Gated Recurrent Unit for Speech Recognition, 19TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2018), VOLS 1-6, 2018,
(25) An exploration of dropout with LSTMs, 18TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2017), VOLS 1-6, 2017,
科研活动
科研项目
( 1 ) 基于XXXX的智能水下XX识别技术研究, 负责人, 国家任务, 2023-09--2025-12
( 2 ) 可持续无监督学习的音频目标分类方法探索, 负责人, 研究所自主部署, 2021-01--2023-12
( 3 ) 智能语音演示系统, 负责人, 境内委托项目, 2019-12--2025-01
( 2 ) 可持续无监督学习的音频目标分类方法探索, 负责人, 研究所自主部署, 2021-01--2023-12
( 3 ) 智能语音演示系统, 负责人, 境内委托项目, 2019-12--2025-01
参与会议
(1)基于基础大模型的水声成像目标检测与分割技术研究 2024-04-14