基本信息
徐及  男  博导  中国科学院声学研究所
电子邮件: xuji@hccl.ioa.ac.cn
通信地址: 中科院声学所DSP大楼三层
邮政编码: 100190

招生信息

   
招生专业
070206-声学
招生方向
智能水声信号处理

教育背景

2011-09--2014-07   中国科学院声学研究所   博士
2008-09--2011-07   清华大学电子工程系   硕士
2004-09--2008-07   清华大学电子工程系   学士

工作经历

   
工作简历
2019-01~现在, 中国科学院声学研究所, 研究员
2016-12~2018-12,中国科学院声学研究所, 副研究员
2014-07~2016-12,中国科学院声学研究所, 助理研究员
2011-09~2014-07,中国科学院声学研究所, 博士
2008-09~2011-07,清华大学电子工程系, 硕士
2004-09~2008-07,清华大学电子工程系, 学士

专利与奖励

   
奖励信息
(1) 汪德昭青年科技奖, 二等奖, 研究所(学校), 2021
(2) 智能语音能力平台关键技术及其在智能客服行业应用, 二等奖, 省级, 2019
专利成果
( 1 ) 一种水下声源定位方法, 2021, 第 1 作者, 专利号: 2017114540530

( 2 ) 一种水下目标分类方法, 2021, 第 1 作者, 专利号: 2017114412369

( 3 ) 一种基于深度学习的多声源测向方法及系统, 2019, 第 1 作者, 专利号: 2019106611463

( 4 ) 一种基于条件对抗神经网络的水下目标数据扩增方法及系统, 2019, 第 1 作者, 专利号: 2019107743883

( 5 ) 一种多语言连续语音流语音内容识别方法及系统, 2019, 第 1 作者, 专利号: 2019107829812

( 6 ) 一种基于类内类间距离进行无监督特征优化的水下目标识别方法, 2019, 第 1 作者, 专利号: 201911266932X

( 7 ) 一种用于深度学习水下目标分类识别的小波线谱特征提取方法, 2019, 第 1 作者, 专利号: 2019113425271

( 8 ) 一种基于迁移神经网络声学模型的语音识别系统及方法, 2018, 第 1 作者, 专利号: 2018100775569

( 9 ) 一种基于窗口输入的双向回馈神经网络的语音识别方法, 2018, 第 1 作者, 专利号: 2018112423984

( 10 ) 基于语言种类和语音内容协同分类的多语言语音识别方法, 2018, 第 1 作者, 专利号: 2018109740495

( 11 ) 一种基于深度学习的水下多声源定位及系统, 2018, 第 1 作者, 专利号: 2018115640070

( 12 ) 一种全音素框架下的通用语音唤醒识别方法及系统, 2017, 第 1 作者, 专利号: 2017100020973

( 13 ) 一种基于混合声学模型的语音识别系统及方法, 2017, 第 1 作者, 专利号: 2017110595924

( 14 ) 一种基于音频模板的语音关键词检索方法, 2015, 第 1 作者, 专利号: 2015102665536

( 15 ) 一种音频关键词模板的筛选和优化方法, 2015, 第 1 作者, 专利号: 2015108828058

( 16 ) 一种声学模型建立方法及基于该模型的语音解码方法, 2013, 第 2 作者, 专利号: 2013105171492

( 17 ) 一种黏着语语音识别方法及系统, 2012, 第 2 作者, 专利号: 2012105516760

出版信息

   
发表论文
(1) Underwater Acoustic Target Recognition based on Smoothness-inducing Regularization and Spectrogram-based Data Augmentation, Ocean Engineering, 2023, 第 1 作者
(2) Adaptive Direction-of-Arrival Estimation Using Deep Neural Network in Marine Acoustic Environment, IEEE Sensors Journal, 2023, 通讯作者
(3) An end-to-end DOA estimation method based on deep learning for underwater acoustic array, Oceans 2022, 2022, 第 4 作者
(4) Improving CTC-based speech recognition via knowledge transferring from pre-trained language models, 2022, 第 6 作者
(5) Underwater-art: Expanding information perspectives with text templates for underwater acoustic target recognition, JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA, 2022, 通讯作者
(6) Adaptive ship-radiated noise recognition with learnable fine-grained wavelet transform, Ocean Engineering, 2022, 通讯作者
(7) Open Source MagicData-RAMC: A Rich Annotated Mandarin Conversational(RAMC) Speech Dataset, 2022, 第 7 作者
(8) UALF: A learnable front-end for intelligent underwater acoustic classification system, OCEAN ENGINEERING, 2022, 通讯作者
(9) Improving Transformer based End-to-End Code-Switching Speech Recognition using Language Identification information, Applied Science, 2021, 通讯作者
(10) Tackling long-tail data distribution problem of deep learning based underwater target recognition system, OCEANS 2021 San Diego, 2021, 第 5 作者
(11) Toward Alleviating the Data Sparsity Problem of Deep Learning Based Underwater Target Classification, OCEANS 2021 San Diego, 2021, 第 4 作者
(12) Automated Detection of Marine Mammal Species based on Short-Time Fractional Fourier Transform, OCEANS 2021 San Diego, 2021, 第 4 作者
(13) A unified system for multilingual speech recognition and language identification, SPEECH COMMUNICATION, 2021, 通讯作者
(14) Context-dependent Label Smoothing Regularization for Attention-based End-to-End Code-Switching Speech Recognition, International Symposium on Chinese Spoken Language Processing, 2020, 第 3 作者
(15) End-to-End Multilingual Speech Recognition System with Language Supervision Training, IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS, 2020, 通讯作者
(16) A feature optimization approach based on inter-class and intra-class distance for ship type classification, SENSORS, 2020, 通讯作者
(17) 使用深度学习的多通道水下目标识别, Multi-channel underwater target recognition using deep learning, 声学学报, 2020, 第 3 作者
(18) Data Augmentation using Conditional Generative Adversarial Network for Underwater Target Recognition, IEEE International Conference on Signal, Information and Data Processing, 2019, 第 3 作者
(19) Multiple Source Localization in a Shallow WaterWaveguide Exploiting Subarray Beamforming andDeep Neural Networks, SENSORS, 2019, 通讯作者
(20) Investigation of knowledge transfer approaches to improve the acoustic modeling of Vietnamese ASR system, IEEE-CAA JOURNAL OF AUTOMATICA SINICA, 2019, 通讯作者
(21) MULTIPLE TEMPORAL SCALES BASED SPEAKER EMBEDDINGS LEARNING FOR TEXT-DEPENDENT SPEAKER RECOGNITION, 2019 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2019, 第 3 作者
(22) 深度学习在水下目标被动识别中的应用进展, Advances in Underwater Target Passive Recognition Using Deep Learning, 信号处理, 2019, 第 1 作者
(23) Feature Analysis of Passive Underwater Targets Recognition Based on Deep Neural Network, OCEANS 2019 - MARSEILLE, 2019, 第 5 作者
(24) Automatic Speech Recognition System with Output-Gate Projected Gated Recurrent Unit, IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS, 2019, 第 3 作者
(25) 基于降噪自动编码器的语种特征补偿方法, Denoising Autoencoder-Based Language Feature Compensation, 计算机研究与发展, 2019, 第 2 作者
(26) Identity Vector Extraction Using Shared Mixture of PLDA for Short-Time Speaker Recognition, CHINESE JOURNAL OF ELECTRONICS, 2019, 第 2 作者
(27) A Deep Neural Network Based Method Of Source Localization In A Shallowwater Environment, 2018 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2018, 第 2 作者
(28) Underwater target classification using deep learning, OCEANS 2018 MTS/IEEE CHARLESTON, 2018, 第 3 作者
(29) Source localization using deep neural networks in a shallow water environment, JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA, 2018, 通讯作者
(30) Multilingual Speech Recognition Training and Adaptation with Language-Specific Gate Units, 2018 11TH INTERNATIONAL SYMPOSIUM ON CHINESE SPOKEN LANGUAGE PROCESSING (ISCSLP), 2018, 第 3 作者
(31) 卷积神经网络声学模型的结构优化和加速计算, Structure optimization and computing acceleration for convolutional neural network acoustic models, 重庆邮电大学学报:自然科学版, 2018, 第 2 作者
(32) Deep Neural Network for Source Localization Using Underwater Horizontal Circular Array, 2018 OCEANS - MTS/IEEE KOBE TECHNO-OCEANS (OTO), 2018, 第 2 作者
(33) A Regression Approach to Speech Source Localization Exploiting Deep Neural Network, 2018 IEEE FOURTH INTERNATIONAL CONFERENCE ON MULTIMEDIA BIG DATA (BIGMM), 2018, 第 2 作者
(34) Output-Gate Projected Gated Recurrent Unit for Speech Recognition, 19TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2018), VOLS 1-6, 2018, 第 4 作者
(35) An Improved Residual LSTM Architecture for Acoustic Modeling, 2017 2ND INTERNATIONAL CONFERENCE ON COMPUTER AND COMMUNICATION SYSTEMS (ICCCS2017), 2017, 第 3 作者
(36) EFFECTIVE UTILIZATION OF MULTIPLE EXAMPLES IN QUERY-BY-EXAMPLE SPOKEN TERM DETECTION, 2016 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING PROCEEDINGS, 2016, 第 1 作者
(37) Agglutinative Language Speech Recognition Using Automatic Allophone Deriving, CHINESE JOURNAL OF ELECTRONICS, 2016, 第 1 作者
(38) Multi-lingual Unsupervised Acoustic Modeling Using Multi-task Deep Neural Network Under Mismatch Conditions, PROCEEDINGS OF 2016 8TH IEEE INTERNATIONAL CONFERENCE ON COMMUNICATION SOFTWARE AND NETWORKS (ICCSN 2016), 2016, 第 2 作者
(39) 基于状态后验概率的语音唤醒识别系统, 中国声学学会青年学术会议, 2016, 第 2 作者
(40) Efficient Acoustic Modeling Method for Unsupervised Speech Recognition using Multi-Task Deep Neural Network, PROCEEDINGS OF THE 2015 4TH NATIONAL CONFERENCE ON ELECTRICAL, ELECTRONICS AND COMPUTER ENGINEERING ( NCEECE 2015), 2016, 第 3 作者
(41) 面向多语言的语音识别声学模型建模方法研究, 中国声学学会青年学术会议, 2015, 第 2 作者
(42) 面向口语统计语言模型建模的自动语料生成算法, Automatic Text Corpus Generation Algorithm towards Oral Statistical Language Modeling, 自动化学报, 2014, 第 3 作者
(43) ON SPEEDING UP THE DEEP NEURAL NETWORK BASED SPEECH RECOGNITION SYSTEMS, The 21st International Congress on Sound and Vibration, 2014, 第 3 作者
(44) An unsupervised adaptation method for deep neural network-based large vocabulary continuous speech recognition, Journal of Information & Computational Science, 2014, 第 3 作者
(45) RECURRENT NEURAL NETWORK LANGUAGE MODEL WITH VECTOR-SPACE WORD REPRESENTATIONS, The 21st International Congress on Sound and Vibration, 2014, 第 3 作者
(46) Exploiting articulatory features for pitch accent detection, JOURNAL OF ZHEJIANG UNIVERSITY-SCIENCE C-COMPUTERS & ELECTRONICS, 2013, 第 2 作者
(47) Automatic Allophone Deriving for Korean Speech Recognition, Ninth International Conference on Computational Intelligence & Security, 2013, 第 1 作者
(48) Long Mandarin Spoken Term Detection Using Two-Stage Search, APPLIED MECHANICS AND MATERIALS, 2013, 第 2 作者
(49) IMPROVE LOW-RESOURCE NON-NATIVE MISPRONUNCIATION DETECTION WITH NATIVE SPEECH BY ARTICULATORY-BASED TANDEM FEATURE, IEEE China Summit & International Conference on Signal & Information Processing, 2013, 第 2 作者
(50) Spoken Term Detection Based on Improved Index Structure, JOURNAL OF SOFTWARE, 2013, 第 2 作者
(51) Bottleneck Features based on Gammatone Frequency Cepstral Coefficients, 14TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2013), VOLS 1-5, 2013, 第 3 作者
(52) Improving Korean LVCSR with Long-time Temporal Patterns and an Extended Phoneme Set, 2013 FOURTH GLOBAL CONGRESS ON INTELLIGENT SYSTEMS (GCIS), 2013, 第 1 作者
(53) Multi-Stream Posterior Features and Combining Subspace Gmms for Low Resource LVCSR, CHINESE JOURNAL OF ELECTRONICS, 2013, 第 2 作者
(54) An improved Mandarin Voice Input System using Recurrent Neural Network Language Model, PROCEEDINGS OF THE 2012 EIGHTH INTERNATIONAL CONFERENCE ON COMPUTATIONAL INTELLIGENCE AND SECURITY (CIS 2012), 2012, 第 2 作者
(55) Utilizing Auxiliary Data in Phoneme Recognition Based on Articulatory Feature, International Conference on Communication Software and Networks, 2011, 第 1 作者
(56) A FRAME MAPPING BASED HMM APPROACH TO CROSS-LINGUAL VOICE TRANSFORMATION, 2011 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, 2011, 第 2 作者
(57) Strategies for Using MLP based Features with Limited Target-Language Training Data, ASRU, 2011, 第 2 作者
(58) A Bayesian View on the Polynomial Distribution Model in Estimation of Distribution Algorithms, 2008 IEEE CONGRESS ON EVOLUTIONARY COMPUTATION, VOLS 1-8, 2008, 第 3 作者
(59) Reducing computational complexity of estimating multivariate histogram-based probabilistic model, 2007 IEEE CONGRESS ON EVOLUTIONARY COMPUTATION, VOLS 1-10, PROCEEDINGS, 2007, 第 2 作者

科研活动

   
科研项目
( 1 ) 面向多语言的层次化和结构化声学建模方法与系统集成, 参与, 国家任务, 2016-01--2020-12
( 2 ) “语音地图”构建的理论与技术研究, 参与, 国家任务, 2016-01--2018-12
( 3 ) 语言无关关键词检索技术研究, 负责人, 中国科学院计划, 2017-01--2018-12
( 4 ) 基于深度学习的水下目标定位与跟踪技术研究, 负责人, 研究所自主部署, 2016-12--2019-12
( 5 ) 基于大数据分析的水下目标定位与识别系统构架研究, 参与, 研究所自主部署, 2017-07--2018-12
( 6 ) 面向北京地区多样化语言的语音关键词检索技术, 负责人, 地方任务, 2015-06--2016-09
( 7 ) ****技术研究, 参与, 国家任务, 2016-12--2019-11
( 8 ) 多语言引擎构建, 负责人, 国家任务, 2019-10--2021-10
( 9 ) 基于端到端的多语言语音内容与语言种类联合识别技术的研究, 负责人, 国家任务, 2020-01--2022-12
( 10 ) 基于深度学习的水平阵目标被动定位综合分析技术研究, 负责人, 研究所自主部署, 2020-08--2023-08
参与会议
(1)Multiple Temporal Scales Based Speaker Embeddings Learning for Text-dependent Speaker Recognition   王文超,张一珂,徐及,颜永红   2019-05-12
(2)Deep neural network for source localization using underwater horizontal circular array   2018-05-31
(3)EFFECTIVE UTILIZATION OF MULTIPLE EXAMPLES IN QUERY-BY-EXAMPLE SPOKEN TERM DETECTION   2016-03-23