基本信息

黄岩  男  博导  中国科学院自动化研究所

国家优青

中国科学院院长特别奖

中国图象图形学学会青年科学家奖

中国人工智能学会优秀博士学位论文奖


电子邮件: yhuang@nlpr.ia.ac.cn
通信地址: 北京市海淀区中关村东路95号
邮政编码: 100190

个人简介

黄岩,国家自然科学基金委优秀青年科学基金获得者,中科院自动化所副研究员。研究方向为视觉-语言理解、多模态机器人、视频分析等,在相关领域的国内外期刊和会议上发表论文共计100余篇,曾获国内外学术会议最佳论文奖3项、国内外主流竞赛冠军4项,担任CVPR领域主席、CVPR和ICCV上3次多模态主题研讨会的共同组织主席。曾获得中国图象图形学学会青年科学家奖、中国科学院院长特别奖、NVIDIA创新研究奖、中国人工智能学会优秀博士论文奖、百度奖学金等。入选中国科协青年人才托举工程、北京市科技新星计划和微软铸星计划。

招生信息

每年招收博士1-2名,建议具有较强自主性和编程能力的同学邮件(yhuang@nlpr.ia.ac.cn)联系我。

指导或者协助指导硕士博士20余人,相关学生曾获得:中科院院长奖、北京市优秀毕业生、自动化所一等奖学金、ICDAR最佳论文提名奖、ICCV2019-VOT国际竞赛冠军、ICCV2019-WIDER国际竞赛冠军、CVPR2022-Habitat国际竞赛冠军等。

招生专业
081104-模式识别与智能系统

教育背景

2012-09--2017-07   中国科学院大学   博士
2008-09--2012-07   电子科技大学   学士

工作经历

   
工作简历
2019-11~现在, 中国科学院自动化研究所, 副研究员
2017-07~2019-10,中国科学院自动化研究所, 助理研究员
社会兼职
2020-01-01-2024-12-31,中国计算机学会计算机视觉专委会, 副秘书长
2020-01-01-2020-07-01,CVPR2020 Workshop on Language & Vision with Applications to Video Understanding, 组织主席
2020-01-01-2020-07-01,CVPR2020 Workshop on Multimodal Learning, 组织主席
2019-05-01-2019-11-30,ICCV2019 Workshop Cross-Modal Learning in Real World, 副秘书长

教授课程

深度学习

专利与奖励

   
奖励信息
(1) 青年科学家奖, , 其他, 2022
(2) 北京市优秀青年人才, 部委级, 2020
(3) 中国科协青年人才托举工程, 部委级, 2020
(4) 北京市科技新星, 部委级, 2020
(5) 微软铸星计划, 其他, 2019
(6) NVIDIA创新研究奖, 其他, 2018
(7) 中国人工智能学会优秀博士学位论文奖, 其他, 2018
(8) 中国科学院优秀博士学位论文奖, 院级, 2018
(9) 中国科学院院长特别奖, 院级, 2017
(10) 百度奖学金, 其他, 2016
(11) RACV Best Poster Award, , 其他, 2016
(12) ICPR Best Student Paper Award, , 其他, 2014
(13) CVPR Workshop Best Paper Award, , 其他, 2014
专利成果
[1] 谭铁牛, 王亮, 黄岩, 罗正雄, 刘子坤, 张建兴. 基于相关性动态滤波的端到端多帧超分辨方法及系统. CN: CN114972038A, 2022-08-30.
[2] 王亮, 黄岩, 陈泽睿. 跨模态检索方法、装置、设备及计算机可读存储介质. CN: CN112487217A, 2021-03-12.
[3] 王亮, 黄岩, 宋纯锋. 基于分割剪影的行人再识别方法及系统. CN: CN109101866B, 2020-12-15.
[4] 王亮, 黄岩, 黄林江. 基于关系原型网络的弱监督时序行为定位方法及装置. CN: CN111783713A, 2020-10-16.
[5] 王亮, 黄岩, 黄林江. 基于身体部件层面的骨架行为识别方法及装置. CN: CN111783711A, 2020-10-16.
[6] 王亮, 黄岩, 牛凯. 基于自适应度量融合的跨模态检索重排序方法. CN: CN111026935A, 2020-04-17.
[7] 王亮, 黄岩, 宋纯锋, 孙天宇. 基于生成对抗网络的帧率增强步态识别方法及装置. CN: CN108681689A, 2018-10-19.
[8] 王亮, 张兆翔, 黄岩, 李林. 目标体的动作行为识别方法及装置. CN: CN108629326A, 2018-10-09.
[9] 王亮, 黄岩, 宋纯锋, 王彦蕴. 基于双流生成对抗网络的跨视角步态识别装置及训练方法. CN: CN108596026A, 2018-09-28.
[10] 王亮, 黄岩, 程文龙. 面向无约束视觉问答指向问题的检索方法及系统. CN: CN108446404A, 2018-08-24.
[11] 王亮, 王威, 黄岩. 基于双向循环卷积网络的视频超分辨率方法和系统. CN: CN105072373A, 2015-11-18.
[12] 王亮, 谭铁牛, 王威, 黄岩. 人脸验证方法和系统. CN: CN104363981A, 2015-02-18.
[13] 王亮, 谭铁牛, 王威, 黄岩. 基于判别式多模态深度置信网多模态数据融合方法和系统. CN: CN103838836A, 2014-06-04.
[14] 谭铁牛, 王亮, 王威, 黄岩. 一种基于多任务深度神经网络的数据识别方法及装置. CN: CN103345656A, 2013-10-09.

出版信息

在相关领域的国际期刊和会议上发表(含录用)论文共计80余篇,其中领域权威期刊和会议论文共计40余篇。以第一作者身份发表领域顶级期刊TPAMI 4篇、领域顶级会议CVPR 2篇、ICCV 2篇、NeurIPS 2篇、AAAI 1篇。更全的论文列表请参考:https://scholar.google.com/citations?user=6nUJrQ0AAAAJ&hl=zh-CN

发表著作
(1) Deep Cognitive Networks, Springer, 2023-03, 第 1 作者
部分期刊论文

  1. Dong An, Hanqing Wang, Wenguan Wang, Zun Wang, Yan Huang, Keji He, and Liang Wang. Etpnav: Evolving topological planning for vision-language navigation in continuous environments, IEEE Transactions on Pattern Analysis and Machine Intelligence (IEEE TPAMI), accepted, 2024.

  2. Yan Huang, Yuming Wang, and Liang Wang, Efficient Image and Sentence Matching, IEEE Transactions on Pattern Analysis and Machine Intelligence (IEEE TPAMI), 45(3): 2970-2983, 2023. 

  3. Chong Liu, Yuqi Zhang, Hongsong Wang, Weihua Chen, Fan Wang, Yan Huang, Yi-Dong Shen, and Liang Wang, Efficient Token-Guided Image-Text Retrieval with Consistent Multimodal Contrastive Training, IEEE Transactions on Image Processing (IEEE TIP), accepted, 2023. 

  4. Zhengxiong Luo, Yan Huang, Shang Li, Liang Wang, and Tieniu Tan, End-to-End Alternating Optimization for Real-World Blind Super Resolution, International Journal of Computer Vision (IJCV), accepted, 2023. 

  5. Yan Huang, Jingdong Wang, and Liang Wang, Few-Shot Image and Sentence Matching via Aligned Cross-Modal Memory, IEEE Transactions on Pattern Analysis and Machine Intelligence (IEEE TPAMI), 44(6): 2968-2983, 2022. 

  6. Jianhua Yang, Yan Huang, Kai Niu, Linjiang Huang, Zhanyu Ma, and Liang Wang, Actor and Action Modular Network for Text-based Video Segmentation, IEEE Transactions on Image Processing (IEEE TIP), 31: 4474-4489, 2022. 

  7. Hongyuan Yu, Houwen Peng, Yan Huang, Hao Du, Jianlong Fu, Liang Wang, and Haibin Ling, Cyclic Differentiable Architecture Search, IEEE Transactions on Pattern Analysis and Machine Intelligence (IEEE TPAMI), 45(1): 211-228, 2022. 

  8. Zerui Chen, Yan Huang, Hongyuan Yu, and Liang Wang, Learning a Robust Part-Aware Monocular 3D Human Pose Estimator via Neural Architecture Search, International Journal of Computer Vision (IJCV), 130: 56–75, 2022. 

  9. Yuchun Fang, Zhengye Xiao, Wei Zhang, Yan Huang, Liang Wang, Nozha Boujemaa, and Donald Geman, Attribute Prototype Learning for Interactive Face Retrieval, IEEE Transactions on Information Forensics and Security (IEEE TIFS), 16: 2593-2607, 2021. 

  10. Linjiang Huang, Yan Huang, Wanli Ouyang, and Liang Wang, Two-Branch Relational Prototypical Network for Weakly Supervised Temporal Action Localization, IEEE Transactions on Pattern Analysis and Machine Intelligence (IEEE TPAMI), 44(9): 5729-5746, 2022. 

  11. Linjiang Huang, Yan Huang, Wanli Ouyang, and Liang Wang, Modeling Sub-Actions for Weakly Supervised Temporal Action Localization, IEEE Transactions on Image Processing (IEEE TIP), 30: 5154-5167, 2021. 

  12. Yan Huang, Qi Wu, Wei Wang, and Liang Wang, Image and Sentence Matching via Semantic Concepts and Order Learning, IEEE Transactions on Pattern Analysis and Machine Intelligence (IEEE TPAMI), 42(3): 636-650, 2020. 

  13. Kai Niu, Yan Huang, Wanli Ouyang, and Liang Wang, Improving Description-based Person Re-identification by Multi-granularity Image-text Alignments, IEEE Transactions on Image Processing (IEEE TIP), 29: 5542-5556, 2020. 

  14. Yan Huang, Wei Wang, and Liang Wang, Video Super-resolution via Bidirectional Recurrent Convolutional Networks, IEEE Transactions on Pattern Analysis and Machine Intelligence (IEEE TPAMI), 40(4), 1015-1028, 2018. 

  15. Yan Huang, Wei Wang, Liang Wang, and Tieniu Tan, Conditional High-order Boltzmann Machines for Supervised Relation Learning, IEEE Transactions on Image Processing (IEEE TIP), 26(9):4297-4310, 2017. 

  16. ​Yan Huang, Wei Wang, and Liang Wang, Unconstrained Multimodal Multi-Label Learning, IEEE Transactions on Multimedia (IEEE TMM), 17(11):1923-1935, 2015. 

部分会议论文
  1. Yunan Zeng, Yan Huang, Jinjin Zhang, Zequn Jie, Zhenhua Chai, Liang Wang. Investigating Compositional Challenges in Vision-Language Models for Visual Grounding. IEEE Computer Vision and Pattern Recognition Conference (CVPR), accepted, 2024. (Highlight)

  2. Keji He, Chenyang Si, Zhihe Lu, Yan Huang, Liang Wang, and Xinchao Wang, Frequency-Enhanced Data Augmentation for Vision-and-Language Navigation, Neural Information Processing Systems (NeurIPS), 2023.

  3. Dong An, Yuankai Qi, Yangguang Li, Yan Huang, Liang Wang, Tieniu Tan, and Jing Shao, BEVBert: Multimodal Map Pre-training for Language-guided Navigation, IEEE International Conference on Computer Vision (ICCV), pp. 2737-2748, 2023. 

  4. Jilong Wang, Saihui Hou, Yan Huang, Chunshui Cao, Xu Liu, Yongzhen Huang, and Liang Wang, Causal Intervention for Sparse-View Gait Recognition, ACM Conference on Multimedia (MM), accepted, 2023. 

  5. Zhengxiong Luo, Dayou Chen, Yingya Zhang, Yan Huang, Liang Wang, Yujun Shen, Deli Zhao, Jingren Zhou, and Tieniu Tan, VideoFusion: Decomposed Diffusion Models for High-Quality Video Generation, IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 10209-10218, 2023. 

  6. Ke Han, Shaogang Gong, Yan Huang, Liang Wang, Tieniu Tan, Clothing-Change Feature Augmentation for Person Re-Identification, IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 22066-22075, 2023. 

  7. Weichen Yu, Tianyu Pang, Qian Liu, Chao Du, Bingyi Kang, Yan Huang, Min Lin, Shuicheng Yan, Bag of tricks for training data extraction from language models, International Conference on Machine Learning (ICML), 2023. 

  8. Yan Huang, Yuming Wang, Yunan Zeng, and Liang Wang, MACK: Multimodal Aligned Conceptual Knowledge for Unpaired Image-text Matching, Neural Information Processing Systems (NeurIPS), 2022. 

  9. Kai Niu, Linjiang Huang, Yan Huang, Peng Wang, Liang Wang, and Yanning Zhang, Cross-modal Co-occurrence Attributes Alignments for Person Search by Language, ACM Conference on Multimedia (MM), pp. 4426–4434, 2022. 

  10. Weichen Yu, Hongyuan Yu, Yan Huang, and Liang Wang, Generalized Inter-class Loss for Gait Recognition, ACM Conference on Multimedia (MM), pp. 141–150, 2022. 

  11. Hongyuan Yu, Tian Li, Weichen Yu, Jianguo Li, Yan Huang, Liang Wang, and Alex Liu, Regularized Graph Structure Learning with Semantic Knowledge for Multi-variates Time-Series Forecasting, International Joint Conference on Artificial Intelligence (IJCAI), 2362-2368, 2022. 

  12. Zhengxiong Luo, Yan Huang*, Shang Li, Liang Wang, and Tieniu Tan, Learning the Degradation Distribution for Blind Image Super-Resolution, IEEE Conference on Computer Vision and Pattern Recognition (CVPR), accepted, 2022. 

  13. Ke Han, Chenyang Si, Yan Huang*, Liang Wang, and Tieniu Tan, Generalizable Person Re-Identification via Self-Supervised Batch Norm Test-Time Adaption, AAAI Conference on Artificial Intelligence (AAAI), accepted, 2022. 

  14. Keji He, Yan Huang, Qi Wu, Jianhua Yang, Dong An, Shuanglin Sima, and Liang Wang, Landmark-RxR: Solving Vision-and-Language Navigation with Fine-Grained Alignment Supervision, Neural Information Processing Systems (NeurIPS), 2021. 

  15. Dong An, Yuankai Qi, Yan Huang*, Qi Wu, Liang Wang, and Tieniu Tan, Neighbor-view Enhanced Model for Vision and Language Navigation, ACM Conference on Multimedia (MM), accepted, 2021. (Oral) 

  16. Zhengxiong Luo, Zhicheng Wang, Yan Huang, Shang Li, Liang Wang, Tieniu Tan, and Erjin Zhou, Rethinking the Heatmap Regression for Bottom-Up Human Pose Estimation, IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 13264-13273, 2021. 

  17. Zhengxiong Luo, Yan Huang*, Shang Li, Liang Wang, and Tieniu Tan, Unfolding the Alternating Optimization for Blind Super Resolution, Neural Information Processing Systems (NeurIPS), 2020. 

  18. Kai Niu, Yan Huang, and Liang Wang, Textual Dependency Embedding for Person Search by Language, ACM Conference on Multimedia (MM), pp. 4032–4040, 2020. 

  19. Zerui Chen, Yan Huang, Hongyuan Yu, Bin Xue, Ke Han, Yiru Guo, and Liang Wang, Towards Part-aware Monocular 3D Human Pose Estimation: An Architecture Search Approach, European Conference on Computer Vision (ECCV), accepted, 2020. (Spotlight) 

  20. Ke Han, Yan Huang, Zerui Chen, Liang Wang, Tieniu Tan, Prediction, Recovery and Identification: Adaptive Low-Resolution Person Re-Identification, European Conference on Computer Vision (ECCV), accepted, 2020. 

  21. Linjiang Huang, Yan Huang, Wanli Ouyang, and Liang Wang, Relational Prototypical Network for Weakly Supervised Temporal Action Localization, AAAI Conference on Artificial Intelligence (AAAI), accepted, 2020. (Oral) 

  22. Linjiang Huang, Yan Huang, Wanli Ouyang, and Liang Wang, Part-Level Graph Convolutional Network for Skeleton-Based Action Recognition, AAAI Conference on Artificial Intelligence (AAAI), accepted, 2020. (Oral) 

  23. Yan Huang and Liang Wang, ACMM: Aligned Cross-Modal Memory For Few-Shot Image and Sentence Matching, IEEE International Conference on Computer Vision (ICCV), pp. 5774-5783, 2019. 

  24. Yan Huang, Yang Long, and Liang Wang, Few-Shot Image and Sentence Matching via Gated Visual-Semantic Embedding, AAAI Conference on Artificial Intelligence (AAAI), pp. 8489-8496, 2019. (Spotlight) 

  25. Weining Wang, Yan Huang, and Liang Wang, Language-driven Temporal Activity Localization: A Semantic Matching Reinforcement Learning Model, IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 334-343, 2019. (Oral) 

  26. Chunfeng Song, Yan Huang, Wanli Ouyang, and Liang Wang, Box-driven Class-wise Region Masking and Filling Rate Guided Loss for Weakly Supervised Semantic Segmentation, IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 3136-3145, 2019. 

  27. Yan Huang, Qi Wu, Chunfeng Song, and Liang Wang, Learning Semantic Concepts and Order for Image and Sentence Matching, IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 6163-6171, 2018. (Spotlight) 

  28. Chunfeng Song, Yan Huang, Wanli Ouyang, and LiangWang, Mask-Guided Contrastive Attention Model for Person Re-Identification, IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 1179-1188, 2018. 

  29. Junbo Wang, Wei Wang, Yan Huang, Liang Wang, and Tieniu Tan, Multimodal Memory Modelling for Video Captioning, IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 7512-7520, 2018. (Spotlight) 

  30. Junbo Wang, Wei Wang, Yan Huang, Liang Wang, and Tieniu Tan, Hierarchical Memory Modelling for Video Captioning, ACM Conference on Multimedia (MM), pp. 63-71, 2018. 

  31. Chenglong Li, Chengli Zhu, Yan Huang, Jin Tang, and Liang Wang, Cross-Modal Ranking with Soft Consistency and Noisy Labels for Robust RGB-T Tracking, European Conference on Computer Vision (ECCV), pp. 831-847, 2018. 

  32. Yan Huang, Wei Wang, and Liang Wang, Instance-aware Image and Sentence Matching with Selective Multimodal LSTM, IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 2310-2318, 2017. 

  33. Zhen Zhou, Yan Huang, Wei Wang, Liang Wang, and Tieniu Tan, See the forest for the trees: Joint spatial and temporal recurrent neural networks for video-based person re-identification, IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 6776-6785, 2017. 

  34. Yan Huang, Wei Wang, and Liang Wang, Bidirectional Recurrent Convolutional Networks for Multi-Frame Super-Resolution, Neural Information Processing Systems (NeurIPS), pp. 235-243, 2015. 

  35. Yan Huang, Wei Wang, and Liang Wang, Conditional High-order Boltzmann Machine: A Supervised Learning Model for Relation Learning, IEEE International Conference on Computer Vision (ICCV), pp. 4265-4273, 2015. 

科研活动

   
科研项目
( 1 ) 视觉认知深度学习理论与方法研究, 负责人, 中国科学院计划, 2019-09--2024-08
( 2 ) 基于层次化建模和联合任务学习的复杂行为与事件分析, 负责人, 国家任务, 2019-01--2021-12
( 3 ) 体育视频里的运动目标跟踪与定位, 负责人, 境内委托项目, 2018-12--2024-06
( 4 ) 面向开放环境的自适应感知, 负责人, 国家任务, 2019-12--2023-12
( 5 ) 北京市科技新星项目, 负责人, 地方任务, 2020-09--2023-08
( 6 ) 中科院青促会项目, 负责人, 中国科学院计划, 2021-01--2025-01
( 7 ) 多模态语义理解, 负责人, 国家任务, 2024-01--2026-12

(协助)指导学生及去向

罗正雄,博士,2023年毕业,北京智源人工智能研究院

韩苛,博士,2023年毕业,University of Trento

余玮辰,硕士,2023年毕业,Carnegie Mellon University

俞宏远,博士,2022年毕业,小米集团

陈泽睿,硕士,2021年毕业,INRIA

牛凯,博士,2020年毕业,西北工业大学

王卫宁,博士,2020年毕业,中科院自动化所

黄林江博士,2020年毕业,北京航空航天大学

宋纯锋,博士,2020年毕业,上海人工智能实验室