General

Shuang Yang,Associate Professor, the Key Laboratory of Intelligent Information Processing, Institute of Computing Technology (ICT), Chinese Academy of Sciences (CAS). 

Email:shuang.yang@ict.ac.cn

Address: No.6 Kexueyuan South Rd., Haidian District, Beijing, China

Postcode:  100190

Personal homepage: @Google Scholar

Team homepage: http://vipl.ict.ac.cn/research/speech/

Research Areas

Computer vision, Pattern Recognition, Audio-Visual Speech Representation Learning and Understanding

Education

  • 2011-2016: Ph.D., Institute of Automation, Chinese Academy of Sciences
  • 2008-2011: M.E., HuNan University
  • 2004-2008: B.E., HuNan University

Experience

   
Work Experience
2020.10-current: Associate Professor with  Institute of Computing Technology, CAS

2016.7-2020.9: Assistant Professor with  Institute of Computing Technology, CAS

Publications

  1. Cooperative Dual Attention for Audio-Visual Speech Enhancement with Visual Cues. F Wang, S Yang, S Shan, X Chen. British Machine Vision Conference (BMVC), Aberdeen, UK, Nov. 20-24, 2023.

  2. Learning Separable Hidden Unit Contributions for Speaker-Adaptive Lip-Reading. S Luo, S Yang, S Shan, X Chen. British Machine Vision Conference, Aberdeen, UK, Nov. 20-24, 2023.

  3. UniLip: Learning Visual-Textual Mapping with Uni-Modal Data for Lip Reading. B Xia, S Yang, S Shan, X Chen. British Machine Vision Conference, Aberdeen, UK, Nov. 20-24, 2023

  4. Audio-Driven Deformation Flow for Effective Lip Reading. D Feng, S Yang, S Shan, X Chen. 26th International Conference on Pattern Recognition (ICPR), pp. 274-280, Aug. 21-25, 2022.

  5. An Efficient Software for Building Lip Reading Models Without Pains. D Feng, S Yang, S Shan. IEEE International Conference on Multimedia & Expo Workshops (ICMEW), pp. 1-2, Virtual Event, Jul. 5-9, 2021

  6. 《机器学习·应用视角》,机械工业出版社,合译,2020

  7. ​UniCon: Unified Context Network for Robust Active Speaker Detection. Y Zhang, S Liang, S Yang, X Liu, Z Wu, S Shan, X Chen. ACM International Conference on Multimedia (ACM Multimedia), pp. 3964-3972, Chengdu, China, Oct. 20-24, 2021.

  8. Mutual Information Maximization for Effective Lip Reading, X Zhao, S Yang, S Shan, X Chen, IEEE International Conference on Automatic Face and Gesture Recognition, 2020
  9. Deformation Flow Based Two-Stream Network for Lip Reading, J Xiao, S Yang, Y Zhang, S Shan, X Chen, IEEE International Conference on Automatic Face and Gesture Recognition, 2020

  10. Pseudo-Convolutional Policy Gradient for Sequence-to-Sequence Lip-Reading, M Luo, S Yang, S Shan, X Chen, IEEE International Conference on Automatic Face and Gesture Recognition, 2020

  11. Can We Read Speech Beyond the Lips? Rethinking RoI Selection for Deep Visual Speech Recognition, Y Zhang, S Yang, J Xiao, S Shan, X Chen,  IEEE International Conference on Automatic Face and Gesture Recognition, 2020

  12. A Novel Pseudo Viewpoint based Holoscopic 3D Micro-Gesture Recognition,Y Liu, S Yang, H Meng, MR Swash, S Shan,  ACM ICMI, 2020

  13. Synchronous Bidirectional Learning for Multilingual Lip Reading, M Luo, S Yang, X Chen, Z Liu, S Shan, The British Machine Vision Conference (BMVC), 2020

  14. LRW-1000: A naturally-distributed large-scale benchmark for lip reading in the wild, S Yang, Y Zhang, D Feng, M Yang, C Wang, J Xiao, K Long, S Shan, X Chen, IEEE International Conference on Automatic Face & Gesture Recognition, 2019

  15. Multi-Task Learning for Audio-Visual Active Speaker Detection, YH Zhang, J Xiao, S Yang, S Shan, The ActivityNet Large-Scale Activity Recognition Challenge @ CVPR 2019,2019

  16. TinyPoseNet: A Fast and Compact Deep Network for Robust Head Pose Estimation, S Li, L Wang, S Yang, Y Wang, C Wang, International Conference on Neural Information Processing, 2017

  17. The Class-specific Oriented Attributes for Action Recognition, H Yang, B Wu, S Yang, C Yuan, W Hu, Chinese Association for Artificial Inteeligence, 2016

  18. Hierarchical Bayesian Multiple Kernel Learning Based Feature Fusion for Action Recognition, W Sun, C Yuan, P Wang, S Yang, W Hu, Z Cai, Multimodal Pattern Recognition of Social Signals in Human-Computer-Interaction, 2016

  19. Multi-feature max-margin hierarchical bayesian model for action recognition, S Yang, C Yuan, B Wu, W Hu, F Wang, Proceedings of the IEEE conference on Computer Vision and Pattern Recognition, 2015

  20. Human Action Recognition Based on Oriented Motion Salient Regions, B Wu, S Yang, C Yuan, W Hu, F Wang, Asian Conference on Computer Vision Workshop, 2014

  21. A hierarchical model based on latent dirichlet allocation for action recognition, S Yang, C Yuan, W Hu, X Ding, International Conference on Pattern Recognition, 2014

  22. Learning human actions by combining global dynamics and local appearance, G Luo, S Yang, G Tian, C Yuan, W Hu, SJ Maybank, IEEE transactions on pattern analysis and machine intelligence, 2014

  23. Combining sparse appearance features and dense motion features via random forest for action detection,S Yang, C Yuan, H Wang, W Hu, IEEE International Conference on Acoustics, Speech and Signal Processing, 2013

  24. Multi-task sparse learning with beta process prior for action recognition, C Yuan, W Hu, G Tian, S Yang, H Wang, Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2013

  25. Online Detection and Tracking Method of Foreign Substances in Ampoules in High-speed Pharmaceutical Lines, S Yang, Y Wang, Chinese Journal of Scientific Instrument, 2011

  26. A Detection System for Impurity of Ampoule Injection Based on Machine-vision, S Yang, Y Wang, Optp-Electronic Engineering