基本信息

王树徽  研究员  博士生导师  中国科学院计算技术研究所
电子邮件: wangshuhui@ict.ac.cn
通信地址: 北京市海淀区科学院南路6号
邮政编码: 100190


2006年于清华大学获得工学学士学位,2012年7月于中科院计算所获工学博士学位。从事跨媒体理解与知识推理、大数据理论与方法、机器学习等方面的研究,已在IEEE/ACM顶级汇刊TPAMI、TIP、TKDE、TMM,以及NeurIPS、ICCV、CVPR、ACMMM、SIGMOD、VLDB等多媒体,视觉、数据科学和人工智能领域的顶级期刊和会议上发表和录用学术论文50余篇,授权国家专利4项。多次担任顶级国际会议ACM Multimedia 领域主席,参与多个国际会议的会议组织工作,担任数十个高水平国际期刊和会议的审稿人。承担或参与科技创新2030-新一代人工智能重大项目、973课题、863课题等重大研究任务,获得国家自然科学基金委优青资助。与多个互联网企业保持良好的科研合作关系。

研究领域

视觉/多媒体分析-图像视频语义理解、跨媒体分析推理

机器学习-度量学习、关联学习、迁移学习

数据挖掘-社交媒体信息挖掘、跨内容检索、用户行为建模

招生信息

   
招生专业
081203-计算机应用技术
081202-计算机软件与理论
招生方向
图像视频理解,视觉概述,视觉语义检索
跨媒体知识表征,知识图谱构建与分析,跨媒体知识推理
深度学习,非参数统计学习,开放域及迁移学习等
说明

欢迎对图像视频理解,图文检索与内容转换生成,跨媒体分析推理,跨媒体知识工程等前沿研究有强烈兴趣和相关研究背景的同学报考博士及硕士研究生!


工作经历

2020-10      至   今,                中国科学院计算技术研究所,   研究员
2015年9月   至 2020年9月,  中国科学院计算技术研究所,副研究员
2014年10月 至 2015年9月,  中国科学院计算技术研究所,助理研究员
2012年8月   至 2014年10月,中国科学院计算技术研究所,博士后

专利与奖励

   
奖励信息
(1) 吴文俊人工智能自然科学奖, 一等奖, 部委级, 2020
(2) 北京市科技进步奖, 二等奖, 省级, 2020
(3) 2016全国多媒体大会(NCMT)最佳论文奖, 特等奖, 其他, 2016
(4) 中国计算机学会(CCF)科学技术奖, 其他, 2012
(5) 中科院院长奖(优秀奖), 院级, 2012
专利成果
[1] 王树徽, 宋国利, 黄庆明. 一种基于语义条件关联学习的跨模态检索方法及系统. CN: CN112100410A, 2020-12-18.

[2] 李亮, 杨士杰, 王树徽, 黄庆明. 一种候选回答语句生成和自然语言选择方法及系统. CN: CN110727768A, 2020-01-24.

[3] 王树徽, 陈扬羽, 黄庆明, 张维刚. 基于帧选择的视频内容描述方法和系统. CN: CN109409221A, 2019-03-01.

[4] 王树徽, 吴益灵, 黄庆明. 基于语义保持的跨模态内容检索方法和系统. CN: CN109284414A, 2019-01-29.

[5] 黄庆明, 张亮, 王树徽. 基于深度判别排序学习的跨媒体训练及检索方法. CN: CN107657008A, 2018-02-02.

[6] 黄庆明, 褚令洋, 张艳雁, 王树徽, 蒋树强. 一种基于密集子图的视觉词典生成方法及其系统. 中国: CN104239398B, 2017-11-21.

[7] 黄庆明, 张艳雁, 褚令洋, 李国荣, 王树徽, 张维刚. 基于多模态信息融合与图聚类的跨媒体话题检测方法、装置. 中国: CN103995804A, 2014-08-20.

[8] 王树徽, 申丽, 黄庆明, 蒋树强. 一种基于树结构的图像分类方法及其系统. 中国: CN103324954A, 2013-09-25.

出版信息

   
发表论文
[1] Ding, Guanqi, Han, Xinzhe, Wang, Shuhui, Wu, Shuzhe, Jin, Xin, Tu, Dandan, Huang, Qingming. Attribute Group Editing for Reliable Few-shot Image Generation. CVPRnull. 2022, [2] Ye, Hanhua, Li, Guorong, Qi, Yuankai, Wang, Shuhui, Huang, Qingming, Yang, MingHsuan. Hierarchical Modular Network for Video Captioning. CVPRnull. 2022, [3] 邓锦灿, Li Liang, Zhang Beichen, 王树徽, Zheng-Jun Zha, Huang, Qingming. Syntax-Guided Hierarchical Attention Network for Video Captioning. IEEE Transactions on Circuit System and Video Technology[J]. 2022, 32(2): 880-892, [4] Zhang, Jinghao, Zhu, Yanqiao, Liu, Qiang, Wu, Shu, Wang, Shuhui, Wang, Liang. Mining Latent Structures for Multimedia Recommendation. ACM Multimedia 2021 (Oral)[J]. 2021, http://arxiv.org/abs/2104.09036.
[5] Yang, Shijie, Li, Liang, Wang, Shuhui, Zhang, Weigang, Huang, Qingming, Tian, Qi. Graph Regularized Encoder-Decoder Networks for Image Representation Learning. IEEE TRANSACTIONS ON MULTIMEDIA[J]. 2021, 23: 3124-3136, http://apps.webofknowledge.com/CitedFullRecord.do?product=UA&colName=WOS&SID=5CCFccWmJJRAuMzNPjj&search_mode=CitedFullRecord&isickref=WOS:000698902000014.
[6] Han, Xinzhe, Wang, Shuhui, Su, Chi, Huang, Qingming, Tian, Qi. Greedy Gradient Ensemble for Robust Visual Question Answering. ICCVnull. 2021, http://arxiv.org/abs/2107.12651.
[7] Mao, Xiaofeng, Chen, Yuefeng, Wang, Shuhui, Su, Hang, He, Yuan, Xue, Hui. Composite Adversarial Attacks. AAAInull. 2021, http://arxiv.org/abs/2012.05434.
[8] 王树徽, 闫旭, 黄庆明. 跨媒体分析与推理技术研究综述. 计算机科学. 2021, 48(3): 79-86, http://lib.cqvip.com/Qikan/Article/Detail?id=7103984849.
[9] Yan, Xu, Fei, Zhengcong, Li, Zekang, Wang, Shuhui, Huang, Qingming, Tian, Qi. Semi-Autoregressive Image Captioning. ACM Multimedianull. 2021, [10] Liu, Xuejing, Li, Liang, Wang, Shuhui, Zha, ZhengJun, Huang, Qingming. Local-binarized very deep residual network for visual categorization. NEUROCOMPUTING[J]. 2021, 430: 82-93, http://dx.doi.org/10.1016/j.neucom.2020.11.041.
[11] Zhaobo Qi, Shuhui Wang, Chi Su, Li Su, Qingming Huang, Qi Tian. Self-Regulated Learning for Egocentric Video Activity Anticipation. IEEE Transactions on Pattern Analysis and Machine Intelligence[J]. 2021, https://ieeexplore.ieee.org/document/9356220.
[12] Li, Xiaodan, Li, Jinfeng, Chen, Yuefeng, Ye, Shaokai, He, Yuan, Wang, Shuhui, Su, Hang, Xue, Hui. QAIR: Practical Query-efficient Black-Box Attacks for Image Retrieval. CVPRnull. 2021, http://arxiv.org/abs/2103.02927.
[13] Chen Weidong, Li Guorong, Zhang Xinfeng, Yu Hongyang, 王树徽, Huang Qingming. Cascade Cross-modal Attention Network for Video Actor and Action Segmentation from a Sentence. ACM Multimedianull. 2021, [14] Wu, Yiling, Wang, Shuhui, Song, Guoli, Huang, Qingming. Augmented Adversarial Training for Cross-Modal Retrieval. IEEE TRANSACTIONS ON MULTIMEDIA[J]. 2021, 23: 559-571, https://www.webofscience.com/wos/woscc/full-record/WOS:000613560200004.
[15] Liu, Mengyi, Wang, Shuhui, Guo, Yulan, He, Yuan, Xue, Hui. Pano-SfMLearner: Self-Supervised Multi-Task Learning of Depth and Semantics in Panoramic Videos. IEEE SIGNAL PROCESSING LETTERS[J]. 2021, 28: 832-836, http://dx.doi.org/10.1109/LSP.2021.3073627.
[16] Song, Guoli, Wang, Shuhui, Huang, Qingming, Tian, Qi. Harmonized Multimodal Learning with Gaussian Process Latent Variable Models. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE[J]. 2021, 43(3): 858-872, https://www.webofscience.com/wos/woscc/full-record/WOS:000616309900008.
[17] Jingru Gan, Jinchang Luo, Haiwei Wang, Wang Shuhui, Wei He, Huang, Qingming. Multimodal Entity Linking: A New Dataset and A Baseline.. ACMMM(CCF-A, oral)null. 2021, [18] 韩歆哲, Wang Shuhui, Chi Su, Zhang, Weigang, Huang, Qingming, Qi Tian. Interpretable Visual Reasoning via Probabilistic Formulation under Natural Supervision. ECCVnull. 2020, [19] Wang, Shuhui, Hu, Ling, Li, Liang, Zhang, Weigang, Huang, Qingming. Two-stream deep sparse network for accurate and efficient image restoration. COMPUTER VISION AND IMAGE UNDERSTANDING[J]. 2020, 200: http://dx.doi.org/10.1016/j.cviu.2020.103029.
[20] Qi, Zhaobo, Wang, Shuhui, Su, Chi, Su, Li, Zhang, Weigang, Huang, Qingming. Modeling Temporal Concept Receptive Field Dynamically for Untrimmed Video Analysis. ACMMMnull. 2020, [21] Zhang Beichen, Li Liang, Yang Shijie, Wang Shuhui, Zheng-Jun Zha, Huang, Qingming. State-relabling adversarial active learning. CVPR(CCF-A, oral)null. 2020, [22] Cui, Shuhao, 王树徽, Zhuo, Junbao, Li Liang, Huang Qingming, Tian Qi. Towards discriminability and diversity: batch nuclear-norm maximization on output under label insufficient situations. IEEE CVPRnull. 2020, [23] Wu, Yiling, Wang, Shuhui, Huang, Qingming. Online Fast Adaptive Low-Rank Similarity Learning for Cross-Modal Retrieval. IEEE TRANSACTIONS ON MULTIMEDIA[J]. 2020, 22(5): 1310-1322, https://www.webofscience.com/wos/woscc/full-record/WOS:000530097200016.
[24] Cui, Shuhao, Wang, Shuhui, Zhuo, Junbao, Li, Liang, Huang, Qingming, Tian, Qi, IEEE. Towards Discriminability and Diversity: Batch Nuclear-norm Maximization under Label Insufficient Situations. 2020 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR)null. 2020, 3940-3949, [25] Song, Guoli, Wang Shuhui, Huang, Qingming, Tian Qi. Learning Feature Representation and Partial Correlation for Multimodal Multi-Labeled Data. IEEE Transactions on Multimedia(TMM)[J]. 2020, [26] Meng Dechao, Li Liang, Liu Xuejing, Li Yadong, Yang Shijie, Zha Zhengjun, Gao Xingyu, Wang Shuhui, Huang Qingming. Parsing-based View-aware Embedding Network for Vehicle Re-Identification. 2020, http://arxiv.org/abs/2004.05021.
[27] Li Liang. A structured latent variable recurrent network with stochastic attention for generating Weibo comments. IJCAI. 2020, [28] 卓君宝, 苏驰, 王树徽, 黄庆明. 最小熵迁移对抗散列方法. 计算机研究与发展[J]. 2020, 57(4): 888-896, http://lib.cqvip.com/Qikan/Article/Detail?id=7101302058.
[29] Cui, Shuhao, Jin, Xuan, Wang, Shuhui, He, Yuan, Huang, Qingming. Heuristic Domain Adaptation. 2020, http://arxiv.org/abs/2011.14540.
[30] Wei Jun, Wang Shuhui, Wu Zhe, Su Chi, Huang Qingming, Tian Qi. Label Decoupling Framework for Salient Object Detection. 2020 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)null. 2020, http://arxiv.org/abs/2008.11048.
[31] Su Li. Towards More Explainability: Concept Knowledge Mining Network for Event Recognition. ACMMM(CCF-A). 2020, [32] Li Xiaodan, Lang Yining, Chen Yuefeng, Mao Xiaofeng, He Yuan, Wang Shuhui, Xue Hui, Lu Quan. Sharp Multiple Instance Learning for DeepFake Video Detection. 2020, http://arxiv.org/abs/2008.04585.
[33] Guo, Dan, Wang, Hui, Wang, Shuhui, Wang, Meng. Textual-Visual Reference-Aware Attention Network for Visual Dialog. IEEE TRANSACTIONS ON IMAGE PROCESSING[J]. 2020, 29: 6655-6666, http://dx.doi.org/10.1109/TIP.2020.2992888.
[34] Zhuo, Junbao, Wang, Shuhui, Cui, Shuhao, Huang, Qingming, IEEE Comp Soc. Unsupervised Open Domain Recognition by Semantic Discrepancy Minimization. 2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2019)null. 2019, 750-759, [35] Li, Liang, Zhu, Xinge, Hao, Yiming, Wang, Shuhui, Gao, Xingyu, Huang, Qingming. A Hierarchical CNN-RNN Approach for Visual Emotion Classification. ACM TRANSACTIONS ON MULTIMEDIA COMPUTING COMMUNICATIONS AND APPLICATIONS[J]. 2019, 15(3): https://www.webofscience.com/wos/woscc/full-record/WOS:000535718800013.
[36] Yang, Shijie, Li, Liang, Wang, Shuhui, Zhang, Weigang, Huang, Qingming, Tian, Qi. SkeletonNet: A Hybrid Network With a Skeleton-Embedding Process for Multi-View Image Representation Learning. IEEE TRANSACTIONS ON MULTIMEDIA[J]. 2019, 21(11): 2916-2929, http://dx.doi.org/10.1109/TMM.2019.2912735.
[37] Wei Jun, Wang Shuhui, Huang Qingming. F3Net: Fusion, Feedback and Focus for Salient Object Detection. 2019, http://arxiv.org/abs/1911.11445.
[38] Wu, Yiling, Wang, Shuhui, Huang, Qingming. Multi-modal semantic autoencoder for cross-modal retrieval. NEUROCOMPUTING[J]. 2019, 331: 165-175, http://dx.doi.org/10.1016/j.neucom.2018.11.042.
[39] Liu Xuejing, Li Liang, Wang Shuhui, Zha ZhengJun, Meng Dechao, Huang Qingming. Adaptive Reconstruction Network for Weakly Supervised Referring Expression Grounding. 2019, http://arxiv.org/abs/1908.10568.
[40] Wu, Yiling, Wang, Shuhui, Song, Guoli, Huang, Qingming. Online Asymmetric Metric Learning With Multi-Layer Similarity Aggregation for Cross-Modal Retrieval. IEEE TRANSACTIONS ON IMAGE PROCESSING[J]. 2019, 28(9): 4299-4312, http://dx.doi.org/10.1109/TIP.2019.2908774.
[41] Xue, Zhe, Li, Guorong, Wang, Shuhui, Huang, Jun, Zhang, Weigang, Huang, Qingming. Beyond global fusion: A group-aware fusion approach for multi-view image clustering. INFORMATION SCIENCES[J]. 2019, 493: 176-191, http://dx.doi.org/10.1016/j.ins.2019.04.034.
[42] Liu Xuejing, Li Liang, Wang Shuhui, Zha ZhengJun, Su Li, Huang Qingming. Knowledge-guided Pairwise Reconstruction Network for Weakly Supervised Referring Expression Grounding. 2019, http://arxiv.org/abs/1909.02860.
[43] Xin Yongjian, Wang Shuhui, Li Liang, Zhang Weigang, Huang Qingming, Jawahar CV, Li H, Mori G, Schindler K. Reverse Densely Connected Feature Pyramid Network for Object Detection. COMPUTER VISION - ACCV 2018, PT Vnull. 2019, 11365: 530-545, [44] Liu, Xuejing, Li, Liang, Wang, Shuhui, Zha, ZhengJun, Su, Li, Huang, Qingming, ACM. Knowledge-guided Pairwise Reconstruction Network for Weakly Supervised Referring Expression Grounding. PROCEEDINGS OF THE 27TH ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA (MM'19)null. 2019, 539-547, http://dx.doi.org/10.1145/3343031.3351074.
[45] Yang, Shijie, Li, Liang, Wang, Shuhui, Meng, Dechao, Huang, Qingming, Tian, Qi, ACM. Structured Stochastic Recurrent Network for Linguistic Video Prediction. PROCEEDINGS OF THE 27TH ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA (MM'19)null. 2019, 21-29, http://dx.doi.org/10.1145/3343031.3350859.
[46] Wu, Yiling, Wang, Shuhui, Song, Guoli, Huang, Qingming, ACM. Learning Fragment Self-Attention Embeddings for Image-Text Matching. PROCEEDINGS OF THE 27TH ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA (MM'19)null. 2019, 2088-2096, http://dx.doi.org/10.1145/3343031.3350940.
[47] Wang, Shuhui, Li, Liang, Yang, Chenxue, Huang, Qingming. Regularized topic-aware latent influence propagation in dynamic relational networks. GEOINFORMATICA[J]. 2019, 23(3): 329-352, [48] Wang Shuhui, Chen Yangyu, Zhuo Junbao, Huang Qingming, Tian Qi, ACM. Joint Global and Co-Attentive Representation Learning for Image-Sentence Retrieval. PROCEEDINGS OF THE 2018 ACM MULTIMEDIA CONFERENCE (MM'18)null. 2018, 1398-1406, http://dx.doi.org/10.1145/3240508.3240535.
[49] Hu Ling, Wang Shuhui, Li Liang, Huang Qingming, Baozong Y, Qiuqi R, Yao Z, Gaoyun AN. How Functions Evolve in Deep Convolutional Neural Network. PROCEEDINGS OF 2018 14TH IEEE INTERNATIONAL CONFERENCE ON SIGNAL PROCESSING (ICSP)null. 2018, 1133-1138, [50] He, Jianfeng, Ma, Bingpeng, Wang, Shuhui, Liu, Yugui, Huang, Qingming. Multi-label double-layer learning for cross-modal retrieval. NEUROCOMPUTING[J]. 2018, 275: 1893-1902, http://dx.doi.org/10.1016/j.neucom.2017.10.032.
[51] Chen, Yangyu, Wang, Shuhui, Zhang, Weigang, Huang, Qingming, Ferrari, V, Hebert, M, Sminchisescu, C, Weiss, Y. Less Is More: Picking Informative Frames for Video Captioning. COMPUTER VISION - ECCV 2018, PT XIIInull. 2018, 11217: 367-384, [52] Li Liang, Wang Shuhui, Jiang Shuqiang, Huang Qingming, ACM. Attentive Recurrent Neural Network for Weak-supervised Multi-label Image Classification. PROCEEDINGS OF THE 2018 ACM MULTIMEDIA CONFERENCE (MM'18)null. 2018, 1092-1100, http://dx.doi.org/10.1145/3240508.3240649.
[53] Wu Yiling, Wang Shuhui, Huang Qingming, ACM. Learning Semantic Structure-preserved Embeddings for Cross-modal Retrieval. PROCEEDINGS OF THE 2018 ACM MULTIMEDIA CONFERENCE (MM'18)null. 2018, 825-833, http://dx.doi.org/10.1145/3240508.3240521.
[54] Xue, Zhe, Li, Guorong, Wang, Shuhui, Zhang, Weigang, Huang, Qingming. Bilevel Multiview Latent Space Learning. IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY[J]. 2018, 28(2): 327-341, https://www.webofscience.com/wos/woscc/full-record/WOS:000425036400005.
[55] Chen, Yangyu, Zhang, Weigang, Wang, Shuhui, Li, Liang, Huang, Qingming, IEEE. Saliency-Based Spatiotemporal Attention for Video Captioning. 2018 IEEE FOURTH INTERNATIONAL CONFERENCE ON MULTIMEDIA BIG DATA (BIGMM)null. 2018, http://apps.webofknowledge.com/CitedFullRecord.do?product=UA&colName=WOS&SID=5CCFccWmJJRAuMzNPjj&search_mode=CitedFullRecord&isickref=WOS:000630423400053.
[56] Xu, Zijun, Su, Li, Wang, Shuhui, Huang, Qingming, Zhang, Yuan, IEEE. S2L: SINGLE-STREAMLINE FOR COMPLEX VIDEO EVENT DETECTION. 2018 IEEE INTERNATIONAL CONFERENCE ON MULTIMEDIA & EXPO WORKSHOPS (ICMEW 2018)null. 2018, [57] Liu, Siyuan, Qu, Qiang, Wang, Shuhui. Heterogeneous anomaly detection in social diffusion with discriminative feature discovery. INFORMATION SCIENCES[J]. 2018, 439: 1-18, http://ir.siat.ac.cn:8080/handle/172644/13947.
[58] Mao, Xiaofeng, Wang, Shuhui, Zheng, Liying, Huang, Qingming. Semantic invariant cross-domain image generation with generative adversarial networks. NEUROCOMPUTING[J]. 2018, 293: 55-63, http://dx.doi.org/10.1016/j.neucom.2018.02.092.
[59] Jianfeng He, Qingming Huang, Weigang Zhang, Qiang Qu, Shuhui Wang. Efficient Cross-modal Retrieval Using Social Tag Information Towards Mobile Applications. 2017, http://ir.siat.ac.cn:8080/handle/172644/11930.
[60] Zhuo, Junbao, Wang, Shuhui, Zhang, Weigang, Huang, Qingming, ACM. Deep Unsupervised Convolutional Domain Adaptation. PROCEEDINGS OF THE 2017 ACM MULTIMEDIA CONFERENCE (MM'17)null. 2017, 261-269, http://dx.doi.org/10.1145/3123266.3123292.
[61] Song, Guoli, Wang, Shuhui, Huang, Qingming, Tian, Qi. Multimodal Similarity Gaussian Process Latent Variable Model. IEEE TRANSACTIONS ON IMAGE PROCESSING[J]. 2017, 26(9): 4168-4181, https://www.webofscience.com/wos/woscc/full-record/WOS:000404288000006.
[62] Yang Shijie, Li Liang, Wang Shuhui, Zhang Weigang, Huang Qingming, IEEE. Multi-view Subspace Learning with Diversity Enforced Skeleton Embedding. 2017 IEEE THIRD INTERNATIONAL CONFERENCE ON MULTIMEDIA BIG DATA (BIGMM 2017)null. 2017, 121-128, http://dx.doi.org/10.1109/BigMM.2017.33.
[63] Min, Weiqing, Jiang, Shuqiang, Wang, Shuhui, Xu, Ruihan, Cao, Yushan, Herranz, Luis, He, Zhiqiang. A survey on context-aware mobile visual recognition. MULTIMEDIA SYSTEMS[J]. 2017, 23(6): 647-665, https://www.webofscience.com/wos/woscc/full-record/WOS:000415313700002.
[64] Song, Guoli, Wang, Shuhui, Huang, Qingming, Tian, Qi, IEEE. Multimodal Gaussian Process Latent Variable Models with Harmonization. 2017 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV)null. 2017, 5039-5047, [65] Wu Yiling, Wang Shuhui, Zhang Weigang, Huang Qingming, IEEE. ONLINE LOW-RANK SIMILARITY FUNCTION LEARNING WITH ADAPTIVE RELATIVE MARGIN FOR CROSS-MODAL RETRIEVAL. 2017 IEEE INTERNATIONAL CONFERENCE ON MULTIMEDIA AND EXPO (ICME)null. 2017, 823-828, [66] Yang, Shijie, Li, Liang, Wang, Shuhui, Zhang, Weigang, Huang, Qingming, IEEE. A Graph Regularized Deep Neural Network for Unsupervised Image Representation Learning. 30TH IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2017)null. 2017, 7053-7061, [67] Wu, Yiling, Wang, Shuhui, Huang, Qingming, IEEE. Online Asymmetric Similarity Learning for Cross-Modal Retrieval. 30TH IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2017)null. 2017, 3984-3993, [68] Huang Qingming. Bi-Level Multi-View Latent Space Learning. IEEE TCSVT (CCF-B). 2017, [69] Liu, Siyuan, Wang, Shuhui. Trajectory Community Discovery and Recommendation by Multi-Source Diffusion Modeling. IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING[J]. 2017, 29(4): 898-911, https://www.webofscience.com/wos/woscc/full-record/WOS:000397581000014.
[70] Huang, Jun, Li, Guorong, Wang, Shuhui, Xue, Zhe, Huang, Qingming. Multi-label classification by exploiting local positive and negative pairwise label correlation. NEUROCOMPUTING[J]. 2017, 257: 164-174, http://dx.doi.org/10.1016/j.neucom.2016.12.073.
[71] Zhang, Jiaming, Wang, Shuhui, Huang, Qingming. Location-Based Parallel Tag Completion for Geo-Tagged Social Image Retrieval. ACM TRANSACTIONS ON INTELLIGENT SYSTEMS AND TECHNOLOGY[J]. 2017, 8(3): https://www.webofscience.com/wos/woscc/full-record/WOS:000400160800005.
[72] Q Huang, J Zhang, S Wang, Q Qu. JEREMIE: Joint Semantic Feature Learning via Multi-relational Matrix Completion. 2017, http://ir.siat.ac.cn:8080/handle/172644/11929.
[73] Min Weiqing, Jiang Shuqiang, Wang Shuhui, Sang Jitao, Mei Shuhuan, ACM. A Delicious Recipe Analysis Framework for Exploring Multi-Modal Recipes with Various Attributes. PROCEEDINGS OF THE 2017 ACM MULTIMEDIA CONFERENCE (MM'17)null. 2017, 402-410, http://dx.doi.org/10.1145/3123266.3123272.
[74] Hua, Yan, Wang, Shuhui, Liu, Siyuan, Cai, Anni, Huang, Qingming. Cross-Modal Correlation Learning by Adaptive Hierarchical Semantic Aggregation. IEEE TRANSACTIONS ON MULTIMEDIA[J]. 2016, 18(6): 1201-1216, https://www.webofscience.com/wos/woscc/full-record/WOS:000376107100021.
[75] Hua, Yan, Wang, Shuhui, Liu, Siyuan, Cai, Anni, Huang, Qingming. Cross-Modal Correlation Learning by Adaptive Hierarchical Semantic Aggregation (vol 18, pg 1201, 2016). IEEE TRANSACTIONS ON MULTIMEDIAnull. 2016, 18(10): 2127-2127, http://www.corc.org.cn/handle/1471x/2375025.
[76] 蒋树强, 闵巍庆, 王树徽. 面向智能交互的图像识别技术综述与展望. 计算机研究与发展[J]. 2016, 53(1): 113-122, http://lib.cqvip.com/Qikan/Article/Detail?id=667688334.
[77] 王祯骏, 王树徽, 张维刚, 黄庆明. 基于社交内容的潜在影响力传播模型. 计算机学报[J]. 2016, 39(8): 1528-1540, http://lib.cqvip.com/Qikan/Article/Detail?id=669627939.
[78] Chu, Lingyang, Zhang, Yanyan, Li, Guorong, Wang, Shuhui, Zhang, Weigang, Huang, Qingming. Effective Multimodality Fusion Framework for Cross-Media Topic Detection. IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY[J]. 2016, 26(3): 556-569, https://www.webofscience.com/wos/woscc/full-record/WOS:000372547400011.
[79] He Jianfeng, Ma Bingpeng, Wang Shuhui, Liu Yugui, Huang Qingming, ACM. Cross-modal Retrieval by Real Label Partial Least Squares. MM'16: PROCEEDINGS OF THE 2016 ACM MULTIMEDIA CONFERENCEnull. 2016, 227-231, http://dx.doi.org/10.1145/2964284.2967216.
[80] Xue, Zhe, Li, Guorong, Wang, Shuhui, Zhang, Chunjie, Zhang, Weigang, Huang, Qingming, IEEE. GOMES: A GROUP-AWARE MULTI-VIEW FUSION APPROACH TOWARDS REAL-WORLD IMAGE CLUSTERING. 2015 IEEE INTERNATIONAL CONFERENCE ON MULTIMEDIA & EXPO (ICME)null. 2015, [81] Wang Shuhui. Location-Based Parallel Tag Completion for Geo-tagged Social Photo Retrieval. International Conference on Multimedia Retrieval (ICMR). 2015, [82] Wang Shuhui. Cluster-Sensitive Structured Correlation Analysis for Web Cross Modality Retrieval. Neurocomputing. 2015, [83] Shen, Li, Sun, Gang, Huang, Qingming, Wang, Shuhui, Lin, Zhouchen, Wu, Enhua. Multi-Level Discriminative Dictionary Learning With Application to Large Scale Image Classification. IEEE TRANSACTIONS ON IMAGE PROCESSING[J]. 2015, 24(10): 3109-3123, http://www.corc.org.cn/handle/1471x/2376455.
[84] Liu, Siyuan, Wang, Shuhui, Zhu, Feida. Structured Learning from Heterogeneous Behavior for Social Identity Linkage. IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING[J]. 2015, 27(7): 2005-2019, https://www.webofscience.com/wos/woscc/full-record/WOS:000355937800019.
[85] Chu, Lingyang, Wang, Shuhui, Liu, Siyuan, Huang, Qingming, Pei, Jian. ALID: Scalable Dominant Cluster Detection. PROCEEDINGS OF THE VLDB ENDOWMENT[J]. 2015, 8(8): 826-837, [86] Wang Shuhui, Wu Yiling, Huang Qingming, IEEE. IMPROVING CROSS-MODAL CORRELATION LEARNING WITH HYPERLINKS. 2015 IEEE INTERNATIONAL CONFERENCE ON MULTIMEDIA & EXPO (ICME)null. 2015, [87] Liu, Siyuan, Qu, Qiang, Wang, Shuhui. Rationality Analytics from Trajectories. ACM TRANSACTIONS ON KNOWLEDGE DISCOVERY FROM DATA[J]. 2015, 10(1): http://dx.doi.org/10.1145/2735634.
[88] Liu, Siyuan, Wang, Shuhui, Liu, Ce, Krishnan, Ramayya. Understanding taxi drivers' routing choices from spatial and social traces. FRONTIERS OF COMPUTER SCIENCE[J]. 2015, 9(2): 200-209, https://www.webofscience.com/wos/woscc/full-record/WOS:000351519500003.
[89] Zhang Jiaming, Wang Shuhui, Huang Qingming, ACM. Location-Based Parallel Tag Completion for Geo-tagged Social Image Retrieval. ICMR'15: PROCEEDINGS OF THE 2015 ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA RETRIEVALnull. 2015, 355-362, http://dx.doi.org/10.1145/2671188.2749353.
[90] 王树徽, 黄庆明. 异质媒体分析技术研究进展. 集成技术. 2015, 7-21, http://lib.cqvip.com/Qikan/Article/Detail?id=664231323.
[91] Wang, Shuhui, Zhuang, Fuzhen, Jiang, Shuqiang, Huang, Qingming, Tian, Qi. Cluster-sensitive Structured Correlation Analysis for Web cross-modal retrieval. NEUROCOMPUTING[J]. 2015, 168: 747-760, http://dx.doi.org/10.1016/j.neucom.2015.05.049.
[92] Li Guorong. GROUP SENSITIVE CLASSIFIER CHAINS FOR MULTI-LABEL CLASSIFICATION. IEEE International Conference on Multimedia and Expo. 2015, [93] Song, Guoli, Wang, Shuhui, Huang, Qingming, Tian, Qi, IEEE. Similarity Gaussian Process Latent Variable Model for Multi-Modal Data Analysis. 2015 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV)null. 2015, 4050-4058, [94] Song, Xinghang, Jiang, Shuqiang, Wang, Shuhui, Li, Liang, Huang, Qingming. Polysemious visual representation based on feature aggregation for large scale image applications. MULTIMEDIA TOOLS AND APPLICATIONS[J]. 2015, 74(2): 595-611, https://www.webofscience.com/wos/woscc/full-record/WOS:000348445300016.
[95] Liu Siyuan, Wang Shuhui, Zhu Feida, Zhang Jinbo, Krishnan Ramayya, ACM SIGMOD. HYDRA: Large-scale Social Identity Linkage via Heterogeneous Behavior Modeling. SIGMOD'14: PROCEEDINGS OF THE 2014 ACM SIGMOD INTERNATIONAL CONFERENCE ON MANAGEMENT OF DATAnull. 2014, 51-62, http://dx.doi.org/10.1145/2588555.2588559.
[96] Huang Jun, Li Guorong, Wang Shuhui, Huang Qingming, Zhou ZH, Wang W, Kumar R, Toivonen H, Pei J, Huang JZ, Wu X. Categorizing Social Multimedia by Neighborhood Decision using Local Pairwise Label Correlation. 2014 IEEE INTERNATIONAL CONFERENCE ON DATA MINING WORKSHOP (ICDMW)null. 2014, 913-920, http://dx.doi.org/10.1109/ICDMW.2014.87.
[97] Wang Shuhui, Wang Zhenjun, Jiang Shuqiang, Huang Qingming, IEEE. CROSS MEDIA TOPIC ANALYTICS BASED ON SYNERGETIC CONTENT AND USER BEHAVIOR MODELING. 2014 IEEE INTERNATIONAL CONFERENCE ON MULTIMEDIA AND EXPO (ICME)null. 2014, [98] Hua Yan Tina, Wang Shuhui, Liu Siyuan, Huang Qingming, Cai Anni, Kumar R, Toivonen H, Pei J, Huang JZ, Wu X. TINA: Cross-modal Correlation Learning by Adaptive Hierarchical Semantic Aggregation. 2014 IEEE INTERNATIONAL CONFERENCE ON DATA MINING (ICDM)null. 2014, 190-199, [99] Chu, Lingyang, Wang, Shuhui, Zhang, Yanyan, Jiang, Shuqiang, Huang, Qingming, IEEE. GRAPH-DENSITY-BASED VISUAL WORD VOCABULARY FOR IMAGE RETRIEVAL. 2014 IEEE INTERNATIONAL CONFERENCE ON MULTIMEDIA AND EXPO (ICME)null. 2014, [100] Wang Shuhui. WIKI-CMR: A Web Cross Modality database for Studying and Evaluation of Cross Modality Retrival Methods. IEEE International Conference on Multimedia and Expo (ICME). 2013, [101] Qing He. Xin Jin, Fuzhen Zhuang, Shuhui Wang, Qing He, and Zhongzhi Shi. Shared Structure Learning for Multiple Tasks with Multiple Views, ECML/PKDD13, September 23-27, 2013, Prague, Czech. ECML/PKDD13. 2013, [102] Zhang Chunjie, Zhang Yifan. Undo the codebook bias by linear transformation for visual applications. ACM International Conference on Multimedianull. 2013, 533-536, http://ir.ia.ac.cn/handle/173211/4670.
[103] Zhang, Chunjie, Wang, Shuhui, Huang, Qingming, Liu, Jing, Liang, Chao, Tian, Qi. Image classification using spatial pyramid robust sparse coding. PATTERN RECOGNITION LETTERS[J]. 2013, 34(9): 1046-1052, http://dx.doi.org/10.1016/j.patrec.2013.02.013.
[104] Zhang, Yanyan, Li, Guorong, Chu, Lingyang, Wang, Shuhui, Zhang, Weigang, Huang, Qingming, IEEE. CROSS-MEDIA TOPIC DETECTION: A MULTI-MODALITY FUSION FRAMEWORK. 2013 IEEE INTERNATIONAL CONFERENCE ON MULTIMEDIA AND EXPO (ICME 2013)null. 2013, [105] Sun, Gang, Wang, Shuhui, Liu, Xuehui, Huang, Qingming, Chen, Yanyun, Wu, Enhua. Accurate and efficient cross-domain visual matching leveraging multiple feature representations. VISUAL COMPUTER[J]. 2013, 29(6-8): 565-575, https://www.webofscience.com/wos/woscc/full-record/WOS:000319478400011.
[106] Chu, Lingyang, Jiang, Shuqiang, Wang, Shuhui, Zhang, Yanyan, Huang, Qingming. Robust Spatial Consistency Graph Model for Partial Duplicate Image Retrieval. IEEE TRANSACTIONS ON MULTIMEDIA[J]. 2013, 15(8): 1982-1996, https://www.webofscience.com/wos/woscc/full-record/WOS:000327393900021.
[107] Zhang, Chunjie, Wang, Shuhui, Huang, Qingming, Liang, Chao, Liu, Ting, Tian, Qi. Laplacian affine sparse coding with tilt and orientation consistency for image classification. JOURNAL OF VISUAL COMMUNICATION AND IMAGE REPRESENTATION[J]. 2013, 24(7): 786-793, http://dx.doi.org/10.1016/j.jvcir.2013.05.004.
[108] Wang Shuhui. Cross Concept Local Fisher Discriminant Analysis for Image Classification. Multimedia Modelling (MMM). 2013, [109] Wang Shuhui. TODMIS: Mining Communities from Trajectories. ACM International Conference on Information and Knowledge Management (CIKM). 2013, [110] Shen, Li, Wang, Shuhui, Sun, Gang, Jiang, Shuqiang, Huang, Qingming, IEEE. Multi-Level Discriminative Dictionary Learning towards Hierarchical Visual Categorization. 2013 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR)null. 2013, 383-390, [111] Zhang Chunjie, Liu Jing. Beyond bag of words: Image representation in sub-semantic space. ACM International Conference on Multimedianull. 2013, 497-500, http://ir.ia.ac.cn/handle/173211/4669.
[112] Sun Gang, Wang Shuhui, Liu Xuehui, Huang Qingming, Chen Yanyun, Wu Enhua. Accurate and efficient cross-domain visual matching leveraging multiple feature representations. 2013, 565-575, http://gateway.webofknowledge.com/gateway/Gateway.cgi?GWVersion=2&SrcApp=PARTNER_APP&SrcAuth=LinksAMR&KeyUT=WOS:000319478400011&DestLinkType=FullRecord&DestApp=ALL_WOS&UsrCustomerID=3a85505900f77cc629623c3f2907beab.
[113] Shuhui Wang, Qingming Huang, Shuqiang Jiang, Qi Tian, Lei Qin. Nearest-neighbor method using multiple neighborhood similarities for social media data mining. NEUROCOMPUTING[J]. 2012, 95: 105-116, http://dx.doi.org/10.1016/j.neucom.2011.06.039.
[114] Wang, Shuhui, Huang, Qingming, Jiang, Shuqiang, Tian, Qi. (SMKL)-M-3: Scalable Semi-Supervised Multiple Kernel Learning for Real-World Image Applications. IEEE TRANSACTIONS ON MULTIMEDIA[J]. 2012, 14(4): 1259-1274, http://dx.doi.org/10.1109/TMM.2012.2193120.
[115] Wang Shuhui, Jiang Shuqiang, Huang Qingming, Tian Qi, IEEE. Multi-feature Metric Learning with Knowledge Transfer among Semantics and Social Tagging. 2012 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR)null. 2012, 2240-2247, [116] 王树徽. 基于多特征的海量多媒体分析与检索技术研究. 2012, [117] Wang, Shuhui, Huang, Qingming, Jiang, Shuqiang, Tian, Qi, Qin, Lei. Nearest-neighbor method using multiple neighborhood similarities for social media data mining. NEUROCOMPUTING[J]. 2012, 95: 105-116, http://dx.doi.org/10.1016/j.neucom.2011.06.039.
[118] Wang, Shuhui, Jiang, Shuqiang, Huang, Qingming, Gao, Wen, IEEE. SHOT CLASSIFICATION FOR ACTION MOVIES BASED ON MOTION CHARACTERISTICS. 2008 15TH IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING, VOLS 1-5null. 2008, 2508-2511, 

学术活动

1)Area Chair of ACMMM 2019-2022.

2)Senior TPC of IJCAI 2021 and AAAI 2021.

3)Program Cochair of MATES Workshop collocated with VLDB 2017.

4) Program Cochair of MASS Workshop collocated with APWEB-WAIM 2017.

5) Publication Chair, PCM 2017.

6) Session Chair, ICME 2015.

7) Publication Chair, ICIMCS 2015.

科研项目
( 1 ) 基于多源信息融合和网络社群行为建模的跨媒体分析技术研究, 主持, 国家级, 2014-01--2016-12
( 2 ) 异构媒体数据的关联与挖掘研究, 参与, 国家级, 2014-01--2016-12
( 3 ) 面向公共安全的跨媒体计算理论与方法, 参与, 国家级, 2012-01--2016-08
( 4 ) 图像视频的群体数据协同结构化表达与处理, 参与, 国家级, 2014-01--2018-12
( 5 ) 异构大数据的对象建模及跨域分析技术研究, 主持, 国家级, 2017-01--2020-12
( 6 ) 基于视觉特性的视频编码理论与方法研究, 参与, 国家级, 2015-01--2019-12
( 7 ) 面向跨媒体内容管理的智能分析与推理:跨媒体分析推理引擎, 主持, 国家级, 2019-12--2023-12
( 8 ) 跨媒体理解与知识推理, 主持, 国家级, 2021-01--2023-12
( 9 ) 面向数据稀缺场景的图像编辑与生成方法研究, 主持, 院级, 2021-04--2022-09
( 10 ) 面向跨媒体知识工程的可信推理与人机博弈问答, 主持, 市地级, 2021-06--2023-05
( 11 ) 面向设计资产库的UI推荐技术合作项目, 主持, 院级, 2022-01--2022-12
( 12 ) 知识辅助的少样本视觉内容理解, 主持, 院级, 2021-09--2022-08
( 13 ) 基于多模态人机交互的心理体检与咨询辅助合作研究战略, 主持, 院级, 2021-09--2022-08

指导学生

已指导学生

辛永健  硕士研究生  085211-计算机技术  

于晟昊  硕士研究生  085211-计算机技术  

魏军  硕士研究生  081203-计算机应用技术  

崔书豪  硕士研究生  081203-计算机应用技术  

闫旭  硕士研究生  081203-计算机应用技术  

薛壮壮  硕士研究生  085211-计算机技术  

韩华侨  硕士研究生  085211-计算机技术  

邓文达  硕士研究生  085211-计算机技术  

现指导学生

魏浩  硕士研究生  085400-电子信息  

孙隽姝  硕士研究生  081203-计算机应用技术  

蔡硕  硕士研究生  085400-电子信息  

朱妍  硕士研究生  085400-电子信息  

黄克楠  硕士研究生  085400-电子信息  

李梦莲  硕士研究生  085400-电子信息  

何晓铭  硕士研究生  085400-电子信息  

已毕业博士生

戚兆波(协助指导,2016~2022,获中科院院长优秀奖)

卓君宝(协助指导,2014~2020,计算所所长优秀奖)

吴益灵(协助指导,2013~2019,国家奖学金,中科院院长优秀奖)

宋国利(协助指导,2012~2018)

申丽(协助指导,2012~2014,中科院优博)

褚令洋(协助指导,2012~2015,国家奖学金,计算所所长奖)


已毕业硕士生

胡玲(协助指导,2019年毕业)

陈扬羽(协助指导,2018年毕业)

张家明(协助指导,2018年毕业)

张川(协助指导,2016年毕业)

王祯骏(协助指导,2015年毕业)

熊威(协助指导,2014年毕业) 


在读研究生

韩歆哲(博士生(协助指导),2017级)

方晟(直博生(协助指导),2018级)

毕超(博士生(协助指导),2018级)

甘婧儒(直博生(协助指导),2019级)

丁冠祺(博士生(协助指导),2019级