基本信息
赵冬斌  男  博导  中国科学院自动化研究所
电子邮件: dongbin.zhao@ia.ac.cn
通信地址: 海淀区中关村东路95号智能化大厦1005
邮政编码: 100190

研究领域

深度强化学习,多智能体强化学习,人工智能基础

智能驾驶,具身智能,游戏智能,基础模型训练,AI4S

最新成果


团队成果每月更新,近一个月的成果用黄色背景标记。关于成果的详细介绍,请关注微信公众号:深度强化学习@CASIA

人员获奖

  • 2026,赵冬斌,IEEE Computational Intelligence Society Distinguished Lecture Program
  • 2025,赵冬斌,中国科学院大学“拾光奉献纪念奖”
  • 2025,赵冬斌,北京智源人工智能研究院智源学者
  • 2025,赵冬斌,2025年度中国科学院优秀导师
  • 2025,陆润宇,博士国家奖学金
  • 2025,刘鑫,博士国家奖学金
  • 2025,柴嘉骏,中国科学院院长特别奖(最高等级,当年全所唯一,作为毕业生代表在国科大/自动化所毕业典礼上发言)
  • 2025,柴嘉骏,中国科学院自动化研究所优秀毕业生,北京市优秀毕业生
  • 2025,陆润宇,IEEE CIS Student Research Grant(每年全球6~9名)
  • 2025,中国科学院自动化研究所三好学生/优秀学生干部:方兴,陈文章,凃崧峻/刘学义,田帅
  • 2025,中国科学院人工智能学院三好学生/优秀学生干部:陆润宇,赵子杰/徐凯旋
  • 2025,中国科学院大学在读期间三好学生/优秀学生干部:江震南,邢泽斌,陈庆/秦宇星
  • 2025,陈霆鸿,北京市自然科学基金本科生启研计划
  • 2025,赵冬斌,中国科学院李佩优秀教师奖
  • 2025,赵冬斌,入选2024年斯坦福全球前2%顶尖科学家,终身科学影响力排行榜和年度科学影响力排行榜


竞赛获奖

  • 2025, ICCV NAVSIM v2 End-to-End Driving Challenge, 第3名, 张启超,郑宇鹏,刑泽斌,杨鹏轩。
  • 2025, CVPR NAVSIM v2 End-to-End Driving Challenge, 第4名(学界排名第1), 张启超,郑宇鹏,刑泽斌,杨鹏轩。https://opendrivelab.com/challenge2025/
  • 2025, ICRA ManiSkill Vitac Challenge, 冠军,秦宇星参加。


期刊—录用/发表

  1. Yinfeng Gao, Deqing Liu, Yupeng Zheng, Qichao Zhang*, Da-Wei Ding*, Dongbin Zhao, “SoAD: Safety-oriented Value Estimation for Enhanced Closed-Loop End-to-End Autonomous Driving,” IEEE Transactions on Systems, Man, and Cybernetics: Systems, DOI: 10.1109/TSMC.2026.3688954.
  2. YinFeng Gao#, Qichao Zhang#, Deqing Liu, Zhongpu Xia, Guang Li, Kun Ma, Guang Chen, Hangjun Ye, Long Chen, Dawei Ding*, Dongbin Zhao, “PerlAD: Towards Enhanced Closed-loop End-to-end Autonomous Driving with Pseudo-simulation-based Reinforcement Learning,” IEEE Robotics and Automation Letters (RA-L), vol. 11, no. 5, pp. 5821-5828, May 2026. DOI:10.1109/LRA.2026.3675928.
  3. Yupeng Zheng, Zebin Xing, Qichao Zhang*, Bu Jin, Pengfei Li, Yuhang Zheng, Zhongpu Xia, Kun Zhan, Xianpeng Lang, Yaran Chen, Dongbin Zhao, “PlanAgent: a multi-modal large language agent for closed-loop vehicle motion planning,”­­ IEEE Transactions on Cognitive and Developmental Systems (TCDS), accepted on January 31, 2026.
  4. Deqing Liu, YinFeng Gao, Qichao Zhang*, Yupeng Zheng, Xueyi Liu, Zhongpu Xia, Dongbin Zhao, “TakeAD: Preference-based Post-optimization for End-to-end Autonomous Driving with Expert Takeover Data,” IEEE Robotics and Automation Letters (RA-L), vol. 11, no. 2, pp. 1738–1745, 2026. (SCI Q1, IF 6). DOI: 10.1109/LRA.2025.3643264.
  5. Zebin Xing, Yupeng Zheng, Qichao Zhang*, Zhixing Ding, Pengxuan Yang, Songen Gu, Zhongpu Xia, Dongbin Zhao, “Mimir: Hierarchical Goal-Driven Diffusion With Uncertainty Propagation for End-to-End Autonomous Driving,” IEEE Robotics and Automation Letters (RA-L), vol. 11, no. 2, pp. 2178-2185, Feb. 2026. (SCI Q1, IF 6). DOI: 10.1109/LRA.2025.3641129. https://github.com/ZebinX/Mimir-Uncertainty-Driving
  6. Yuhui Chen, Haoran Li*, Zhennan Jiang, Haowei Wen, Dongbin Zhao, “TeViR: text-to-video reward with diffusion models for efficient reinforcement learning,” IEEE Transactions on Systems, Man, and Cybernetics: Systems (TSMCA), vol. 56, no. 2, pp. 893–905, Feb. 2026. (SCI Q1, IF 9.1). DOI: 10.1109/TSMC.2025.3638818.
  7. Yuqian Fu, Yuanheng Zhu, Haoran Li, Zijie Zhao, Jiajun Chai, Dongbin Zhao*, “CPIG: leveraging consistency policy with intention guidance for multi-agent exploration,” IEEE Transactions on Cognitive and Developmental Systems (TCDS), vol.18, no.1, pp. 154-166, Feb. 2026. (SCI Q1, IF 5.0). DOI: 10.1109/TCDS.2025.3578001. https://github.com/fyqqyf/CPIG 
  8. Ding Li, Qichao Zhang*, Dongfang Yang, Zhi Wang, Ren Fan, Dongbin Zhao, “IP3: Integrated path-guided prediction and planning for safe autonomous driving,” IEEE Transactions on Vehicular Technology (TVT), Vol. 74, No. 11, pp. 16729-16742, Nov. 2025. (SCI Q1, IF 6.1). DOI: 10.1109/TVT.2025.3576204.  https://github.com/ld-av/IP3/.
  9. Yaran Chen, Chenguang Yang, Chaomin Luo, and Dongbin Zhao, “Guest Editorial: Special Issue on Embodied AI in Indoor Robotics: Bridging Perception, Interaction, and Autonomy,” IEEE Transactions on Cognitive and Developmental Systems (TCDS), Vol. 17, No. 5, pp. 1047-1149, Oct. 2025. (SCI Q1, IF 5.0). DOI: 10.1109/TCDS.2025.3595370.
  10. Yaran Chen, Wenbo Cui, Yuanwen Chen, Mining Tan, Xinyao Zhang, Jinrui Liu, Haoran Li, Dongbin Zhao*, and He Wang, “RoboGPT: an LLM-based long-term decision-making embodied agent for instruction following tasks,” IEEE Transactions on Cognitive and Developmental Systems (TCDS). Vol. 17, No. 5, pp. 1163-1174, Oct. 2025. (SCI Q1, IF 5.0). DOI: 10.1109/TCDS.2025.3543364.  https://github.com/Cwb0106/RoboGPT.
  11. Yaran Chen, Xueyu Chen, Yu Han, Haoran Li, Dongbin Zhao, JingZhong Ji, Xu Wang*, Yong Zhou*, “Multimodal learning-based prediction for nonalcohol fatty liver disease,” Machine Intelligence Research (MIR), Vol. 22, No. 5, pp. 871-887, Oct. 2025. (SCI Q1, IF 6.4). DOI: 10.1007/s11633-024-1506-4.
  12. Boyu Li, Haobin Jiang, Ziluo Ding, Xinrun Xu, Haoran Li, Dongbin Zhao, Zongqing Lu*, “SELU: self-learning embodied multimodal large language models in unknown environments,” Transactions on Machine Learning Research (TMLR), 2025
  13. Runyu Lu, Yuanheng Zhu*, Dongbin Zhao, Yu Liu, You He, “Last-Iterate Convergence to Approximate Nash Equilibria in Multiplayer Imperfect Information Games,” IEEE Transactions on Neural Networks and Learning Systems (TNNLS), Vol. 36, No. 8, pp. 13859-13873, Aug. 2025. (SCI Q1, IF 11.1). DOI: 10.1109/TNNLS.2024.3516693. https://github.com/lryforeal/IESL-implementation
  14. Zijie Zhao, Yuanheng Zhu*, Dongbin Zhao*, “Meta learning task representation in multi-agent reinforcement learning: from global inference to local inference,” IEEE Transactions on Neural Networks and Learning Systems (TNNLS), Vol. 36, No. 8, pp. 14908-14921, Aug. 2025. (SCI Q1, IF 11.1). DOI: 10.1109/TNNLS.2025.3540758.  https://github.com/zhaozijie2022/mg2l.
  15. Jianjun Chai, Zijie Zhao, Yuanheng Zhu, Dongbin Zhao*, “A Survey of Cooperative Mutil-Agent Reinforcement Learning for Multi-Task Scenarios,” Artificial Intelligence Science and Engineering (AISE), Vol. 1, No. 2, 89-121, 2025. DOI: 10.23919/AISE.2025.000008. Popular Article.
  16. Xin Liu, Yaran Chen*, Dongbin Zhao*, “Learning future representation with synthetic observations for sample-efficient reinforcement learning,” SCIENCE CHINA Information Sciences (SCIS), Vol. 68, No. 5, 150202: 1-18, May 2025. (SCI Q1, IF 7.3). https://doi.org/10.1007/s11432-024-4380-4.
  17. Haoran Li, Guangzheng Hu, Shasha Liu, Mingjun Ma, Yaran Chen, Dongbin Zhao*, “NeuronsGym: a hybrid framework and benchmark for robot tasks with Sim2Real policy learning,” IEEE Transactions on Emerging Topics in Computational Intelligence (TETCI), Vol. 9, No. 3, pp. 2491-2505, May 2025. (SCI Q1, IF 5.3). DOI: 10.1109/TETCI.2024.3488732. https://github.com/DRL-CASIA/NeuronsGym
  18. Xin Liu, Yaran Chen, Haoran Li, Dongbin Zhao*, “Balancing state exploration and skill diversity in unsupervised skill discovery,” IEEE Transactions on Cybernetics (TCyb), Vol. 55, No. 5, pp. 2234-2247, May 2025. (SCI Q1, IF 9.4). DOI: 10.1109/TCYB.2025.3548821.  https://github.com/liuxin0824/ComSD
  19. Xin Liu, Yaran Chen*, Haoran Li, Boyu Li, Dongbin Zhao*, “Cross-domain random pretraining with prototypes for reinforcement learning,” IEEE Transactions on Systems, Man, and Cybernetics: Systems (TSMCA), Vol. 55, No. 5, pp. 3601 – 3613, May 2025. (SCI Q1, IF 9.1). DOI: 10.1109/TSMC.2025.3541926.  Popular Articlehttps://github.com/liuxin0824/CRPTpro
  20. Yuqian Fu, Yuanheng Zhu*, Jiajun Chai, Dongbin Zhao, “LDR: Learning discrete representation to improve noise robustness in multiagent tasks,” IEEE Transactions on Systems, Man, and Cybernetics: Systems (TSMCA), Vol. 55, No. 1, pp. 513-525, January 2025. (SCI Q1, IF 9.1). DOI: 10.1109/TSMC.2024.3487535.
  21. Nannan Li, Yaran Chen*, Dongbin Zhao, “Adaptive search for broad attention based vision transformers,” Neurocomputing, 611, 2025, 128696. (SCI Q1, IF 5.4). DOI: https://doi.org/10.1016/j.neucom.2024.128696. https://github.com/Bpumpkin/ASB
  22. Yuanheng Zhu, Shangjing Huang, Binbin Zuo, Dongbin Zhao*, Changyin Sun*, “Multi-task multi-agent reinforcement learning with task-entity transformers and value decomposition training,” IEEE Transactions on Automation Science and Engineering (TASE), Vol. 22, pp. 9164-9177, 2025.(SCI Q1, IF 6.4). DOI: 10.1109/TASE.2024.3501580. https://github.com/YuanhengZhu/TETQmix
  23. 李浩然,陈宇辉,崔文博,刘卫恒,刘锴,周明才,张正涛,赵冬斌*,面向具身操作的视觉-语言-动作模型综述,自动化学报. 2026, 52(1): 18-51. DOI: 10.16383/j.aas.c250394.
  24. 胡光政,朱圆恒,赵冬斌*,两团队零和博弈下熵引导的极小极大值分解强化学习方法,自动化学报, 2025, 51(4): 875-888. DOI: 10.16383/j.aas.c240258.
  25. 刘民颂,朱圆恒*赵冬斌,基于Transform状态-动作-奖赏预测表征学习,自动化学报,2025, 51(1): 117-132. DOI: 10.16383/j.aas.c240230.
  26. 梁荣钦,朱圆恒*赵冬斌,基于对手池的两人格斗游戏深度强化学习,控制理论与应用,2025, 42(2): 226-234. DOI: 10.7641/CTA.2024.30688. https://github.com/zhongqian97/TwoPlayerGameSelfPlayFramework,


会议-录用/发表

  1. Boyu Li, Chaoyi Xu, Haoqi Yuan, Xinrun Xu, Dongbin Zhao, Haoran Li*, Zongqing Lu*, “X-DiffVLA: X-Embodied Diffusion Action Heads for Vision-Language-Action Models,” RSS 2026.
  2. Jiangran Lyu, Kai Liu, Xuheng Zhang, Wenxuan Zhu, Tingrui Shen, Haoran Liao, Yusen Feng, Jiayi Chen, Jiazhao Zhang, Yifei Dong, Cui Wenbo, Senmao Qi, Shuo Wang, Yixin Zheng, Mi Yan, Xuesong Shi, Haoran Li, Dongbin Zhao, Ming-Yu Liu, Zhizheng Zhang, Li Yi, Yizhou Wang, He Wang*, “LDA-1B: Scaling Latent Dynamics Action Model via Universal Embodied Data Ingestion,” RSS 2026.
  3. Yuan Liu, Haoran Li*, Shuai Tian, Yuxing Qin, Yupeng Zheng, Yongzhen Huang, Dongbin Zhao, “Towards Long-lived Robots: Continual Learning of VLA Models via Reinforcement Fine-tuning,” RSS 2026.
  4. Minghui Jia, Qichao Zhang, Ali Luo, Linjing Li, Shuo Ye, Hailing Lu, Wen Hou, Dongbin Zhao, “Spec-o3: A Tool-Augmented Vision-Language Agent for Rare Celestial Object Candidate Vetting via Automated Spectral Inspection,” ACL 2026 main.
  5. Jingbo Sun#, Wenyue Chong#, Songjun Tu, Qichao Zhang*, Yaocheng Zhang, Jiajun Chai, Xiaohan Wang, Wei Lin, Guojun Yin, Dongbin Zhao, “AutoSearch: Self-Decision-Driven Reinforcement Learning for Adaptive Search Depth in Agentic RAG” ACL 2026 findings.
  6. Yaocheng Zhang, Haohuan Huang, Zijun Song, Zijie Zhao, Qichao Zhang, Yuanheng Zhu*, Dongbin Zhao, “CriticSearch: Fine-Grained Credit Assignment for Search Agents via a Retrospective Critic,” ACL 2026 findings.
  7. Bo Lv#, Jingbo Sun#, Jianwei Lv, Chen Tang, shaojie zhang, Nayu Liu, Guoxin Yu, Zihao Li, Qichao Zhang, Dongbin Zhao, Ping Luo, Yue Yu, “Beyond Query Memorization: Large Language Model Routing with Query Decomposition and Historical Matching,” ACL 2026 main.
  8. Jingbo Sun, Songjun Tu, Xing Fang, Qichao Zhang*, Haoran Li, Ke Chen, Dongbin Zhao, Saliency-Guided Representation with Consistency Policy Learning for Visual Unsupervised Reinforcement Learning, CVPR 2026. https://github.com/bofusun/SRCP
  9. Junli Wang, Yinan Zheng, Xueyi Liu, Zebin Xing, Pengfei Li, Kun Ma, Hangjun Ye, Guang Chen, Guang Li, Long Chen, Zhongpu Xia, Qichao Zhang*, MeanFuser: Fast One-Step Multi-Modal Trajectory Generation and Adaptive Reconstruction via MeanFlow for End-to-End Driving, CVPR 2026. https://github.com/wjl2244/MeanFuser
  10. Boyu Li, Siyuan He, Hang Xu, Haoqi Yuan, Yu Zang, Liwei Hu, Junpeng Yue, Zhenxiong Jiang, Pengbo Hu, Börje F. Karlsson, Dongbin Zhao, Yehui Tang, Zongqing Lu*, “Towards Proprioception-Aware Embodied Planning for Dual-Arm Humanoid Robots,” ICRA 2026.
  11. Yingting Zhou, Wenbo Cui, Weiheng Liu, Guixing Chen, Haoran Li*, Dongbin Zhao, “DiffuDepGrasp: Diffusion-based Depth Noise Modeling Empowers Sim-to-Real Robotic Grasping,” ICRA 2026. (CCF-B).
  12. Qichao Zhang, Xing Fang, Dongbin Zhao*, “ConsistencyPlanner: Real-time Planning with Fast-Sampling Consistency Models,” ICRA 2026. (CCF-B).
  13. Yupeng Zheng, Pengxuan Yang, Zhongpu XiaQichao Zhang*, Dongbin Zhao, “Preliminary Investigation into Data Scaling Laws for Imitation Learning-Based End-to-End Autonomous Driving,” ICRA 2026. (CCF-B).
  14. Wenbo Cui, Chengyang Zhao, Yuhui Chen, Haoran Li, Zhizheng Zhang, Dongbin Zhao, He Wang*, “CLAR: Learning 3D Representations for Robotic Manipulation by Fusing Masked Reconstruction with Multi-Level Contrastive Alignment,” ICRA 2026. (CCF-B).
  15. Runyu Lu, Ruochuan Shi, Yuanheng Zhu*, Dongbin Zhao, “R2PS: Worst-Case Robust Real-Time Pursuit Strategies under Partial Observability,” ICLR 2026. (领域顶会).
  16. Zijie Zhao, Honglei Guo, Shengqian Chen, Kaixuan Xu, Bo Jiang, Yuanheng Zhu*, Dongbin Zhao, Empowering Multi-Robot Cooperation via Sequential World Models, ICLR 2026. (领域顶会).
  17. Yuqian Fu, Tinghong Chen, Jiajun Chai, Xihuai Wang, Songjun Tu, Guojun Yin, Wei Lin, Qichao Zhang, Yuanheng Zhu, Dongbin Zhao, SRFT: A single-stage method with supervised and reinforcement fine-tuning for reasoning, ICLR 2026.(领域顶会). https://arxiv.org/abs/2506.19767.
  18. Yixuan Li, Yuhui Chen, Mingcai Zhou, Haoran Li*, “QDepth-VLA: Quantized Depth Prediction as Auxiliary Supervision for Vision-Language-Action Models”, The 25th International Conference on Autonomous Agents and Multi-Agent Systems (AAMAS),   Paphos, Cyprus, May 25-29, 2026. (CCF B) https://github.com/ucasmichael/QDepth-VLA.
  19. Jinrui Liu, Bingyan Nie, Boyu Li, Yaran Chen, Yuze Wang, Shunsen He, Haoran Li*, “RoboGPT-R1: Enhancing Robot Task Planning with Reinforcement Learning,” The 25th International Conference on Autonomous Agents and Multi-Agent Systems (AAMAS), Paphos, Cyprus, May 25-29, 2026. (CCF B)
  20. Pengxuan Yang, Ben Lu, Zhongpu Xia, Chao Han, Yinfeng Gao, Teng Zhang, Kun Zhan, XianPeng Lang, Yupeng Zheng, Qichao Zhang*, WorldRFT: Latent World Model Planning with Reinforcement Fine-Tuning for Autonomous Driving, The 40th Annual AAAI Conference on Artificial Intelligence (AAAI), Singapore, Jan 20-27, 2026. (CCF A). https://github.com/pengxuanyang/WorldRFT.
  21. Xin Liu, Haoran Li*, Dongbin Zhao, “Videos are Sample-Efficient Supervisions: Behavior Cloning from Videos via Latent Representations,” NeurIPS 2025. (CCF A) https://github.com/liuxin0824/BCV-LR 
  22. Songjun Tu, Jiahao Lin, Qichao Zhang*, Xiangyu Tian, Linjing Li, Xiangyuan Lan, Dongbin Zhao, “Learning When to Think: Shaping Adaptive Reasoning in R1-Style Models via Multi-Stage RL,” NeurIPS 2025. (CCF A), https://github.com/ScienceOne-AI/AutoThink
  23. Runyu Lu, Peng Zhang, Ruochuan Shi, Yuanheng Zhu*, Dongbin Zhao, Yang Liu, Dong Wang, Cesare Alippi, “Equilibrium Policy Generalization: A Reinforcement Learning Framework for Cross-Graph Zero-Shot Generalization in Pursuit-Evasion Games,” NeurIPS 2025. (CCF A) https://github.com/Cahemgco/EPG_code
  24. Zijie Zhao, Zhongyue Zhao, Kaixuan Xu, Yuqian Fu, Jiajun Chai, Yuanheng Zhu*, Dongbin Zhao, “Learning and Planning Multi-Agent Tasks via a MoE-based World Model,” NeurIPS 2025. (CCF A) https://github.com/zhaozijie2022/m3w-marl
  25. Ruochuan Shi, Runyu Lu, Yuanheng Zhu*, Dongbin Zhao*, “ARAC: Adaptive Regularized Multi-Agent Soft Actor-Critic in Graph-Structured Adversarial Games,” DAI 2025 oral.
  26. Yuqian Fu, Yuanheng Zhu*, Jiajun Chai, Guojun Yin, Wei Lin, Qichao Zhang, Dongbin Zhao, “RLAE: Reinforcement Learning-Assisted Ensemble for LLMs,” EMNLP 2025 main (CCF B). https://github.com/fyqqyf/RLAE
  27. Weiheng Liu, Yuxuan Wan, Jilong Wang, Yuxuan Kuang, Haoran Li, Dongbin Zhao, Zhizheng Zhang, He Wang, FetchBot: Object Fetching in Cluttered Shelves via Zero-Shot Sim2Real, CoRL 2025 oral. (领域顶会). 
  28. Xueyi Liu, Zuodong Zhong, Qichao Zhang*, Yuxin Guo, Yupeng Zheng, Junli Wang, Dongbin Zhao, “ReasonPlan: Unified Scene Prediction and Decision Reasoning for Closed-loop Autonomous Driving,” CoRL 2025. (领域顶会). https://github.com/Liuxueyi/ReasonPlan  
  29. Shuai Tian, Haoran Li*, Dongbin Zhao, Fast and Accurate Visuomotor Imitation Learning via 2D Consistency Flow Matching Policy, ICONIP 2025. (CCF C)
  30. Songjun Tu, Qichao Zhang*, Linjing Li, Yuqian Fu, Nan Xu, Xiangyuan Lan, Wei He, Xiangyuan Lan, Dongmei Jiang, Dongbin Zhao, “Enhancing LLM reasoning with iterative DPO: a comprehensive empirical investigation,” COLM 2025. https://github.com/TU2021/DPO-VP
  31. Shugao Liu, Qichao Zhang, Haoran Li*, Dongbin Zhao, “FusionNav: Enhancing Zero-Shot Object-Goal Navigation via 3D Semantic Fusion and Farsight Value Reasoning,” IEEE SMCC 2025. (CCF C)
  32. Yupeng Zheng, Pengxuan Yang, Zebin Xing, Yuhang Zheng, Pengfei Li, Yinfeng Gao, Qichao Zhang*, Teng Zhang, Zhongpu Xia, Peng Jia, XianPeng Lang, Dongbin Zhao, “World4Drive: Hierarchical Latent World Models for Perception-Free End-to-End Autonomous Driving,” ICCV 2025. (CCF A)
  33. Mengying Lin#, Shugao Liu#, Dingxi Zhang, Yaran Chen, Zhaoran Wang, Haoran Li*, Dongbin Zhao, Advancing Object-Goal Navigation through LLM-enhanced Object Affinities Transfer, 2025 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS). (CCF C)
  34. Runyu Lu, Yuanheng Zhu*, Dongbin Zhao, “Constrained exploitability descent: finding mixed-strategy Nash equilibrium by offline reinforcement learning,” ICML 2025. (CCF A)
  35. Kaixuan Xu, Jiajun Chai, Sicheng Li, Yuqian Fu, Yuanheng Zhu*, Dongbin Zhao, “DipLLM: Fine-Tuning LLM for Strategic Decision-making in Diplomacy,” ICML 2025. (CCF A) https://github.com/KaiXIIM/dipllm
  36. Yuhui ChenShuai TianShugao LiuYingting ZhouHaoran Li*Dongbin Zhao, “Fine-tuning VLA models via Human-in-the-Loop Consistency Policy,” RSS 2025. https://github.com/cccedric/conrft.
  37. Yuanwen Chen, Haoran Li,  Yaran Chen, Dongbin Zhao, “LeAffordNav: Enhancing open-vocabulary mobile manipulation with LLM-guided exploration and affordance-aware navigation,” ICME 2025. (CCF B)  https://github.com/Cyuanwen/LeAffordNav.
  38. Pengxuan Yang, Yupeng Zheng, Kefei Zhu, Zebin Xing, Pengfei Li, Qichao Zhang*, Zhongpu Xia, Dongbin Zhao, “UncAD: Towards Safe End-to-end Autonomous Driving via Online Map Uncertainty,” ICRA 2025. (CCF B) 
  39. Jiajun Chai, Yuqian Fu, Sicheng Li, Yuanheng Zhu*, Dongbin Zhao, "Empowering LLM Agents with zero-shot optimal decision-making through Q-learning, ICLR 2025.(领域顶会) https://github.com/laq2024/MLAQ.
  40. Jingbo Sun, Songjun Tu, Qichao Zhang*, Haoran Li, Xin Liu, Yaran Chen, Ke Chen, Dongbin Zhao, “Unsupervised zero-shot reinforcement learning via dual-value forward-backward representation,” ICLR 2025.  https://github.com/bofusun/DVFB.
  41. Runyu Lu, Yuanheng Zhu*, Dongbin Zhao, “Divergence-regularized discounted aggregation: equilibrium finding in multiplayer partially observable stochastic games,” ICLR 2025. (领域顶会)
  42. Yuqian Fu, Yuanheng Zhu*, Jian Zhao, Jiajun Chai, Dongbin Zhao, “INS: Interaction-aware synthesis to enhance offline multi-agent reinforcement learning,” ICLR 2025. (领域顶会). https://github.com/fyqqyf/INS.
  43. Songjun TuJingbo SunQichao Zhang*, Yaocheng ZhangJia LiuKe ChenDongbin Zhao, “In-dataset trajectory return regularization for offline preference-based reinforcement learning,” AAAI 2025. (CCF A). https://github.com/TU2021/DTR.
  44. Jingbo Sun, Songjun Tu, Qichao Zhang*, Ke Chen, Dongbin Zhao*, “Salience-invariant consistent policy learning for generalization in visual reinforcement learning,” AAMAS 2025 oral. (CCF-B) 
  45. Xing Fang, Qichao Zhang*, Haoran Li, Dongbin Zhao, “Consistency policy with categorical critic for autonomous driving,” AAMAS 2025 oral. (CCF-B)
  46. Yaocheng Zhang, Yuanheng Zhu*, Yuqian Fu, Songjun Tu, Dongbin Zhao, “Offline goal-conditioned reinforcement learning with elastic-subgoal diffused policy learning,” AAMAS 2025 oral. (CCF-B)  https://github.com/zhyaoch/ESD.
  47. Songjun Tu, Qichao Zhang*, Dongbin Zhao, “Online preference-based reinforcement learning with self-augmented feedback from large language model,” AAMAS 2025 oral. (CCF-B) https://github.com/TU2021/RL-SaLLM-F.


图书章节

  1.  陈亚冉,李楠楠,丁子祥,赵冬斌,神经网络架构搜索,清华大学出版社,2025年9月出版
团队成员报告
  1. 2026年 3月13日,具身VLA强化学习后训练,智猩猩公开课,线上,李浩然。
  2. 2026年3月21日,畅聊个人成长、洞察行业发展,深蓝学院《与优秀的人同行》第七期,线上,夏中谱。
  3. 2026 年 3 月 27 日,基于世界模型的端到端自动驾驶探索,智能网联汽车高质量发展闭门研讨会,赵冬斌。
  4. 2026年4月11日,基于模仿学习与世界模型的端到端自动驾驶,中国汽车工程学会具身智能电动汽车前沿研讨会,北京,张启超。
  5. 2026年4月25日,具身操作模型和强化学习方法,2026认知系统与信息处理研讨会暨专委会年会,福州,李浩然。


  1. 2025年1月6日,面向高级别自动驾驶的人工智能方法的探索实践,中关村智能网联汽车创新发展论坛,北京,赵冬斌。
  2. 2025年1月11日,从强化学习到大模型和具身智能,IEEE 计算智能学会郑州分会成立大会&计算智能前沿论坛,郑州,赵冬斌。
  3. 2025年1月14日,监督学习式端到端自动驾驶的进展与挑战,第四届全球自动驾驶峰会,北京,张启超。
  4. Feb. 7, 2025, Reinforcement Learning Assisted Large Models and Embodied Intelligence, 13th International Conference on Intelligent Control and Information Processing (ICICIP 2025), Abu Dhabi, UAE & Muscat, Oman, Dongbin Zhao.
  5. 2025年3月22日,面向多任务的多智能体强化学习理论与应用,第四届智能优化与决策前沿论坛会议,北京,赵冬斌。
  6. 2025年3月29日,基于强化学习的视觉-语言-动作模型后训练,中国具身智能大会,北京,李浩然。
  7. 2025年4月26日,基于人工智能方法的高级别自动驾驶,2025年重庆交通大学神经网络与智能控制前沿论坛,重庆,赵冬斌。
  8. 2025年4月27日,基于生成式模型的强化学习,2025年西南大学智能系统感知与控制前沿论坛,重庆,赵冬斌。
  9. 2025年5月9日, 强化学习算法及其自动驾驶应用进展, Pre-conference workshop on Reinforcement Learning and Adaptive Dynamic Programming, IEEE 14th Data Driven Control and Learning System Conference (DDCLS’25), Wuxi, China, Qichao Zhang.
  10. 2025年5月14日,深度强化学习助力智能产业应用,聚合智能产业概念验证实验室启动论坛,北京,赵冬斌。
  11. 2025年5月24日,基于强化学习的机器人具身智能,第三届山东省计算智能大会,徐州,赵冬斌。
  12. 2025年6月14日,开放环境的多智能体决策智能,第四届智能决策论坛-智能学习与博弈论坛,南京,朱圆恒。
  13. 2025年6月14日,基于强化学习的视觉-语言-动作模型后训练,第四届智能决策论坛-具身智能前沿技术论坛,南京,李浩然。
  14. 2025年7月8日,深度强化学习和具身智能,人工智能与学习系统专题研讨会,宁波奉化,赵冬斌。
  15. 2025年8月2日,具身智能中的强化学习,第三届人工智能大模型技术高峰论坛,合肥,赵冬斌
  16. 2025年8月31日,自动驾驶大模型,嘉程创业流水席第271系,北京,夏中谱
  17. 2025年9月20日,面向具身操作的VLA现状和展望第六届中国智能机器人学术年会,南通,赵冬斌。
  18. 2025年9月20日,大语言模型的深度思考能力探索RL China 2025科学智能体论坛,北京,张启超。
  19. 2025年9月21日,强化学习在多模态具身大模型中的应用RL China 2025多模态智能体论坛,北京,李浩然。
  20. 2025年9月26日,开放环境的多智能体决策智能,第十三届中国(绵阳)科技城国际科技博览会及新质生产力人工智能大会暨对接交流会,中国生产力促进中心协会,绵阳,朱圆恒。
  21. 2025年9月28日,端到端自动驾驶的探索和实践,2025车机人创新发展论坛,北京,赵冬斌。
  22. 2025年9月28日,磐石筑基:从AI理论到系统实践”-大语言模型推理技术,张启超
  23. 2025年10月23日,端到端自动驾驶的实践和探索,第三十二届中国汽车工程学会年会,重庆,赵冬斌。
  24. 2025年10月24日,端到端自动驾驶:从模仿学习到强化学习,2025中国车辆控制与智能化大会,Pre-conference Workshop on Trustworthy Autonomous Vehicles,青岛,张启超。
  25. 2025年10月29日,强化学习赋能具身智能,国科大2025-2026学年秋季学期的研究生科学前沿讲座,北京,赵冬斌。
  26. 2025年11月6日,具身智能的实践和探索,北京软件和信息服务业协会人工智能应用大讲堂,北京,赵冬斌

招生信息

招生专业1:控制理论与控制工程--群体智能与博弈对抗

招生专业2:模式识别--人工智能理论与方法


招生方向
深度强化学习,智能驾驶,智能游戏,机器人
人工智能,深度强化学习,多智能体博弈

教育背景

1996-09--2000-04   哈尔滨工业大学   博士
1994-09--1996-07   哈尔滨工业大学   硕士
1990-09--1994-07   哈尔滨工业大学   学士
出国学习工作
2007年8月-2008年8月,University of Arizona, 访问学者,国家留学基金委公派留学计划。

工作经历

   
工作简历
2014-01~2014-02,新加坡科技研究局, 访问学者
2012-11~现在, 中科院自动化所, 研究员、博导
2002-04~2012-10,中国科学院自动化研究所, 副研、硕导-博导
2000-05~2002-01,清华大学, 博士后
社会兼职
2025-02-28-今,中国人工智能学会, 理事
2022-09-01-今,中国人工智能学会智能自适应协同优化控制专委会, 秘书长
2022-01-01-今,中国自动化学会“数据驱动、学习与优化”专业委员会, 副主任
2022-01-01-今,IEEE Computational Intelligence Magazine, Associate Editor
2021-09-01-2022-08-31,IEEE Conference on Games, General Chair
2021-01-01-2021-07-22,The International Joint Conference on Neural Networks (IJCNN), Shenzhen, China, July 18-22, 2021, Competition Chair
2020-07-19-2020-07-24,IEEE World Congress on Computational Intelligence (WCCI 2020), Glasgow, UK, July 19 -24, 2020, Awards Chair
2020-03-01-今,IEEE Transactions on Artificial Intelligence, Associate Editor
2020-01-01-2020-12-31,IEEE CIS Distinguished Lectures Program, Chair
2019-12-11-2019-12-16,The 10th International Conference on Intelligent Control and Information Processing (ICICIP 2019), Marrakesh, Morocco, Program Chair
2019-12-06-2019-12-09,IEEE Symposium on Adaptive Dynamic Programming and Reinforcement Learning (ADPRL 2019), Xiamen, China, Program Chair
2019-07-13-2019-07-18,IEEE International Joint Conference on Neural Networks (IJCNN 2019), Budapest, Hungary, Program Co-Chair
2019-05-04-2019-05-06,IEEE International Conference on Computational Intelligence for Financial Engineering and Economics (CIFEr 2019), Shenzhen, China, General Co-Chair
2019-01-01-2019-12-31,IEEE CIS Technical Activities Strategy Planning Sub-Committee, Chair
2018-12-01-2018-12-04,The 25th International Conference on Neural Information Processing (ICONIP 2018), Siem Reap, Cambodia, Dec 1-4, 2018, Tutorial Chair
2018-11-18-2018-11-21,IEEE Symposium on Adaptive Dynamic Programming and Reinforcement Learning (ADPRL 2018), Bangalore, India, Nov. 18 -21, 2018, Program Chair
2018-09-01-2019-08-31,IEEE Computation Intelligence Magazine special issue on “Deep Reinforcement Learning and Games”., Lead Guest Chair
2018-06-29-2018-07-06,2018 Eighth International Conference on Information Science and Technology (ICIST 2018), Cordoba, Granada, and Seville, Spain during June 30-July 6, 2018, Program Chair
2018-05-31-2018-12-31,IEEE Transactions on Neural Networks and Learning Systems special issue on “Deep Reinforcement Learning and Adaptive Dynamic Programming”, Lead Guest Editor
2018-03-01-今,IEEE Transactions on Cybernetics, Associate Editor
2017-11-26-2017-11-30,IEEE Symposium on Adaptive Dynamic Programming and Reinforcement Learning (ADPRL 2017), Honolulu, Hawaii, USA, Program Chair
2017-11-13-2017-11-17,The 24th International Conference on Neural Information Processing (ICONIP 2017), Guangzhou, China, Program Chair
2017-07-05-2017-07-27,2017 IEEE CIS Summer School on Computational and Artificial Intelligence, Chair
2016-12-31-2017-12-31,IEEE计算智能学会北京分会, 主席
2016-12-05-2016-12-08,IEEE Symposium on Adaptive Dynamic Programming and Reinforcement Learning (ADPRL 2016), Athens, Greece, Program Chair
2016-07-25-2017-07-29,IEEE World Congress on Computational Intelligence (WCCI 2016), Vancouver, Canada, Publicity Co-chair
2016-06-11-2016-06-14,The 13th World Congress on Intelligent Control and Automation (WCICA 2016), Guilin, China, Program Co-Chair
2015-10-15-2015-10-18,12th International Symposium on Neural Networks (ISNN 2015), Jeju, Korea, Program Co-Chair
2015-04-24-2015-04-26,The 5th International Conference on Information Science and Technology (ICIST 2015), Changsha, China, Program Chair
2015-01-01-今,Artificial Intelligence Review, Associate Editor
2014-12-31-2016-12-31,IEEE计算智能学会自适应动态规划和强化学习技术委员会, 主席
2014-12-31-2015-12-31,IEEE计算智能学会旅行资助委员会, 主席
2014-12-31-2016-12-31,IEEE计算智能学会多媒体委员会, 主席
2014-12-31-2016-12-31,IEEE计算智能学会北京分会, 副主席
2014-12-09-2014-12-12,IEEE Symposiums Series on Computational Intelligence (SSCI 2014), Atlanta, USA, Poster Chair
2014-07-06-2014-07-11,IEEE World Congress on Computational Intelligence (WCCI 2014), Beijing, China, Finance Co-Chair
2014-07-06-2014-07-11,IEEE CIS Summer School on Automated Computational Intelligence, Beijing, China, Chair
2013-12-31-2020-12-31,IEEE Computational Intelligence Magazine, Associate Editor,
2013-06-09-2013-06-11,The 4th International Conference on Intelligent Control and Information Processing (ICICIP 2013), Beijing, China, Program Chair
2012-12-31-2014-12-30,IEEE CIS Newsletter, Editor,
2012-07-11-2012-07-14,International Symposium on Neural Networks (ISNN 2012), Shenyang, China, Registration Chair
2012-07-11-2012-07-14,Brain Inspired Cognitive Systems (BICS 2012), Shenyang, China, Finance Chair
2011-12-31-2021-12-31,IEEE Transactions on Neural Networks and Learning Systems, Associate Editor
2011-11-01-今,Cognitive Computation, Associate Editor,
2010-09-30-2019-12-31,IEEE高级会员,
-今,

教授课程

演化计算
强化学习
计算智能
本科生毕业设计(计算机科学与技术)
智能控制
智能控制理论基础及应用

专利与奖励

   
奖励信息
(1) 中国科学院李佩优秀教师奖, , 院级, 2025
(2) 中国科学院大学教育教学成果奖, 二等奖, 研究所(学校), 2025
(3) 2025年度中国科学院优秀导师, 院级, 2025
(4) 北京智源人工智能研究院2025级智源学者, 市地级, 2025
(5) 天津市自然科学二等奖, 二等奖, 省级, 2023
(6) 北京市自然科学奖二等奖“高效深度强化学习算法和最优性分析”, 二等奖, 省级, 2022
(7) 《强化学习》获中国科学院大学校级研究生优秀课程, , 研究所(学校), 2022
(8) 中国自动化学会会士, , 部委级, 2022
(9) Fellow of Asia-Pacific Artificial Intelligence Association (AAIA), 其他, 2022
(10) IEEE Transactions on Emerging Topics in Computational Intelligence, 2022年度唯一优秀论文奖, , 国家级, 2022
(11) IEEE Transactions on Automation Science and Engineering 2022年度唯一最佳论文奖, , 国家级, 2022
(12) 2022 IEEE 11th Data Driven Control and Learning Systems Conference (DDCLS), Best Paper Finalist, 其他, 2022
(13) 中国科学院大学“智能基座”产教融合协同育人基地项目优秀教师奖, 研究所(学校), 2022
(14) 中国科学院大学“智能基座”产教融合协同育人基地项目优秀教师奖, 研究所(学校), 2021
(15) “国科大杯”创新创业大赛创意组一等奖,指导教师, 一等奖, 研究所(学校), 2020
(16) 第六届中国国际“互联网+”大学生创新创业大赛,北京赛区复赛二等奖,指导教师, 二等奖, 省级, 2020
(17) Robomaster全球人工智能挑战赛,导航与运动规划比赛一等奖,指导教师, 一等奖, 其他, 2020
(18) Robomaster全球人工智能挑战赛,感知比赛一等奖,指导教师, 一等奖, 其他, 2020
(19) Robomaster全球人工智能挑战赛,决策比赛一等奖,指导教师, 一等奖, 其他, 2020
(20) IEEE Conference on Games格斗游戏冠军,指导教师, 一等奖, 其他, 2020
(21) Outstanding Associate Editor of 2019, IEEE Transactions on Neural Networks and Learning Systems, 其他, 2020
(22) “国科大杯”创新创业大赛优秀指导老师, , 研究所(学校), 2020
(23) IEEE Fellow, , 国家级, 2020
(24) IEEE Transactions on Cognitive and Developmental Systems优秀论文奖(唯一), , 其他, 2020
(25) 2019年中国AI+创新创业大赛, 一等奖, 部委级, 2019
(26) 中国人工智能学会优秀博士学位论文指导教师, , 部委级, 2019
(27) 《控制理论与应用》优秀编委, , 其他, 2019
(28) 《控制理论与应用》优秀论文奖, , 其他, 2018
(29) IJCNN 2018 Best Student Paper Final List, , 其他, 2018
(30) 前方车辆检测第1名,2017年中国智能车未来挑战赛—复杂交通环境认知基础能力离线测试比赛, 一等奖, 国家级, 2017
(31) 前方车辆距离监测第1名,2017年中国智能车未来挑战赛—复杂交通环境认知基础能力离线测试比赛, 一等奖, 国家级, 2017
(32) 基于数据的非线性系统自学习最优控制理论与方法, 三等奖, 部委级, 2015
(33) 中国科学院“朱李月华优秀教师”奖, , 院级, 2014
(34) 中国石油和化工自动化应用协会科技进步一等奖, 一等奖, 部委级, 2012
(35) 北京市科学技术奖, 三等奖, 省级, 2010
(36) 中国石油和化学工业协会科技进步三等奖, 三等奖, 部委级, 2009

出版信息


发表论文
(1) Adaptive search for broad attention based vision transformers, NEUROCOMPUTING, 2025, 第 3 作者
(2) 基于Transformer的状态−动作−奖赏预测表征学习, State-Action-Reward Prediction Representation Learning Based on Transformer, 自动化学报, 2025, 第 3 作者
(3) Multimodal Learning-based Prediction for Nonalcoholic Fatty Liver Disease, MACHINE INTELLIGENCE RESEARCH, 2025, 第 5 作者
(4) 两团队零和博弈下熵引导的极小极大值分解强化学习方法, Entropy-guided Minimax Factorization for Reinforcement Learning in Two-team Zero-sum Games, 自动化学报, 2025, 第 3 作者
(5) RoboGPT: an LLM-based Long-term Decision-making Embodied Agent for Instruction Following Tasks, IEEE Transactions on Cognitive and Developmental Systems, 2025, 第 8 作者  通讯作者
(6) Unsupervised Zero-Shot Reinforcement Learning via Dual-Value Forward-Backward Representation, The Thirteenth International Conference on Learning Representations, 2025, 第 8 作者
(7) Deep-Reinforcement-Learning-Based Driving Policy at Intersections Utilizing Lane Graph Networks, IEEE TRANSACTIONS ON COGNITIVE AND DEVELOPMENTAL SYSTEMS, 2024, 第 4 作者
(8) Discretizing Continuous Action Space With Unimodal Probability Distributions for On-Policy Reinforcement Learning, IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2024, 第 5 作者
(9) BViT: Broad Attention-Based Vision Transformer, IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2024, 第 5 作者
(10) NeuronsGym: A Hybrid Framework and Benchmark for Robot Navigation With Sim2Real Policy Learning, IEEE TRANSACTIONS ON EMERGING TOPICS IN COMPUTATIONAL INTELLIGENCE, 2024, 第 6 作者  通讯作者
(11) Dynamic-Horizon Model-Based Value Estimation With Latent Imagination, IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2024, 第 3 作者
(12) FM3Q: Factorized Multi-Agent MiniMax Q-Learning for Two-Team Zero-Sum Markov Game, IEEE TRANSACTIONS ON EMERGING TOPICS IN COMPUTATIONAL INTELLIGENCE, 2024, 第 4 作者
(13) Data Generation Feedback Relearning Control for Unmodeled Nonlinear Systems, IEEE TRANSACTIONS ON EMERGING TOPICS IN COMPUTATIONAL INTELLIGENCE, 2024, 第 3 作者
(14) High-Quality Synthetic Data is Efficient for Model-Based Offline Reinforcement Learning, International Joint Conference on Neural Networks (IJCNN), 2024, 第 5 作者  通讯作者
(15) Prototypical Context-Aware Dynamics for Generalization in Visual Control With Model-Based Reinforcement Learning, IEEE TRANSACTIONS ON INDUSTRIAL INFORMATICS, 2024, 第 5 作者
(16) Stabilizing Diffusion Model for Robotic Control with Dynamic Programming and Transition Feasibility, IEEE Transactions on Artificial Intelligence, 2024, 第 5 作者
(17) MAT: Morphological Adaptive Transformer for Universal Morphology Policy Learning, IEEE Transactions on Cognitive and Developmental Systems, 2024, 第 4 作者
(18) Boosting Continuous Control with Consistency Policy, International Conference on Autonomous Agents and Multiagent Systems, 2024, 第 3 作者
(19) Empirical Policy Optimization for n -Player Markov Games, IEEE TRANSACTIONS ON CYBERNETICS, 2023, 第 5 作者  通讯作者
(20) NVIF: Neighboring Variational Information Flow for Cooperative Large-Scale Multiagent Reinforcement Learning, IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2023, 第 3 作者
(21) Soft Contrastive Learning With Q-Irrelevance Abstraction for Reinforcement Learning, IEEE TRANSACTIONS ON COGNITIVE AND DEVELOPMENTAL SYSTEMS, 2023, 第 5 作者  通讯作者
(22) A Hierarchical Deep Reinforcement Learning Framework for 6-DOF UCAV Air-to-Air Combat, IEEE TRANSACTIONS ON SYSTEMS MAN CYBERNETICS-SYSTEMS, 2023, 第 5 作者
(23) UNMAS: Multiagent Reinforcement Learning for Unshaped Cooperative Scenarios, IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2023, 第 4 作者  通讯作者
(24) BViT: Broad Attention-Based Vision Transformer, IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2023, 第 5 作者
(25) Enhanced Rolling Horizon Evolution Algorithm With Opponent Model Learning: Results for the Fighting Game AI Competition, IEEE TRANSACTIONS ON GAMES, 2023, 第 3 作者  通讯作者
(26) 关键点图对比图像分类方法, Keypoint-based graph contrastive neural network for image classification, 智能系统学报, 2023, 第 3 作者
(27) NeuronsMAE: A Novel Multi-Agent Reinforcement Learning Environment for Cooperative and Competitive Multi-Robot Tasks, 2023 International Joint Conference on Neural Networks(IJCNN), 2023, 第 5 作者
(28) Highway Lane Change Decision-Making via Attention-Based Deep Reinforcement Learning, IEEE-CAA JOURNAL OF AUTOMATICA SINICA, 2022, 第 3 作者
(29) Multi-task safe reinforcement learning for navigating intersections in dense traffic, Journal of the Franklin Institute, 2022, 第 3 作者
(30) Neurons Perception Dataset for RoboMaster AI Challenge, 2022 IEEE World Congress on Computational Intelligence (WCCI), 2022, 第 6 作者
(31) Missile guidance with assisted deep reinforcement learning for head-on interception of maneuvering target, COMPLEX & INTELLIGENT SYSTEMS, 2022, 第 3 作者
(32) BNAS: Efficient Neural Architecture Search Using Broad Scalable Architecture, IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2022, 第 4 作者  通讯作者
(33) TrajGen: Generating Realistic and Diverse Trajectories With Reactive and Feasible Agent Behaviors for Autonomous Driving, IEEE TRANSACTIONS ON INTELLIGENT TRANSPORTATION SYSTEMS, 2022, 第 8 作者  通讯作者
(34) Soft Contrastive Learning with Q-irrelevance Abstraction for Reinforcement Learning, Ieee transactions on cognitive and developmental systems, 2022, 第 5 作者
(35) Dynamic-Horizon Model-Based Value Estimation With Latent Imagination, IEEE Transactions on Neural Networks and Learning Systems, 2022, 第 3 作者
(36) Empirical Policy Optimization for n-Player Markov Games, IEEE TRANSACTIONS ON CYBERNETICS, 2022, 第 5 作者  通讯作者
(37) Online Minimax Q Network Learning for Two-Player Zero-Sum Markov Games, IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2022, 第 2 作者  通讯作者
(38) Heuristic rank selection with progressively searching tensor ring network, COMPLEX & INTELLIGENT SYSTEMS, 2022, 第 5 作者
(39) BiFNet: Bidirectional Fusion Network for Road Segmentation, IEEE TRANSACTIONS ON CYBERNETICS, 2022, 第 4 作者  通讯作者
(40) BNAS-v2: Memory-Efficient and Performance-Collapse-Prevented Broad Neural Architecture Search, IEEE TRANSACTIONS ON SYSTEMS MAN CYBERNETICS-SYSTEMS, 2022, 第 4 作者  通讯作者
(41) 实时格斗游戏的智能决策方法, Intelligent decision making approaches for real time fighting game, 控制理论与应用, 2022, 第 4 作者
(42) CNN-G: Convolutional Neural Network Combined With Graph for Image Segmentation With Theoretical Analysis, IEEE TRANSACTIONS ON COGNITIVE AND DEVELOPMENTAL SYSTEMS, 2021, 第 3 作者  通讯作者
(43) UNMAS: Multiagent Reinforcement Learning for Unshaped Cooperative Scenarios, IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2021, 第 4 作者  通讯作者
(44) BiFNet: Bidirectional Fusion Network for Road Segmentation, IEEE TRANSACTIONS ON CYBERNETICS, 2021, 第 4 作者  通讯作者
(45) Optimal Feedback Control of Pedestrian Flow in Heterogeneous Corridors, IEEE TRANSACTIONS ON AUTOMATION SCIENCE AND ENGINEERING, 2021, 第 2 作者
(46) Missile guidance with assisted deep reinforcement learning for head-on interception of maneuvering target, COMPLEXINTELLIGENTSYSTEMS, 2021, 第 3 作者
(47) MGRL: Graph neural network based inference in a Markov network with reinforcement learning for visual navigation, NEUROCOMPUTING, 2021, 第 3 作者  通讯作者
(48) BNAS: Efficient Neural Architecture Search Using Broad Scalable Architecture, IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2021, 第 4 作者  通讯作者
(49) Event-Triggered Communication Network With Limited-Bandwidth Constraint for Multi-Agent Reinforcement Learning, IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2021, 第 3 作者
(50) LMI-Based Synthesis of String-Stable Controller for Cooperative Adaptive Cruise Control, IEEE TRANSACTIONS ON INTELLIGENT TRANSPORTATION SYSTEMS, 2020, 第 3 作者  通讯作者
(51) Synthesis of Cooperative Adaptive Cruise Control With Feedforward Strategies, IEEE TRANSACTIONS ON VEHICULAR TECHNOLOGY, 2020, 第 2 作者
(52) A spatial-temporal LSTM model for human trajectory prediction, IEEE/CAA Journal of Automation Sinica, 2020, 第 1 作者
(53) A Spatial-Temporal Attention Model for Human Trajectory Prediction, A Spatial-Temporal Attention Model forHuman Trajectory Prediction, IEEE-CAA JOURNAL OF AUTOMATICA SINICA, 2020, 第 4 作者
(54) Artificial intelligence in tongue diagnosis: Using deep convolutional neural network for recognizing unhealthy tongue with tooth-mark, COMPUTATIONAL AND STRUCTURAL BIOTECHNOLOGY JOURNAL, 2020, 第 16 作者
(55) Hierarchical optimal control for input-affine nonlinear systems through the formulation of Stackelberg game, INFORMATION SCIENCES, 2020, 第 4 作者
(56) Deep Reinforcement Learning-Based Automatic Exploration for Navigation in Unknown Environment, IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2020, 第 3 作者  通讯作者
(57) Cooperative Multi-Agent Deep Reinforcement Learning with Counterfactual Reward, 2020 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2020, 第 4 作者
(58) An Improved Minimax-Q Algorithm Based on Generalized Policy Iteration to Solve a Chaser-Invader Game, 2020 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2020, 第 3 作者
(59) Invariant Adaptive Dynamic Programming for Discrete-Time Optimal Control, IEEE TRANSACTIONS ON SYSTEMS MAN CYBERNETICS-SYSTEMS, 2020, 第 2 作者
(60) RailNet: An Information Aggregation Network for Rail Track Segmentation, 2020 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2020, 第 3 作者
(61) Advances in Deep Neural Information Processing - Editorial, Neurocomputing, 2020, 第 1 作者
(62) Advances in deep neural information processing, NEUROCOMPUTING, 2020, 第 1 作者  通讯作者
(63) Adaptive optimal control of cooperative adaptive cruise control with uncertain heterogeneous vehicles, IEEE Control System Technology, 2019, 第 1 作者
(64) Control-Limited Adaptive Dynamic Programming for Multi-Battery Energy Storage Systems, IEEE TRANSACTIONS ON SMART GRID, 2019, 第 2 作者  通讯作者
(65) Graph-FCN for Image Semantic Segmentation, ADVANCES IN NEURAL NETWORKS - ISNN 2019, PT I, 2019, 第 3 作者
(66) Comparison of Control Methods Based on Imitation Learning for Autonomous Driving, 2019 TENTH INTERNATIONAL CONFERENCE ON INTELLIGENT CONTROL AND INFORMATION PROCESSING (ICICIP), 2019, 第 5 作者
(67) StarCraft Micromanagement With Reinforcement Learning and Curriculum Transfer Learning, IEEE TRANSACTIONS ON EMERGING TOPICS IN COMPUTATIONAL INTELLIGENCE, 2019, 
(68) Optimal Pedestrian Evacuation in Building with Consecutive Differential Dynamic Programming, 2019 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2019, 第 3 作者
(69) Real-time multiple object tracking based on optical flow, 2019 9TH INTERNATIONAL CONFERENCE ON INFORMATION SCIENCE AND TECHNOLOGY (ICIST2019), 2019, 第 4 作者
(70) Deep sparse representation-based mid-level visual elements discovery in fine-grained classification, SOFT COMPUTING, 2019, 第 2 作者  通讯作者
(71) Deep Kalman Filter with Optical Flow for Multiple Object Tracking, 2019 IEEE INTERNATIONAL CONFERENCE ON SYSTEMS, MAN AND CYBERNETICS (SMC), 2019, 第 2 作者
(72) Reinforcement Learning and Deep Learning based Lateral Control for Autonomous Driving, IEEE COMPUTATIONAL INTELLIGENCE MAGAZINE, 2019, 第 2 作者
(73) Adaptive cruise control via adaptive dynamic programming with experience replay, SOFT COMPUTING, 2019, 第 2 作者
(74) Multi-Objective Neural Architecture Search for Light-Weight Model, 2019, 第 4 作者
(75) Reinforcement Learning based Lane Change Decision-Making with Imaginary Sampling, 2019 IEEE SYMPOSIUM SERIES ON COMPUTATIONAL INTELLIGENCE (IEEE SSCI 2019), 2019, 第 2 作者
(76) Data-based reinforcement learning for nonzero-sum games with unknown drift dynamics, IEEE Transactions on Cybernetics, 2019, 第 2 作者  通讯作者
(77) Lane Change Decision-making through Deep Reinforcement Learning with Rule-based Constraints, 2019 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2019, 第 3 作者
(78) Adaptive Optimal Control of Heterogeneous CACC System With Uncertain Dynamics, IEEE TRANSACTIONS ON CONTROL SYSTEMS TECHNOLOGY, 2019, 第 2 作者  通讯作者
(79) Model-Free Reinforcement Learning based Lateral Control for Lane Keeping, 2019 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2019, 第 3 作者
(80) A Review of Computational Intelligence for StarCraft AI, 8th IEEE Symposium Series on Computational Intelligence (IEEE SSCI), 2018, 第 5 作者
(81) Overview of Image Segmentation and Its Application on Free Space Detection, PROCEEDINGS OF 2018 IEEE 7TH DATA DRIVEN CONTROL AND LEARNING SYSTEMS CONFERENCE (DDCLS), 2018, 第 3 作者
(82) Reinforcement Learning for Build-Order Production in StarCraft II, 8th International Conference on Information Science and Technology (ICIST), 2018, 第 2 作者
(83) Multi-task learning for dangerous object detection in autonomous driving, INFORMATION SCIENCES, 2018, 第 2 作者  通讯作者
(84) Comprehensive comparison of online ADP algorithms for continuous-time optimal control, Artificial Intelligence Review, 2018, 第 2 作者
(85) A Gradient-Based Reinforcement Learning Algorithm for Multiple Cooperative Agents, IEEE ACCESS, 2018, 第 3 作者
(86) A Temporal-based Deep Learning Method for Multiple Objects Detection in Autonomous Driving, 2018, 第 1 作者
(87) A Review of Computational Intelligence for StarCraft AI, 2018 IEEE SYMPOSIUM SERIES ON COMPUTATIONAL INTELLIGENCE (IEEE SSCI), 2018, 第 5 作者
(88) Special Issue on Deep Reinforcement Learning and Adaptive Dynamic Programming, IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2018, 第 1 作者  通讯作者
(89) An Autonomous Driving Experience Platform with Learning-Based Functions, 2018 IEEE SYMPOSIUM SERIES ON COMPUTATIONAL INTELLIGENCE (IEEE SSCI), 2018, 第 2 作者
(90) Learning battles in ViZDoom via deep reinforcement learning, 2018, 第 4 作者
(91) Model-Free Reinforcement Learning for Fully Cooperative Multi-Agent Graphical Games, 2018 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2018, 第 2 作者
(92) Policy Iteration for H infinity Optimal Control of Polynomial Nonlinear Systems via Sum of Squares Programming, IEEE TRANSACTIONS ON CYBERNETICS, 2018, 第 2 作者
(93) Event-Based Robust Control for Uncertain Nonlinear Systems Using Adaptive Dynamic Programming, IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2018, 第 2 作者
(94) Visual Navigation with Actor-Critic Deep Reinforcement Learning, 2018 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2018, 第 2 作者
(95) A temporal-based deep learning method for multiple objects detection in autonomous driving, 2018 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2018, 第 2 作者
(96) Guest Editorial Special Issue on Deep/Reinforcement Learning and Games, IEEE TRANSACTIONS ON GAMES, 2018, 
(97) A pdf-Free Change Detection Test Based on Density Difference Estimation, IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2018, 第 3 作者
(98) Comprehesive comparison of online ADP algorithms for continuous-time optimal control, Artificial Intelligence Review, 2018, 第 1 作者
(99) DeepSign: Deep Learning based Traffic Sign Recognition, 2018 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2018, 第 2 作者
(100) Hybrid Deep Learning Based Moving Object Detection via Motion prediction, 2018 CHINESE AUTOMATION CONGRESS (CAC), 2018, 第 3 作者
(101) An Autonomous Driving Experience Platform with Learning-Based Functions, 8th IEEE Symposium Series on Computational Intelligence (IEEE SSCI), 2018, 第 3 作者
(102) Visual navigation with Actor-Critic deep reinforcement learning, 2018, 第 3 作者
(103) A Semi-Supervised Predictive Sparse Decomposition Based on Task-Driven Dictionary Learning, COGNITIVE COMPUTATION, 2017, 第 2 作者  通讯作者
(104) Model-free Optimal Control based Intelligent Cruise Control with Hardware-in-the-loop Demonstration, IEEE COMPUTATIONAL INTELLIGENCE MAGAZINE, 2017, 第 1 作者
(105) Event-triggered optimal control for nonlinear constrained-input systems with partially unknown dynamics via adaptive dynamic programming, IEEE Transactions on Industrial Electronics, 2017, 第 1 作者
(106) Event-Triggered Adaptive Dynamic Programming for Uncertain Nonlinear Systems, COGNITIVE SYSTEMS AND SIGNAL PROCESSING, ICCSIP 2016, 2017, 第 2 作者  通讯作者
(107) Data-driven adaptive dynamic programming for two-player nonzero-sum game, 2017, 第 1 作者
(108) Neural Information Processing, NEURALINFORMATIONPROCESSINGLECTURENOTESINCOMPUTERSCIENCE, 2017, 第 3 作者
(109) Multi-task Learning with Cartesian Product-Based Multi-objective Combination for Dangerous Object Detection, ADVANCES IN NEURAL NETWORKS, PT I, 2017, 第 2 作者  通讯作者
(110) Data-Driven Adaptive Dynamic Programming for Two-Player Nonzero-Sum Game, 2017 29TH CHINESE CONTROL AND DECISION CONFERENCE (CCDC), 2017, 第 2 作者
(111) An Incremental Change Detection Test Based on Density Difference Estimation, IEEE TRANSACTIONS ON SYSTEMS MAN CYBERNETICS-SYSTEMS, 2017, 第 2 作者
(112) Event-Triggered H-infinity Control for Continuous-Time Nonlinear System via Concurrent Learning, IEEE TRANSACTIONS ON SYSTEMS MAN CYBERNETICS-SYSTEMS, 2017, 第 2 作者
(113) FMRQ-A Multiagent Reinforcement Learning Algorithm for Fully Cooperative Tasks, IEEE TRANSACTIONS ON CYBERNETICS, 2017, 第 2 作者
(114) Policy Gradient Methods with Gaussian Process Modelling Acceleration, International Joint Conference on Neural Networks (IJCNN), 2017, 第 2 作者
(115) Event-Triggered Optimal Control for Partially Unknown Constrained-Input Systems via Adaptive Dynamic Programming, IEEE TRANSACTIONS ON INDUSTRIAL ELECTRONICS, 2017, 第 2 作者
(116) Cooperative Reinforcement Learning for Multiple Units Combat in StarCraft, 2017 IEEE SYMPOSIUM SERIES ON COMPUTATIONAL INTELLIGENCE (SSCI), 2017, 第 3 作者
(117) Image Clustering based on Deep Sparse Representations, 2016 IEEE Symposium Series on Computational Intelligence: SSCI 2016, Athens, Greece, 6-9 December 2016, pages 2037-2712, v.4, 2017, 第 2 作者
(118) A Kolmogorov-Smirnov Test to Detect Changes in Stationarity in Big Data, IFAC PAPERSONLINE, 2017, 第 1 作者
(119) An Incremental Change Detection Test Based on Density Difference Estimation, IEEE TRANSACTIONS ON SYSTEMS, MAN, AND CYBERNETICS: SYSTEMS, 2017, 第 2 作者
(120) Building Energy Consumption Prediction: An Extreme Deep Learning Approach, ENERGIES, 2017, 第 3 作者
(121) Event-triggered integral reinforcement learning for nonlinear continuous-time systems, 2017 IEEE SYMPOSIUM SERIES ON COMPUTATIONAL INTELLIGENCE (SSCI), 2017, 第 2 作者
(122) Data-driven adaptive dynamic programming for continuous-time fully cooperative games with partially constrained inputs, NEUROCOMPUTING, 2017, 第 2 作者  通讯作者
(123) Deep Reinforcement Learning With Visual Attention for Vehicle Classification, IEEE TRANSACTIONS ON COGNITIVE AND DEVELOPMENTAL SYSTEMS, 2017, 第 1 作者  通讯作者
(124) Policy Gradient Methods with Gaussian Process Modelling Acceleration, International Joint Conference on Neural Networks (IJCNN), 2017, 第 2 作者
(125) Comparison of methods to efficient graph SLAM under general optimization framework, YAC 2017, 2017, 第 3 作者
(126) Editorial: new developments in neural network structures for signal processing, autonomous decision, and adaptive controll, IEEE Transactions on Neural Networks and Learning Systems, 2017, 第 1 作者
(127) Comparison of Methods to Efficient Graph SLAM Under General Optimization Framework, 2017 32ND YOUTH ACADEMIC ANNUAL CONFERENCE OF CHINESE ASSOCIATION OF AUTOMATION (YAC), 2017, 第 3 作者
(128) ADP with MCTS algorithm for Gomoku, 2017, 第 4 作者
(129) Cooperative Reinforcement Learning for Multiple Units Combat in StarCraft, 2017, 第 2 作者
(130) 深度强化学习进展: 从 AlphaGo 到 AlphaGo Zero, Recent progress of deep reinforcement learning:from AlphaGo to AlphaGo Zero, 控 制 理 论 与 应 用, 2017, 第 3 作者
(131) Iterative Adaptive Dynamic Programming for Solving Unknown Nonlinear Zero-Sum Game Based on Online Data, IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2017, 第 2 作者
(132) A Kolmogorov-Smirnov test to detect changes in stationarity in big data, 20th World Congress of the International-Federation-of-Automatic-Control (IFAC), 2017, 第 1 作者
(133) Event-Triggered $H_\\infty $ Control for Continuous-Time Nonlinear System via Concurrent Learning, IEEE TRANSACTIONS ON SYSTEMS, MAN, AND CYBERNETICS: SYSTEMS, 2017, 第 2 作者
(134) Data-Based Adaptive Critic Designs for Nonlinear Robust Optimal Control With Uncertain Dynamics, IEEE TRANSACTIONS ON SYSTEMS MAN CYBERNETICS-SYSTEMS, 2016, 第 4 作者
(135) Fuzzy-Based Goal Representation Adaptive Dynamic Programming, IEEE TRANSACTIONS ON FUZZY SYSTEMS, 2016, 第 5 作者
(136) Using reinforcement learning techniques to solve continuous-time non-linear optimal tracking problem without system dynamics, IET CONTROL THEORY AND APPLICATIONS, 2016, 第 2 作者  通讯作者
(137) Model-free reinforcement learning for nonlinear zero-sum games with simultaneous explorations, 2016, 第 3 作者
(138) A perturbed Gaussian process regression with chunk sparsification for tracking non-stationary systems, 28th Chinese Control and Decision Conference, 2016, 第 3 作者
(139) Ensemble LSDD-based change detection tests, 2016, 第 1 作者
(140) 深度强化学习综述:兼论计算机围棋的发展, 控制理论与应用, 2016, 第 9 作者
(141) Deep Reinforcement Learning with Experience Replay Based on SARSA, PROCEEDINGS OF 2016 IEEE SYMPOSIUM SERIES ON COMPUTATIONAL INTELLIGENCE (SSCI), 2016, 第 1 作者  通讯作者
(142) “机器智能、系统优化与最优决策”专刊前言, 控制理论与应用, 2016, 
(143) Move Prediction in Gomoku Using Deep Learning, 2016 31ST YOUTH ACADEMIC ANNUAL CONFERENCE OF CHINESE ASSOCIATION OF AUTOMATION (YAC), 2016, 第 2 作者
(144) Model-free iterative adaptive dynamic programming solving unknown nonlinear zero-sum game based on online measurement, IEEE Transactions on Neural Networks and Learning Systems, 2016, 第 1 作者
(145) Deep reinforcement learning with Experience Replay based on SARSA, 2016, 第 4 作者
(146) Image clustering based on the deep sparse representations, COMPUTATIONAL INTELLIGENCE (SSCI), 2016 IEEE SYMPOSIUM SERIES ON, 2016, 第 1 作者
(147) Experience Replay for Optimal Control of Nonzero-Sum Game Systems With Unknown Dynamics, IEEE TRANSACTIONS ON CYBERNETICS, 2016, 第 1 作者  通讯作者
(148) A Visual Attention based Convolutional Neural Network for Image Classification, 12th World Congress on Intelligent Control and Automation (WCICA), 2016, 第 2 作者
(149) Ensemble LSDD-based Change Detection Tests, 2016 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2016, 第 3 作者
(150) Online reinforcement learning control by Bayesian inference, IET CONTROL THEORY AND APPLICATIONS, 2016, 第 2 作者  通讯作者
(151) 概率近似正确的强化学习算法解决连续状态空间控制问题, Probably approximately correct reinforcement learning solving continuous-state control problem, 控制理论与应用, 2016, 第 1 作者
(152) Data-Based Adaptive Critic Designs for Nonlinear Robust Optimal Control With Uncertain Dynamics, IEEE TRANSACTIONS ON SYSTEMS MAN CYBERNETICS-SYSTEMS, 2016, 第 4 作者
(153) GrDHP: A General Utility Function Representation for Dual Heuristic Dynamic Programming, IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2015, 第 3 作者
(154) Consensus of Heterogeneous Multi-agent Systems With Switching Topologies Using Input-output Feedback Linearization, 2015 34th Chinese control conference: CCC 2015, Hangzhou, China, 28-30 July 2015, pages 6414-7296, v.8, 2015, 第 1 作者
(155) Consensus of Heterogeneous Multi-agent Systems With Switching Topologies Using Input-output Feedback Linearization, 2015 34TH CHINESE CONTROL CONFERENCE (CCC), 2015, 第 2 作者
(156) Computational Energy Management in Smart Grids, NEUROCOMPUTING, 2015, 第 4 作者
(157) Convergence Proof of Approximate Policy Iteration for Undiscounted Optimal Control of Discrete-Time Systems, COGNITIVE COMPUTATION, 2015, 第 2 作者  通讯作者
(158) 能源存储:一种新的方法, 能源存储:一种新的方法, 2015, 第 5 作者
(159) MEC-A Near-Optimal Online Reinforcement Learning Algorithm for Continuous Deterministic Systems, IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2015, 第 1 作者  通讯作者
(160) A data-based online reinforcement learning algorithm satisfying probably approximately correct principle, NEURAL COMPUTING & APPLICATIONS, 2015, 第 2 作者  通讯作者
(161) Model-Free Adaptive Algorithm for Optimal Control of Continuous-Time Nonlinear System, 2015, 第 1 作者
(162) Event-triggered hinfinity control for continuous-time nonlinear system, 12th International Symposium on Neural Networks (ISNN), 2015, 第 1 作者  通讯作者
(163) Machine Learning with Applications to Autonomous Systems, MATHEMATICAL PROBLEMS IN ENGINEERING, 2015, 第 3 作者
(164) Thermal Comfort Control Based on MEC Algorithm for HVAC Systems, 2015 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2015, 第 2 作者
(165) Online Reinforcement Learning by Bayesian Inference, PROCEEDINGS OF INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS 2015, 2015, 第 1 作者
(166) 智能小区商业模式及运营策略分析, Analysis of intelligent community business model and operation mode, 电力系统保护与控制, 2015, 第 3 作者
(167) Clique-based cooperative multiagent reinforcement learning using factor graphs, IEEE/CAA JOURNAL OF AUTOMATICA SINICA, 2015, 第 1 作者
(168) Model-Free Optimal Control for Affine Nonlinear Systems With Convergence Analysis, IEEE TRANSACTIONS ON AUTOMATION SCIENCE AND ENGINEERING, 2015, 第 1 作者  通讯作者
(169) Model-free optimal control for affine nonlinear systems based on action dependent heuristic dynamic programming with convergency analysis, IEEE Transactions on Automation and Science Engineering, 2015, 第 1 作者
(170) Convergence analysis and application of fuzzy-HDP for nonlinear discrete-time HJB systems, NEUROCOMPUTING, 2015, 第 2 作者  通讯作者
(171) Online Synchronous Policy Iteration Based on Concurrent Learning to Solve Continuous-time Optimal Control Problem, 2015 5TH INTERNATIONAL CONFERENCE ON INFORMATION SCIENCE AND TECHNOLOGY (ICIST), 2015, 第 2 作者
(172) Thermal Comfort Control Based on MEC Algorithm for HVAC System, 2015, 第 4 作者
(173) Online Synchronous Policy Iteration Based on Concurrent Learning to Solve Continuous-time Optimal Control Problem, PROCEEDINGS OF INTERNATIONAL CONFERENCE ON INFORMATION SCIENCE AND TECHNOLOGY, 2015, 第 3 作者
(174) 带有储能设备的智能电网电能迭代自适应动态规划最优控制, Iterative Adaptive Dynamic Programming Approach to Power Optimal Control for Smart Grid with Energy Storage Devices, 自动化学报, 2014, 第 4 作者
(175) Model-free Adaptive Dynamic Programming for Optimal Control of Discrete-time Affine Nonlinear System, IFAC PROCEEDINGS VOLUMES, 2014, 第 2 作者
(176) Detecting and Reacting to Changes in Sensing Units: The Active Classifier Case, IEEE TRANSACTIONS ON SYSTEMS MAN CYBERNETICS-SYSTEMS, 2014, 第 3 作者
(177) A hierarchical classification algorithm for evaluating energy consumption behaviors, Nternational Joint Conference on Neural Networks (IJCNN 2014), 2014, 第 2 作者
(178) Full-range adaptive cruise control based on supervised adaptive dynamic programming, NEUROCOMPUTING, 2014, 第 1 作者
(179) A Kaiman filter-based actor-critic learning approach, International Joint Conference on Neural Networks (IJCNN 2014), 2014, 第 2 作者
(180) Event-triggered reinforcement learning approach for unknown nonlinear continuous-time system, International Joint Conference on Neural Networks (IJCNN 2014), 2014, 第 1 作者
(181) An high-efficient online reinforcement learning algorithm for continuous-state systems, World Congress on Intelligent Control and Automation (WCICA 2014), 2014, 第 2 作者
(182) Online reinforcement learning for continuous-state systems, FRONTIERS OF INTELLIGENT CONTROL AND INFORMATION PROCESSING, 2014, 第 2 作者
(183) Dual Heuristic dynamic Programming for nonlinear discrete-time uncertain systems with state delay, NEUROCOMPUTING, 2014, 第 2 作者
(184) Model-free adaptive dynamic programming for optimal control of discrete-time affine nonlinear system, PROCEEDINGS OF INTERNATIONAL FEDERATION OF AUTOMATIC CONTROL 2014, 2014, 第 1 作者
(185) Cheating Behavior Detection based-on Pictorial Structure Model, 2014, 第 1 作者
(186) 基于数据的智能电网电能自适应优化调控, Data-Based Adaptive Optimal Control of Smart Grid Power, 控制工程, 2014, 第 3 作者
(187) A supervised Actor-Critic approach for adaptive cruise control, SOFT COMPUTING, 2013, 第 1 作者
(188) Neural sliding-mode load frequency controller design of power systems, NEURAL COMPUTING & APPLICATIONS, 2013, 第 2 作者
(189) Online Model-Free RLSPI Algorithm for Nonlinear Discrete-Time Non-affine Systems, 2013, 第 1 作者
(190) A Prior-Free Encode-Decode Change Detection Test to Inspect Datastreams for Concept Drift, 2013 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2013, 第 3 作者
(191) How to automatically set an initial angle for balance control of a cart-pole system: an education case, INTERNATIONAL JOURNAL OF ELECTRICAL ENGINEERING EDUCATION, 2013, 第 3 作者
(192) A neural-network-based iterative GDHP approach for solving a class of nonlinear optimal control problems with control constraints, NEURAL COMPUTING AND APPLICATIONS,, 2013, 第 3 作者
(193) Special issue on intelligent control and information processing, SOFT COMPUTING, 2013, 第 1 作者  通讯作者
(194) Data-based control, optimization, modeling and applications, NEURAL COMPUTING & APPLICATIONS, 2013, 第 1 作者  通讯作者
(195) A prior-free encode-decode change detection test to inspect datastreams for concept drift, International Joint Conference on Neural Networks (IJCNN 2013), 2013, 第 3 作者
(196) A neural-network-based iterative GDHP approach for solving a class of nonlinear optimal control problems with control constraints, NEURAL COMPUTING & APPLICATIONS, 2013, 第 3 作者
(197) Optimal control of unknown nonaffine nonlinear discrete-time systems based on adaptive dynamic programming, AUTOMATICA, 2012, 第 4 作者
(198) 复杂系统的平行控制理论及应用, Parallel Control Theory of Complex Systems and Applications, 复杂系统与复杂性科学, 2012, 第 5 作者
(199) Integration of fuzzy controller with adaptive dynamic programming, 10th World Congress on Intelligent Control and Automation (WCICA 2012), 2012, 第 2 作者
(200) Hybrid feedback control of vehicle longitudinal acceleration, PROCEEDING OF CHINESE CONTROL CONFERENCE, 2012, 第 1 作者
(201) Reinforcement learning control based on multi-goal representation using hierarchical heuristic dynamic programming, 2012 IEEE International Joint Conference on Neural Networks (IJCNN 2012), 2012, 第 1 作者
(202) Data-driven optimal algorithms and their applications to pattern recognition, NEUROCOMPUTING, 2012, 第 3 作者
(203) Self-teaching adaptive dynamic programming for Gomoku, NEUROCOMPUTING, 2012, 第 1 作者
(204) 基于OGRE的车辆自适应巡航控制三维仿真, 3D Simulation of Adaptive Cruise Control Based on OGRE, 交通运输系统工程与信息, 2012, 第 2 作者
(205) Data-driven learning and control with multiple critic networks, 10th World Congress on Intelligent Control and Automation (WCICA 2012), 2012, 第 1 作者
(206) SVM-based just-in-time adaptive classifiers, 2012, 第 1 作者
(207) Neural-Network-Based Optimal Control for a Class of Unknown Discrete-Time Nonlinear Systems Using Globalized Dual Heuristic Programming, IEEE TRANSACTIONS ON AUTOMATION SCIENCE AND ENGINEERING, 2012, 第 3 作者
(208) Computational Intelligence in Urban Traffic Signal Control: A Survey, IEEE TRANSACTIONS ON SYSTEMS MAN AND CYBERNETICS PART C-APPLICATIONS AND REVIEWS, 2012, 第 1 作者  通讯作者
(209) Neural and Fuzzy Dynamic Programming for Under-actuated Systems, 2012 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2012, 第 1 作者
(210) Adaptive Cruise Control Based on Reinforcement Leaning with Shaping Rewards, JOURNAL OF ADVANCED COMPUTATIONAL INTELLIGENCE AND INTELLIGENT INFORMATICS, 2011, 第 2 作者
(211) Neural network based online traffic signal controller design with reinforcement training, IEEE Conference on Intelligent Transportation Systems, Proceedings, ITSC, 2011, 第 1 作者
(212) 高超声速飞行器轨迹跟踪控制仿真研究, Simulation Research on Tracking Control for Hypersonic Aircraft, 系统仿真学报, 2011, 第 3 作者
(213) Special Section on Data-Based Control, Modeling, and Optimization, IEEE TRANSACTIONS ON NEURAL NETWORKS, 2011, 第 5 作者
(214) Supervised adaptive dynamic programming based adaptive cruise control, IEEE SSCI 2011: Symposium Series on Computational Intelligence - ADPRL 2011: 2011 IEEE Symposium on Adaptive Dynamic Programming and Reinforcement Learning, 2011, 第 1 作者
(215) DHP for coordinated freeway ramp metering, IEEE Transactions on Intelligent Transportation Systems, 2011, 第 1 作者
(216) DHP Method for Ramp Metering of Freeway Traffic, IEEE TRANSACTIONS ON INTELLIGENT TRANSPORTATION SYSTEMS, 2011, 第 1 作者  通讯作者
(217) Control of Overhead Crane Systems by Combining Sliding Mode with Fuzzy Regulator, IFAC PROCEEDINGS VOLUMES, 2011, 第 3 作者
(218) Adaptive dynamic programming for optimal control of unknown nonlinear discrete-time systems, IEEE SSCI 2011: Symposium Series on Computational Intelligence - ADPRL 2011: 2011 IEEE Symposium on Adaptive Dynamic Programming and Reinforcement Learning, 2011, 第 2 作者
(219) Reinforcement learning for multi-agent patrol policy, the 9th IEEE International Conference on Cognitive Informatics, ICCI 2010, 2010, 第 1 作者
(220) Inverse Control of Cable-driven Parallel Mechanism Using Type-2 Fuzzy Neural Network, Inverse Control of Cable-driven Parallel Mechanism Using Type-2 Fuzzy Neural Network, 自动化学报, 2010, 第 4 作者
(221) A traffic signal control algorithm for isolated intersections based on adaptive dynamic programming, 2010 International Conference on Networking, Sensing and Control, ICNSC 2010, 2010, 第 1 作者
(222) Inverse control of cable-driven parallel mechanism using type-2 fuzzy neural network, ZIDONGHUA XUEBAO/ ACTA AUTOMATICA SINICA, 2010, 第 1 作者
(223) Fuzzy Logic Based Adjustment Control of a Cable-driven Auto-leveling Parallel Robot, 2009IEEERSJINTERNATIONALCONFERENCEONINTELLIGENTROBOTSANDSYSTEMS, 2009, 第 4 作者
(224) Coordinated control of multiple ramps metering based on ADHDP (λ) Controller, International Journal of Innovative Computing, Information and Control, 2009, 第 1 作者
(225) Trajectory Tracking Control of Omnidirectional Wheeled Mobile Manipulators: Robust Neural Network-Based Sliding Mode Approach, IEEE TRANSACTIONS ON SYSTEMS MAN AND CYBERNETICS PART B-CYBERNETICS, 2009, 第 2 作者
(226) GENETIC ALGORITHM-BASED FUZZY CONTROLLER TO AVOID NETWORK CONGESTION, INTELLIGENT AUTOMATION AND SOFT COMPUTING, 2009, 第 3 作者
(227) THE APPLICATION OF ADHDP(lambda) METHOD TO COORDINATED MULTIPLE RAMPS METERING, INTERNATIONAL JOURNAL OF INNOVATIVE COMPUTING INFORMATION AND CONTROL, 2009, 第 2 作者
(228) Design of interval type-2 fuzzy logic system using sampled data and priorknowledge, ICIC EXPRESS LETTERS, 2009, 第 3 作者
(229) 绳索牵引自动水平调节机器人综合控制策略, Synthetic control strategy for a cable-driven auto-leveling robot, 电机与控制学报, 2009, 第 4 作者
(230) An overview on the adaptive dynamic programming based urban city traffic signal optimal control, ZIDONGHUA XUEBAO/ ACTA AUTOMATICA SINICA, 2009, 第 1 作者
(231) 全方位移动机械手运动控制Ⅱ——鲁棒控制, Motion Control of Omnidirectional Mobile Manipulators(Part Ⅱ) --Robust Control, 机械工程学报, 2009, 第 2 作者
(232) 基于自适应动态规划的城市交通信号优化控制方法综述, An Overview on the Adaptive Dynamic Programming Based Urban City Traffic Signal Optimal Control, 自动化学报, 2009, 第 1 作者
(233) 全方位移动机械手运动控制Ⅰ——建模与控制, Motion Control of Omnidirectional Mobile Manipulators (Part I) —Modeling and Control, 机械工程学报, 2009, 第 2 作者
(234) Motion control of omnidirectional mobile manipulators (Part II) - Robust control, JIXIE GONGCHENG XUEBAO/JOURNAL OF MECHANICAL ENGINEERING, 2009, 第 1 作者
(235) Motion and Internal Force Control for Omnidirectional Wheeled Mobile Robots, IEEE-ASME TRANSACTIONS ON MECHATRONICS, 2009, 第 1 作者  通讯作者
(236) Motion control of omnidirectional mobile manipulators (Part I) - Modeling and control, JIXIE GONGCHENG XUEBAO/JOURNAL OF MECHANICAL ENGINEERING, 2009, 第 3 作者
(237) Adaptive Dynamic Neuro-fuzzy System for Traffic Signal Control, 2008 IEEE INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS, VOLS 1-8, 2008, 第 2 作者
(238) 绳索牵引自动水平调节机器人的设计, Design of a cable-driven self-leveling robot, 华中科技大学学报:自然科学版, 2008, 第 3 作者
(239) DynaCAS: Computational Experiments and Decision Support for ITS, IEEE INTELLIGENT SYSTEMS, 2008, 第 4 作者
(240) 一种新的针对平移振荡器系统的模糊控制方法, A Novel Fuzzy Logic Control Scheme for TORA System, 重庆工学院学报:自然科学版, 2008, 第 3 作者
(241) Hierarchical sliding mode control for a class of SIMO under-actuated systems, CONTROL AND CYBERNETICS, 2008, 第 3 作者
(242) Motion regulation of redundantly actuated omni-directional wheeled mobile robots with internal force control, 2007 IEEE/RSJ INTERNATIONAL CONFERENCE ON INTELLIGENT ROBOTS AND SYSTEMS, VOLS 1-9, 2007, 第 1 作者  通讯作者
(243) Improved mean shift segmentation approach for natural images, APPLIED MATHEMATICS AND COMPUTATION, 2007, 第 3 作者
(244) Select informative symptoms combination for diagnosing syndrome, JOURNAL OF BIOLOGICAL SYSTEMS, 2007, 第 4 作者
(245) Application of ADP to intersection signal control, ADVANCES IN NEURAL NETWORKS - ISNN 2007, PT 1, PROCEEDINGS, 2007, 第 2 作者
(246) 一种末端任务给定的移动机械手动态路径规划方法, A dynamic path planning approach for mobile manipulators along given end effector paths, 控制与决策, 2007, 第 2 作者
(247) 一种新的基于神经模糊推理网络的复杂系统模糊辨识方法, A New Fuzzy Identification Approach for Complex Systems Based on Neural-Fuzzy Inference Network, 自动化学报, 2006, 第 3 作者
(248) 一种全方位移动机械手的可操作度分析, Manipulability Analysis for Omnidirectional Mobile Manipulators, 中国机械工程, 2006, 第 2 作者
(249) 一种改进的自然图像分割方法, Improved Approach of Natural Image Segmentation, 计算机应用研究, 2006, 第 3 作者
(250) A particle swarm optimized fuzzy neural network control for acrobot, ADVANCES IN NEURAL NETWORKS - ISNN 2006, PT 2, PROCEEDINGS, 2006, 第 1 作者  通讯作者
(251) Analysis of infectious disease data based on evolutionary computation, INTELLIGENCE AND SECURITY INFORMATICS, PROCEEDINGS, 2006, 第 1 作者  通讯作者
(252) 一类非确定欠驱动系统的串级模糊滑模控制, Cascade fuzzy sliding mode control for a class of uncertain underactuated systems, 控制理论与应用, 2006, 第 3 作者
(253) 基于OpenGL的移动机械手路径规划仿真, Path Planning Simulation for Mobile Manipulators Based on OpenGL, 系统仿真学报, 2006, 第 2 作者
(254) Motion and squeeze force control for omnidirectional wheeled mobile robots, 2006 AMERICAN CONTROL CONFERENCE, VOLS 1-12, 2006, 第 2 作者
(255) Robot planning with artificial potential field guided ant colony optimization algorithm, ADVANCES IN NATURAL COMPUTATION, PT 2, 2006, 第 1 作者  通讯作者
(256) 自然图像分割方法及其在目标检测中的应用, Segmentation Approach for Natural Images and Application to Object Detection in ViewBased Navigation, 模式识别与人工智能, 2006, 第 3 作者
(257) 基于视觉的机器人定位精度提高方法, Improvement Method of Localization Precision for Vision based Robot, 计算机测量与控制, 2005, 第 3 作者
(258) 一种鲁棒的只需两帧图像的姿态估计方法, Robust Pose Estimation from Only Two Frames, 模式识别与人工智能, 2005, 第 3 作者
(259) 一种基于点对的相机几何标定方法, A Geometric Approach for Camera Calibration Based on Point Pairs, 机器人, 2005, 第 3 作者
(260) Pendubot的一种分层滑模控制方法, Hierarchical sliding-mode control of Pendubot, 控制理论与应用, 2005, 第 3 作者
(261) 一种移动机械手分级协调路径规划方法, A coordinated and hierarchical path planning approach for mobile manipulator, 制造业自动化, 2005, 第 2 作者
(262) 实时卡片字符识别与校验系统的设计与实现, The Design and Realization of A Real-time Card Character Recognition and Verification System, 计算机工程与应用, 2005, 第 2 作者
(263) 基于最小二乘支持向量机的自适应逆扰动消除控制系统, Adaptive Inverse Disturbance Canceling Control Systems Based on Least Squares Support Vector Machine, 控制与决策, 2005, 第 3 作者
(264) 一种基于点对的深度和运动估计方法, Depth and Motion Estimation from Point Pairs, 机器人, 2005, 第 3 作者
(265) 一种全方位移动机器人的控制方法, A control approach to an omnidirectional mobile robot, 电机与控制学报, 2005, 第 3 作者
(266) 基于模型的人体运动参数检测, Measure for Human Motion Parameters Based on Model, 生物医学工程学杂志, 2005, 第 2 作者
(267) 一种基于强化学习的在线神经模糊控制系统, Reinforcement-Learning-Based On-Line Neural-Fuzzy Control System, 中国科学院研究生院学报, 2005, 第 3 作者
(268) 基于Lyapunov稳定理论设计MRAC系统的简单方法, Simple Scheme for MRAC System Using Lyapunov Theory, 系统仿真学报, 2005, 第 3 作者
(269) 煤气化炉的仿真系统开发, Development of Simulation System for Coal Gasifier, 系统仿真学报, 2005, 第 1 作者
(270) A Robust Two Feature Points Based Depth Estimation Method, ACTA AUTOMATICA SINICA, 2005, 第 3 作者
(271) 门牌识别系统中的鲁棒性分割方法, Robust Segmentation Method for Doorplate Recognition System, 自动化学报, 2005, 第 3 作者
(272) 一种鲁棒的基于两个特征点的深度估计方法, A Robust Two Feature Points Based Depth Estimation Method, 自动化学报, 2005, 第 3 作者
(273) 一种新型神经网络滑模控制器的设计, Design of a new type of neural network sliding - mode controller, 电机与控制学报, 2005, 第 3 作者
(274) 基于稳定性分析的一类欠驱动系统的滑模控制器设计, Design of Sliding mode Controller Based on Stable Analysis for a Class of Underactuated Systems, 信息与控制, 2005, 第 3 作者
(275) 基于神经网络的一类非线性系统自适应滑模控制, An adaptive sliding mode control for a class of uncertain nonlinear systems based on neural networks, 电机与控制学报, 2005, 第 3 作者
(276) 一类连续状态与动作空间下的加权Q学习, A kind of weighted Q-learning for continuous state and action spaces, 电机与控制学报, 2005, 第 4 作者
(277) 移动机械手结构设计, 可编程控制器与工厂自动化(PLC FA), 2004, 第 2 作者
(278) 桥式吊车系统的分级滑模控制方法, Hierarchical Sliding-Mode Control Method for Overhead Cranes, 自动化学报, 2004, 第 3 作者
(279) 机器人行为协调机制研究进展, The Progress of the Behavior Coordination Mechanism for Robots, 机器人, 2004, 第 3 作者
(280) 基于运动视的移动机器人定位方法, Mobile Robot Localization Based on Motion Vision, 机器人, 2004, 第 3 作者
(281) 一类网络控制系统的建模及分析, Modeling and Analysis of a Class of Networked Control Systems, 控制工程, 2004, 第 3 作者
(282) 基于MATLAB的非完整动力学系统跟踪控制的动态仿真, MATLAB-Based Dynamic Simulation of Tracking Control of Nonholonomic Dynamic Systems, 系统仿真学报, 2004, 第 3 作者
(283) 一种全方位移动机器人的运动学分析, KINEMATIC ANALYSIS OF AN OMNIDIRECTIONAL MOBILE ROBOT, 机器人, 2004, 第 3 作者
(284) DC—DC变换器的模糊神经网络控制方法研究, Fuzzy Neural Network Control Method for DC-DC Converter, 系统仿真学报, 2004, 第 3 作者
(285) 一种基于RBF网络的非线性自适应逆控制系统, A kind of nonlinear adaptive inverse control system based on RBF networks, 控制与决策, 2004, 第 3 作者
(286) 基于滑模方法的桥式吊车系统的抗摆控制, Anti-swing control of overhead cranes based on sliding-mode method, 控制与决策, 2004, 第 3 作者
(287) 具有形状自适应的欠驱动拟人机器人手指, UNDER-ACTUATED HUMANOID ROBOT FINGER WITH SHAPE ADAPTATION, 机械工程学报, 2004, 第 4 作者
(288) 一种全方位移动机械手的体系结构设计与分析, Architecture Design and Analysis of an Omni-directional Mobile Manipulator, 机器人, 2004, 第 3 作者
(289) 移动机器人导航研究现状及其发展趋势展望, 可编程控制器与工厂自动化(PLC FA), 2004, 第 4 作者
(290) 一种基于目标识别的运动视定位方法, A LOCALIZATION METHOD USING MOTION VISION, 模式识别与人工智能, 2004, 第 3 作者
(291) 两种小巧的远距离传输抗干扰电路的设计与比较, Design and Compare of Two Kinds of Small Noise Immune Circuits for Long-distance Transferring, 电气自动化, 2003, 第 4 作者
(292) 拟人机器人上肢运动检测系统的研制, Development of the detecting system for arm motion of humanoid robot, 传感器技术, 2003, 第 3 作者
(293) 全方位移动机器人结构和运动分析, STRUCTURE AND KINEMATIC ANALYSIS OF OMNI-DIRECTIONAL MOBILE ROBOTS, 机器人, 2003, 第 1 作者
(294) 脉冲GTAW焊缝成形智能控制方法, Intelligent Control of Weld Seam Molding in Pulsed GTAW, 自动化学报, 2003, 第 2 作者
(295) 变抓取力的欠驱动拟人机器人手, Under-actuated humanoid robot hand with changeable grasping force, 清华大学学报:自然科学版, 2003, 第 5 作者
(296) 移动机械手控制研究进展, SURVEY OF THE CONTROL FOR MOBILE MANIPULATORS, 机器人, 2003, 第 3 作者
(297) 拟人机器人TH-1手臂运动学, ARMS KINEMATICS ON A HUMANOID ROBOT TH-1, 机器人, 2002, 第 1 作者
(298) GMRL: Graph neural network based inference in a Markov network with Reinforcement Learning for visual navigation, NEUROCOMPUTING, 第 3 作者  通讯作者
发表著作
(1) 游戏人工智能方法, 科学出版社, 2024-02, 第 1 作者
(2) 智能网联汽车关键技术与应用—智能网联汽车决策控制技术, 人民交通出版社, 2023-03, 第 1 作者
(3) Neural Information Processing, Lecture Notes in Computer Science 10636, 10637, 10638, 10639, Springer Heidelberg Dordrecht London NewYork, 2017-10, 第 4 作者
(4) Advances in Neural Networks – ISNN 2015, Springer Heidelberg Dordrecht London NewYork, 2015-04, 第 4 作者
(5) Frontiers of Intelligent Control and Information Processing, Frontiers of Intelligent Control and Information Processing, World Scientific Publishing, 2014-11, 第 3 作者
(6) Advances in Brain Inspired Cognitive Systems, Advances in Brain Inspired Cognitive Systems, Springer Heidelberg Dordrecht London NewYork, 2013-06, 第 3 作者
(7) 机器人手册,第26章-面向操作任务的运动, Springer Handbook of Robotics, Chapter 26 - Motion for Manipulation Tasks, 机械工业出版社, 2013-01, 第 1 作者
(8) 机器人手册,第51章-智能车辆, Springer Handbook of Robotics, Chapter 51 - Intelligent Vehicles, 机械工业出版社, 2013-01, 第 1 作者
(9) 全方位移动机器人导论, An introduction to Omnidirectinoal Mobile Robots, 科学出版社, 2010-05, 第 1 作者

科研活动

   
科研项目
( 1 ) 车端视觉语言动作VLA模型架构研究, 负责人, 境内委托项目, 2025-09--2027-03
( 2 ) 基于VLA及其强化学习后训练的闭环端到端自动驾驶研究, 负责人, 地方任务, 2025-07--2027-06
( 3 ) 变化环境下的新型强化学习算法及应用, 负责人, 国家任务, 2022-12--2027-11
( 4 ) 面向多任务的多智能体深度强化学习理论与应用, 负责人, 国家任务, 2022-01--2025-12
( 5 ) 室外复杂视觉条件下的机器人感知和目标识别, 负责人, 国家任务, 2022-01--2024-12
( 6 ) 虚实融合的智能博弈技术与应用, 负责人, 中国科学院计划, 2021-01--2023-12
( 7 ) 中国科学院战略性先导科技专项(A类):多智能体深度强化学习, 负责人, 中国科学院计划, 2020-07--2021-06
( 8 ) 银河水滴科技公司项目二期:面向地铁运营场景需求的智能感知核心技术研究, 负责人, 境内委托项目, 2020-06--2021-05
( 9 ) 非完全信息条件下的博弈决策--知识与数据共同驱动的深度强化学习算法, 负责人, 国家任务, 2020-01--2022-12
( 10 ) 适配硬件的算子结构优化及自动并行切分技术研究, 负责人, 境内委托项目, 2019-08--2020-05
( 11 ) 复杂城市交互场景下的电动汽车智能决策技术, 负责人, 地方任务, 2019-07--2020-06
( 12 ) 强化学习技术和硬件化技术研究, 负责人, 境内委托项目, 2018-09--2019-06
( 13 ) 面向智能驾驶的深度强化学习方法研究, 负责人, 境内委托项目, 2018-09--2019-08
( 14 ) 面向地铁运营场景需求的智能感知核心技术研究, 负责人, 境内委托项目, 2018-09--2019-08
( 15 ) 高度自动驾驶(L4级)电动汽车关键技术研发及验证平台开发--深度强化学习应用, 负责人, 地方任务, 2018-01--2019-12
( 16 ) “气虚证辨证标准的系统研究”的中医证候辨证新方法研究--中医AI, 负责人, 国家任务, 2018-01--2020-12
( 17 ) 基于人工智能的智能驾驶体验科普展品, 负责人, 地方任务, 2018-01--2018-12
( 18 ) 深度神经网络优化的群体协作神经动力学方法, 负责人, 中国科学院计划, 2018-01--2020-12
( 19 ) 智能驾驶危险目标检测的深度强化学习方法, 负责人, 地方任务, 2018-01--2019-12
( 20 ) 不完全信息动态博弈的优化决策, 负责人, 国家任务, 2017-03--2018-12
( 21 ) 智能辅助驾驶控制系统关键技术研究与产品开发, 负责人, 国家任务, 2016-07--2019-06
( 22 ) 深度自适应动态规划理论方法和应用, 负责人, 国家任务, 2016-01--2019-12
( 23 ) 基于数据的建筑群及分布式能源系统一体化建模与自学习优化控制, 参与, 国家任务, 2016-01--2020-12
( 24 ) 人机交互的监督强化学习控制理论和方法, 负责人, 其他国际合作项目, 2015-01--2016-12
( 25 ) 中国科学院海外评审专家(何海波), 负责人, 中国科学院计划, 2015-01--2016-12
( 26 ) 建筑能耗数据挖掘与分析工具包开发, 参与, 地方任务, 2013-12--2014-12
( 27 ) 汽车自适应巡航控制(ACC)系统及方法, 负责人, 地方任务, 2013-09--2016-05
( 28 ) 能源管控中心平行控制节能技术研究, 参与, 地方任务, 2013-04--2014-12
( 29 ) 基于监督式ADP 的汽车智能巡航控制, 负责人, 国家任务, 2013-01--2016-12
( 30 ) 汽车的智慧起停巡航控制, 负责人, 地方任务, 2012-01--2014-12
( 31 ) 基于数据的非线性控制系统分析与设计, 参与, 国家任务, 2011-01--2014-12
参与会议
(1)深度强化学习助力AI专家和通才   中国自动化学会会士面对面报告   2024-03-12
(2)Deep Reinforcement Learning Helps to Shape Generalists   2023-11-22
(3)Deep reinforcement learning for Virtual Games and Real Robots   2023-01-29
(4)第1讲:基于深度强化学习的智能驾驶感知和决策   中国人工智能学会智能驾驶大讲堂   2022-09-14
(5)Deep reinforcement learning for decision making of autonomous vehicles   2022-08-03
(6)Deep Reinforcement Learning based Game Decision Making   2022-07-24
(7)Deep reinforcement learning for perception and control of autonomous vehicles   2022-02-25
(8)宽度神经架构搜索   中国计算机大会   2021-12-18
(9)Deep reinforcement learning for games and robotic applications   2021-07-16
(10)Recent Progress of Deep Reinforcement Learning   2021-05-22
(11)Artificial intelligence methods for real-time fighting game   2020-10-25
(12)深度强化学习—游戏AI和其他   中国自动化学会混合增强智能第三期前沿讲习班   2020-10-11
(13)Deep reinforcement learning algorithms and applications   2020-07-26
(14)深度强化学习--从仿真到实体   2020年北京智源大会   2020-06-23
(15)深度强化学习算法和游戏AI进展   中国人工智能学会机器博弈专委会   2019-10-11
(16)深度强化学习算法、理论和应用   国家自然科学基金委信息三处十四五规划研讨会   2019-08-31
(17)深度强化学习算法及在游戏AI中的应用   视觉感知智能系统专委会会议   2019-08-12
(18)博弈(游戏)中的深度强化学习算法   第二届全国大数据与人工智能科学大会   2019-07-06
(19)Deep Reinforcement Learning for Video Game   华为多智能体强化学习研讨会   2019-04-25
(20)深度强化学习算法与医疗应用   中国中医药信息研究会临床研究分会第三届学术年会   2018-09-08
(21)深度强化学习算法与应用   中国自动化学会“深度与宽度强化学习”前沿论坛   2018-05-30
(22)Game AI with RL and DL   2018-05-21
(23)深度强化学习进展:从AlphaGo到AlphaGo Zero   第二届世界智能大会   2018-05-17
(24)Game AIs with RL and DL   2018-05-16
(25)Recent Progress on Deep Reinforcement Learning-- from AlphaGo to AlphaGo Zero   三星机器学习前沿研讨会   2018-01-15
(26)深度强化学习算法及应用   中国电力科学研究院2017年二 零八科学会议—人工智能在电力领域的研究应用方向和关键技术   2017-12-06
(27)Cooperative reinforcement learning for multiple units combat in StarCraft   Kun Shao, Yuanheng Zhu, Dongbin Zhao   2017-11-28
(28)Event-triggered integral reinforcement learning for nonlinear continuous-time systems   Qichao Zhang, Dongbin Zhao   2017-11-28
(29)深度强化学习进展—从AlphaGo到AlphaGo Zero   中国仿真学会智能物联专委会会议   2017-11-17
(30)Off-Policy reinforcement learning for partially unknown nonzero-sum games   2017-11-16
(31)FMR-GA -- A cooperative multi-agent reinformcement learning algorithm based on gradient ascent   2017-11-16
(32)人工智能方法及其在智慧城市中的应用   泰山科技论坛—人工智能在智慧城市建设中的应用研究   2017-11-08
(33)A Kolmogorov-Smirnov test to detect changes in stationarity in big data   2017-07-06
(34)Multi-task learning with Cartesian product-based multi-objective combination for dangerous object detection   2017-06-10
(35)Data-driven adaptive dynamic programming for two-player nonzero-sum game   2017-05-29
(36)Comparison of methods to efficient graph SLAM under general optimization framework   2017-05-19
(37)Policy gradient methods with gaussian process modelling acceleration   2017-05-16

指导学生

已指导学生

田艺  硕士研究生  081101-控制理论与控制工程  

胡朝辉  硕士研究生  081101-控制理论与控制工程  

戴钰桀  博士研究生  081101-控制理论与控制工程  

苏永生  硕士研究生  081101-控制理论与控制工程  

张震  博士研究生  081101-控制理论与控制工程  

王滨  博士研究生  081101-控制理论与控制工程  

朱圆恒  博士研究生  081101-控制理论与控制工程  

王海涛  硕士研究生  081101-控制理论与控制工程  

夏中谱  博士研究生  081101-控制理论与控制工程  

张启超  博士研究生  081101-控制理论与控制工程  

吕乐  博士研究生  081101-控制理论与控制工程  

卜丽  博士研究生  081101-控制理论与控制工程  

陈亚冉  博士研究生  081101-控制理论与控制工程  

唐振韬  博士研究生  081101-控制理论与控制工程  

邵坤  博士研究生  081101-控制理论与控制工程  

李栋  博士研究生  081101-控制理论与控制工程  

卢毅  博士研究生  081101-控制理论与控制工程  

李浩然  博士研究生  081101-控制理论与控制工程  

丁子祥  博士研究生  081203-计算机应用技术  

刘育琦  博士研究生  081101-控制理论与控制工程  

李伟凡  博士研究生  081104-模式识别与智能系统  

胡光政  博士研究生  081203-计算机应用技术  

李楠楠  博士研究生  081101-控制理论与控制工程  

王俊杰  博士研究生  081101-控制理论与控制工程  

李丁  博士研究生  081203-计算机应用技术  

刘民颂  博士研究生  081101-控制理论与控制工程  

刘莎莎  硕士研究生  085410-人工智能  

马名骏  硕士研究生  085410-人工智能  

郭又天  硕士研究生  085211-计算机技术  

柴嘉骏  博士研究生  081101-控制理论与控制工程  

现指导学生

陆润宇  博士研究生  081203-计算机应用技术  

范昌易  硕士研究生  085410-人工智能  

赵子杰  博士研究生  081203-计算机应用技术  

傅宇千  博士研究生  081104-模式识别与智能系统  

徐凯旋  博士研究生  081203-计算机应用技术  

田帅  硕士研究生  081104-模式识别与智能系统  

刘鑫  博士研究生  081104-模式识别与智能系统  

江震南  博士研究生  081203-计算机应用技术  

孙敬博  博士研究生  081101-控制理论与控制工程  

陈宇辉  博士研究生  081101-控制理论与控制工程  

陈庆  硕士研究生  085410-人工智能  

凃崧峻  博士研究生  081101-控制理论与控制工程  

刘学义  博士研究生  081101-控制理论与控制工程  

崔文博  博士研究生  081101-控制理论与控制工程  

李博宇  博士研究生  081101-控制理论与控制工程  

刘卫恒  博士研究生  081104-模式识别与智能系统  

李思成  博士研究生  081101-控制理论与控制工程  

郑宇鹏  博士研究生  081101-控制理论与控制工程