基本信息

赵冬斌,男、研究员、博导、Fellow IEEE,中国科学院自动化研究所
电子邮件: dongbin.zhao@ia.ac.cn
通信地址: 海淀区中关村东路95号智能化大厦1005
邮政编码: 100190

研究领域

智能学习控制:深度强化学习,自适应动态规划,强化学习,演化计算,智能游戏,自动机器学习
智能交通:智能驾驶,交通信号控制,车路协同
机器人:移动机器人感知与学习控制,机电一体化系统

招生信息

招生专业1:控制理论与控制工程--群体智能与博弈对抗

招生专业2:模式识别--人工智能理论与方法


招生方向
深度强化学习,自适应动态规划,强化学习,智能控制
智能驾驶,智能游戏,机器人,智能交通,能源管控
神经架构搜索,自动机器学习

教育背景

1996-09--2000-04   哈尔滨工业大学   博士
1994-09--1996-07   哈尔滨工业大学   硕士
1990-09--1994-07   哈尔滨工业大学   学士
出国学习工作
2007年8月-2008年8月,University of Arizona, 访问学者,国家留学基金委公派留学计划。

工作经历

   
工作简历
2014-01~2014-02,新加坡科技研究局, 访问学者
2012-11~现在, 中科院自动化所, 研究员、博导
2002-04~2012-10,中国科学院自动化研究所, 副研、硕导-博导
2000-05~2002-01,清华大学, 博士后
社会兼职
2019-12-11-2019-12-16,The 10th International Conference on Intelligent Control and Information Processing (ICICIP 2019), Marrakesh, Morocco, Program Chair
2019-12-06-2019-12-09,IEEE Symposium on Adaptive Dynamic Programming and Reinforcement Learning (ADPRL 2019), Xiamen, China, Program Chair
2019-07-13-2019-07-18,IEEE International Joint Conference on Neural Networks (IJCNN 2019), Budapest, Hungary, Program Co-Chair
2019-05-04-2019-05-06,IEEE International Conference on Computational Intelligence for Financial Engineering and Economics (CIFEr 2019), Shenzhen, China, General Co-Chair
2019-01-01-2019-12-31,IEEE CIS Technical Activities Strategy Planning Sub-Committee, Chair
2018-12-01-2018-12-04,The 25th International Conference on Neural Information Processing (ICONIP 2018), Siem Reap, Cambodia, Dec 1-4, 2018, Tutorial Chair
2018-11-18-2018-11-21,IEEE Symposium on Adaptive Dynamic Programming and Reinforcement Learning (ADPRL 2018), Bangalore, India, Nov. 18 -21, 2018, Program Chair
2018-09-01-2019-08-31,IEEE Computation Intelligence Magazine special issue on “Deep Reinforcement Learning and Games”., Lead Guest Chair
2018-06-29-2018-07-06,2018 Eighth International Conference on Information Science and Technology (ICIST 2018), Cordoba, Granada, and Seville, Spain during June 30-July 6, 2018, Program Chair
2018-06-01-今,IEEE Transactions on Neural Networks and Learning Systems special issue on “Deep Reinforcement Learning and Adaptive Dynamic Programming”, Lead Guest Editor
2018-03-01-今,IEEE Transactions on Cybernetics, Associate Editor
2017-11-26-2017-11-30,IEEE Symposium on Adaptive Dynamic Programming and Reinforcement Learning (ADPRL 2017), Honolulu, Hawaii, USA, Program Chair
2017-11-13-2017-11-17,The 24th International Conference on Neural Information Processing (ICONIP 2017), Guangzhou, China, Program Chair
2017-07-05-2017-07-27,2017 IEEE CIS Summer School on Computational and Artificial Intelligence, Chair
2017-01-01-今,IEEE计算智能学会北京分会, 主席
2016-12-05-2016-12-08,IEEE Symposium on Adaptive Dynamic Programming and Reinforcement Learning (ADPRL 2016), Athens, Greece, Program Chair
2016-07-25-2017-07-29,IEEE World Congress on Computational Intelligence (WCCI 2016), Vancouver, Canada, Publicity Co-chair
2016-06-11-2016-06-14,The 13th World Congress on Intelligent Control and Automation (WCICA 2016), Guilin, China, Program Co-Chair
2015-10-15-2015-10-18,12th International Symposium on Neural Networks (ISNN 2015), Jeju, Korea, Program Co-Chair
2015-04-24-2015-04-26,The 5th International Conference on Information Science and Technology (ICIST 2015), Changsha, China, Program Chair
2015-01-01-今,Artificial Intelligence Review, Associate Editor
2014-12-31-2016-12-31,IEEE计算智能学会自适应动态规划和强化学习技术委员会, 主席
2014-12-31-2015-12-31,IEEE计算智能学会旅行资助委员会, 主席
2014-12-31-2016-12-31,IEEE计算智能学会多媒体委员会, 主席
2014-12-31-2016-12-31,IEEE计算智能学会北京分会, 副主席
2014-12-09-2014-12-12,IEEE Symposiums Series on Computational Intelligence (SSCI 2014), Atlanta, USA, Poster Chair
2014-07-06-2014-07-11,IEEE World Congress on Computational Intelligence (WCCI 2014), Beijing, China, Finance Co-Chair
2014-07-06-2014-07-11,IEEE CIS Summer School on Automated Computational Intelligence, Beijing, China, Chair
2014-01-01-今,IEEE Computational Intelligence Magazine, Associate Editor,
2013-06-09-2013-06-11,The 4th International Conference on Intelligent Control and Information Processing (ICICIP 2013), Beijing, China, Program Chair
2012-12-31-2014-12-30,IEEE CIS Newsletter, Editor,
2012-07-11-2012-07-14,International Symposium on Neural Networks (ISNN 2012), Shenyang, China, Registration Chair
2012-07-11-2012-07-14,Brain Inspired Cognitive Systems (BICS 2012), Shenyang, China, Finance Chair
2012-01-01-今,IEEE Transactions on Neural Networks and Learning Systems, Associate Editor
2011-11-01-今,Cognitive Computation, Associate Editor,
2010-10-01-今,IEEE高级会员,

教授课程

强化学习
智能控制
智能控制理论基础及应用

专利与奖励

   
奖励信息
(1) 2017年度IEEE Transactions on Cognitive and Developmental Systems优秀论文奖(唯一), , 其他, 2019
(2) IEEE Fellow, , 其他, 2019
(3) 《控制理论与应用》优秀编委, , 其他, 2019
(4) 中国人工智能学会优秀博士学位论文指导教师, , 部委级, 2019
(5) 2019年中国AI+创新创业大赛, 一等奖, 部委级, 2019
(6) IJCNN 2018 Best Student Paper Final List, 其他, 2018
(7) 《控制理论与应用》优秀论文奖, 其他, 2018
(8) 前方车辆距离监测第1名,2017年中国智能车未来挑战赛—复杂交通环境认知基础能力离线测试比赛, 一等奖, 国家级, 2017
(9) 前方车辆检测第1名,2017年中国智能车未来挑战赛—复杂交通环境认知基础能力离线测试比赛, 一等奖, 国家级, 2017
(10) 基于数据的非线性系统自学习最优控制理论与方法, 三等奖, 部委级, 2015
(11) 中国科学院“朱李月华优秀教师”奖, , 院级, 2014
(12) 中国石油和化工自动化应用协会科技进步一等奖, 一等奖, 部委级, 2012
(13) 北京市科学技术奖, 三等奖, 省级, 2010
(14) 中国石油和化学工业协会科技进步三等奖, 三等奖, 部委级, 2009
专利成果
[1] 朱圆恒, 李伟凡, 熊华, 赵冬斌. 一种基于强化学习的导弹制导方法和装置. CN: CN113239472A, 2021-08-10.

[2] 朱圆恒, 赵冬斌. 基于加速度前馈的异构车队协同自适应巡航控制方法. CN: CN110888322B, 2021-04-13.

[3] 张启超, 王俊杰, 赵冬斌. 智能驾驶横向换道决策方法、系统和装置. CN: CN110304045B, 2020-12-15.

[4] 李浩然, 张启超, 赵冬斌. 面向地铁视觉图像的轨道检测方法及系统. CN: CN111611956A, 2020-09-01.

[5] 陈亚冉, 赵晓东, 赵冬斌. 面向智能驾驶的移动目标轨迹预测方法、系统、装置. CN: CN111597961A, 2020-08-28.

[6] 赵冬斌, 李栋, 张启超, 陈亚冉, 朱圆恒. 智能驾驶车道保持方法及系统. CN: CN109466552B, 2020-07-28.

[7] 赵冬斌, 邵坤, 朱圆恒. 基于反事实回报的多智能体深度强化学习方法、系统. CN: CN111105034A, 2020-05-05.

[8] 赵冬斌, 张启超, 夏中谱. 驾驶员跟车行为分析中的期望跟车距离计算方法. CN: CN107016193B, 2020-02-14.

[9] 陈亚冉, 赵冬斌, 张启超. 基于光流和卡尔曼滤波的多目标追踪方法、系统、装置. CN: CN110415277A, 2019-11-05.

[10] 赵冬斌, 卜丽, 朱圆恒, 李相俊. 储能电池充/放电异常行为检测方法及检测系统. CN: CN106154180B, 2019-02-05.

[11] 赵冬斌, 陈亚冉. 面向驾驶辅助系统的危险目标检测方法、装置. 中国: CN107609483A, 2018-01-19.

[12] 赵冬斌, 张震, 刘德荣. 一种基于稀疏强化学习的传感器网络优化方法. 中国: CN103702349A, 2014-04-02.

[13] 赵冬斌, 王滨, 刘德荣. 基于监督式强化学习的最优控制方法. 中国: CN103324085A, 2013-09-25.

[14] 赵冬斌, 朱圆恒, 刘德荣. 基于数据的Q函数自适应动态规划方法. 中国: CN103217899A, 2013-07-24.

[15] 刘德荣, 魏庆来, 黄玉柱, 赵冬斌. 变换炉的控制方法. 中国: CN102830628A, 2012.12.19.

[16] 赵冬斌, 王滨, 刘德荣, 魏庆来, 朱圆恒, 苏永生. 煤气化炉的控制方法. 中国: CN102799748A, 2012-11-28.

[17] 赵冬斌, 朱圆恒. 模糊自适应动态规划方法. 中国: CN102645894A, 2012-08-22.

[18] 赵冬斌. 车辆自适应巡航控制系统及方法. 中国: CN102109821A, 2011-06-29.

[19] 赵冬斌, 李涛, 易建强. 街区路口交通信号优化控制方法. 中国: CN101789178A, 2010-07-28.

[20] 赵冬斌, 李涛, 易建强, 张建宏. 单配重式自动水平调节吊具及使用方法. 中国: CN101759092A, 2010-06-30.

[21] 易建强, 余 意, 赵冬斌, 张建宏. 绳索牵引自动水平调节吊具及方法. 中国: CN101633478, 2010-01-27.

[22] 易建强, 项炎平, 赵冬斌. 一种双旋配重式自动水平调节吊具系统及控制方法. 中国: CN101468776, 2009-07-01.

[23] 赵冬斌, 徐 冬, 易建强, 张小成. 一种极坐标方式水平自动调节吊具及方法. 中国: CN101450767, 2009-06-10.

[24] 易建强, 赵冬斌, 程 金. 一种航迹自动舵控制系统及其方法. 中国: CN100494898, 2009-06-03.

[25] 易建强, 赵冬斌, 程 金. 一种自动舵航向控制系统及其方法. 中国: CN100491915, 2009-05-27.

[26] 易建强, 项炎平, 赵冬斌. 一种自动化立体仓库框架结构. 中国: CN101407271, 2009-04-15.

[27] 谭湘敏, 易建强, 赵冬斌. 一种移动机器人的位姿传感系统及其方法. 中国: CN100478142, 2009-04-15.

[28] 易建强, 张小成, 赵冬斌, 徐 冬. 一种正交式水平自动调节吊具及方法. 中国: CN101397114, 2009-04-01.

[29] 易建强, 赵冬斌. 线材自动点数机. 中国: CN100465073, 2009-03-04.

[30] 刘伟荣, 易建强, 赵冬斌. 一种应用于互联网的网络拥塞控制系统及方法. 中国: CN101166140, 2008.04.23.

[31] 易建强, 钟志光, 赵冬斌. 一种结合传感技术的射频卡门禁系统. 中国: CN100444186, 2008-12-17.

[32] 赵冬斌, 易建强. 桌上曲棍球机器人系统. 中国: CN100389845, 2008-05-28.

[33] 易建强, 洪义平, 赵冬斌. 鲁棒的自然图像分割方法. 中国: CN100378752, 2008-04-02.

[34] 易建强, 赵冬斌, 李新春, 邓旭玥, 李佳宁. 一种移动机械手控制系统. 中国: CN100361792, 2008-01-16.

[35] 赵冬斌, 易建强. 火灾救援机器人系统及其方法. 中国: CN1994495, 2007-07-11.

[36] 赵冬斌, 易建强. 火灾抢险机器人系统及其方法. 中国: CN1978004, 2007-06-13.

[37] 赵冬斌, 易建强, 宋佐时, 邓旭玥. 移动机械手系统. 中国: CN1319702, 2007-06-06.

[38] 易建强, 洪义平, 赵冬斌. 门牌号自动识别系统及方法. 中国: CN1316418, 2007-05-16.

[39] 赵冬斌, 易建强. 火灾救援机器人系统. 中国: CN2889642, 2007-04-18.

[40] 易建强, 钟志光, 赵冬斌. 一种基于射频技术的病理监测系统. 中国: CN1853556, 2006.11.01.

[41] 易建强, 洪义平, 赵冬斌. 机器人视觉导航中的自然目标检测方法. 中国: CN1873656, 2006-12-06.

[42] 赵冬斌, 易建强. 转球式洗衣机及方法. 中国: CN1869315, 2006-11-29.

[43] 易建强, 赵冬斌. 线材自动点数机. 中国: CN2743236, 2005.11.30.

[44] 易建强, 赵冬斌, 李新春, 邓旭玥, 李佳宁. 一种移动机械手控制系统. 中国: CN2747031, 2005-12-21.

[45] 易建强, 洪义平, 赵冬斌. 一种实时IC卡数字字符识别与校验系统及方法. 中国: CN1684097, 2005-10-19.

[46] 钟志光, 易建强, 赵冬斌. 多用途指纹识别保存柜系统. 中国: CN1614621, 2005-05-11.

[47] 易建强, 刘殿通, 赵冬斌. 吊车全自动控制系统. 中国: CN1613747, 2005-05-11.

[48] 易建强, 刘殿通, 赵冬斌. 吊车半自动控制系统. 中国: CN1613746, 2005-05-11.

[49] 钟志光, 易建强, 赵冬斌. 含指纹识别钥匙柜的网络指纹门禁系统. 中国: CN1612150, 2005-05-04.

[50] 易建强, 钟志光, 赵冬斌. 银行保管箱系统. 中国: CN1570340, 2005-01-26.

[51] 易建强, 刘殿通, 赵冬斌. 吊车全自动控制系统. 中国: CN2663387, 2004-12-15.

[52] 易建强, 刘殿通, 赵冬斌. 吊车半自动控制系统. 中国: CN2659859, 2004-12-01.

[53] 赵冬斌, 易建强. 桌上曲棍球机器人系统. 中国: CN2649274, 2004-10-20.

[54] 赵冬斌, 易建强, 宋佐时, 邓旭玥. 移动机械手系统. 中国: CN2645862, 2004-10-06.

出版信息


发表论文
[1] Lu, Yi, Chen, Yaran, Zhao, Dongbin, Liu, Bao, Lai, Zhichao, Chen, Jianxin. CNN-G: Convolutional Neural Network Combined With Graph for Image Segmentation With Theoretical Analysis. IEEE TRANSACTIONS ON COGNITIVE AND DEVELOPMENTAL SYSTEMS[J]. 2021, 13(3): 631-644, http://dx.doi.org/10.1109/TCDS.2020.2998497.
[2] Li, Nannan, Pan, Yu, Chen, Yaran, Ding, Zixiang, Zhao, Dongbin, Xu, Zenglin. Heuristic rank selection with progressively searching tensor ring network. COMPLEX & INTELLIGENT SYSTEMS. 2021, http://dx.doi.org/10.1007/s40747-021-00308-x.
[3] Zhu, Yuanheng, Zhao, Dongbin, He, Haibo. Optimal Feedback Control of Pedestrian Flow in Heterogeneous Corridors. IEEE TRANSACTIONS ON AUTOMATION SCIENCE AND ENGINEERING[J]. 2021, 18(3): 1097-1108, http://dx.doi.org/10.1109/TASE.2020.2996018.
[4] Lu, Yi, Chen, Yaran, Zhao, Dongbin, Li, Dong. MGRL: Graph neural network based inference in a Markov network with reinforcement learning for visual navigation. NEUROCOMPUTING[J]. 2021, 421: 140-150, http://dx.doi.org/10.1016/j.neucom.2020.07.091.
[5] Zhu, Yuanheng, He, Haibo, Zhao, Dongbin. LMI-Based Synthesis of String-Stable Controller for Cooperative Adaptive Cruise Control. IEEE TRANSACTIONS ON INTELLIGENT TRANSPORTATION SYSTEMS[J]. 2020, 21(11): 4516-4525, https://www.webofscience.com/wos/woscc/full-record/WOS:000587709700003.
[6] Zhu, Yuanheng, Zhao, Dongbin, He, Haibo. Synthesis of Cooperative Adaptive Cruise Control With Feedforward Strategies. IEEE TRANSACTIONS ON VEHICULAR TECHNOLOGY[J]. 2020, 69(4): 3615-3627, https://www.webofscience.com/wos/woscc/full-record/WOS:000530284400009.
[7] Zhao Dongbin. A spatial-temporal LSTM model for human trajectory prediction. IEEE/CAA Journal of Automation Sinica. 2020, [8] Zhao, Xiaodong, Chen, Yaran, Guo, Jin, Zhao, Dongbin. A spatial-temporal attention model for human trajectory prediction. IEEE-CAA JOURNAL OF AUTOMATICA SINICA[J]. 2020, 7(4): 965-974, http://dx.doi.org/10.1109/JAS.2020.1003228.
[9] Wang, Xu, Liu, Jingwei, Wu, Chaoyong, Liu, Junhong, Li, Qianqian, Chen, Yufeng, Wang, Xinrong, Chen, Xinli, Pang, Xiaohan, Chang, Binglong, Lin, Jiaying, Zhao, Shifeng, Li, Zhihong, Deng, Qingqiong, Lu, Yi, Zhao, Dongbin, Chen, Jianxin. Artificial intelligence in tongue diagnosis: Using deep convolutional neural network for recognizing unhealthy tongue with tooth-mark. COMPUTATIONAL AND STRUCTURAL BIOTECHNOLOGY JOURNAL[J]. 2020, 18: 973-980, http://dx.doi.org/10.1016/j.csbj.2020.04.002.
[10] Mu, Chaoxu, Wang, Ke, Zhang, Qichao, Zhao, Dongbin. Hierarchical optimal control for input-affine nonlinear systems through the formulation of Stackelberg game. INFORMATION SCIENCES[J]. 2020, 517: 1-17, http://dx.doi.org/10.1016/j.ins.2019.12.078.
[11] Li, Haoran, Zhang, Qichao, Zhao, Dongbin. Deep Reinforcement Learning-Based Automatic Exploration for Navigation in Unknown Environment. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS[J]. 2020, 31(6): 2064-2076, http://dx.doi.org/10.1109/TNNLS.2019.2927869.
[12] Shao, Kun, Zhu, Yuanheng, Tang, Zhentao, Zhao, Dongbin, IEEE. Cooperative Multi-Agent Deep Reinforcement Learning with Counterfactual Reward. 2020 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN)null. 2020, [13] Liu, Minsong, Zhu, Yuanheng, Zhao, Dongbin, IEEE. An Improved Minimax-Q Algorithm Based on Generalized Policy Iteration to Solve a Chaser-Invader Game. 2020 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN)null. 2020, [14] Zhu, Yuanheng, Zhao, Dongbin, He, Haibo. Invariant Adaptive Dynamic Programming for Discrete-Time Optimal Control. IEEE TRANSACTIONS ON SYSTEMS MAN CYBERNETICS-SYSTEMS[J]. 2020, 50(11): 3959-3971, https://www.webofscience.com/wos/woscc/full-record/WOS:000578826300003.
[15] Zhao Dongbin. Advances in Deep Neural Information Processing - Editorial. Neurocomputing. 2020, [16] Zhao, Dongbin, Duan, Shukai, Yan, Zheng, Alippi, Cesare. Advances in deep neural information processing. NEUROCOMPUTINGnull. 2020, 408: 80-81, http://dx.doi.org/10.1016/j.neucom.2020.01.001.
[17] Zhao Dongbin. Adaptive optimal control of cooperative adaptive cruise control with uncertain heterogeneous vehicles. IEEE Control System Technology. 2019, [18] Lu, Yi, Chen, Yaran, Zhao, Dongbin, Chen, Jianxin, Lu, H, Tang, H, Wang, Z. Graph-FCN for Image Semantic Segmentation. ADVANCES IN NEURAL NETWORKS - ISNN 2019, PT Inull. 2019, 11554: 97-105, [19] Zhu, Yuanheng, Zhao, Dongbin, Li, Xiangjun, Wang, Ding. Control-Limited Adaptive Dynamic Programming for Multi-Battery Energy Storage Systems. IEEE TRANSACTIONS ON SMART GRID[J]. 2019, 10(4): 4235-4244, https://www.webofscience.com/wos/woscc/full-record/WOS:000472577500065.
[20] Gao, Yinfeng, Liu, Yuqi, Zhang, Qichao, Wang, Yu, Zhao, Dongbin, Ding, Dawei, Pang, Zhonghua, Zhang, Yueming, IEEE. Comparison of Control Methods Based on Imitation Learning for Autonomous Driving. 2019 TENTH INTERNATIONAL CONFERENCE ON INTELLIGENT CONTROL AND INFORMATION PROCESSING (ICICIP)null. 2019, 274-281, http://apps.webofknowledge.com/CitedFullRecord.do?product=UA&colName=WOS&SID=5CCFccWmJJRAuMzNPjj&search_mode=CitedFullRecord&isickref=WOS:000613247000046.
[21] Shao, Kun, Zhu, Yuanheng, Zha, Dongbin. StarCraft Micromanagement With Reinforcement Learning and Curriculum Transfer Learning. IEEE TRANSACTIONS ON EMERGING TOPICS IN COMPUTATIONAL INTELLIGENCE[J]. 2019, 3(1): 73-84, http://dx.doi.org/10.1109/TETCI.2018.2823329.
[22] Zhu, Yuanheng, He, Haibo, Zhao, Dongbin, Hou, Zhongsheng, IEEE. Optimal Pedestrian Evacuation in Building with Consecutive Differential Dynamic Programming. 2019 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN)null. 2019, [23] Su, Hao, Chen, Yaran, Tong, Shiwen, Zhao, Dongbin, IEEE. Real-time multiple object tracking based on optical flow. 2019 9TH INTERNATIONAL CONFERENCE ON INFORMATION SCIENCE AND TECHNOLOGY (ICIST2019)null. 2019, 350-356, [24] Lv, Le, Zhao, Dongbin, Shao, Kun. Deep sparse representation-based mid-level visual elements discovery in fine-grained classification. SOFT COMPUTING[J]. 2019, 23(18): 8711-8722, http://dx.doi.org/10.1007/s00500-018-3468-3.
[25] Chen, Yaran, Zhao, Dongbin, Li, Haoran, IEEE. Deep Kalman Filter with Optical Flow for Multiple Object Tracking. 2019 IEEE INTERNATIONAL CONFERENCE ON SYSTEMS, MAN AND CYBERNETICS (SMC)null. 2019, 3036-3041, [26] Li, Dong, Zhao, Dongbin, Zhang, Qichao, Chen, Yaran. Reinforcement Learning and Deep Learning Based Lateral Control for Autonomous Driving. IEEE COMPUTATIONAL INTELLIGENCE MAGAZINE[J]. 2019, 14(2): 83-98, http://ir.ia.ac.cn/handle/173211/23517.
[27] Wang, Bin, Zhao, Dongbin, Cheng, Jin. Adaptive cruise control via adaptive dynamic programming with experience replay. SOFT COMPUTING[J]. 2019, 23(12): 4131-4144, http://ir.ia.ac.cn/handle/173211/24396.
[28] Zhang, Qichao, Zhao, Dongbin. Data-Based Reinforcement Learning for Nonzero-Sum Games With Unknown Drift Dynamics. IEEE TRANSACTIONS ON CYBERNETICS[J]. 2019, 49(8): 2874-2885, http://ir.ia.ac.cn/handle/173211/24567.
[29] Wang, Junjie, Zhang, Qichao, Zhao, Dongbin, Chen, Yaran, IEEE. Lane Change Decision-making through Deep Reinforcement Learning with Rule-based Constraints. 2019 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN)null. 2019, http://apps.webofknowledge.com/CitedFullRecord.do?product=UA&colName=WOS&SID=5CCFccWmJJRAuMzNPjj&search_mode=CitedFullRecord&isickref=WOS:000530893803042.
[30] Zhu, Yuanheng, Zhao, Dongbin, Zhong, Zhiguang. Adaptive Optimal Control of Heterogeneous CACC System With Uncertain Dynamics. IEEE TRANSACTIONS ON CONTROL SYSTEMS TECHNOLOGY[J]. 2019, 27(4): 1772-1779, [31] Zhang, Qichao, Luo, Rui, Zhao, Dongbin, Luo, Chaomin, Qian, Dianwei, IEEE. Model-Free Reinforcement Learning based Lateral Control for Lane Keeping. 2019 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN)null. 2019, [32] Tang Zhentao, Shao Kun, Zhu Yuanheng, Li Dong, Zhao Dongbin, Huang Tingwen, Sundaram S. A Review of Computational Intelligence for StarCraft AI. 2018 IEEE SYMPOSIUM SERIES ON COMPUTATIONAL INTELLIGENCE (IEEE SSCI)null. 2018, 1167-1173, http://apps.webofknowledge.com/CitedFullRecord.do?product=UA&colName=WOS&SID=5CCFccWmJJRAuMzNPjj&search_mode=CitedFullRecord&isickref=WOS:000459238800159.
[33] Zhao, Xiaodong, Zhang, Qichao, Zhao, Dongbin, Pang, Zhonghua, Sun, MX, Zhang, HG. Overview of Image Segmentation and Its Application on Free Space Detection. PROCEEDINGS OF 2018 IEEE 7TH DATA DRIVEN CONTROL AND LEARNING SYSTEMS CONFERENCE (DDCLS)null. 2018, 1164-1169, http://apps.webofknowledge.com/CitedFullRecord.do?product=UA&colName=WOS&SID=5CCFccWmJJRAuMzNPjj&search_mode=CitedFullRecord&isickref=WOS:000450645900210.
[34] Chen, Yaran, Zhao, Dongbin, Lv, Le, Zhang, Qichao. Multi-task learning for dangerous object detection in autonomous driving. INFORMATION SCIENCES[J]. 2018, 432(*): 559-571, http://dx.doi.org/10.1016/j.ins.2017.08.035.
[35] Zhang, Zhen, Wang, Dongqing, Zhao, Dongbin, Han, Qiaoni, Song, Tingting. A Gradient-Based Reinforcement Learning Algorithm for Multiple Cooperative Agents. IEEE ACCESS[J]. 2018, 6: 70223-70235, http://ir.ia.ac.cn/handle/173211/25665.
[36] Zhao Dongbin, Li Haoran, Li Dong, Guo Ping, Chen Yaran. A Temporal-based Deep Learning Method for Multiple Objects Detection in Autonomous Driving. 2018, http://ir.ia.ac.cn/handle/173211/23521.
[37] Zhu, Yuanheng, Zhao, Dongbin. Comprehensive comparison of online ADP algorithms for continuous-time optimal control. ARTIFICIAL INTELLIGENCE REVIEW[J]. 2018, 49(4): 531-547, https://www.webofscience.com/wos/woscc/full-record/WOS:000426912500004.
[38] Zhao, Dongbin, Liu, Derong, Lewis, F L, Principe, Jose C, Squartini, Stefano. Special Issue on Deep Reinforcement Learning and Adaptive Dynamic Programming. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMSnull. 2018, 29(6): 2038-2041, https://www.webofscience.com/wos/woscc/full-record/WOS:000432398300001.
[39] Li Dong, Zhao Dongbin, Zhang Qichao, Zhu Yuanheng, Sundaram S. An Autonomous Driving Experience Platform with Learning-Based Functions. 2018 IEEE SYMPOSIUM SERIES ON COMPUTATIONAL INTELLIGENCE (IEEE SSCI)null. 2018, 1174-1179, http://apps.webofknowledge.com/CitedFullRecord.do?product=UA&colName=WOS&SID=5CCFccWmJJRAuMzNPjj&search_mode=CitedFullRecord&isickref=WOS:000459238800160.
[40] Zhu, Yuanheng, Zhao, Dongbin, Yang, Xiong, Zhang, Qichao. Policy Iteration for H infinity Optimal Control of Polynomial Nonlinear Systems via Sum of Squares Programming. IEEE TRANSACTIONS ON CYBERNETICS[J]. 2018, 48(2): 500-509, https://www.webofscience.com/wos/woscc/full-record/WOS:000422925700005.
[41] Yuanheng Zhu, Nannan Li, Kun Shao, Dongbin Zhao. Learning battles in ViZDoom via deep reinforcement learning. 2018, http://ir.ia.ac.cn/handle/173211/23364.
[42] Zhang, Qichao, Zhao, Dongbin, Lewis, Frank L, IEEE. Model-Free Reinforcement Learning for Fully Cooperative Multi-Agent Graphical Games. 2018 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN)null. 2018, [43] Zhang, Qichao, Zhao, Dongbin, Wang, Ding. Event-Based Robust Control for Uncertain Nonlinear Systems Using Adaptive Dynamic Programming. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS[J]. 2018, 29(1): 37-50, https://www.webofscience.com/wos/woscc/full-record/WOS:000419558900004.
[44] Chen, Yaran, Zhao, Dongbin, Li, Haoran, Li, Dong, Guo, Ping, IEEE. A temporal-based deep learning method for multiple objects detection in autonomous driving. 2018 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN)null. 2018, [45] Shao, Kun, Zhao, Dongbin, Zhu, Yuanheng, Zhang, Qichao, IEEE. Visual Navigation with Actor-Critic Deep Reinforcement Learning. 2018 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN)null. 2018, [46] Bu, Li, Alippi, Cesare, Zhao, Dongbin. A pdf-Free Change Detection Test Based on Density Difference Estimation. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS[J]. 2018, 29(2): 324-334, https://www.webofscience.com/wos/woscc/full-record/WOS:000422952400007.
[47] Wu, I C, Lee, C S, Tian, Y, Mueller, M. Guest Editorial Special Issue on Deep/Reinforcement Learning and Games. IEEE TRANSACTIONS ON GAMESnull. 2018, 10(4): 333-335, https://www.webofscience.com/wos/woscc/full-record/WOS:000453577300001.
[48] Zhao Dongbin. Comprehesive comparison of online ADP algorithms for continuous-time optimal control. Artificial Intelligence Review. 2018, [49] Lu, Yi, Chen, Yaran, Zhao, Dongbin, Li, Haoran, IEEE. Hybrid Deep Learning Based Moving Object Detection via Motion prediction. 2018 CHINESE AUTOMATION CONGRESS (CAC)null. 2018, 1442-1447, http://apps.webofknowledge.com/CitedFullRecord.do?product=UA&colName=WOS&SID=5CCFccWmJJRAuMzNPjj&search_mode=CitedFullRecord&isickref=WOS:000459239501089.
[50] Li, Dong, Zhao, Dongbin, Chen, Yaran, Zhang, Qichao, IEEE. DeepSign: Deep Learning based Traffic Sign Recognition. 2018 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN)null. 2018, [51] Zhu Yuanheng, Zhang Qichao, Zhao Dongbin, Li Dong. An Autonomous Driving Experience Platform with Learning-Based Functions. 2018 IEEE SYMPOSIUM SERIES ON COMPUTATIONAL INTELLIGENCE (IEEE SSCI)null. 2018, 1174-1179, http://apps.webofknowledge.com/CitedFullRecord.do?product=UA&colName=WOS&SID=5CCFccWmJJRAuMzNPjj&search_mode=CitedFullRecord&isickref=WOS:000459238800160.
[52] Yuanheng Zhu, Qichao Zhang, Dongbin Zhao, Kun Shao. Visual navigation with Actor-Critic deep reinforcement learning. 2018, http://ir.ia.ac.cn/handle/173211/23365.
[53] Zhu, Yuanheng, Zhao, Dongbin, He, Haibo, Ji, Junhong. Event-Triggered Optimal Control for Partially Unknown Constrained-Input Systems via Adaptive Dynamic Programming. IEEE TRANSACTIONS ON INDUSTRIAL ELECTRONICS[J]. 2017, 64(5): 4101-4109, https://www.webofscience.com/wos/woscc/full-record/WOS:000399674000064.
[54] Deng QingQiong, Zhao Dongbin, Lv Le. Image Clustering based on Deep Sparse Representations. 2016 IEEE Symposium Series on Computational Intelligence: SSCI 2016, Athens, Greece, 6-9 December 2016, pages 2037-2712, v.4null. 2017, 2108-2113, http://ir.ia.ac.cn/handle/173211/14471.
[55] Bu Li, Zhao Dongbin, Alippi Cesare. An Incremental Change Detection Test Based on Density Difference Estimation. IEEE Transactions on Systems, Man, and Cybernetics: Systems[J]. 2017, [56] Li, Chengdong, Ding, Zixiang, Zhao, Dongbin, Yi, Jianqiang, Zhang, Guiqing. Building Energy Consumption Prediction: An Extreme Deep Learning Approach. ENERGIES[J]. 2017, 10(10): https://doaj.org/article/97e10cd1f86645f384b67cc9b9f33881.
[57] Zhang, Qichao, Zhao, Dongbin, Zhu, Yuanheng. Data-driven adaptive dynamic programming for continuous-time fully cooperative games with partially constrained inputs. NEUROCOMPUTING[J]. 2017, 238(*): 377-386, http://dx.doi.org/10.1016/j.neucom.2017.01.076.
[58] Zhao, Dongbin, Chen, Yaran, Lv, Le. Deep Reinforcement Learning With Visual Attention for Vehicle Classification. IEEE TRANSACTIONS ON COGNITIVE AND DEVELOPMENTAL SYSTEMS[J]. 2017, 9(4): 356-367, http://dx.doi.org/10.1109/TCDS.2016.2614675.
[59] Li Dong, Zhao Dongbin, Zhang Qichao, Luo Chaomin, IEEE. Policy Gradient Methods with Gaussian Process Modelling Acceleration. 2017 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN)null. 2017, 1774-1779, [60] Qichao Zhang, Haoran Li, Dongbin Zhao. Comparison of methods to efficient graph SLAM under general optimization framework. YAC 2017null. 2017, *-, http://ir.ia.ac.cn/handle/173211/19422.
[61] Zhao Dongbin. Editorial: new developments in neural network structures for signal processing, autonomous decision, and adaptive controll. IEEE Transactions on Neural Networks and Learning Systems. 2017, [62] 唐振韬, 邵坤, 赵冬斌, 朱圆恒. 深度强化学习进展:从AlphaGo到AlphaGo Zero. 控制理论与应用[J]. 2017, 34(12): 1529-1546, http://lib.cqvip.com/Qikan/Article/Detail?id=7000480876.
[63] Tang Zhentao, Lv Le, Shao Kun, Zhao Dongbin. ADP with MCTS algorithm for Gomoku. 2017, http://ir.ia.ac.cn/handle/173211/14475.
[64] Zhao Dongbin, Wei Qinglai, Alippi Cesare, Bu Li. A Kolmogorov-Smirnov Test to Detect Changes in Stationarity in Big Data. IFAC PAPERSONLINEnull. 2017, 50(1): 14260-14265, http://dx.doi.org/10.1016/j.ifacol.2017.08.1821.
[65] 朱圆恒, 赵冬斌, 邵坤. Cooperative Reinforcement Learning for Multiple Units Combat in StarCraft. 2017, http://ir.ia.ac.cn/handle/173211/15399.
[66] Zhu, Yuanheng, Zhao, Dongbin, Li, Xiangjun. Iterative Adaptive Dynamic Programming for Solving Unknown Nonlinear Zero-Sum Game Based on Online Data. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS[J]. 2017, 28(3): 714-725, https://www.webofscience.com/wos/woscc/full-record/WOS:000395980500020.
[67] Zhang Qichao, Zhao Dongbin, Zhu Yuanheng. Event-Triggered $H_\\infty $ Control for Continuous-Time Nonlinear System via Concurrent Learning. IEEE Transactions on Systems, Man, and Cybernetics: Systems[J]. 2017, [68] Zhao, Dongbin, Xia, Zhongpu, Zhang, Qichao. Model-free Optimal Control based Intelligent Cruise Control with Hardware-in-the-loop Demonstration. IEEE COMPUTATIONAL INTELLIGENCE MAGAZINE[J]. 2017, 12(2): 56-69, https://www.webofscience.com/wos/woscc/full-record/WOS:000399714900005.
[69] Lv, Le, Zhao, Dongbin, Deng, Qingqiong. A Semi-Supervised Predictive Sparse Decomposition Based on Task-Driven Dictionary Learning. COGNITIVE COMPUTATION[J]. 2017, 9(1): 115-124, https://www.webofscience.com/wos/woscc/full-record/WOS:000394418100008.
[70] Zhao Dongbin. Event-triggered optimal control for nonlinear constrained-input systems with partially unknown dynamics via adaptive dynamic programming. IEEE Transactions on Industrial Electronics. 2017, [71] Shengli Xie, Derong Liu, Dongbin Zhao, ElSayed M ElAlfy, Yuanqing Li. Neural Information Processing. Neural Information Processing, Lecture Notes in Computer Sciencenull. 2017, 10636, 10637, 10638, 10639,-, http://ir.ia.ac.cn/handle/173211/19892.
[72] Zhao Dongbin, Zhang Qichao. Data-driven adaptive dynamic programming for two-player nonzero-sum game. 2017, http://ir.ia.ac.cn/handle/173211/14342.
[73] Chen, Yaran, Zhao, Dongbin, Cong, F, Leung, A, Wei, Q. Multi-task Learning with Cartesian Product-Based Multi-objective Combination for Dangerous Object Detection. ADVANCES IN NEURAL NETWORKS, PT Inull. 2017, 10261: 28-35, [74] Bu, Li, Zhao, Dongbin, Alippi, Cesare. An Incremental Change Detection Test Based on Density Difference Estimation. IEEE TRANSACTIONS ON SYSTEMS MAN CYBERNETICS-SYSTEMS[J]. 2017, 47(10): 2714-2726, https://www.webofscience.com/wos/woscc/full-record/WOS:000411098200009.
[75] Zhang, Qichao, Zhao, Dongbin, Zhu, Yuanheng. Event-Triggered H-infinity Control for Continuous-Time Nonlinear System via Concurrent Learning. IEEE TRANSACTIONS ON SYSTEMS MAN CYBERNETICS-SYSTEMS[J]. 2017, 47(7): 1071-1081, https://www.webofscience.com/wos/woscc/full-record/WOS:000404354600004.
[76] Zhang, Zhen, Zhao, Dongbin, Gao, Junwei, Wang, Dongqing, Dai, Yujie. FMRQ-A Multiagent Reinforcement Learning Algorithm for Fully Cooperative Tasks. IEEE TRANSACTIONS ON CYBERNETICS[J]. 2017, 47(6): 1367-1379, https://www.webofscience.com/wos/woscc/full-record/WOS:000401950400002.
[77] Li Dong, Zhao Dongbin, Zhang Qichao, Luo Chaomin, IEEE. Policy Gradient Methods with Gaussian Process Modelling Acceleration. 2017 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN)null. 2017, 1774-1779, [78] Wang, Ding, Liu, Derong, Zhang, Qichao, Zhao, Dongbin. Data-Based Adaptive Critic Designs for Nonlinear Robust Optimal Control With Uncertain Dynamics. IEEE TRANSACTIONS ON SYSTEMS MAN CYBERNETICS-SYSTEMS[J]. 2016, 46(11): 1544-1555, https://www.webofscience.com/wos/woscc/full-record/WOS:000386225800006.
[79] Tang, Yufei, He, Haibo, Ni, Zhen, Zhong, Xiangnan, Zhao, Dongbin, Xu, Xin. Fuzzy-Based Goal Representation Adaptive Dynamic Programming. IEEE TRANSACTIONS ON FUZZY SYSTEMS[J]. 2016, 24(5): 1159-1175, https://www.webofscience.com/wos/woscc/full-record/WOS:000386076600013.
[80] Zhu, Yuanheng, Zhao, Dongbin, Li, Xiangjun. Using reinforcement learning techniques to solve continuous-time non-linear optimal tracking problem without system dynamics. IET CONTROL THEORY AND APPLICATIONS[J]. 2016, 10(12): 1339-1347, [81] Zhu Yuanheng, Chen Xi, Zhao Dongbin, Zhang Qichao. Model-free reinforcement learning for nonlinear zero-sum games with simultaneous explorations. 2016, http://ir.ia.ac.cn/handle/173211/14340.
[82] Li Dong, Xia Zhongpu, Zhao Dongbin. A Perturbed Gaussian Process Regression with Chunk Sparsification for Tracking Non-stationary Systems. PROCEEDINGS OF THE 28TH CHINESE CONTROL AND DECISION CONFERENCE (2016 CCDC)null. 2016, 6639-6644, [83] 周彤, 李栋, 朱圆恒, 王成红, 刘德荣, 王海涛, 陈亚冉, 邵坤, 赵冬斌. 深度强化学习综述:兼论计算机围棋的发展. 控制理论与应用[J]. 2016, 33(6): 701-717, [84] Zhao Dongbin, Alippi Cesare, Bu Li. Ensemble LSDD-based change detection tests. 2016, http://ir.ia.ac.cn/handle/173211/14332.
[85] 孙长银, 王成红, 胡跃明, 赵东斌, 周彤, 苏剑波. “机器智能、系统优化与最优决策”专刊前言. 控制理论与应用. 2016, 33(12): 1553-1554, http://lib.cqvip.com/Qikan/Article/Detail?id=7000119650.
[86] Zhao Dongbin. Model-free iterative adaptive dynamic programming solving unknown nonlinear zero-sum game based on online measurement. IEEE Transactions on Neural Networks and Learning Systems. 2016, [87] Zhao, Dongbin, Zhang, Qichao, Wang, Ding, Zhu, Yuanheng. Experience Replay for Optimal Control of Nonzero-Sum Game Systems With Unknown Dynamics. IEEE TRANSACTIONS ON CYBERNETICS[J]. 2016, 46(3): 854-865, https://www.webofscience.com/wos/woscc/full-record/WOS:000370963500023.
[88] Dongbin Zhao, Le Lv, Qingqiong Deng. Image clustering based on the deep sparse representations. Computational Intelligence (SSCI), 2016 IEEE Symposium Series onnull. 2016, 1-6, http://ir.ia.ac.cn/handle/173211/19423.
[89] ZhuYuanheng, ShaoKun, WangHaitao, 赵冬斌. Deep reinforcement learning with Experience Replay based on SARSA. 2016, http://ir.ia.ac.cn/handle/173211/19877.
[90] Chen, Yaran, Zhao, Dongbin, Lv, Le, Li, Chengdong, IEEE. A Visual Attention Based Convolutional Neural Network for Image Classification. PROCEEDINGS OF THE 2016 12TH WORLD CONGRESS ON INTELLIGENT CONTROL AND AUTOMATION (WCICA)null. 2016, 764-769, http://apps.webofknowledge.com/CitedFullRecord.do?product=UA&colName=WOS&SID=5CCFccWmJJRAuMzNPjj&search_mode=CitedFullRecord&isickref=WOS:000388373802067.
[91] Xia, Zhongpu, Zhao, Dongbin. Online reinforcement learning control by Bayesian inference. IET CONTROL THEORY AND APPLICATIONS[J]. 2016, 10(12): 1331-1338, https://www.webofscience.com/wos/woscc/full-record/WOS:000381410000003.
[92] 赵冬斌, 朱圆恒. 概率近似正确的强化学习算法解决连续状态空间控制问题. 控制理论与应用. 2016, 33(12): 1603-1613, http://lib.cqvip.com/Qikan/Article/Detail?id=7000119656.
[93] Wang, Ding, Liu, Derong, Zhang, Qichao, Zhao, Dongbin. Data-Based Adaptive Critic Designs for Nonlinear Robust Optimal Control With Uncertain Dynamics. IEEE TRANSACTIONS ON SYSTEMS MAN CYBERNETICS-SYSTEMS[J]. 2016, 46(11): 1544-1555, https://www.webofscience.com/wos/woscc/full-record/WOS:000386225800006.
[94] Zhao, Dongbin, Zhu, Yuanheng. MEC-A Near-Optimal Online Reinforcement Learning Algorithm for Continuous Deterministic Systems. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS[J]. 2015, 26(2): 346-356, http://www.irgrid.ac.cn/handle/1471x/980893.
[95] Ni, Zhen, He, Haibo, Zhao, Dongbin, Xu, Xin, Prokhorov, Danil V. GrDHP: A General Utility Function Representation for Dual Heuristic Dynamic Programming. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS[J]. 2015, 26(3): 614-627, https://www.webofscience.com/wos/woscc/full-record/WOS:000351834400016.
[96] Zhao Dongbin, ZhangQichao, Li Chengdong, Wei Qinglai. Consensus of Heterogeneous Multi-agent Systems With Switching Topologies Using Input-output Feedback Linearization. 2015 34th Chinese control conference: CCC 2015, Hangzhou, China, 28-30 July 2015, pages 6414-7296, v.8null. 2015, 6872-6877, http://ir.ia.ac.cn/handle/173211/14338.
[97] Zhang, Qichao, Zhao, Dongbin, Wei, Qinglai, Li, Chengdong, Zhao, Q, Liu, S. Consensus of Heterogeneous Multi-agent Systems With Switching Topologies Using Input-output Feedback Linearization. 2015 34TH CHINESE CONTROL CONFERENCE (CCC)null. 2015, 6872-6877, [98] Zhu, Yuanheng, Zhao, Dongbin, He, Haibo, Ji, Junhong. Convergence Proof of Approximate Policy Iteration for Undiscounted Optimal Control of Discrete-Time Systems. COGNITIVE COMPUTATION[J]. 2015, 7(6): 763-771, http://ir.ia.ac.cn/handle/173211/10525.
[99] Squartini, Stefano, Liu, Derong, Piazza, Francesco, Zhao, Dongbin, He, Haibo. Computational Energy Management in Smart Grids. NEUROCOMPUTINGnull. 2015, 170: 267-269, http://dx.doi.org/10.1016/j.neucom.2015.05.110.
[100] 王革, 刘广天, 汪海洪, 巩可欣, 赵冬斌. 能源存储:一种新的方法. 能源存储:一种新的方法null. 2015, http://ir.ia.ac.cn/handle/173211/19889.
发表著作
(1) 全方位移动机器人导论, An introduction to Omnidirectinoal Mobile Robots, 科学出版社, 2010-05, 第 1 作者
(2) 机器人手册,第26章-面向操作任务的运动, Springer Handbook of Robotics, Chapter 26 - Motion for Manipulation Tasks, 机械工业出版社, 2013-01, 第 1 作者
(3) 机器人手册,第51章-智能车辆, Springer Handbook of Robotics, Chapter 51 - Intelligent Vehicles, 机械工业出版社, 2013-01, 第 1 作者
(4) Advances in Brain Inspired Cognitive Systems, Advances in Brain Inspired Cognitive Systems, Springer Heidelberg Dordrecht London NewYork, 2013-06, 第 3 作者
(5) Frontiers of Intelligent Control and Information Processing, Frontiers of Intelligent Control and Information Processing, World Scientific Publishing, 2014-11, 第 3 作者
(6) Advances in Neural Networks – ISNN 2015, Springer Heidelberg Dordrecht London NewYork, 2015-04, 第 4 作者
(7) Neural Information Processing, Lecture Notes in Computer Science 10636, 10637, 10638, 10639, Springer Heidelberg Dordrecht London NewYork, 2017-10, 第 4 作者

科研活动

   
科研项目
( 1 ) 基于数据的非线性控制系统分析与设计, 参与, 国家级, 2011-01--2014-12
( 2 ) 汽车的智慧起停巡航控制, 主持, 省级, 2012-01--2014-12
( 3 ) 基于监督式ADP 的汽车智能巡航控制, 主持, 国家级, 2013-01--2016-12
( 4 ) 能源管控中心平行控制节能技术研究, 参与, 省级, 2013-04--2014-12
( 5 ) 建筑能耗数据挖掘与分析工具包开发, 参与, 省级, 2013-12--2014-12
( 6 ) 汽车自适应巡航控制(ACC)系统及方法, 主持, 省级, 2013-09--2016-05
( 7 ) 人机交互的监督强化学习控制理论和方法, 主持, 研究所(学校), 2015-01--2016-12
( 8 ) 深度自适应动态规划理论方法和应用, 主持, 国家级, 2016-01--2019-12
( 9 ) 基于数据的建筑群及分布式能源系统一体化建模与自学习优化控制, 参与, 国家级, 2016-01--2020-12
( 10 ) 中国科学院海外评审专家(何海波), 主持, 部委级, 2015-01--2016-12
( 11 ) 智能辅助驾驶控制系统关键技术研究与产品开发, 主持, 国家级, 2016-07--2019-06
( 12 ) 不完全信息动态博弈的优化决策, 主持, 国家级, 2017-03--2018-12
( 13 ) 深度神经网络优化的群体协作神经动力学方法, 主持, 部委级, 2018-01--2020-12
( 14 ) 智能驾驶危险目标检测的深度强化学习方法, 主持, 省级, 2018-01--2019-12
( 15 ) 高度自动驾驶(L4级)电动汽车关键技术研发及验证平台开发--深度强化学习应用, 主持, 省级, 2018-01--2019-12
( 16 ) “气虚证辨证标准的系统研究”的中医证候辨证新方法研究--中医AI, 主持, 国家级, 2018-01--2020-12
( 17 ) 基于人工智能的智能驾驶体验科普展品, 主持, 省级, 2018-01--2018-12
( 18 ) 强化学习技术和硬件化技术研究, 主持, 院级, 2018-09--2019-06
( 19 ) 面向智能驾驶的深度强化学习方法研究, 主持, 院级, 2018-09--2019-08
( 20 ) 面向地铁运营场景需求的智能感知核心技术研究, 主持, 院级, 2018-09--2019-08
( 21 ) 非完全信息条件下的博弈决策--知识与数据共同驱动的深度强化学习算法, 主持, 国家级, 2020-01--2022-12
( 22 ) 复杂城市交互场景下的电动汽车智能决策技术, 主持, 省级, 2019-07--2020-06
( 23 ) 适配硬件的算子结构优化及自动并行切分技术研究, 主持, 院级, 2019-08--2020-05
参与会议
(1)Deep Reinforcement Learning for Video Game   华为多智能体强化学习研讨会   2019-04-25
(2)深度强化学习算法与医疗应用   中国中医药信息研究会临床研究分会第三届学术年会   2018-09-08
(3)深度强化学习算法与应用   中国自动化学会“深度与宽度强化学习”前沿论坛   2018-05-30
(4)Game AI with RL and DL   2018-05-21
(5)深度强化学习进展:从AlphaGo到AlphaGo Zero   第二届世界智能大会   2018-05-17
(6)Game AIs with RL and DL   2018-05-16
(7)Recent Progress on Deep Reinforcement Learning-- from AlphaGo to AlphaGo Zero   三星机器学习前沿研讨会   2018-01-15
(8)深度强化学习算法及应用   中国电力科学研究院2017年二 零八科学会议—人工智能在电力领域的研究应用方向和关键技术   2017-12-06
(9)Cooperative reinforcement learning for multiple units combat in StarCraft   Kun Shao, Yuanheng Zhu, Dongbin Zhao   2017-11-28
(10)Event-triggered integral reinforcement learning for nonlinear continuous-time systems   Qichao Zhang, Dongbin Zhao   2017-11-28
(11)深度强化学习进展—从AlphaGo到AlphaGo Zero   中国仿真学会智能物联专委会会议   2017-11-17
(12)Off-Policy reinforcement learning for partially unknown nonzero-sum games   2017-11-16
(13)FMR-GA -- A cooperative multi-agent reinformcement learning algorithm based on gradient ascent   2017-11-16
(14)人工智能方法及其在智慧城市中的应用   泰山科技论坛—人工智能在智慧城市建设中的应用研究   2017-11-08
(15)A Kolmogorov-Smirnov test to detect changes in stationarity in big data   2017-07-06
(16)Multi-task learning with Cartesian product-based multi-objective combination for dangerous object detection   2017-06-10
(17)Data-driven adaptive dynamic programming for two-player nonzero-sum game   2017-05-29
(18)Comparison of methods to efficient graph SLAM under general optimization framework   2017-05-19
(19)Policy gradient methods with gaussian process modelling acceleration   2017-05-16

指导学生

已指导学生

田艺  硕士研究生  081101-控制理论与控制工程  

胡朝辉  硕士研究生  081101-控制理论与控制工程  

戴钰桀  博士研究生  081101-控制理论与控制工程  

苏永生  硕士研究生  081101-控制理论与控制工程  

张震  博士研究生  081101-控制理论与控制工程  

王滨  博士研究生  081101-控制理论与控制工程  

朱圆恒  博士研究生  081101-控制理论与控制工程  

王海涛  硕士研究生  081101-控制理论与控制工程  

夏中谱  博士研究生  081101-控制理论与控制工程  

张启超  博士研究生  081101-控制理论与控制工程  

吕乐  博士研究生  081101-控制理论与控制工程  

卜丽  博士研究生  081101-控制理论与控制工程  

陈亚冉  博士研究生  081101-控制理论与控制工程  

邵坤  博士研究生  081101-控制理论与控制工程  

李栋  博士研究生  081101-控制理论与控制工程  

现指导学生

唐振韬  博士研究生  081101-控制理论与控制工程  

卢毅  博士研究生  081101-控制理论与控制工程  

李浩然  博士研究生  081101-控制理论与控制工程  

刘民颂  硕士研究生  081101-控制理论与控制工程  

丁子祥  博士研究生  081203-计算机应用技术  

刘育琦  博士研究生  081101-控制理论与控制工程  

李伟凡  博士研究生  081104-模式识别与智能系统  

胡光政  博士研究生  081203-计算机应用技术  

李楠楠  博士研究生  081101-控制理论与控制工程  

王俊杰  博士研究生  081101-控制理论与控制工程