基本信息

赵冬斌,男、博导,中国科学院自动化研究所研究员,中国科学院大学岗位教授

电子邮件: dongbin.zhao@ia.ac.cn
通信地址: 海淀区中关村东路95号智能化大厦1005
邮政编码: 100190

部门/实验室:复杂系统管理与控制国家重点实验

研究领域

智能学习控制:深度强化学习,自适应动态规划,强化学习,演化计算方法,智能游戏
智能交通:智能驾驶,交通信号控制
机器人:移动机器人,机器人学习控制,机电一体化系统
能源系统管控:化工过程仿真,能效优化与控制,智能电网,建筑节能

招生信息

   
招生专业
081101-控制理论与控制工程
招生方向
深度强化学习,自适应动态规划,强化学习,智能控制
智能驾驶,智能游戏,机器人,智能交通,能源管控

教育背景

1996-09--2000-04   哈尔滨工业大学   博士
1994-09--1996-07   哈尔滨工业大学   硕士
1990-09--1994-07   哈尔滨工业大学   学士
出国学习工作
2007年8月-2008年8月,University of Arizona, 访问学者,国家留学基金委公派留学计划。

工作经历

   
工作简历
2014-01~2014-02,新加坡科技研究局, 访问学者
2012-11~现在, 中科院自动化所, 研究员、博导
2002-04~2012-10,中国科学院自动化研究所, 副研、硕导-博导
2000-05~2002-01,清华大学, 博士后
社会兼职
2017-11-26-2017-11-30,IEEE Symposium on Adaptive Dynamic Programming and Reinforcement Learning (ADPRL 2017), Honolulu, Hawaii, USA, Program Chair
2017-11-13-2017-11-17,The 24th International Conference on Neural Information Processing (ICONIP 2017), Guangzhou, China, Program Chair
2017-01-01-今,IEEE计算智能学会北京分会, 主席
2016-12-05-2016-12-08,IEEE Symposium on Adaptive Dynamic Programming and Reinforcement Learning (ADPRL 2016), Athens, Greece, Program Chair
2016-07-25-2017-07-29,IEEE World Congress on Computational Intelligence (WCCI 2016), Vancouver, Canada, Publicity Co-chair
2016-06-11-2016-06-14,The 13th World Congress on Intelligent Control and Automation (WCICA 2016), Guilin, China, Program Co-Chair
2015-10-15-2015-10-18,12th International Symposium on Neural Networks (ISNN 2015), Jeju, Korea, Program Co-Chair
2015-04-24-2015-04-26,The 5th International Conference on Information Science and Technology (ICIST 2015), Changsha, China, Program Chair
2015-01-01-今,Artificial Intelligence Review, Associate Editor
2014-12-31-2016-12-31,IEEE计算智能学会自适应动态规划和强化学习技术委员会, 主席
2014-12-31-2015-12-31,IEEE计算智能学会旅行资助委员会, 主席
2014-12-31-2016-12-31,IEEE计算智能学会多媒体委员会, 主席
2014-12-31-2016-12-31,IEEE计算智能学会北京分会, 副主席
2014-12-09-2014-12-12,IEEE Symposiums Series on Computational Intelligence (SSCI 2014), Atlanta, USA, Poster Chair
2014-07-06-2014-07-11,IEEE World Congress on Computational Intelligence (WCCI 2014), Beijing, China, Finance Co-Chair
2014-07-06-2014-07-11,IEEE CIS Summer School on Automated Computational Intelligence, Beijing, China, Chair
2014-01-01-今,IEEE Computational Intelligence Magazine, Associate Editor,
2013-06-09-2013-06-11,The 4th International Conference on Intelligent Control and Information Processing (ICICIP 2013), Beijing, China, Program Chair
2012-12-31-2014-12-30,IEEE CIS Newsletter, Editor,
2012-07-11-2012-07-14,International Symposium on Neural Networks (ISNN 2012), Shenyang, China, Registration Chair
2012-07-11-2012-07-14,Brain Inspired Cognitive Systems (BICS 2012), Shenyang, China, Finance Chair
2012-01-01-今,IEEE Transactions on Neural Networks and Learning Systems, Associate Editor
2011-11-01-今,Cognitive Computation, Associate Editor,
2010-10-01-今,IEEE高级会员,

教授课程

智能控制
智能控制理论基础及应用

专利与奖励

   
奖励信息
(1) 基于数据的非线性系统自学习最优控制理论与方法, 三等奖, 部委级, 2015
(2) 中国科学院“朱李月华优秀教师”奖, , 院级, 2014
(3) 中国石油和化工自动化应用协会科技进步一等奖, 一等奖, 部委级, 2012
(4) 北京市科学技术奖, 三等奖, 省级, 2010
(5) 中国石油和化学工业协会科技进步三等奖, 三等奖, 部委级, 2009
专利成果
( 1 ) 极坐标方式自动水平调节吊具系统及方法, 发明, 2010, 第 1 作者, 专利号: ZL200710178782.8
( 2 ) 转球式洗衣机及方法, 发明, 2010, 第 1 作者, 专利号: ZL200510011787.2
( 3 ) 一种应用于互联网的网络拥塞控制系统及方法, 发明, 2010, 第 3 作者, 专利号: ZL 200610113821.1
( 4 ) 火灾抢险机器人系统及其方法, 发明, 2010, 第 1 作者, 专利号: ZL200510126236.0
( 5 ) 一种正交式水平自动调节吊具及方法, 发明, 2010, 第 3 作者, 专利号: ZL200710122474.3
( 6 ) 火灾救援机器人系统及其方法, 发明, 2010, 第 1 作者, 专利号: ZL200510130759.2
( 7 ) 街区路口交通信号优化控制方法, 发明, 2011, 第 1 作者, 专利号: ZL200910076851.3
( 8 ) 单配重式自动水平调节吊具系统及方法, 发明, 2012, 第 1 作者, 专利号: ZL200810240941.7
( 9 ) 车辆的自适应巡航控制系统和方法, 发明, 2013, 第 1 作者, 专利号: ZL201010615914.0
( 10 ) 煤气化炉仿真方法, 发明, 2014, 第 1 作者, 专利号: ZL201210291386.7
( 11 ) 模糊自适应动态规划方法, 发明, 2014, 第 1 作者, 专利号: ZL201210118982.5
( 12 ) Adaptive Cruise Control System and Method for Vehicle, 发明, 2016, 第 1 作者, 专利号: PAT 9266533
( 13 ) 基于监督式强化学习的最优控制方法, 发明, 2016, 第 1 作者, 专利号: ZL103324085A
( 14 ) 基于数据的Q函数自适应动态规划方法, 发明, 2016, 第 1 作者, 专利号: ZL103217899A
( 15 ) 一种基于稀疏强化学习的传感器网络优化方法, 发明, 2017, 第 1 作者, 专利号: ZL201310739109.2

出版信息

IEEE Computational Intelligence Magazine

Special Issue on

Deep Reinforcement Learning and Games

 

Aims and Scope

Recently, there has been tremendous progress in artificial intelligence (AI) and computational intelligence (CI) and games. In 2015, Google DeepMind published a paper “Human-level control through deep reinforcement learning” in Nature, showing the power of AI&CI in learning to play Atari video games directly from the screen capture. Furthermore, in Nature 2016, it published a cover paper “Mastering the game of Go with deep neural networks and tree search” and proposed the computer Go program, AlphaGo. In March 2016, AlphaGo beat the world’s top Go player Lee Sedol by 4:1. In early 2017, the Master, a variant of AlphaGo, won 60 matches against top Go players. In late 2017, AlphaGo Zero learned only from self-play and was able to beat the original AlphaGo without any losses (Nature 2017). This becomes a new milestone in the AI&CI history, the core of which is the algorithm of deep reinforcement learning (DRL). Moreover, the achievements on DRL and games are manifest. In 2017, the AIs beat the expert in Texas Hold’em poker (Science 2017). OpenAI developed an AI to outperform the champion in the 1V1 Dota 2 game. Facebook released a huge database of StarCraft I. Blizzard and DeepMind turned StarCraft II into an AI research lab with a more open interface. In these games, DRL also plays an important role.

 

Needless to say, the great achievements of DRL are first obtained in the domain of games, and it is timely to report major advances in a special issue of IEEE Computational Intelligence Magazine. IEEE Trans. on Neural network and Learning Systems and IEEE Trans. on Computational Intelligence and AI in Games have organized similar ones in 2017.

 

DRL is able to output control signals directly based on input images, and integrates the capacity for perception of deep learning (DL) and the decision making of reinforcement learning (RL). This mechanism has many similarities to human modes of thinking. However, there is much work left to do. The theoretical analysis of DRL, e. g., the convergence, stability, and optimality, is still in early days. Learning efficiency needs to be improved by proposing new algorithms or combining with other methods. DRL algorithms still need to be demonstrated in more diverse practical settings. Therefore, the aim of this special issue is to publish the most advanced research and state-of-the-art contributions in the field of DRL and its application in games. We expect this special issue to provide a platform for international researchers to exchange ideas and to present their latest research in relevant topics. Specific topics of interest include but are not limited to:

 

·       Survey on DRL and games;

·       New AI&CI algorithms in games;

·       Learning forward models from experience;

·       New algorithms of DL, RL and DRL;

·       Theoretical foundation of DL, RL and DRL;

·       DRL combined with search algorithms or other learning methods;

·       Challenges of AI&CI games as limitations in strategy learning, etc.;

·       DRL or AI&CI Games based applications in realistic and complicated systems.

Important Dates

Submission Deadline: October 1st, 2018

Notification of Review Results: December 10th, 2018

Submission of Revised Manuscripts: January 31st, 2019

Submission of Final Manuscript: March 15th, 2019

Special Issue Publication: August 2019 Issue

 

Guest Editors

D. Zhao, Institute of Automation, Chinese Academy of Sciences, China, Dongbin.zhao@ia.ac.cn

 

Dr. Zhao is a professor at Institute of Automation, Chinese Academy of Sciences and also a professor with the University of Chinese Academy of Sciences, China. His current research interests are in the area of deep reinforcement learning, computational intelligence, adaptive dynamic programming, games, and robotics. Dr. Zhao is the Associate Editor of IEEE Transactions on Neural Networks and Learning Systems and IEEE Computation Intelligence Magazine, etc. He is the Chair of Beijing Chapter, and the past Chair of Adaptive Dynamic Programming and Reinforcement Learning Technical Committee of IEEE Computational Intelligence Society (CIS). He works as several guest editors of renowned international journals, including the leading guest editor of the IEEE Trans.on Neural Network and Learning Systems special issue on Deep Reinforcement Learning and Adaptive Dyanmic Programming.

 

S. Lucas, Queen Mary University of London, UK, simon.lucas@qmul.ac.uk

 

Dr. Lucas was a full professor of computer science, in the School of Computer Science and Electronic Engineering at the University of Essex until July 31, 2017, and now is the Professor and Head of School of Electronic Engineering and Computer Science at Queen Mary University of London. He was the Founding Editor-in-Chief of the IEEE Transactions on Computational Intelligence and AI in Games, and also co-founded the IEEE Conference on Computational Intelligence and Games, first held at the University of Essex in 2005.  He is the Vice President for Education of the IEEE Computational Intelligence Society. His research has gravitated toward Game AI: games provide an ideal arena for AI research, and also make an excellent application area.

 

J. Togelius, New York University, USA, julian.togelius@nyu.edu.

 

Julian Togelius is an Associate Professor in the Department of Computer Science and Engineering, New York University, USA. He works on all aspects of computational intelligence and games and on selected topics in evolutionary computation and evolutionary reinforcement learning. His current main research directions involve search-based procedural content generation in games, general video game playing, player modeling, and fair and relevant benchmarking of AI through game-based competitions. He is the Editor-in-Chief of IEEE Transactions on Computational Intelligence and AI in Games, and a past chair of the IEEE CIS Technical Committee on Games.

 

Submission Instructions

1.     The IEEE CIM requires all prospective authors to submit their manuscripts in electronic format, as a PDF file. The maximum length for Papers is typically 20 double-spaced typed pages with 12-point font, including figures and references. Submitted manuscript must be typewritten in English in single column format. Authors of Papers should specify on the first page of their submitted manuscript up to 5 keywords. Additional information about submission guidelines and information for authors is provided at the IEEE CIM website. Submission will be made via https://easychair.org/conferences/?conf=ieeecimcitbb2018.

2.     Send also an email to guest editor D. Zhao (dongbin.zhao@ia.ac.cn) with subject “IEEE CIM special issue submission” to notify about your submission.

3.      Early submissions are welcome. We will start the review process as soon as we receive your contribution.


发表论文
(1) Iterative adaptive dynamic programming solving unknown nonlinear zero-sum game based on online measurement, IEEE Transactions on Neural Networks and Learning Systems, 2017, 第 2 作者
(2) A semi-supervised predictive sparse decomposition based on the task-driven dictionary learning, Cognitive Computation, 2017, 第 2 作者
(3) Model-free optimal control based intelligent cruise control with hardware-in-the-loop demonstration, IEEE Computational Intelligence Magazine, 2017, 第 1 作者
(4) 深度强化学习综述: 兼论计算机围棋的发展, 控制理论与应用, 2016, 第 1 作者
(5) Experience replay for optimal control of nonzero-sum game systems with unknown dynamics, IEEE Transactions on Cybernetics, 2016, 第 1 作者
(6) Online reinforcement learning control by Bayesian inference, IET Control Theory & Applications, 2016, 第 2 作者
(7) Using reinforcement learning techniques to solve continuous-time nonlinear optimal tracking problem without system dynamics, IET Control Theory and Applications, 2016, 第 2 作者
(8) Event-triggered H∞ control for continuous-time nonlinear system via concurrent learning, IEEE Transactions on Systems, Man and Cybernetics: Systems, 2016, 第 2 作者
(9) FMRQ-A multiagent reinforcement learning algorithm for fully cooperative tasks, IEEE Transactions on Cybernetics, 2016, 第 2 作者
(10) Model-free iterative adaptive dynamic programming solving unknown nonlinear zero-sum game based on online measurement, IEEE Transactions on Neural Networks and Learning Systems, 2016, 第 2 作者
(11) MEC—a near-optimal online reinforcement learning algorithm for continuous deterministic systems, IEEE Transactions on Neural Networks and Learning Systems, 2015, 第 1 作者
(12) Convergence analysis and application of fuzzy-HDP for nonlinear discrete-time HJB systems, Neurocomputing, 2015, 第 2 作者
(13) Model-free optimal control for affine nonlinear systems based on action dependent heuristic dynamic programming with convergency analysis, IEEE Transactions on Automation and Science Engineering, 2015, 第 1 作者
(14) A data-based online reinforcement learning algorithm satisfying probably approximately correct principle, Neural Computing and Applications, 2015, 第 1 作者
(15) Full range adaptive cruise control based on supervised adaptive dynamic programming, Neurocomputing, 2014, 第 1 作者
(16) Detecting and reacting to changes in sensing units: the active classifier case, IEEE Transactions on System, Man and Cybernetics Part A – Systems, 2014, 第 3 作者
(17) Dual heuristic dynamic programming for nonlinear discrete-time uncertain systems with state delay, Neurocomputing, 2014, 第 2 作者
(18) A supervised actor-critic approach for adaptive cruise control, Soft Computing, 2013, 第 1 作者
(19) A Neural-Network-Based Iterative GDHP Approach for Solving a Class of Nonlinear Optimal Control Problems with Control Constraints, Neural Computing and Applications, 2013, 第 3 作者
(20) Computational intelligence in urban traffic signal control, a survey, IEEE Transactions on System, Man and Cybernetics Part C: Applications and Reviews, 2012, 第 1 作者
(21) Self-teaching adaptive dynamic programming for Go-Moku, Neurocomputing, 2012, 第 1 作者
(22) DHP for coordinated freeway ramp metering, IEEE Transactions on Intelligent Transportation Systems, 2011, 第 1 作者
(23) Adaptive cruise control based on reinforcement leaning with shaping rewards, Journal of Advanced Computational Intelligence and Intelligent Informatics, 2011, 第 2 作者
(24) Motion and internal force control for omni-directional wheeled mobile robots, IEEE/ASME Transactions on Mechatronics, 2009, 第 1 作者
(25) Trajectory tracking control of omnidirectional wheeled mobile manipulators: robust neural network based sliding mode approach., IEEE Transactions on Systems, Man and Cybernetics Part B - Cybernetics, 2009, 第 2 作者
(26) Coordinated control of multiple ramps metering based on ADHDP (λ) Controller, International Journal of Innovative Computing, Information and Control, 2009, 第 2 作者
发表著作
( 1 ) 全方位移动机器人导论, An introduction to Omnidirectinoal Mobile Robots, 科学出版社, 2010-05, 第 1 作者
( 2 ) 机器人手册,第26章-面向操作任务的运动, Springer Handbook of Robotics, Chapter 26 - Motion for Manipulation Tasks, 机械工业出版社, 2013-01, 第 1 作者
( 3 ) 机器人手册,第51章-智能车辆, Springer Handbook of Robotics, Chapter 51 - Intelligent Vehicles, 机械工业出版社, 2013-01, 第 1 作者
( 4 ) Advances in Brain Inspired Cognitive Systems, Advances in Brain Inspired Cognitive Systems, Springer Heidelberg Dordrecht London NewYork, 2013-06, 第 3 作者
( 5 ) Frontiers of Intelligent Control and Information Processing, Frontiers of Intelligent Control and Information Processing, World Scientific Publishing, 2014-11, 第 3 作者

科研活动

   
科研项目
( 1 ) 基于数据的非线性控制系统分析与设计, 参与, 国家级, 2011-01--2014-12
( 2 ) 汽车的智慧起停巡航控制, 主持, 省级, 2012-01--2014-12
( 3 ) 基于监督式ADP 的汽车智能巡航控制, 主持, 国家级, 2013-01--2016-12
( 4 ) 能源管控中心平行控制节能技术研究, 参与, 省级, 2013-04--2014-12
( 5 ) 建筑能耗数据挖掘与分析工具包开发, 参与, 省级, 2013-12--2014-12
( 6 ) 汽车自适应巡航控制(ACC)系统及方法, 主持, 省级, 2013-09--2016-05
( 7 ) 人机交互的监督强化学习控制理论和方法, 主持, 研究所(学校), 2015-01--2016-12
( 8 ) 深度自适应动态规划理论方法和应用, 主持, 国家级, 2016-01--2016-12
( 9 ) 基于数据的建筑群及分布式能源系统一体化建模与自学习优化控制, 参与, 国家级, 2016-01--2016-12
( 10 ) 中国科学院海外评审专家(何海波), 主持, 部委级, 2015-01--2016-12
( 11 ) 智能辅助驾驶控制系统关键技术研究与产品开发, 主持, 国家级, 2016-07--2019-06
( 12 ) 不完全信息动态博弈的优化决策, 主持, 国家级, 2017-03--2020-03
参与会议
   

指导学生

已指导学生

田艺  02  19253  

胡朝辉  02  19253  

苏永生  02  19253  

戴钰桀  01  19253  

张震  01  19253  

王滨  01  19253  

朱圆恒  01  19253  

王海涛  02  19253  

夏中谱  01  19253  

现指导学生

张启超  01  19253  

吕乐  01  19253  

卜丽  01  19253  

李浩然  02  19253  

陈亚冉  01  19253  

唐振韬   01  19253  

邵坤  01  19253  

李栋  01  19253