State Key Laboratory for Management and Control of Complex Systems
Institute of Automation, Chinese Academy of Sciences
Beijing 100190, China
Phone: +86-130-0118-1922; Fax: +86-10-82544799
Multi-agent reinforcement learning
Deep reinforcement learning
Cooperation and competition
09/2010--07/2015, Instititue of Automation, Chinese Acadey of Sciences , PhD
09/2006--07/2010, Nanjing University, B.S.
07/2015--now, Institute of Automation, Chinese Academy of Sciences, Assistant Research, Associated Researcher
12/2017--12/2018, University of Rhode Island, Visiting Scholar
2018/2019, University of Chinese Academy of Sciences, Reinforcement Learning (with Prof Dongbin Zhao)
2019/2020, 2020/2021, University of Chinese Academy of Sciences, Reinforcement Learning (with Profs Dongbin Zhao and Qichao Zhang)
 Synthesis of Cooperative Adaptive Cruise
Control with Feedforward Strategies, IEEE Transactions on Vehicular
Technology, 2020-02, First Author.
 Vision-based control in the open racing car simulator with deep and reinforcement learning, Journal of Ambient Intelligence and Humanized Computing, 2019-09, First Author.
 LMI-Based Synthesis of String-Stable Controller for Cooperative Adaptive Cruise Control, IEEE Transactions on Intelligent Transportation Systems, 2019-08, First Author.
 Control-limited adaptive dynamic programming for multi-battery energy storage systems, IEEE Transactions on Smart Grid, 2019-07, First Author.
 Adaptive optimal control of heterogeneous CACC system with uncertain dynamics, IEEE Transactions on Control Systems Technology, 2019-07, First Author.
 Invariant Adaptive Dynamic Programming for
Discrete-Time Optimal Control, IEEE Transactions on Systems, Man, and
Cybernetics: Systems, 2019-04, First Author.
Micromanagement With Reinforcement Learning and Curriculum Transfer
Learning, IEEE Transactions on Emerging Topics in Computational
Intelligence, 2019-02, Second Author.
comparison of online ADP algorithms for continuous-time optimal
control, Artificial Intelligence Review, 2018-04, First Author.
 深度强化学习进展: 从 AlphaGo 到 AlphaGo Zero, Recent progress of deep reinforcement learning: from AlphaGo to AlphaGo Zero, 控制理论与应用, 2017-12, Forth Author.
 Adaptive dynamic programming for robust neural control of unknown continuous-time non-linear systems, IET Control Theory & Applications, 2017-09, Forth Author.
 Event-Triggered Optimal Control for Partially Unknown Constrained-Input Systems via Adaptive Dynamic Programming, IEEE Transactions on Industrial Electronics, 2017-05, First Author.
 Iterative Adaptive Dynamic Programming for Solving Unknown Nonlinear Zero-Sum Game Based on Online Data, IEEE Transactions on Neural Networks and Learning Systems, 2017-03, First Author.
 Policy iteration for Hinfty optimal control of polynomial nonlinear systems via sum of squares programming, IEEE transactions on cybernetics, 2017-02, First Author.
 Data-driven adaptive dynamic programming for continuous-time fully cooperative games with partially constrained inputs, Neurocomputing, 2017-02, Third Author.
 Probably approximately correct reinforcement leaming solving continuous-state control problem, 控制理论与应用, 2016-12, First Author.
 Using reinforcement learning techniques to solve continuous-time non-linear optimal tracking problem without system dynamics, IET Control Theory Applications, 2016-07, First Author.
 Convergence Proof of Approximate Policy Iteration for Undiscounted Optimal Control of Discrete-Time Systems, Cognitive Computation, 2015-06, First Author.
 A data-based online reinforcement learning algorithm satisfying probably approximately correct principle, Neural Computing and Applications, 2015-04, First Author.
 MEC-A Near-Optimal Online Reinforcement Learning Algorithm for Continuous Deterministic Systems, IEEE Transactions on Neural Networks and Learning Systems, 2015-02, Second Author.
 Convergence analysis and application of fuzzy-HDP for nonlinear discrete-time HJB systems, Neurocomputing, 2015-02, First Author.
 多电池储能系统的优化控制方法、系统及存储介质, 发明, 2020, 第 1 作者, 专利号: 201810967603.7
 智能驾驶车道保持方法及系统, 发明, 2018, 第 5 作者, 专利号: 201811260601.0
 弹簧质量阻尼器的鲁棒跟踪控制方法, 发明, 2018, 第 3 作者, 专利号: 201810004181.3
 基于数据的Q函数自适应动态规划方法, 发明, 2013, 第 2 作者, 专利号: 201310036976.X
 储能电池充放电异常行为检测方法及检测系统, 发明, 2016, 第 3 作者, 专利号: 201610687158.X
 基于反事实回报的多智能体深度强化学习方法、系统, 发明, 2020, 第 3 作者, 专利号: 201911343902.4