张启超-中国科学院大学-UCAS

基本信息

张启超男硕导中国科学院自动化研究所
电子邮件： zhangqichao2014@ia.ac.cn
通信地址：北京市海淀区中关村东路95号
邮政编码：

研究方向

近年来，围绕强化学习决策推理方向开展研究工作，目前聚焦于大模型的RL后训练和AI4Science, 招收计算机类、人工智能类、AI4S(math/material)交叉的硕士生、实习生（本科生可优先保研）。

针对“高效在线强化学习算法”，提出了事件驱动强化学习方法（IEEE TCYB 19’, IEEE TNNLS 18’(ESI高被引)，IEEE TSMC:A 17’(ESI高被引)），获北京市自然科学二等奖(3/10)，中科院院长优秀奖，人工智能学会优秀博士论文提名奖，提出动态视域的值扩展强化学习方法（IEEE TNNLS 22’），获得2020 ICRA DJI RoboMaster AI Challenge决策赛道一等奖。
针对“真实数据驱动的动态环境建模”，提出结合模仿学习&强化学习的高保真交互行为建模方法（IEEE TITS 22’, IEEE TIV 23', IEEE TNNLS 23'），面向动态环境室外机器人，提出基于强化学习的决策控制算法（J FRANKLIN I 22’, IJAS 21’, IEEE CIM 17’），获DAI 2020 华为SMARTS单车决策赛道第1名。

招收对相关方向感兴趣的实习生，有意者请与我联系zhangqichao2014@ia.ac.cn，智能化大厦1004房间。

工作经历

工作简历

2020-09~现在, 中国科学院自动化研究所, 硕士生导师
2019-10~现在, 中国科学院自动化研究所, 副研究员
2017-07~2019-09,中国科学院自动化研究所, 助理研究员

社会兼职

中国人工智能学会智能驾驶专委会，委员；

中国计算机学会智能汽车专委会，执行委员；

中国自动化学会自适应动态规划专委会，会员；

中国自动化学会数据驱动学习控制专委会，会员；

中国生产力促进中心汽车工作委员会，副秘书长

教授课程

多智能体系统
强化学习
智能控制

出版信息

发表论文：

Yupeng Zheng, Zhongpu Xia, Qichao Zhang*, et al., “Preliminary Investigation into Data Scaling Laws for Imitation Learning-Based End-to-End Autonomous Driving”，Arxiv 2024.
Songjun Tu, Jingbo Sun, Qichao Zhang*, et al., “In-Dataset Trajectory Return Regularization for Offline Preference-based Reinforcement Learning” AAAI 2025.
Yupeng Zheng, Zebin Xing, Qichao Zhang*, et al., "PlanAgent: A Multi-modal Large Language Agent for Closed-loop Vehicle Motion Planning", Arxiv, 2024.
Yinfeng Gao^#, Qichao Zhang^#(共一), et al., "Dream to Drive With Predictive Individual World Model", IEEE TIV, 2024.
Junjie Wang, Qichao Zhang*, et al., "Prototypical Context-Aware Dynamics for Generalization in Visual Control With Model-Based Reinforcement Learning ", IEEE TII, 2024.
Yuqi Liu, Qichao Zhang*, et al., “Deep Reinforcement Learning-Based Driving Policy at Intersections Utilizing Lane Graph Networks", IEEE TCDS, 2024.
Yupeng Zheng, et al., Qichao Zhang*, “Monoocc: Digging into monocular semantic occupancy prediction", ICRA (Oral), 2024. [代码]
Zhiyuan Zhang, Qichao Zhang*, Xiaoxu Wu, et al. ,“User Response Modeling in Reinforcement Learning for Ads Allocation", [C]//ACM on Web Conference 2024 (WWW Industury track, 21.3%, Oral). 2024: 131-140. [代码]
Ding Li, Qichao Zhang*, et al.,“Planning-inspired hierarchical trajectory prediction for autonomous driving",IEEE Transactions on Intelligent Vehicles, 2023.
Ding Li, Qichao Zhang*, et al., "Conditional goal-oriented trajectory prediction for interacting vehicles with vectorized representation", IEEE Transactions on Neural Networks and Learning Systems, 2023.
Yupeng Zheng, et al., Qichao Zhang,Dongbin Zhao, “STEPS: Joint Self-supervised Nighttime Image Enhancement and Depth Estimation", ICRA 2023.[代码]
Qichao Zhang#,Yinfeng Gao#, et al,, “TrajGen: generating realistic and diverse trajectories with reactive and feasible agent behaviors for autonomous driving," IEEE Transactions on Intelligent Transportation Systems, DOI: 10.1109/TITS.2022.3202185, 2022.
Junjie Wang, Qichao Zhang*, Dongbin Zhao, et al., “Dynamic horizon value estimation for model-based reinforcement learning,” IEEE Transactions on Neural Networks and Learning Systems, 2022. DOI: 10.1109/TNNLS.2022.3215788.
Junjie Wang, Qichao Zhang*, Dongbin Zhao, “Highway lane change decision-making via attention-based deep reinforcement learning,” IEEE/CAA Journal of Automatica Sinica, vol. 9, no. 3, pp. 567-569, 2022. DOI: 10.1109/JAS.2021.1004395. (SCI Q1, IF 6.171)
Yuqi Liu, Yinfeng Gao, Qichao Zhang*, Dawei Ding, Dongbin Zhao, “Multi-task safe reinforcement learning for navigating intersections in dense traffic,” Journal of the Franklin Institute, https://doi.org/10.1016/j.jfranklin.2022.06.052, 2022.
Haoran Li, Qichao Zhang, Zhao, “Deep reinforcement learning-based automatic exploration for navigation in unknown environment,” IEEE Transactions Neural Network and Learning Systems, 31(6): 2064-2076, 2020. （SCI Q1）
Qichao Zhang, Dongbin Zhao*, et al., “Event-based robust control for uncertain nonlinear systems using adaptive dynamic programming,” IEEE Transactions Neural Network and Learning Systems, 2018, 29(1): 37-50. （SCI Q1，ESI Highly cited papers）
Qichao Zhang, Dongbin Zhao*, et al., “Event-triggered H∞ control for continuous-time nonlinear system via concurrent learning,” IEEE Transactions on Systems, Man, and Cybernetics: Systems, 2017, 47(7), 1074-1081. （SCI Q1，ESI Highly cited papers）
Dongbin Zhao, Zhongpu Xia, Qichao Zhang*, “Model-free optimal control based intelligent cruise control with hardware-in-the-loop demonstration,” IEEE Computational Intelligence Magazine, 2017, 12(2): 56-69.
Dongbin Zhao, Qichao Zhang, Ding Wang, et al. “Experience replay for optimal control of nonzero-sum game systems with unknown dynamics," IEEE Transactions on Cybernetics, 2015, 46(3): 854-865.（SCI Q1，ESI Highly cited papers）
Dong Li, Dongbin Zhao, Qichao Zhang, et al., “Reinforcement learning and deep learning based lateral control for autonomous driving [application notes][J],” IEEE Computational Intelligence Magazine, 2019, 14(2): 83-98.
...

科研活动

在研项目

在研项目：

（ 1 ）面向城市自动驾驶的耦合式预测规划联合学习方法研究，课题负责人，北京市面上，2024.01-2026.01

（ 2 ）强交互场景下联合轨迹规划算法与模型研究，课题负责人，企业委托(重庆长安汽车)，2023.11-2024.10

（ 3 ）数据驱动的仿真交通场景生成方法研究, 课题负责人, 国自然面上, 2022.01-2025.12

（ 4 ）离线强化学习混排策略研究, 课题负责人, 企业委托(美团), 2022.11-2024.03

（ 5 ）变化环境下的新型强化学习算法研究, 课题核心骨干, 科技部重点研发计划, 2022.11-2027.10

结题项目：

（ 1 ）离线强化学习与自动驾驶策略研究, 课题负责人, CCF-百度松果基金, 2022.09-2023.08

（ 2 ）虚实融合的多机器人智能博弈决策技术与应用, 课题负责人, 研究所自选, 2021.01-2023.12

（ 3 ）限定场景交通流smart agent算法研究，课题负责人，企业委托(百度网讯)，2020.08--2021.07

（ 4 ）基于智能计算的列车关键区段路侧感知及车辆定位理论与方法，子课题负责人，北京市重点基金，2019.12-2022.12，结题获评优秀（负责人：北航余贵珍教授）

（ 5 ）面向智能驾驶深度强化学习方法研究，课题负责人，企业委托(华为诺亚)，2019.01-2019.12

（ 6 ）基于多智能体深度自适应动态规划的优化控制方法与应用，课题负责人，国自然青年，2019.01-2021.12

合作情况

团队人员

直接指导：

1. 方兴，研三，离线强化学习算法，模式识别与智能系统，本科毕业于电子科技大学，推免硕士，2022.09-;

2. 王君礼，研三，自动驾驶联合预测算法，模式识别与智能系统，本科毕业于四川大学，硕士研究生，2022.09

3. 杨鹏轩，研一，端到端自动驾驶，模式识别与智能系统，本科毕业于中国科学院大学，推免硕士，2024.09

4. 邢泽斌，研一，端到端自动驾驶，模式识别与智能系统，本科毕业于北京邮电大学，推免硕士，2024.09

联合指导：

1. 孙敬博，强化学习的泛化性研究，硕士毕业于北京理工大学，博士生(鹏程联合培养)（导师：赵冬斌研究员）

2. 高胤峰，精准世界模型的自动驾驶决策，本硕毕业于北京科技大学，博士生，（导师：丁大伟教授）

3. 郑宇鹏，自动驾驶感知与预测算法，模式识别与智能系统，本科毕业于中国科学院大学，博士生，(硕士国奖)

4. 刘学义，视觉语言模型的自动驾驶，本科就读于西北工业大学，推免型直博生，（导师：赵冬斌研究员）2023.09

5. 涂崧峻，大语言模型人类反馈强化学习，本科就读于中南大学，推免型直博生(鹏程联合培养)，（导师：赵冬斌研究员）2023.09

6. 刘德庆，大四毕设阶段，本科就读于山东大学，推免型直博生，（导师：赵冬斌研究员）2025.09

毕业生：

1. 王俊杰，基于深度强化学习的超车换道决策方法，硕博连读，（导师：赵冬斌研究员，张启超副研) ，入职百度自动驾驶

2. 刘育琦，自动驾驶路口通行决策算法，博士生，（导师：赵冬斌研究员，张启超副研），入职小米机器人

3. 李丁，自动驾驶联合预测与决策算法，硕士毕业于天津大学，博士生，（导师：赵冬斌研究员，张启超副研），北航卓百博士后

4. 张志远，基于用户行为预测和强化学习的广告推荐策略研究，本科毕业于中国科学院大学，硕士生，港科大(广州)博士深造

最新消息