基本信息

范东睿  研究员、博导

通信地址: 北京市海淀区中关村南一街6号
邮政编码: 100190

研究领域

众核处理器设计,高通量处理器设计,数据流处理器设计

专注于众核处理器设计、高通量处理器设计和数据流处理器设计。作为国内权威的高通量计算与处理器体系结构领域专家,范东睿研究员在计算机体系结构研究与高通量计算领域拥有丰富的技术积累,曾主持完成多项欧盟、科技部、工信部、国家自然科学基金委、中科院等国内外重大科研项目。主持完成了超标量 Godson-X、64核Godson-T、256核HGJ处理器、高通量数据处理器 DPU等各类高性能、高通量处理器的设计工作。2010年主持完成的Godson-T众核处理器流片,在国际上众核处理器结构研究领域占据了一席之地,并被计算机领域国际知名杂志《MICROPRPCESSOR REPORT》评选了2011年服务器领域十大件,Godson-T作为唯一一款学术界的众核处理器芯片入选。

招生信息

   
招生专业
081201-计算机系统结构
招生方向
大数据处理,智能芯片,计算机体系结构
高通量视频处理,视觉计算

教育背景

2000-09--2005-07   中国科学院计算技术研究所   计算机体系结构工学博士
1996-09--2000-07   北京交通大学   理学院应用数学系理学学士
学历
-- 研究生
学位
-- 博士

工作经历

范东睿,中科院特聘研究员(骨干人才),博士生导师。在国内外期刊、会议上发表论文120余篇,包括MICRO、HPCA、HotChips、PPoPP等领域顶级会议以及IEEE Micro、TPDS、TC等领域顶级期刊。近五年在国内外应邀作学术报告30余次,已获授权/受理发明专利60余项,其中国际专利9项。担任过HPCA、MICRO等顶级会议的程序委员会委员,以及ICPP、IGCC等国际会议主席。

作为国内权威的高通量计算与处理器体系结构领域专家,范东睿博士在计算机体系结构研究与高通量计算领域拥有丰富的技术积累,曾主持完成多项欧盟、科技部、工信部、国家自然科学基金委、中科院等国内外重大科研项目。2010年其主持的Godson-T众核处理器成功流片,被计算机领域国际知名杂志《MICROPRPCESSOR REPORT》评选为“2011年服务器领域十大事件”之一,为中国在全球众核处理器结构研究领域赢得一席之地。

自2005年起,范东睿研究员一直从事微处理器体系结构方面的研究,先后获评“首都科技领军人才(2018)”、“北京市科学技术进步二等奖(2017)”、“北京市海英人才(2016)”、“中国科学院卓越青年科学家(2014)”等荣誉。


社会兼职
2017-10-18-2017-10-21,论坛主席, 2017年HPC China 会议处理器评测与优化技术论坛
2016-10-31-2017-10-31,Micro程序委员会, 委员
2016-01-01-2016-10-10,General Co-Chairs, The 7 th International Green & Sustainable Computing Conference, General Co-Chairs
2015-12-31-2016-10-17,Industrial liaison & Program Committee, 49 th Annual IEEE/ACM International Symposium on Microarchitecture, MICRO 2016, Industrial liaison & Program Committee
2015-07-01-2015-12-20,Chair, Workshop on Energy-Efficient High Throughput Computing for Big Data, Chair
2015-07-01-2015-11-12,论坛主席, HPC-China 面向E级计算的新型处理器设计论坛主席
2014-05-01-今,机械工业出版社高性能专家委员会, 委员
2013-06-09-2014-06-09,PMAM程序委员会, 委员
2013-05-09-2014-05-09,HPCA程序委员会, 委员
2012-05-09-2013-05-09,ASP-DAC程序委员会, 委员
2012-02-01-今,Sustainable Computing国际期刊编委,
2011-08-03-今,CCF体系结构专委会常委,
2011-08-01-2011-08-31,第40届并行处理国际会议(ICPP 2011)副主席,
2011-06-01-今,NVIDIA全球合作教授,
2010-06-01-今,中国计算机学会, 高级会员
2009-08-24-今,CCF工程与工艺专委会委员,
2009-06-18-今,CCF系统软件专委会委员,
2009-06-01-今,欧洲HiPEAC联盟, 会员
2007-01-01-今,IEEE会员,

专利与奖励

先后获得奖励及荣誉近二十项:
  • 2018年科技部创新人才推进计划领军人才
  • 2018年CCF-IEEE CS青年科学家
  • 2018年海淀区十大杰出青年
  • 2018年全国向上向善创新创业好青年
  • 2017年度北京市科学技术奖二等奖
  • 2017年被评为“首都领军人才”
  • 2016年被评为“北京市海英人才”
  • 2014年获“中科院卓越青年科学家奖”
  • 2014年获北京市科学技术奖二等奖(排名第一)
  • 2011年入选中科院青年创新促进会(首批入选)
  • 2010年被评为“北京市科技新星”
  • 2008年获“中科院卢嘉锡青年人才奖”

专利成果
[1] 严明玉, 李涵, 叶笑春, 曹华伟, 范东睿. 一种面向图神经网络应用的片上存储系统及方法. CN: [[[CN111695685A]]], [[["2020-09-22"]]].

[2] 严明玉, 李涵, 叶笑春, 曹华伟, 范东睿. 一种面向图神经网络应用的任务调度执行系统及方法. CN: [[[CN111694643A]]], [[["2020-09-22"]]].

[3] 王中旗, 黄俊英, 张志敏, 叶笑春, 范东睿. 板间通信接口系统. CN202210764461.0, 2022-10-04.

[4] 王中旗, 黄俊英, 张志敏, 叶笑春, 范东睿. 光纤通信转接系统. CN202210764499.8, 2022-06-15.

[5] 王中旗, 张志轩, 黄俊英, 张志敏, 叶笑春, 范东睿. 针对低温多芯片计算系统的模拟方法及其系统. CN202210880357.8, 2022-06-12.

[6] 王中旗, 黄俊英, 张志敏, 叶笑春, 范东睿. 跨平台光纤传输系统. CN202210764423.5, 2022-06-08.

[7] 刘艳欢, 李文明, 安述倩, 吴海彬, 冯煜晶, 吴萌, 叶笑春, 范东睿. 一种数据传输装置及传输方法. CN: CN111459856B, 2022-02-18.

[8] 范志华, 吴欣欣, 李文明, 安学军, 叶笑春, 范东睿. 一种卷积神经网络的加速方法及装置. CN: CN113919477A, 2022-01-11.

[9] 范志华, 秦宏, 吴欣欣, 李文明, 安学军, 叶笑春, 范东睿. 一种ECDSA算法执行系统及方法. CN: CN113505383A, 2021-10-15.

[10] 王鹏超, 李晓霖, 郝沁汾, 叶笑春, 范东睿. 一种基于系统总线的三维芯片及其三维化方法. CN: CN113451260A, 2021-09-28.

[11] 李晓霖, 郝沁汾, 叶笑春, 范东睿. 基于先进封装技术的多CPU共封架构下高速缓存的动态扩容方法及系统. CN: CN113392604A, 2021-09-14.

[12] 李文明, 安述倩, 吴萌, 吴海彬, 刘艳欢, 叶笑春, 范东睿. 基于阻变存储器的通用区块链应用处理加速方法及系统. CN: CN110890120B, 2021-08-31.

[13] 刘天雨, 吴欣欣, 范志华, 李文明, 叶笑春, 范东睿. 一种基于数据流架构的深度可分离卷积融合方法及系统. CN: CN113313251A, 2021-08-27.

[14] 刘天雨, 吴欣欣, 李文明, 叶笑春, 范东睿. 基于数据流架构的稀疏神经网络的运算方法. CN: CN113313247A, 2021-08-27.

[15] 范志华, 吴欣欣, 王珎, 李文明, 安学军, 叶笑春, 范东睿. 基于数据流结构的低精度神经网络计算装置及加速方法. CN: CN113298236A, 2021-08-24.

[16] 欧焱, 范志华, 吴欣欣, 李文明, 叶笑春, 范东睿. 一种用于动态分配片上网络带宽的方法及装置. CN: CN113296957A, 2021-08-24.

[17] 欧焱, 范志华, 吴欣欣, 范东睿, 叶笑春, 李文明. 基于路由信息的数据流指令映射方法及系统. CN: CN113297131A, 2021-08-24.

[18] 吴欣欣, 范志华, 欧焱, 李文明, 叶笑春, 范东睿. 一种基于数据流架构的多精度神经网络计算装置以及方法. CN: CN113298245A, 2021-08-24.

[19] 秦梦远, 郝沁汾, 叶笑春, 范东睿. 面向环形数据报文网络的数据传输拥塞控制方法及系统. CN: CN113225241A, 2021-08-06.

[20] 张强, 郝沁汾, 叶笑春, 范东睿. 光电转换装置、计算机主板及计算机主机. CN: CN113193919A, 2021-07-30.

[21] 黄俊英, 付荣亮, 张阔中, 叶笑春, 张志敏, 范东睿. 生成面向超导RSFQ电路的多扇出时钟信号的方法. CN: CN113128165A, 2021-07-16.

[22] 黄俊英, 付荣亮, 张阔中, 叶笑春, 张志敏, 范东睿. 生成面向超导RSFQ电路的多扇出时钟信号的方法. CN: CN113128165A, 2021-07-16.

[23] 秦梦远, 郝沁汾, 叶笑春, 范东睿. 一种对CPU互连系统的网络拓扑结构进行重构的方法及装置. CN: CN113127404A, 2021-07-16.

[24] 黄俊英, 张阔中, 叶笑春, 张志敏, 范东睿. 用于双时钟架构的超导RSFQ电路布局方法. CN: CN113095033A, 2021-07-09.

[25] 李妍, 郝沁汾, 叶笑春, 范东睿. 自动检测除尘装置及除尘机箱. CN: CN113083797A, 2021-07-09.

[26] 张阔中, 张志敏, 唐光明, 黄俊英, 付荣亮, 叶笑春, 范东睿. 超导处理器及其输入输出控制模块. CN: CN112861463A, 2021-05-28.

[27] 范志华, 谭龙, 吴欣欣, 李文明, 安学军, 叶笑春, 范东睿. 面向数据流架构的SHA算法执行方法、存储介质、电子装置. CN: CN112861154A, 2021-05-28.

[28] 安述倩, 吴海彬, 刘艳欢, 李文明, 叶笑春, 范东睿. 粗粒度数据流架构执行阵列的调试方法及装置. CN: CN111008133B, 2021-04-27.

[29] 范志华, 欧焱, 吴欣欣, 李文明, 安学军, 叶笑春, 范东睿. 一种片上带宽动态分配方法及系统. CN: CN112311695A, 2021-02-02.

[30] 吴欣欣, 范志华, 欧焱, 李文明, 叶笑春, 范东睿. 基于数据流架构的稀疏卷积神经网络加速方法及装置. CN: CN112215349A, 2021-01-12.

[31] 范志华, 吴欣欣, 谭龙, 李文明, 安学军, 叶笑春, 范东睿. 一种神经网络剪枝方法及装置. CN: CN112183744A, 2021-01-05.

[32] 张志敏, 唐光明, 张阔中, 黄俊英, 付荣亮, 叶笑春, 范东睿. 一种超导并行寄存器堆装置. CN: CN112114875A, 2020-12-22.

[33] 张志敏, 唐光明, 张阔中, 黄俊英, 付荣亮, 叶笑春, 范东睿. 一种超导流水线电路及处理器. CN: CN112116094A, 2020-12-22.

[34] 范晓宣, 曹华伟, 叶笑春, 范东睿. 一种基于图数据库的蛋白质组数据管理方法、介质和设备. CN: CN112116951A, 2020-12-22.

[35] 吴欣欣, 范志华, 轩伟, 李文明, 叶笑春, 范东睿. 基于数据流架构的稀疏卷积神经网络加速方法及系统. CN: CN112015473A, 2020-12-01.

[36] 吴欣欣, 范志华, 轩伟, 李文明, 叶笑春, 范东睿. 基于数据流架构的稀疏卷积神经网络加速方法及系统. CN: CN112015472A, 2020-12-01.

[37] 付荣亮, 黄俊英, 张阔中, 唐光明, 叶笑春, 范东睿, 张志敏. 一种生成面向超导RSFQ电路的多扇出信号的方法. CN: CN111950216A, 2020-11-17.

[38] 范志华, 吴欣欣, 李文明, 安学军, 叶笑春, 范东睿. 一种加速安全散列算法的加速器. CN: CN111738703A, 2020-10-02.

[39] 安述倩, 张明喆, 叶笑春, 王达, 张浩, 范东睿, 唐志敏. 一种数据流处理器指令映射方法及系统、装置. CN: CN110941451A, 2020-03-31.

[40] 李文明, 叶笑春, 安述倩, 姜志颖, 王晨晖, 范东睿. 一种用于区块链的处理装置及方法. CN: CN110211618A, 2019-09-06.

[41] 李文明, 叶笑春, 安述倩, 姜志颖, 王晨晖, 范东睿. 一种哈希硬件处理装置及方法. CN: CN110211617A, 2019-09-06.

[42] 邹沫, 张鲁培, 李文明, 叶笑春, 范东睿. 基于数据流架构的快速傅里叶变换方法、系统和存储介质. CN: CN110008436A, 2019-07-12.

[43] 曹华伟, 张承龙, 安学军, 叶笑春, 范东睿. 一种面向宽度优先搜索算法的加速装置、方法及存储介质. CN: CN109992413A, 2019-07-09.

[44] 瞿佩瑶, 唐光明, 叶笑春, 范东睿. 一种RSFQ FFT处理器的蝶形运算处理方法及系统. CN: CN109783054A, 2019-05-21.

[45] 贾瑞花, 张承龙, 曹华伟, 叶笑春, 范东睿. 一种电信诈骗事件检测方法和检测系统. CN: CN109615116A, 2019-04-12.

[46] 郭南, 叶笑春, 王达, 范东睿, 张浩, 李文明. 基于深度线索的视频场景检索方法和系统. CN: CN109241342A, 2019-01-18.

[47] 瞿佩瑶, 唐光明, 叶笑春, 范东睿. 超导单磁通量子处理器的算术逻辑单元运算方法和系统. CN: CN108108151A, 2018-06-01.

[48] 李文明, 叶笑春, 孙凝晖, 范东睿, 王达, 马丽娜, 朱亚涛, 张洋. 一种异常事件自动推送及基于历史操作的监控方法及系统. CN: CN107071342A, 2017-08-18.

[49] 马丽娜, 祁玉琼, 叶笑春, 张浩, 范东睿, 王达. 一种字符操作加速方法、装置、芯片、处理器. CN: CN106445472A, 2017-02-22.

[50] 朱亚涛, 张志敏, 范东睿, 王达, 张浩. 一种串匹配算法的加速方法及装置. CN: CN106445891A, 2017-02-22.

[51] 朱亚涛, 张志敏, 范东睿, 王达, 张浩. 一种K近邻算法的加速装置及方法. CN: CN106355199A, 2017-01-25.

[52] 张洋, 唐志敏, 叶笑春, 张浩, 范东睿. 众核处理器片上访存距离优化的方法及其装置. CN: CN106339350A, 2017-01-18.

[53] 马丽娜, 范东睿, 谢向辉, 李宏亮, 郑方. 一种面向大数据的加速排序装置、方法、芯片、处理器. CN: CN106250097A, 2016-12-21.

[54] 李丹萍, 胡九川, 范东睿, 谢向辉, 李宏亮. 一种改善数据在缓存中空间局部性的缓存方法及装置. CN: CN106126440A, 2016-11-16.

[55] 范东睿, 宋风龙, 王达, 叶笑春. 内存访问处理方法、装置及系统. CN: CN104346285A, 2015-02-11.

[56] 徐远超, 范东睿, 张浩, 叶笑春. 一种访问数据缓存的方法和处理器. CN: CN104252392A, 2014-12-31.

[57] 张轮凯, 范东睿, 叶笑春, 王达. 基于多内核处理器的一致性处理方法和装置. CN: CN104252423A, 2014-12-31.

[58] 张轮凯, 范东睿, 张浩, 叶笑春. 一种众核系统的任务管理方法和装置. CN: CN104239134A, 2014-12-24.

[59] 熊海泉, 唐志敏, 张志敏, 范东睿. 一种操作系统进程识别跟踪及信息获取的方法和装置. CN: CN104007956A, 2014-08-27.

[60] 范东睿, 叶笑春, 王达, 张浩. 一种实时多任务调度方法和装置. CN: CN103870327A, 2014-06-18.

[61] 徐远超, 范东睿, 张浩, 叶笑春. 一种基于缓存感知的确定待迁移任务的方法和装置. CN: CN103729248A, 2014-04-16.

[62] 唐士斌, 宋风龙, 王达, 范东睿. 程序的线程关系确定方法、设备及系统. CN: CN103729166A, 2014-04-16.

[63] 范灵俊, 唐士斌, 王达, 张浩, 范东睿. 用于处理器的动态组相联高速缓存装置及其访问方法. CN: CN102662868A, 2012-09-12.

[64] 张帅, 焦帅, 张浩, 范东睿, 李海忠. 一种片上多核数据传输方法和装置. CN: CN102567278A, 2012-07-11.

[65] 张轮凯, 李海忠, 雷峥蒙, 张浩, 范东睿. 一种片上共享高速缓存的替换装置和方法以及相应处理器. CN: CN102110073A, 2011-06-29.

[66] 纪雯, 范东睿, 陈益强, 张绘国, 邢云冰. 视频信号编码装置和方法. CN: CN101977313A, 2011-02-16.

[67] 范灵俊, 林伟, 张浩, 范东睿. 采用可配置的片上存储装置实现访存操作的系统及方法. CN: CN101930357A, 2010-12-29.

[68] 安述倩, 余磊, 张浩, 范东睿. RISC处理器中执行寄存器类型指令的方法和其系统. CN: CN101916180A, 2010-12-15.

[69] 徐卫志, 焦帅, 张浩, 刘志勇, 范东睿, 雷峥蒙, 宋风龙, 王达. 众核处理器片上同步方法和其系统. CN: CN101908034A, 2010-12-08.

[70] 余磊, 张浩, 刘志勇, 范东睿. 处理器内指令级流水线控制方法及其系统. CN: CN101894013A, 2010-11-24.

[71] 雷峥蒙, 焦帅, 徐卫东, 范东睿, 张浩. 多核处理器的JTAG实时片上调试方法及其系统. CN: CN101840368A, 2010-09-22.

[72] 刘磊, 袁楠, 范东睿. 一种对访存操作进行权限检查的系统、装置及方法. CN: CN101079083B, 2010-05-12.

[73] 叶笑春, 段振中, 范东睿, 张军超. 对状态寄存器进行重命名的方法和使用该方法的处理器. CN: CN100524208C, 2009-08-05.

[74] 段振中, 范东睿. 复杂指令集体系结构中的深度优先异常处理方法. CN: CN100495324C, 2009-06-03.

[75] 袁 楠, 范东睿. 对复杂指令译码生成微码的译码装置和方法. CN: CN100492279C, 2009-05-27.

[76] 马啸宇, 范东睿, 包尔固德, 张轮凯. 一种多核或众核处理器功能验证设备及方法. CN: CN101320344A, 2008-12-10.

[77] 龙国平, 范东睿, 袁 楠, 张 浩. 基于局部相联查找的解决访存相关的方法和处理器. CN: CN101211257A, 2008-07-02.

[78] 张 浩, 范东睿. 一种快速虚实地址转换装置及其方法. CN: CN101211318A, 2008-07-02.

[79] 陈 曦, 范东睿, 张 浩. 满足SystemC语法要求的多核处理器及获得其执行代码的方法. CN: CN101196826A, 2008-06-11.

[80] 龙国平, 袁 楠, 范东睿. 复杂指令系统中TLBR内部例外的处理方法和处理器. CN: CN101114216A, 2008-01-30.

[81] 段振中, 范东睿. 对预处理微指令发生异常多层嵌套进行处理的设备及方法. CN: CN101075184A, 2007-11-21.

[82] 黄海林, 范东睿, 许 彤, 唐志敏. 一种单步执行在片调试功能的方法及装置. CN: CN1904851A, 2007-01-31.

[83] 黄海林, 唐志敏, 范东睿, 许 彤. 用于虚实地址变换及读写高速缓冲存储器的方法及装置. CN: CN1896972A, 2007-01-17.

[84] 范东睿, 唐志敏. 一种从虚拟地址向物理地址变换的方法及其装置. CN: CN1779663A, 2006-05-31.

[85] 范东睿, 唐志敏. 改进的虚拟地址变换方法及其装置. CN: CN1779662A, 2006-05-31.

出版信息

   
发表论文
[1] 吴萌, 严明玉, 叶笑春, 李文明, Xiaocheng Yang, 张志敏, 范东睿. Characterizing and Understanding Defense Methods for GNNs on GPUs. IEEE CAL[J]. 2023, [2] Xinda Chen, Rongliang Fu, Junying Huang, Huawei Cao, Zhimin Zhang, Xiaochun Ye, Tsung-Yi Ho, Dongrui Fan. JRouter: A Multi-Terminal Hierarchical Length-Matching Router under Planar Manhattan Routing Model for RSFQ Circuits. GLSVLSInull. 2023, [3] 范志华, 李文明, 王珎, 刘天雨, 吴海彬, 刘艳欢, 吴萌, 叶笑春, 范东睿, 安学军. Accelerating Convolutional Neural Networks by Exploiting the Sparsity of Output Activation. TPDS[J]. 2023, 34(12): 3253-3265, [4] 王铎, 严明玉, 刘昕, 邹沫, 刘天雨, 李文明, 叶笑春, 范东睿. A High-accurate Multi-objective Exploration Framework for Design Space of CPU. 第60届设计自动化大会 (DAC)null. 2023, [5] Xiaocheng Yang, 严明玉, 叶笑春, 范东睿. Simple and Efficient Heterogeneous Graph Neural Network. The Thirty-Seventh AAAI Conference on Artificial Intelligence (AAAI-23)null. 2023, [6] 范志华, 李文明, 汤胜中, 安学军, 叶笑春, 范东睿. Improving Utilization of Dataflow Architectures Through Software and Hardware Co-Design. Euro-Parnull. 2023, [7] 王铎, 严明玉, 滕亦涵, 韩登科, 叶笑春, 范东睿. A High-accurate Multi-objective Ensemble Exploration Framework for Design Space of CPU Microarchitecture. Proceedings of the Great Lakes Symposium on VLSI 2023null. 2023, [8] 范志华, 李文明, 王珎, 刘天雨, 吴海彬, 刘艳欢, 吴萌, 吴欣欣, 叶笑春, 范东睿, 孙凝晖, 安学军. Accelerating Convolutional Neural Networks by Exploiting the Sparsity of Output Activation. IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS[J]. 2023, [9] 范志华, 吴欣欣, 李文明, 曹华伟, 安学军, 叶笑春, 范东睿. 面向低精度神经网络的数据流体系结构优化. 计算机研究与发展[J]. 2023, 60(1): 43-58, http://lib.cqvip.com/Qikan/Article/Detail?id=7108741862.
[10] Liu, Xin, Yan, Mingyu, Deng, Lei, Li, Guoqi, Ye, Xiaochun, Fan, Dongrui. Sampling Methods for Efficient Training of Graph Convolutional Networks: A Survey. IEEE-CAA JOURNAL OF AUTOMATICA SINICAnull. 2022, 9(2): 205-234, http://dx.doi.org/10.1109/JAS.2021.1004311.
[11] Mo Zou, Mingyu Yan, Wenming Li, Zhimin Tang, Xiaochun Ye, Dongrui Fan. GEM: Execution-Aware Cache Management for Graph Analytics. ICA3PPnull. 2022, [12] Zou, Mo, Zhang, Mingzhe, Wang, Rujia, Sun, XianHe, Ye, Xiaochun, Fan, Dongrui, Tang, Zhimin. Accelerating Graph Processing With Lightweight Learning-Based Data Reordering. IEEE COMPUTER ARCHITECTURE LETTERS[J]. 2022, 21(1): 5-8, http://dx.doi.org/10.1109/LCA.2022.3151087.
[13] Rongliang Fu, Junying Huang, Haibin Wu, Xiaochun Ye, Dongrui Fan, Tsung-Yi Ho. JBNN: A Hardware Design for Binarized Neural Networks Using Single-Flux-Quantum Circuits. IEEE TRANSACTIONS ON COMPUTERS[J]. 2022, 771(12): 3203-3214, [14] Junying Huang, Rongliang Fu, Xiaochun Ye, Dongrui Fan. A survey on superconducting computing technology: circuits, architectures and design tools. CCF Transactions on High Performance Computing[J]. 2022, [15] Sun, Gongjian, Yan, Mingyu, Wang, Duo, Li, Han, Li, Wenming, Ye, Xiaochun, Fan, Dongrui, Xie, Yuan. Multi-node Acceleration for Large-scale GCNs. IEEE TRANSACTIONS ON COMPUTERS[J]. 2022, [16] Feng, YuJing, Li, DeJian, Tan, Xu, Ye, XiaoChun, Fan, DongRui, Li, WenMing, Wang, Da, Zhang, Hao, Tang, ZhiMin. Accelerating Data Transfer in Dataflow Architectures Through a Look-Ahead Acknowledgment Mechanism. JOURNAL OF COMPUTER SCIENCE AND TECHNOLOGY[J]. 2022, 37(4): 942-959, [17] Zhihua Fan, Wenming Li, Tianyu Liu, Shengzhong Tang, Zhen Wang, Xuejun An, Xiaochun Ye, Dongrui Fan. A Loop Optimization Method for Dataflow. High Performance Computing and Communicationsnull. 2022, [18] Yan, Mingyu, Zou, Mo, Yang, Xiaocheng, Li, Wenming, Ye, Xiaochun, Fan, Dongrui, Xie, Yuan. Characterizing and Understanding HGNNs on GPUs. IEEE COMPUTER ARCHITECTURE LETTERS[J]. 2022, 21(2): 69-72, http://dx.doi.org/10.1109/LCA.2022.3198281.
[19] Liu, Xin, Yan, Mingyu, Deng, Lei, Li, Guoqi, Ye, Xiaochun, Fan, Dongrui, Pan, Shirui, Xie, Yuan. Survey on Graph Neural Network Acceleration: An Algorithmic Perspective. International Joint Conference on Artificial Intelligencenull. 2022, http://arxiv.org/abs/2202.04822.
[20] Lin, Haiyang, Yan, Mingyu, Wang, Duo, Zou, Mo, Tu, Fengbin, Ye, Xiaochun, Fan, Dongrui, Xie, Yuan. Alleviating Datapath Conflicts and Design Centralization in Graph Analytics Acceleration. DESIGN AUTOMATION CONFERENCEnull. 2022, [21] Lin, Haiyang, Yan, Mingyu, Yang, Xiaocheng, Zou, Mo, Li, Wenming, Ye, Xiaochun, Fan, Dongrui. Characterizing and Understanding Distributed GNN Training on GPUs. IEEE COMPUTER ARCHITECTURE LETTERS[J]. 2022, 21(1): 21-24, http://dx.doi.org/10.1109/LCA.2022.3168067.
[22] Wang, Yinshen, Li, Wenming, Liu, Tianyu, Zhou, Liangjiang, Wang, Bingnan, Fan, Zhihua, Ye, Xiaochun, Fan, Dongrui, Ding, Chibiao. Characterization and Implementation of Radar System Applications on a Reconfigurable Dataflow Architecture. IEEE COMPUTER ARCHITECTURE LETTERS[J]. 2022, 21(2): 121-124, [23] Xinxin Wu, Zhihua Fan, Tianyu Liu, Wenming Li, Xiaochun Ye, Dongrui Fan. LRP: Predictive output activation based on SVD approach for CNNs acceleration. Design, Automation and Test in Europenull. 2022, [24] 轩伟, 曹华伟, 严明玉, 唐志敏, 叶笑春, 范东睿. BSR-TC: Adaptively Sampling for Accurate Triangle Counting over Evolving Graph Streams. International Journal of Software Engineering and Knowledge Engineering[J]. 2021, 31(11): 1561-1581, https://worldscientific.com/doi/10.1142/S021819402140012X.
[25] 严明玉, 李涵, 邓磊, 胡杏, 叶笑春, 张志敏, 范东睿, 谢源. 图计算加速架构综述. 计算机研究与发展[J]. 2021, 58(4): 862-887, http://lib.cqvip.com/Qikan/Article/Detail?id=7104271412.
[26] Li, Yi, Wu, Meng, Ye, Xiaochun, Li, Wenming, Xue, Rui, Wang, Da, Zhang, Hao, Fan, Dongrui. An efficient scheduling algorithm for dataflow architecture using loop-pipelining. INFORMATION SCIENCES[J]. 2021, 547: 1136-1153, http://dx.doi.org/10.1016/j.ins.2020.09.029.
[27] 范东睿. 数据流计算研究进展与概述. 数据与计算发展前沿. 2021, [28] 李涵, 严明玉, 吕征阳, 李文明, 叶笑春, 范东睿, 唐志敏. 图神经网络加速结构综述. 计算机研究与发展[J]. 2021, 58(6): 1204-1229, http://lib.cqvip.com/Qikan/Article/Detail?id=7104820799.
[29] Li, Han, Yan, Mingyu, Yang, Xiaocheng, Deng, Lei, Li, Wenming, Ye, Xiaochun, Fan, Dongrui, Xie, Yuan. Hardware Acceleration for GCNs via Bidirectional Fusion. IEEE COMPUTER ARCHITECTURE LETTERS[J]. 2021, 20(1): [30] Chenglong Zhang, Huawei Cao, Xiaochun Ye, Guobo Wang, Qinfen Hao, Dongrui Fan. Highly Efficient Breadth-First Search on CPU-based Single-node System. INTERNATIONAL JOURNAL OF HYDROGEN ENERGYnull. 2021, 2066-2071, [31] Dongrui Fan. Scalable and Efficient Graph Traversal on High-Throughput Cluster. CCF Transaction on High Performance Computing (CCF THPC). 2021, [32] 吴欣欣, 欧焱, 李文明, 王达, 张浩, 范东睿. 基于粗粒度数据流架构的稀疏卷积神经网络加速. 计算机研究与发展[J]. 2021, 58(7): 1504-1517, http://lib.cqvip.com/Qikan/Article/Detail?id=7105055136.
[33] Cao, Dingyuan, Zhang, Mingzhe, Lu, Hang, Ye, Xiaochun, Fan, Dongrui, Che, Yuezhi, Wang, Rujia. Streamline Ring ORAM Accesses through Spatial and Temporal Optimization. 2021 27TH IEEE INTERNATIONAL SYMPOSIUM ON HIGH-PERFORMANCE COMPUTER ARCHITECTURE (HPCA 2021)null. 2021, 14-25, [34] 李灵枝, 胡九川, 叶笑春, 范东睿, 严龙. 渗透缓存命中率诱导的缓存区域动态分配机制研究. 软件导刊[J]. 2020, 19(4): 1-8, http://lib.cqvip.com/Qikan/Article/Detail?id=7101773847.
[35] Rongliang Fu, Zhimin Zhang, Guangming Tang, Junying Huang, Xiaochun Ye, Dongrui Fan, Ninghui Sun. Design Automation Methodology from RTL to Gate-level Netlist and Schematic for RSFQ Logic Circuits. Great Lakes Symposium on VLSInull. 2020, [36] Qu, PeiYao, Tang, GuangMing, Yang, JiaHong, Ye, XiaoChun, Fan, DongRui, Zhang, ZhiMin, Sun, NingHui. Design of an 8-bit Bit-Parallel RSFQ Microprocessor. IEEE TRANSACTIONS ON APPLIED SUPERCONDUCTIVITY[J]. 2020, 30(7): [37] Yang, JiaHong, Tang, GuangMing, Zheng, XiangYu, Ye, XiaoChun, Fan, DongRui, Zhang, ZhiMin, Sun, NingHui. Distributed Self-Clock: A Suitable Architecture for SFQ Circuits. IEEE TRANSACTIONS ON APPLIED SUPERCONDUCTIVITY[J]. 2020, 30(7): http://dx.doi.org/10.1109/TASC.2020.3007175.
[38] Dongrui Fan. Pixel-Semantic Revising of Position: One-Stage Object Detector with Shared Encoder-Decoder. The 27th International Conference on Neural Information Processing (ICONIP2020). 2020, [39] Wu, Xinxin, Li, Yi, Ou, Yan, Li, Wenming, Sun, Shibo, Xu, Wenxing, Fan, Dongrui, Qiu, M. Accelerating Sparse Convolutional Neural Networks Based on Dataflow Architecture. ALGORITHMS AND ARCHITECTURES FOR PARALLEL PROCESSING, ICA3PP 2020, PT IInull. 2020, 12453: 14-31, [40] Dongrui Fan. Scalable and efcient graph traversal on high‑throughput cluster. CCF Transactions on High Performance Computing. 2020, [41] Tang, GuangMing, Qu, PeiYao, Zheng, XiangYu, Yang, JiaHong, Ye, XiaoChun, Fan, DongRui, Sun, NingHui. Bit-Slice Butterfly Processing Units for 64-Point RSFQ FFT Processors. IEEE TRANSACTIONS ON APPLIED SUPERCONDUCTIVITY[J]. 2020, 30(1): https://www.webofscience.com/wos/woscc/full-record/WOS:000482590700001.
[42] Yan Mingyu, Deng Lei, Hu Xing, Liang Ling, Feng Yujing, Ye Xiaochun, Zhang Zhimin, Fan Dongrui, Xie Yuan. HyGCN: A GCN Accelerator with Hybrid Architecture. 2020, http://arxiv.org/abs/2001.02514.
[43] Li, Qian, Guo, Nan, Ye, Xiaochun, Fan, Dongrui, Tang, Zhimin. Video Face Recognition System: RetinaFace-mnet-faster and Secondary Search. 2020, http://arxiv.org/abs/2009.13167.
[44] 范灵俊, 杨菲, 郑卫城, 洪学海, 范东睿. 构建城市“互联网+”新型基础设施发展战略研究. 中国工程科学[J]. 2020, 22(4): 106-113, http://lib.cqvip.com/Qikan/Article/Detail?id=7102599416.
[45] Yan, Mingyu, Deng, Lei, Hu, Xing, Liang, Ling, Feng, Yujing, Ye, Xiaochun, Zhang, Zhimin, Fan, Dongrui, Xie, Yuan, IEEE. HyGCN: A GCN Accelerator with Hybrid Architecture. 2020 IEEE INTERNATIONAL SYMPOSIUM ON HIGH PERFORMANCE COMPUTER ARCHITECTURE (HPCA 2020)null. 2020, 15-29, [46] Yan, Mingyu, Chen, Zhaodong, Deng, Lei, Ye, Xiaochun, Zhang, Zhimin, Fan, Dongrui, Xie, Yuan. Characterizing and Understanding GCNs on GPU. IEEE COMPUTER ARCHITECTURE LETTERS[J]. 2020, 19(1): 22-25, http://dx.doi.org/10.1109/LCA.2020.2970395.
[47] Ye, Xiaochun, Tan, Xu, Wu, Meng, Feng, Yujing, Wang, Da, Zhang, Hao, Pei, Songwen, Fan, Dongrui. An efficient dataflow accelerator for scientific applications. FUTURE GENERATION COMPUTER SYSTEMS-THE INTERNATIONAL JOURNAL OF ESCIENCE[J]. 2020, 112: 580-588, http://dx.doi.org/10.1016/j.future.2020.03.023.
[48] Ou, Yan, Shen, Chongfei, Feng, Yujing, Wu, Xinxin, Li, Wenming, Ye, Xiaochun, Fan, Dongrui, Qiu, M. CTA: A Critical Task Aware Scheduling Mechanism for Dataflow Architecture. ALGORITHMS AND ARCHITECTURES FOR PARALLEL PROCESSING, ICA3PP 2020, PT Inull. 2020, 12452: 61-77, [49] 叶笑春, 李文明, 张洋, 张浩, 王达, 范东睿. 高通量众核处理器设计. 数据与计算发展前沿[J]. 2020, 2(1): 70-84, https://kns.cnki.net/KCMS/detail/detail.aspx?dbcode=CJFQ&dbname=CJFDLAST2020&filename=KYXH202001006&v=MDU4OTk4ZVgxTHV4WVM3RGgxVDNxVHJXTTFGckNVUjd1Zlp1Wm5GaXZuVUwzTkxqVFRackc0SE5ITXJvOUZZb1I=.
[50] Hao, Qinfen, Hao, Kai, Xue, Haiyun, Han, Meng, Qi, Nan, Zhang, Kunming, Niu, Xingmao, Xiao, Limin, Fan, Dongrui, IEEE. A Chip-level Optical Interconnect for CPU. 2020 IEEE PHOTONICS CONFERENCE (IPC)null. 2020, http://apps.webofknowledge.com/CitedFullRecord.do?product=UA&colName=WOS&SID=5CCFccWmJJRAuMzNPjj&search_mode=CitedFullRecord&isickref=WOS:000612237500111.
[51] 张承龙, 曹华伟, 王国波, 郝沁汾, 张洋, 叶笑春, 范东睿. 面向高通量计算机的图算法优化技术. 计算机研究与发展[J]. 2020, 57(6): 1152-1163, http://lib.cqvip.com/Qikan/Article/Detail?id=7101851458.
[52] 董荣育, 曹华伟, 叶笑春, 张园, 郝沁汾, 范东睿. Highly Efficient and GPU-Friendly Implementation of BFS on Single-node System. International Symposium on Parallel and Distributed Processing with Applications (ISPA 2017)null. 2020, https://ieeexplore.ieee.org/document/9443861.
[53] Dongrui Fan. iATPG: Instruction-level Automatic Test Program Generation for Vulnerability under DVFS Attack. 2019 IEEE 25th International Symposium on On-Line Testing and Robust System Design (IOLTS). 2019, [54] 李易, 常成娟, 卢圣健, 江道忠, 范东睿, 叶笑春. 面向数据流结构的指令映射优化方法. 计算机工程与科学[J]. 2019, 41(1): 9-13, http://lib.cqvip.com/Qikan/Article/Detail?id=7001148810.
[55] Dongrui Fan. A Sharing Path Awareness Scheduling Algorithm for Dataflow Architecture. HPCC. 2019, [56] 范东睿. 面向数据流结构的指令内存访存冲突优化研究. 计算机研究与发展. 2019, [57] Dongrui Fan. C-MAP: Improving the Effectiveness of Mapping Method for CGRA by Reducing NoC Congestion. HPCC 2019. 2019, [58] 欧焱, 冯煜晶, 李文明, 叶笑春, 王达, 范东睿. 面向数据流结构的指令内访存冲突优化研究. 计算机研究与发展[J]. 2019, 56(12): 2720-2732, http://lib.cqvip.com/Qikan/Article/Detail?id=7100658631.
[59] Junying Huang, Jing Ye, Xiaochun Ye, Da Wang, Dongrui Fan, Huawei Li, Xiaowei Li, Zhimin Zhang. Instruction Vulnerability Test and Code Optimization against DVFS attack. 2019 IEEE INTERNATIONAL TEST CONFERENCE IN ASIA (ITC-ASIA 2019)[J]. 2019, 49-54, [60] 范东睿, 叶笑春, 包云岗, 孙凝晖. 中国高通量计算机的自主研发之路. 中国科学院院刊[J]. 2019, 648-656, http://lib.cqvip.com/Qikan/Article/Detail?id=75898988504849574854484856.
[61] Zokaee, Farzaneh, Zhang, Mingzhe, Ye, Xiaochun, Fan, Dongrui, Jiang, Lei, ACM. Magma: A Monolithic 3D Vertical Heterogeneous ReRAM-based Main Memory Architecture. PROCEEDINGS OF THE 2019 56TH ACM/EDAC/IEEE DESIGN AUTOMATION CONFERENCE (DAC)null. 2019, http://dx.doi.org/10.1145/3316781.3317858.
[62] 张志敏. Balancing Memory Accesses for Energy-Efficient Graph Analytics Accelerators. ISLPED. 2019, [63] Li, Wenming, Ye, Xiaochun, Wang, Da, Zhang, Hao, Tang, Zhimin, Fan, Dongrui, Sun, Ninghui. PIM-WEAVER: A High Energy-efficient, General-purpose Acceleration Architecture for String Operations in Big Data Processing. SUSTAINABLE COMPUTING-INFORMATICS & SYSTEMS[J]. 2019, 21: 129-142, http://dx.doi.org/10.1016/j.suscom.2019.01.006.
[64] 余世干, 唐志敏, 叶笑春, 范东睿. 基于推测机制异构多核处理器容错方法与仿真. 系统仿真学报[J]. 2019, 31(12): 2685-2695, http://lib.cqvip.com/Qikan/Article/Detail?id=7100565631.
[65] Wenming Li, Xiaochun Ye, Da Wang, Hao Zhang, Zhimin Tang, Dongrui Fan, Ninghui Sun. PIM-WEAVER: A High Energy-efficient, General-purpose Acceleration Architecture for String Operations in Big Data Processing. SUSTAINABLE COMPUTING: INFORMATICS AND SYSTEMS. 2019, 21: 129-142, http://dx.doi.org/10.1016/j.suscom.2019.01.006.
[66] Dongrui Fan. Applying CNN on a Scientific Application Accelerator Based on Dataflow Architecture. CCF Transaction on High Performance Computing (CCF THPC). 2019, [67] Gao Yan, Liu Boxiao, Guo Nan, Ye Xiaochun, Wan Fang, You Haihang, Fan Dongrui. Utilizing the Instability in Weakly Supervised Object Detection. 2019, http://arxiv.org/abs/1906.06023.
[68] Yan Mingyu, Hu Xing, Li Shuangchen, Basak Abanti, Li Han, Ma Xin, Akgun Itir, Peng Yujing, Gu Peng, Deng Lei, Ye Xiaochun, Zhang Zhimin, Fan Dongrui, Xie Yuan, Assoc Comp Machinery. Alleviating Irregularity in Graph Analytics Acceleration: a Hardware/Software Co-Design Approach. MICRO'52: THE 52ND ANNUAL IEEE/ACM INTERNATIONAL SYMPOSIUM ON MICROARCHITECTUREnull. 2019, 615-628, http://dx.doi.org/10.1145/3352460.3358318.
[69] Gao, Yan, Liu, Boxiao, Guo, Nan, Ye, Xiaochun, Wan, Fang, You, Haihang, Fan, Dongrui, IEEE. C-MIDN: Coupled Multiple Instance Detection Network With Segmentation Guidance for Weakly Supervised Object Detection. 2019 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2019)null. 2019, 9833-9842, [70] 向陶然, 叶笑春, 李文明, 冯煜晶, 谭旭, 张浩, 范东睿. 基于细粒度数据流架构的稀疏神经网络全连接层加速. 计算机研究与发展[J]. 2019, 56(6): 1192-1204, http://lib.cqvip.com/Qikan/Article/Detail?id=7002192926.
[71] Sun, NingHui, Bao, YunGang, Fan, DongRui. The rise of high-throughput computing. FRONTIERS OF INFORMATION TECHNOLOGY & ELECTRONIC ENGINEERING[J]. 2018, 19(10): 1245-1250, http://lib.cqvip.com/Qikan/Article/Detail?id=676786551.
[72] Tang, GuangMing, Qu, PeiYao, Ye, XiaoChun, Fan, DongRui, Sun, NingHui. 32-Bit 4 x 4 Bit-Slice RSFQ Matrix Multiplier. IEEE TRANSACTIONS ON APPLIED SUPERCONDUCTIVITY[J]. 2018, 28(7): https://www.webofscience.com/wos/woscc/full-record/WOS:000435190700001.
[73] Xie, Xiaolong, Liang, Yun, Li, Xiuhong, Wu, Yudong, Sun, Guangyu, Wang, Tao, Fan, Dongrui. CRAT: Enabling Coordinated Register Allocation and Thread-Level Parallelism Optimization for GPUs. IEEE TRANSACTIONS ON COMPUTERS[J]. 2018, 67(6): 890-897, https://www.webofscience.com/wos/woscc/full-record/WOS:000431902600010.
[74] Xiang Taoran, Feng Yujing, Ye Xiaochun, Tan Xu, Li Wenming, Zhu Yatao, Wu Meng, Zhang Hao, Fan Dongrui, IEEE. Accelerating CNN Algorithm with Fine-grained Dataflow Architectures. IEEE 20TH INTERNATIONAL CONFERENCE ON HIGH PERFORMANCE COMPUTING AND COMMUNICATIONS / IEEE 16TH INTERNATIONAL CONFERENCE ON SMART CITY / IEEE 4TH INTERNATIONAL CONFERENCE ON DATA SCIENCE AND SYSTEMS (HPCC/SMARTCITY/DSS)null. 2018, 243-251, http://dx.doi.org/10.1109/HPCC/SmartCity/DSS.2018.00063.
[75] Feng Yujing, Li Han, Tan Xu, Ye Xiaochun, Fan Dongrui, Tang Zhimin, IEEE. Optimizing network efficiency of dataflow architectures through dynamic packet merging. 2018 NINTH INTERNATIONAL GREEN AND SUSTAINABLE COMPUTING CONFERENCE (IGSC)null. 2018, http://apps.webofknowledge.com/CitedFullRecord.do?product=UA&colName=WOS&SID=5CCFccWmJJRAuMzNPjj&search_mode=CitedFullRecord&isickref=WOS:000484460900038.
[76] Xu Tan, Xiao-Chun Ye, Xiao-Wei Shen, Yuan-Chao Xu, Da Wang, Lunkai Zhang, Wen-Ming Li, Dong-Rui Fan, Zhi-Min Tang. A Pipelining Loop Optimization Method for Dataflow Architecture. 计算机科学技术学报:英文版[J]. 2018, 33(1): 116-130, http://lib.cqvip.com/Qikan/Article/Detail?id=674567291.
[77] Tang, GuangMing, Qu, PeiYao, Ye, XiaoChun, Fan, DongRui. Logic Design of a 16-bit Bit-Slice Arithmetic Logic Unit for 32-/64-bit RSFQ Microprocessors. IEEE TRANSACTIONS ON APPLIED SUPERCONDUCTIVITY[J]. 2018, 28(4): https://www.webofscience.com/wos/woscc/full-record/WOS:000425742900001.
[78] Tan, Xu, Ye, XiaoChun, Shen, XiaoWei, Xu, YuanChao, Wang, Da, Zhang, Lunkai, Li, WenMing, Fan, DongRui, Tang, ZhiMin. A Pipelining Loop Optimization Method for Dataflow Architecture. JOURNAL OF COMPUTER SCIENCE AND TECHNOLOGY[J]. 2018, 33(1): 116-130, http://lib.cqvip.com/Qikan/Article/Detail?id=674567291.
[79] 冯煜晶, 欧焱, 叶笑春, 范东睿, 谭旭, 唐志敏. 基于网络负载特征感知的数据流指令调度机制研究. 高技术通讯[J]. 2018, 28(11): 885-898, http://lib.cqvip.com/Qikan/Article/Detail?id=7001166774.
[80] Ninghui SUN, Yungang BAO, Dongrui FAN. The rise of high-throughput computing. 信息与电子工程前沿:英文版[J]. 2018, 19(10): 1245-1250, http://lib.cqvip.com/Qikan/Article/Detail?id=676786551.
[81] Tan, Xu, Shen, XiaoWei, Ye, XiaoChun, Wang, Da, Fan, DongRui, Zhang, Lunkai, Li, WenMing, Zhang, ZhiMin, Tang, ZhiMin. A Non-Stop Double Buffering Mechanism for Dataflow Architecture. JOURNAL OF COMPUTER SCIENCE AND TECHNOLOGY[J]. 2018, 33(1): 145-157, http://lib.cqvip.com/Qikan/Article/Detail?id=674567293.
[82] 范东睿, 叶笑春. 众核处理器:高端计算的核心引擎. 前沿科学[J]. 2018, 12(4): 32-36, http://lib.cqvip.com/Qikan/Article/Detail?id=7001585981.
[83] Li Wenming, Ye Xiaochun, Wang Da, Zhang Hao, Wu Dongdong, Zhang Zhimin, Fan Dongrui, Chen JJ, Yang LT. WEAVER: An Energy Efficient, General-Purpose Acceleration Architecture for String Operations in Big Data Applications. 2018 IEEE INT CONF ON PARALLEL & DISTRIBUTED PROCESSING WITH APPLICATIONS, UBIQUITOUS COMPUTING & COMMUNICATIONS, BIG DATA & CLOUD COMPUTING, SOCIAL COMPUTING & NETWORKING, SUSTAINABLE COMPUTING & COMMUNICATIONSnull. 2018, 47-54, [84] Feng Yujing, Xiang Taoran, Ye Xiaochun, Fan Dongrui, Wang Da, Wu Dongdong, Tang Zhimin, IEEE. Optimizing the efficiency of data transfer in dataflow architectures. IEEE 20TH INTERNATIONAL CONFERENCE ON HIGH PERFORMANCE COMPUTING AND COMMUNICATIONS / IEEE 16TH INTERNATIONAL CONFERENCE ON SMART CITY / IEEE 4TH INTERNATIONAL CONFERENCE ON DATA SCIENCE AND SYSTEMS (HPCC/SMARTCITY/DSS)null. 2018, 140-149, http://dx.doi.org/10.1109/HPCC/SmartCity/DSS.2018.00050.
[85] Fan, Dongrui, Li, Wenming, Ye, Xiaochun, Wang, Da, Zhang, Hao, Tang, Zhimin, Sun, Ninghui, IEEE. SmarCo: An Efficient Many-Core Processor for High-Throughput Applications in Datacenters. 2018 24TH IEEE INTERNATIONAL SYMPOSIUM ON HIGH PERFORMANCE COMPUTER ARCHITECTURE (HPCA)null. 2018, 596-607, [86] Shen, XiaoWei, Ye, XiaoChun, Tan, Xu, Wang, Da, Zhang, Lunkai, Li, WenMing, Zhang, ZhiMin, Fan, DongRui, Sun, NingHui. An Efficient Network-on-Chip Router for Dataflow Architecture. JOURNAL OF COMPUTER SCIENCE AND TECHNOLOGY[J]. 2017, 32(1): 11-25, [87] 申小伟, 叶笑春, 王达, 张浩, 王飞, 谭旭, 张志敏, 范东睿, 唐志敏, 孙凝晖. 一种面向科学计算的数据流优化方法. 计算机学报[J]. 2017, 40(9): 2181-2196, http://lib.cqvip.com/Qikan/Article/Detail?id=673042586.
[88] 张洋, 李文明, 叶笑春, 王达, 范东睿, 李宏亮, 唐志敏, 孙凝晖. LFF:一种面向大数据应用的众核处理器访存公平性调度机制. 高技术通讯[J]. 2017, 27(2): 103-111, http://lib.cqvip.com/Qikan/Article/Detail?id=672300314.
[89] Dongrui Fan. An Adaptive Tuning Sparse Fast Fourier Transform. Pacific-Rim Conference on Multimedia (PCM). 2017, [90] 胡九川, 范东睿, 李丹萍, 严龙, 叶笑春. 一种支持数据渗透迁移的片上缓存模型研究. 北京交通大学学报:自然科学版[J]. 2017, 41(5): 1-9, http://lib.cqvip.com/Qikan/Article/Detail?id=674102938.
[91] 刘炳涛, 王达, 叶笑春, 范东睿, 张志敏, 唐志敏. 基于数据流块的空间指令调度方法. 计算机研究与发展[J]. 2017, 54(4): 750-763, http://lib.cqvip.com/Qikan/Article/Detail?id=7000192386.
[92] Chu Yi, Luo Chuan, Huang Wenxuan, You Haihang, Fan Dongrui, IEEE. Hard Neighboring Variables Based Configuration Checking in Stochastic Local Search for Weighted Partial Maximum Satisfiability. 2017 IEEE 29TH INTERNATIONAL CONFERENCE ON TOOLS WITH ARTIFICIAL INTELLIGENCE (ICTAI 2017)null. 2017, 139-146, [93] Sheikh, Hafiz Fahad, Ahmad, Ishfaq, Fan, Dongrui. An Evolutionary Technique for Performance-Energy-Temperature Optimized Scheduling of Parallel Tasks on Multi-Core Processors. IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS[J]. 2016, 27(3): 668-681, http://dx.doi.org/10.1109/TPDS.2015.2421352.
[94] Hu, Jiuchuan, Fan, Dongrui, Li, Danping, Yan, Long, Ye, Xiaochun, IEEE. On the Properties of Data Migration Based on Topology Pattern Keeping On Cache Hierarchy. 2016 SEVENTH INTERNATIONAL GREEN AND SUSTAINABLE COMPUTING CONFERENCE (IGSC)null. 2016, http://apps.webofknowledge.com/CitedFullRecord.do?product=UA&colName=WOS&SID=5CCFccWmJJRAuMzNPjj&search_mode=CitedFullRecord&isickref=WOS:000402169700007.
[95] Shen Xiaowei, Ye Xiaochun, Tan Xu, Wang Da, Zhang Zhimin, Fan Dongrui, Tang Zhimin, IEEE. POSTER: An Optimization of Dataflow Architectures for Scientific Applications. 2016INTERNATIONALCONFERENCEONPARALLELARCHITECTUREANDCOMPILATIONTECHNIQUESPACTnull. 2016, 441-442, http://dx.doi.org/10.1145/2967938.2974054.
[96] Qi Yuqiong, Ma Lina, Li Wenming, Ye Xiaochun, Wang Da, Fan Dongrui, Sun Ninghui, Chen J, Yang LT. ACCC: An Acceleration Mechanism for Character Operation based on Cache Computing in Big Data Applications. PROCEEDINGS OF 2016 IEEE 18TH INTERNATIONAL CONFERENCE ON HIGH PERFORMANCE COMPUTING AND COMMUNICATIONS; IEEE 14TH INTERNATIONAL CONFERENCE ON SMART CITY; IEEE 2ND INTERNATIONAL CONFERENCE ON DATA SCIENCE AND SYSTEMS (HPCC/SMARTCITY/DSS)null. 2016, 608-615, http://dx.doi.org/10.1109/HPCC-SmartCity-DSS.2016.56.
[97] Zhu Yatao, Zhang Shuai, Ye Xiaochun, Wang Da, Tan Xu, Fan Dongrui, Zhang Zhimin, Li Hongliang, IEEE. An Energy-efficient Bandwidth Allocation Method for Single-chip Heterogeneous Processor. 2016 SEVENTH INTERNATIONAL GREEN AND SUSTAINABLE COMPUTING CONFERENCE (IGSC)null. 2016, http://apps.webofknowledge.com/CitedFullRecord.do?product=UA&colName=WOS&SID=5CCFccWmJJRAuMzNPjj&search_mode=CitedFullRecord&isickref=WOS:000402169700033.
[98] Hu Jiuchuan, Fan Dongrui, Li Danping, Yan Long, Ye Xiaochun, IEEE. A Percolation Data Migration Schema in A Hybrid Cache Hierarchy. 2016 SEVENTH INTERNATIONAL GREEN AND SUSTAINABLE COMPUTING CONFERENCE (IGSC)null. 2016, http://apps.webofknowledge.com/CitedFullRecord.do?product=UA&colName=WOS&SID=5CCFccWmJJRAuMzNPjj&search_mode=CitedFullRecord&isickref=WOS:000402169700006.
[99] Zhu Yatao, Ye Xiaochun, Wang Da, Li Wenming, Zhang Yang, Fan Dongrui, Zhang Zhimin, Tang Zhimin, IEEE. A Framework for Energy-efficient Optimization on Multi-Cores. 2016 SEVENTH INTERNATIONAL GREEN AND SUSTAINABLE COMPUTING CONFERENCE (IGSC)null. 2016, http://apps.webofknowledge.com/CitedFullRecord.do?product=UA&colName=WOS&SID=5CCFccWmJJRAuMzNPjj&search_mode=CitedFullRecord&isickref=WOS:000402169700032.
[100] 张洋, 王达, 叶笑春, 朱亚涛, 范东睿, 李宏亮, 谢向辉. 众核处理器片上网络的层次化全局自适应路由机制. 计算机研究与发展[J]. 2016, 53(6): 1211-1220, http://lib.cqvip.com/Qikan/Article/Detail?id=669061058.
[101] Wang Fei, Wang Da, Yang Haigang, Xie Xianghui, Fan Dongrui. On-Chip Generating FPGA Test Configuration Bitstreams to Reduce Manufacturing Test Time. CHINESE JOURNAL OF ELECTRONICS[J]. 2016, 25(1): 64-70, http://lib.cqvip.com/Qikan/Article/Detail?id=667783130.
[102] Shen Xiaowei, Ye Xiaochun, Tan Xu, Wang Da, Zhang Zhimin, Tang Zhimin, Fan Dongrui, IEEE. Memory Partition for SIMD in Streaming Dataflow Architectures. 2016 SEVENTH INTERNATIONAL GREEN AND SUSTAINABLE COMPUTING CONFERENCE (IGSC)null. 2016, http://apps.webofknowledge.com/CitedFullRecord.do?product=UA&colName=WOS&SID=5CCFccWmJJRAuMzNPjj&search_mode=CitedFullRecord&isickref=WOS:000402169700035.
[103] 刘炳涛, 王达, 叶笑春, 张浩, 范东睿, 张志敏. 一种缓存数据流信息的处理器前端设计. 计算机研究与发展[J]. 2016, 53(6): 1221-1237, http://lib.cqvip.com/Qikan/Article/Detail?id=669061059.
[104] 刘炳涛, 王达, 叶笑春, 张浩, 范东睿, 张志敏. 一种缓存数据流信息的处理器前端设计. 计算机研究与发展[J]. 2016, 53(6): 1221-1237, http://lib.cqvip.com/Qikan/Article/Detail?id=669061059.
[105] 李国杰, 范东睿. 面向高通量计算的可扩展、高效能并行微结构研究立项报告. 科技创新导报[J]. 2016, 13(9): 168-168, http://lib.cqvip.com/Qikan/Article/Detail?id=669805509.
[106] 李文明, 叶笑春, 张洋, 宋风龙, 王达, 唐士斌, 范东睿, 谢向辉. BDSim:面向大数据应用的组件化高可配并行模拟框架. 计算机学报[J]. 2015, 38(10): 1959-1975, http://lib.cqvip.com/Qikan/Article/Detail?id=666506311.
[107] 高珂, 陈荔城, 范东睿, 刘志勇. 多核系统共享内存资源分配和管理研究. 计算机学报[J]. 2015, 38(5): 1020-1034, http://lib.cqvip.com/Qikan/Article/Detail?id=664815060.
[108] Li Wenming, Fan Lingjun, Wang Zihou, Ye Xiaochun, Wang Da, Zhang Hao, Zhang Liang, Fan Dongrui, Xie Xianghui, IEEE. Thread ID Based Power Reduction Mechanism for Multi-thread Shared Set-associative Caches. 2015 SIXTH INTERNATIONAL GREEN COMPUTING CONFERENCE AND SUSTAINABLE COMPUTING CONFERENCE (IGSC)null. 2015, http://apps.webofknowledge.com/CitedFullRecord.do?product=UA&colName=WOS&SID=5CCFccWmJJRAuMzNPjj&search_mode=CitedFullRecord&isickref=WOS:000380428700018.
[109] Li Wenming, Zhang Liang, Ye Xiaochun, Wang Da, Zhang Hao, Wang Zihou, Fan Dongrui, IEEE. A High-Density Data Path Implementation fitting for HTC Applications. 2015 SIXTH INTERNATIONAL GREEN COMPUTING CONFERENCE AND SUSTAINABLE COMPUTING CONFERENCE (IGSC)null. 2015, http://apps.webofknowledge.com/CitedFullRecord.do?product=UA&colName=WOS&SID=5CCFccWmJJRAuMzNPjj&search_mode=CitedFullRecord&isickref=WOS:000380428700059.
[110] 李文明, 叶笑春, 王达, 郑方, 李宏亮, 林晗, 范东睿, 孙凝晖. MACT:高通量众核处理器离散访存请求批量处理机制. 计算机研究与发展[J]. 2015, 52(6): 1254-1265, http://lib.cqvip.com/Qikan/Article/Detail?id=665059268.
[111] 高珂, 范东睿, 刘志勇. 一种缓解多线程访存干扰的VRB内存机制. 计算机研究与发展[J]. 2015, 52(11): 2577-2588, http://lib.cqvip.com/Qikan/Article/Detail?id=666660942.
[112] 朱亚涛, 张帅, 王达, 叶笑春, 张洋, 胡九川, 张志敏, 范东睿, 李宏亮. EOFDM:一种面向众核架构的最低能耗搜索方法. 计算机研究与发展[J]. 2015, 52(6): 1303-1315, http://lib.cqvip.com/Qikan/Article/Detail?id=665059273.
[113] Gupta, Sandeep K S, Fan, Dongrui. Introduction to special issue on Selected Papers from 2013 International Green Computing Conference. SUSTAINABLE COMPUTING-INFORMATICS & SYSTEMSnull. 2015, 6: 1-2, http://dx.doi.org/10.1016/j.suscom.2015.01.001.
[114] 朱亚涛, 张帅, 王达, 叶笑春, 张洋, 胡九川, 张志敏, 范东睿, 李宏亮. EOFDM:一种面向众核架构的最低能耗搜索方法. 计算机研究与发展[J]. 2015, 52(6): 1303-1315, http://lib.cqvip.com/Qikan/Article/Detail?id=665059273.
[115] Xie Xiaolong, Liang Yun, Li Xiuhong, Wu Yudong, Sun Guangyu, Wang Tao, Fan Dongrui, ACM. Enabling Coordinated Register Allocation and Thread-level Parallelism Optimization for GPUs. PROCEEDINGS OF THE 48TH ANNUAL IEEE/ACM INTERNATIONAL SYMPOSIUM ON MICROARCHITECTURE (MICRO-48)null. 2015, 395-406, http://dx.doi.org/10.1145/2830772.2830813.
[116] Sandeep K.S. Gupta, Dongrui Fan. Introduction to special issue on Selected Papers from 2013 International Green Computing Conference. SUSTAINABLECOMPUTINGINFORMATICSANDSYSTEMS. 2015, 6: 1-2, http://dx.doi.org/10.1016/j.suscom.2015.01.001.
[117] 李文明, 叶笑春, 张洋, 宋风龙, 王达, 唐士斌, 范东睿, 谢向辉. BDSim:面向大数据应用的组件化高可配并行模拟框架. 计算机学报[J]. 2015, 38(10): 1959-1975, http://lib.cqvip.com/Qikan/Article/Detail?id=666506311.
[118] 李文明, 叶笑春, 王达, 郑方, 李宏亮, 林晗, 范东睿, 孙凝晖. MACT:高通量众核处理器离散访存请求批量处理机制. 计算机研究与发展[J]. 2015, 52(6): 1254-1265, http://lib.cqvip.com/Qikan/Article/Detail?id=665059268.
[119] 范东睿. HD-NoC:面向高通量应用的高密度片上网络实现机制. HPC-China. 2015, [120] 唐士斌, 宋风龙, 张帅, 范东睿, 刘志勇. 基于全局同步逻辑时间的访存依赖约减方法. 计算机学报[J]. 2014, 37(7): 1487-1499, http://lib.cqvip.com/Qikan/Article/Detail?id=662044928.
[121] 汤旭龙, 安虹, 范东睿. 主流视频编解码软件的硬件性能分析与设计. 计算机工程[J]. 2014, 40(6): 300-305, http://lib.cqvip.com/Qikan/Article/Detail?id=50016433.
[122] Chen, Zheng, Gu, Huaxi, Yang, Yintang, Fan, Dongrui. A Hierarchical Optical Network-On-Chip Using Central-Controlled Subnet and Wavelength Assignment. JOURNAL OF LIGHTWAVE TECHNOLOGY[J]. 2014, 32(5): 930-938, https://www.webofscience.com/wos/woscc/full-record/WOS:000330129500008.
[123] 魏海涛, 秦明康, 于俊清, 范东睿. 一种面向众核架构的数据流编译框架. 计算机学报[J]. 2014, 37(7): 1560-1569, http://lib.cqvip.com/Qikan/Article/Detail?id=662044935.
[124] Chen, Ke, Gu, Huaxi, Yang, Yintang, Fan, Dongrui. A Novel Two-Layer Passive Optical Interconnection Network for On-Chip Communication. JOURNAL OF LIGHTWAVE TECHNOLOGY[J]. 2014, 32(9): 1770-1776, https://www.webofscience.com/wos/woscc/full-record/WOS:000334741300004.
[125] Zhang, Na, Gu, Huaxi, Yang, Yintang, Fan, Dongrui. QBNoC: QoS-aware bufferless NoC architecture. MICROELECTRONICS JOURNAL[J]. 2014, 45(6): 751-758, http://dx.doi.org/10.1016/j.mejo.2014.04.015.
[126] Zhang Lunkai, Strukov Dmitri, Saadeldeen Hebatallah, Fan Dongrui, Zhang Mingzhe, Franklin Diana, IEEE. SpongeDirectory: Flexible Sparse Directories Utilizing Multi-Level Memristors. PROCEEDINGS OF THE 23RD INTERNATIONAL CONFERENCE ON PARALLEL ARCHITECTURES AND COMPILATION TECHNIQUES (PACT'14)null. 2014, 61-73, [127] 孙公瑾, 安虹, 范东睿. 多标准视频编码器下的运动估计评估. 计算机工程[J]. 2014, 40(4): 295-300,304, http://lib.cqvip.com/Qikan/Article/Detail?id=49246178.
[128] Song, Fenglong, Tang, Shibin, Li, Wenming, Miao, Futao, Zhang, Hao, Fan, Dongrui, Liu, Zhiyong. CRANarch: A feasible processor micro-architecture for Cloud Radio Access Network. MICROPROCESSORS AND MICROSYSTEMS[J]. 2014, 38(8): 1025-1036, http://dx.doi.org/10.1016/j.micpro.2014.08.003.
[129] 熊海泉, 刘志勇, 徐卫志, 唐士斌, 范东睿. VMM中Guest OS非陷入系统调用指令截获与识别. 计算机研究与发展[J]. 2014, 51(10): 2348-2359, http://lib.cqvip.com/Qikan/Article/Detail?id=662435628.
[130] 张轮凯, 宋风龙, 王达, 范东睿, 孙凝晖. 提升稀疏目录缓存一致性系统性能的方法. 计算机研究与发展[J]. 2014, 51(9): 1955-1970, http://lib.cqvip.com/Qikan/Article/Detail?id=662178137.
[131] Dongrui Fan. BDSim : A component-based high configurable parallel simulation framework for big-data application evaluation. CCF Bigdata2014. 2014, [132] 徐冉冉, 孟海波, 桂小琰, 申小伟, 安述倩. 面向门级网表的VLSI三模冗余加固设计. 计算机工程与科学[J]. 2014, 36(12): 2355-2360, http://lib.cqvip.com/Qikan/Article/Detail?id=663226939.
[133] 郑亚松, 王达, 叶笑春, 崔慧敏, 徐远超, 范东睿. MALK:一种高效处理大规模键值的MapReduce框架. 计算机研究与发展[J]. 2014, 51(12): 2711-2723, http://lib.cqvip.com/Qikan/Article/Detail?id=663245478.
[134] Song, Fenglong, Zheng, Yasong, Miao, Futao, Ye, Xiaochun, Zhang, Hao, Fan, Dongrui, Liu, Zhiyong, IEEE. Low Execution Efficiency: When General Multi-Core Processor Meets Wireless Communication Protocol. 2013 IEEE 15TH INTERNATIONAL CONFERENCE ON HIGH PERFORMANCE COMPUTING AND COMMUNICATIONS & 2013 IEEE INTERNATIONAL CONFERENCE ON EMBEDDED AND UBIQUITOUS COMPUTING (HPCC_EUC)null. 2013, 906-913, http://dx.doi.org/10.1109/HPCC.and.EUC.2013.129.
[135] Zhang Shuai, Liu Zhiyong, Fan Dongrui, Song Fonglong, Zhang Mingzhe, IEEE. Energy-Performance Modeling and Optimization of Parallel Computing in On-Chip Networks. 2013 12TH IEEE INTERNATIONAL CONFERENCE ON TRUST, SECURITY AND PRIVACY IN COMPUTING AND COMMUNICATIONS (TRUSTCOM 2013)null. 2013, 879-886, [136] Ye Xiaochun, Fan Dongrui, Sun Ninghui, Tang Shibin, Zhang Mingzhe, Zhang Hao, IEEE. SimICT: A Fast and Flexible Framework for Performance and Power Evaluation of Large-Scale Architecture. 2013IEEEINTERNATIONALSYMPOSIUMONLOWPOWERELECTRONICSANDDESIGNISLPEDnull. 2013, 273-278, http://apps.webofknowledge.com/CitedFullRecord.do?product=UA&colName=WOS&SID=5CCFccWmJJRAuMzNPjj&search_mode=CitedFullRecord&isickref=WOS:000337238700048.
[137] Ding, Hui, Gu, Huaxi, Yang, Yintang, Fan, Dongrui. 3D Networks-on-Chip mapping targeting minimum signal TSVs. IEICE ELECTRONICS EXPRESS[J]. 2013, 10(18): https://www.webofscience.com/wos/woscc/full-record/WOS:000326194900004.
[138] Wei, Haitao, Qin, Mingkang, Zhang, Weiwei, Yu, Junqing, Fan, Dongrui, Gao, Guang R. StreamTMC: Stream compilation for tiled multi-core architectures. JOURNAL OF PARALLEL AND DISTRIBUTED COMPUTING[J]. 2013, 73(4): 484-494, http://dx.doi.org/10.1016/j.jpdc.2012.12.001.
[139] 吕慧伟, 程元, 白露, 陈明宇, 范东睿, 孙凝晖. 众核处理器和众核集群的并行模拟. 计算机研究与发展[J]. 2013, 50(5): 1110-1117, http://lib.cqvip.com/Qikan/Article/Detail?id=45617364.
[140] Dongrui Fan. International Symposium on Low Power Electronics and Desig. International Symposium on Low Power Electronics and Design. 2013, [141] Zhang Mingzhe, Wang Da, Ye Xiaochun, He Liqiang, Fan Dongrui, Liu Zhiyong, IEEE. A Path-Adaptive Opto-Electronic Hybrid NoC for Chip Multi-Processor. 2013 12TH IEEE INTERNATIONAL CONFERENCE ON TRUST, SECURITY AND PRIVACY IN COMPUTING AND COMMUNICATIONS (TRUSTCOM 2013)null. 2013, 1198-1205, [142] 范涛, 刘高辉, 叶笑春, 李文明, 宋爽, 范东睿. SPARC平台模拟器源码级调试系统的研究与实现. 计算机工程与应用[J]. 2013, 49(4): 65-70, http://lib.cqvip.com/Qikan/Article/Detail?id=44810940.
[143] Dongrui Fan. An Efficient Parallel Mechanism for Highly-Debuggable Multicore Simulator. International Conference on Advanced Parallel Processing Technology (APPT). 2013, [144] 张帅, 宋风龙, 王栋, 刘志勇, 范东睿. 多核结构片上网络性能-能耗分析及优化方法. 计算机学报[J]. 2013, 36(5): 988-1003, http://lib.cqvip.com/Qikan/Article/Detail?id=45850220.
[145] Peng, Liu, Tan, Guangming, Kalia, Rajiv K, Nakano, Aiichiro, Vashishta, Priya, Fan, Dongrui, Zhang, Hao, Song, Fenglong. Scalability study of molecular dynamics simulation on Godson-T many-core architecture. JOURNAL OF PARALLEL AND DISTRIBUTED COMPUTING[J]. 2013, 73(11): 1469-1482, http://dx.doi.org/10.1016/j.jpdc.2012.07.007.
[146] 范灵俊, 徐远超, 施巍松, 范东睿, 娄杰. 针对组相联缓存的无效缓存路访问混合过滤机制研究. 计算机学报[J]. 2013, 36(4): 799-807, http://lib.cqvip.com/Qikan/Article/Detail?id=45976851.
[147] 范东睿. MALK——面向共享存储多核系统高效处理大规模键值的MapReduce框架. CCF BigData2013. 2013, [148] Cui, Huimin, Xue, Jingling, Wang, Lei, Yang, Yang, Feng, Xiaobing, Fan, Dongrui. Extendable Pattern-Oriented Optimization Directives. ACM TRANSACTIONS ON ARCHITECTURE AND CODE OPTIMIZATION[J]. 2012, 9(3): [149] Jiao Shuai, Ienne Paolo, Ye Xiaochun, Wang Da, Fan Dongrui, Sun Ninghui, Kaklamanis C, Papatheodorou T, Spirakis PG. CRAW/P: A Workload Partition Method for the Efficient Parallel Simulation of Manycores. EURO-PAR 2012 PARALLEL PROCESSINGnull. 2012, 7484: 102-114, [150] Xu Weizhi, Liu Zhiyong, Wu Jun, Ye Xiaochun, Jiao Shuai, Wang Da, Song Fenglong, Fan Dongrui, IEEE. Auto-Tuning GEMV on Many-Core GPU. PROCEEDINGS OF THE 2012 IEEE 18TH INTERNATIONAL CONFERENCE ON PARALLEL AND DISTRIBUTED SYSTEMS (ICPADS 2012)null. 2012, 30-36, [151] Dongrui Fan. Self-correction trace model: A full-system simulator for optical network-on-chip. Proceedings of the 2012 IEEE 26th International Parallel and Distributed Processing Symposium Workshops, IPDPSW 2012. 2012, [152] Wang Da, Zhang Lunkai, Xu Weizhi, Fan Dongrui, Wang Fei, IEEE. A SAT-Based Diagnosis Pattern Generation Method for Timing Faults in Scan Chains. 2012 IEEE INTERNATIONAL SYMPOSIUM ON CIRCUITS AND SYSTEMS (ISCAS 2012)null. 2012, 2308-2312, [153] Fan, Dongrui, Zhang, Hao, Wang, Da, Ye, Xiaochun, Song, Fenglong, Li, Guojie, Sun, Ninghui. GODSON-T: AN EFFICIENT MANY-CORE PROCESSOR EXPLORING THREAD-LEVEL PARALLELISM. IEEE MICRO[J]. 2012, 32(2): 38-47, https://www.webofscience.com/wos/woscc/full-record/WOS:000302458600007.
[154] Peng, Liu, Nakano, Aiichiro, Tan, Guangming, Vashishta, Priya, Fan, Dongrui, Zhang, Hao, Kalia, Rajiv K, Song, Fenglong, ACM. Performance Analysis and Optimization of Molecular Dynamics Simulation on Godson-T Many-core Processor. PROCEEDINGS OF THE 2011 8TH ACM INTERNATIONAL CONFERENCE ON COMPUTING FRONTIERS (CF 2011)null. 2011, http://dx.doi.org/10.1145/2016604.2016643.
[155] Lei Yu, Zhi Yong Liu, Dong Rui Fan, Yi Ke Ma, Feng Long Song, Xiao Chun Ye, Wei Zhi Xu. Mapping Routing Lookup Algorithm on Many-Core Architecture Based on SPM and Cache Mixed Method. APPLIED MECHANICS AND MATERIALS. 2011, 1287: [156] Dongrui Fan. Godson-T-- High-Efficient Architecture of Godson-T Many-Core Processor. HotChips. 2011, [157] Dongrui Fan. An Efficient and Flexible Task Management for Many Cores. LNCS Transactions on High-Performance Embedded Architectures and Compilers. 2011, [158] 马宜科, 常晓涛, 范东睿, 刘志勇. 混合体系结构中有状态硬件加速器的优化. 计算机学报[J]. 2011, 34(7): 1314-1322, http://lib.cqvip.com/Qikan/Article/Detail?id=38725757.
[159] Da Wang, Dongrui Fan, Yu Hu. A Case Study: Low Power Design-for-Testability Features of a Multi-core Processor Godson-T. ADVANCED MATERIALS RESEARCH. 2011, 1359: [160] 焦帅, 徐卫志, 唐士斌, 范东睿, 孙凝晖. PartitionSim:一个面向众核结构的并行模拟器. 计算机学报[J]. 2011, 34(11): 2084-2092, http://lib.cqvip.com/Qikan/Article/Detail?id=40083654.
[161] 范灵俊, 颜成钢, 宋风龙, 马宜科, 范东睿. H.264去块滤波算法在众核结构上的并行优化. 小型微型计算机系统[J]. 2011, 32(11): 2263-2267, http://lib.cqvip.com/Qikan/Article/Detail?id=39785223.
[162] Lei Yu, Zhi Yong Liu, Dong Rui Fan, Yike Ma, Feng Long Song, Xiao Chun Ye, Wei Zhi Xu. Study on the Mapping of Streaming Application on Many-Core Architecture. APPLIED MECHANICS AND MATERIALS. 2011, 1287: [163] Fan, DongRui, Li, XiaoWei, Li, GuoJie. New Methodologies for Parallel Architecture. JOURNAL OF COMPUTER SCIENCE AND TECHNOLOGY[J]. 2011, 26(4): 578-587, http://lib.cqvip.com/Qikan/Article/Detail?id=38447509.
[164] Peng Liu, Tan Guangming, Kalia Rajiv K, Nakano Aiichiro, Vashishta Priya, Fang Dongrui, Sun Ninghui, Guarracino MR, Vivien F, Traff JL, Cannataro M, Danelutto M, Hast A, Perla F, Knupfer A, DiMartino B, Alexander M. Preliminary Investigation of Accelerating Molecular Dynamics Simulation on Godson-T Many-Core Processor. EURO-PAR 2010 PARALLEL PROCESSING WORKSHOPSnull. 2011, 6586: 349-356, [165] Cui, Huimin, Xue, Jingling, Wang, Lei, Yang, Yang, Feng, Xiaobing, Fan, Dongrui, IEEE. Extendable Pattern-Oriented Optimization Directives. 2011 9TH ANNUAL IEEE/ACM INTERNATIONAL SYMPOSIUM ON CODE GENERATION AND OPTIMIZATION (CGO)null. 2011, 107-118, [166] Dongrui Fan. Optimizing web browser on many-core architectures. 2011, [167] Dongrui Fan. Thread Owned Block Cache: Managing Latency in Many-Core Architecture. International Conference on Parallel Computing (Euro-Par). 2010, [168] 包尔固德, 李伟生, 范东睿, 杨扬, 马啸宇. Godson-T众核体系结构上的Broadcast性能优化. 计算机研究与发展[J]. 2010, 524-531, http://lib.cqvip.com/Qikan/Article/Detail?id=33116075.
[169] Silvano, Cristina, Fornaciari, William, Palermo, Gianluca, Zaccaria, Vittorio, Castro, Fabrizio, Martinez, Marcos, Bocchio, Sara, Zafalon, Roberto, Avasare, Prabhat, Vanmeerbeeck, Geert, YkmanCouvreur, Chantal, Wouters, Maryse, Kavka, Carlos, Onesti, Luka, Turco, Alessandro, Bondi, Umberto, Mariani, Giovanni, Posadas, Hector, Villar, Eugenio, Wu, Chris, Fan Dongrui, Hao, Zhang, Tang Shibin, IEEE Comp Soc. MULTICUBE: Multi-Objective Design Space Exploration of Multi-Core Architectures. IEEE ANNUAL SYMPOSIUM ON VLSI (ISVLSI 2010)null. 2010, 488-493, [170] Cui, HuiMin, Wang, Lei, Fan, DongRui, Feng, XiaoBing. Landing Stencil Code on Godson-T. JOURNAL OF COMPUTER SCIENCE AND TECHNOLOGY[J]. 2010, 25(4): 886-894, http://lib.cqvip.com/Qikan/Article/Detail?id=34470262.
[171] 叶笑春, 林伟, 范东睿, 张浩. 蛋白质序列比对算法在众核结构上的并行优化. 软件学报[J]. 2010, 3094-3105, http://lib.cqvip.com/Qikan/Article/Detail?id=36056005.
[172] Dongrui Fan. High Performance Comparison-Based Sorting Algorithm on Many-Core GPUs. International Parallel and Distributed Processing Symposium (IPDPS). 2010, [173] 崔慧敏, 王蕾, 范东睿, 冯晓兵. Landing Stencil Code on Godson-T. 计算机科学技术学报(英文版)[J]. 2010, 886-894, http://lib.cqvip.com/Qikan/Article/Detail?id=34470262.
[174] 徐卫志, 宋风龙, 范东睿, 余磊, 张帅, 刘志勇. 众核处理器片上同步机制和评估方法研究. 计算机学报[J]. 2010, 1777-1787, http://lib.cqvip.com/Qikan/Article/Detail?id=35344799.
[175] Dongrui Fan. Efficient Address Mapping of Shared Cache for On-Chip Many-Core Architecture. International Conference on Parallel Computing (Euro-Par). 2010, [176] Dongrui Fan. P-GAS: Parallelizing a cycle-accurate event-driven many-core processor simulator using parallel discrete event simulation. 2010, [177] Dongrui Fan. GVE: Godson-T verification engine for many-core architecture rapid prototyping and debugging. 2010, [178] Dongrui Fan. Minimal Multi-Threading: Finding and Removing Redundant Instructions in Multi-Threaded Processors. International Symposium on Microarchitecture (Micro). 2010, [179] Yu Lei, Liu Zhiyong, Fan Dongrui, Song Fenglong, Zhang Junchao, Yuan Nan, IEEE COMPUTER SOC. Study on Fine-grained Synchronization in Many-Core Architecture. SNPD 2009: 10TH ACIS INTERNATIONAL CONFERENCE ON SOFTWARE ENGINEERING, ARTIFICIAL INTELLIGENCES, NETWORKING AND PARALLEL DISTRIBUTED COMPUTING, PROCEEDINGSnull. 2009, 524-529, http://dx.doi.org/10.1109/SNPD.2009.61.
[180] Dongrui Fan. Evaluation method of synchronization for shared-memory on-chip many-core processor. Proceedings - 2009 IEEE International Symposium on Parallel and Distributed Processing with Applications, ISPA 2009. 2009, [181] Yuan Nan, Zhou Yongbin, Tan Guangming, Zhang Junchao, Fan Dongrui, Sips H, Epema D, Lin HX. High Performance Matrix Multiplication on Many Cores. EURO-PAR 2009: PARALLEL PROCESSING, PROCEEDINGSnull. 2009, 5704: 948-959, [182] Dongrui Fan. Design of new hash mapping functions. 2009, [183] Dongrui Fan. GFFC: The global feedback based flow control in the NoC design for many-core processor. NPC 2009 - 6th International Conference on Network and Parallel Computing. 2009, [184] DongRui Fan, Nan Yuan, JunChao Zhang, YongBin Zhou, Wei Lin, FengLong Song, XiaoChun Ye, He Huang, Lei Yu, GuoPing Long, Hao Zhang, Lei Liu. Godson-T: An Efficient Many-Core Architecture for Parallel Program Executions. JOURNAL OF COMPUTER SCIENCE AND TECHNOLOGY,[J]. 2009, 24(6): 1061-1073, https://www.webofscience.com/wos/woscc/full-record/WOS:000271535700008.
[185] Fan, DongRui, Yuan, Nan, Zhang, JunChao, Zhou, YongBin, Lin, Wei, Song, FengLong, Ye, XiaoChun, Huang, He, Yu, Lei, Long, GuoPing, Zhang, Hao, Liu, Lei. Godson-T: An Efficient Many-Core Architecture for Parallel Program Executions. JOURNAL OF COMPUTER SCIENCE AND TECHNOLOGY[J]. 2009, 24(6): 1061-1073, http://lib.cqvip.com/Qikan/Article/Detail?id=32022578.
[186] Dongrui Fan. A fast linear-space sequence alignment algorithm with dynamic parallelization framework. Proceedings - IEEE 9th International Conference on Computer and Information Technology, CIT 2009. 2009, [187] Dongrui Fan. A synchronization-based alternative to directory protocol. Proceedings - 2009 IEEE International Symposium on Parallel and Distributed Processing with Applications, ISPA 2009. 2009, [188] Long, Guoping, Fan, Dongrui, Zhang, Junchao. Architectural Support for Cilk Computations on Many-core Architectures. ACM SIGPLAN NOTICES[J]. 2009, 44(4): 285-286, https://www.webofscience.com/wos/woscc/full-record/WOS:000272014600032.
[189] Dongrui Fan. A low-complexity synchronization based cache coherence solution for many cores. Proceedings - IEEE 9th International Conference on Computer and Information Technology, CIT 2009. 2009, [190] Dongrui Fan. Software and hardware cooperate for 1-D FFT algorithm optimization on multicore processors. Proceedings - IEEE 9th International Conference on Computer and Information Technology, CIT 2009. 2009, [191] 龙国平, 范东睿. LU分解在Godson—Tv1众核体系结构上的并行化研究. 计算机学报[J]. 2009, 2157-2167, http://lib.cqvip.com/Qikan/Article/Detail?id=32080304.
[192] 龙国平, 范东睿. LU分解在Godson-Tvl众核体系结构上的半行化研究. 计算机学报[J]. 2009, 32(11): 2157-2167,  http://dx.doi.org/10.3724/SP.J.1016.2009.02157.
[193] Dongrui Fan. Characterizing and understanding the bandwidth behavior of workloads on multi-core processors. Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics). 2009, [194] 宋风龙, 刘志勇, 范东睿, 张军超, 余磊. 一种片上众核结构共享Cache动态隐式隔离机制研究. 计算机学报[J]. 2009, 1896-1904, http://lib.cqvip.com/Qikan/Article/Detail?id=31781012.
[195] 张浩, 林伟, 周永彬, 叶笑春, 范东睿. 通用处理器的高带宽访存流水线研究. 计算机学报[J]. 2009, 142-151, http://lib.cqvip.com/Qikan/Article/Detail?id=29336464.
[196] Zhou Yongbin, Zhang Junchao, Zhang Shuai, Yuan Nan, Fan Dongrui, Liao XF, Jin H, Zheng R, Zou DQ. Data Management: The Spirit to Pursuit Peak Performance on Many-Core Processor. 2009 IEEE INTERNATIONAL SYMPOSIUM ON PARALLEL AND DISTRIBUTED PROCESSING WITH APPLICATIONS, PROCEEDINGSnull. 2009, 559-564, http://dx.doi.org/10.1109/ISPA.2009.22.
[197] Dongrui Fan. A Performance Model of Dense Matrix Operations on Many-core Architectures. International Conference on Parallel Computing (Euro-Par). 2008, [198] 袁楠, 范东睿. 高性能代价比的两层关联间接转移预测器设计. 计算机学报[J]. 2008, 31(11): 1898-1906, http://lib.cqvip.com/Qikan/Article/Detail?id=28668923.
[199] 段振中, 范东睿. JTAG调试通信接口的软件模拟. 微电子学与计算机[J]. 2008, 25(2): 157-159, http://lib.cqvip.com/Qikan/Article/Detail?id=26550273.
[200] 龙国平, 张军超, 范东睿. 众核体系结构对Cilk语言的硬件支持及评测研究. 计算机学报[J]. 2008, 31(11): 1975-1985, http://lib.cqvip.com/Qikan/Article/Detail?id=28668931.
[201] 许彤, 王朋宇, 黄海林, 范东睿, 朱鹏飞, 郑保建, 曹非. 嵌入式处理器在片调试功能的验证. 计算机辅助设计与图形学学报[J]. 2007, 19(4): 502-507, http://lib.cqvip.com/Qikan/Article/Detail?id=24260721.
[202] 范东睿, 黄海林, 唐志敏. 嵌入式处理器TLB设计方法研究. 计算机学报[J]. 2006, 29(1): 73-80, http://lib.cqvip.com/Qikan/Article/Detail?id=21072974.
[203] 黄海林, 范东睿, 许彤, 唐志敏. 嵌入式处理器中访存部件的低功耗设计研究. 计算机学报[J]. 2006, 29(5): 815-821, http://lib.cqvip.com/Qikan/Article/Detail?id=21884374.
[204] 黄海林, 许彤, 范东睿, 唐志敏. 嵌入式处理器中降低Cache缺失代价设计方法研究. 小型微型计算机系统[J]. 2006, 27(11): 2077-2081, https://d.wanfangdata.com.cn/periodical/xxwxjsjxt200611019.
[205] 黄海林, 范东睿, 许彤, 朱鹏飞, 郑保建, 曹非, 陈亮. 嵌入式处理器在片调试功能的设计与实现. 计算机辅助设计与图形学学报[J]. 2006, 18(7): 1005-1010, http://lib.cqvip.com/Qikan/Article/Detail?id=22439361.
[206] 常晓涛, 范东睿, 韩银和, 张志敏. 应用输入向量控制技术降低漏电功耗的快速算法. 计算机研究与发展[J]. 2006, 43(5): 946-952, http://lib.cqvip.com/Qikan/Article/Detail?id=21816504.
[207] 范东睿. 嵌入式处理器中TLB 设计方法研究. 计算机学报,. 2006, [208] Dongrui Fan. An Energy Efficient TLB Design Methodology. International Symposium on Low Power Electronics and Design (ISLPED). 2005, [209] Fan, DR, Yang, HB, Gao, GR, Zhao, RC. Evaluation and choice of various branch predictors for low-power embedded processor. JOURNAL OF COMPUTER SCIENCE AND TECHNOLOGY[J]. 2003, 18(6): 833-838, http://lib.cqvip.com/Qikan/Article/Detail?id=8906949.
[210] 蒋敬旗, 周旭, 李文, 范东睿. 系统芯片中低功耗测试的几种方法. 微电子学与计算机[J]. 2002, 19(10): 20-23, http://lib.cqvip.com/Qikan/Article/Detail?id=6962753.
[211] 李文, 周旭, 范东睿, 蒋敬旗. 可测试性设计中的功耗优化技术. 贵州工业大学学报:自然科学版[J]. 2002, 31(4): 1-7, http://lib.cqvip.com/Qikan/Article/Detail?id=6763121.
发表著作
(1) 计算机系统设计:片上系统, Computer System Design: System-on-Chip, 机械工业出版社, 2015-06, 第 2 作者
(2) 并行计算机组成与设计, Parallel Computer Organization and Design, 机械工业出版社, 2017-05, 第 1 作者

科研活动

   
科研项目
( 1 ) 高通量计算系统的构建原理、支撑技术及云服务应用, 主持, 国家级, 2011-01--2015-12
( 2 ) 超高性能CPU新型架构研究, 主持, 国家级, 2011-01--2011-12
( 3 ) 众核体系结构中的渗透式延迟容忍方法研究, 主持, 国家级, 2012-01--2015-12
( 4 ) 延长摩尔定律的微处理芯片新原理、新结构与新方法研究, 主持, 国家级, 2005-01--2010-07
( 5 ) 结合众核特征运行时系统关键技术研究, 主持, 国家级, 2009-01--2010-12
( 6 ) 多目标设计空间探索在嵌入式多媒体多处理器片上系统的应用”, 主持, 研究所(学校), 2009-01--2010-12
( 7 ) 适用于生物信息处理的众核结构设计方法研究, 主持, 省级, 2009-12--2011-12
( 8 ) 数据并行与线程并行合一的可伸缩处理器体系结构, 参与, 国家级, 2014-01--2018-12
( 9 ) E级超级计算机新型体系结构及关键技术路线研究, 参与, 国家级, 2015-01--2016-12
( 10 ) 超并行高效能计算机体系, 参与, 国家级, 2010-01--2018-12
( 11 ) 卓越青年科学家, 主持, 部委级, 2014-10--2017-12
( 12 ) 院人才-青促会, 主持, 部委级, 2014-06--2018-12
( 13 ) 千线程并行众核CPU体系结构和支撑技术研究, 主持, 国家级, 2014-01--2016-12
( 14 ) 高通量众核处理器研究, 主持, 市地级, 2015-01--2016-12
( 15 ) 多源数据自适应感知与关联浓缩技术, 主持, 国家级, 2017-07--2020-12
( 16 ) 后E级时代的新型高能效处理器体系结构, 主持, 国家级, 2018-01--2022-12
( 17 ) 超导计算机研发专项-超导计算机系统集成技术, 主持, 部委级, 2018-01--2022-12
( 18 ) 高通量众核架构关键技术, 主持, 部委级, 2020-01--2021-12

指导学生

已指导学生

雷峥蒙  硕士研究生  081201-计算机系统结构  

安述倩  硕士研究生  081201-计算机系统结构  

孟沫舒  硕士研究生  081201-计算机系统结构  

宋爽  硕士研究生  081220-信息安全  

乔雪笛  硕士研究生  081201-计算机系统结构  

常蕾  硕士研究生  081201-计算机系统结构  

郑亚松  博士研究生  081201-计算机系统结构  

吴飞  硕士研究生  081203-计算机应用技术  

张洋  博士研究生  081201-计算机系统结构  

李文明  博士研究生  081201-计算机系统结构  

王国江  硕士研究生  081201-计算机系统结构  

马丽娜  硕士研究生  081201-计算机系统结构  

葛长恩  硕士研究生  085211-计算机技术  

吴萌  硕士研究生  081201-计算机系统结构  

现指导学生

谭旭  博士研究生  081201-计算机系统结构  

冯煜晶  博士研究生  081201-计算机系统结构  

张承龙  博士研究生  081201-计算机系统结构  

向陶然  博士研究生  081201-计算机系统结构  

常成娟  硕士研究生  081201-计算机系统结构  

欧焱  博士研究生  081201-计算机系统结构  

薛瑞  博士研究生  081201-计算机系统结构  

轩伟  硕士研究生  081201-计算机系统结构  

吴欣欣  硕士研究生  081201-计算机系统结构  

李涵  博士研究生  081201-计算机系统结构  

朱晓晨  硕士研究生  081201-计算机系统结构  

谭龙  硕士研究生  081201-计算机系统结构  

高龑  硕士研究生  081201-计算机系统结构  

李易  博士研究生  081201-计算机系统结构