基本信息
谭光明 男 博导 计算技术研究所
电子邮件:tgm@ict.ac.cn
通信地址:科学院南路6号
邮政编码:100190 

研究领域

并行算法设计与分析、并行编程和优化、计算机体系结构、生物信息学、大数据 

Research Homepage (English): http://www.ncic.ac.cn/~tgm

招生信息

   
招生方向
高性能计算
体系结构
并行编程

教育背景

2006-08--2007-08   University of Delaware   访问学者
2002-09--2008-03   中国科学院计算技术研究所   工学博士
1998-09--2002-07   湘潭大学   理学学士

专利与奖励

2007年  中国科学院院长优秀奖
2008年  中国计算机学会优秀博士论文奖
2010年  中国科学院卢嘉锡青年人才奖 
2011年 中国科学院青年创新促进会

2013年 国家科技进步奖二等奖

科研活动

从事高性能计算领域的算法设计和优化的研究工作,作为曙光高性能计算机团队中算法和性能优化方向负责人,参与了曙光4000、曙光5000和曙光6000(星云)系列国产超级计算机系统的研制。在高性能算法设计、基础数学库优化和领域专用加速计算三个方面取得了若干创新性和系统性的研究成果,发表了数十篇论文(包括顶级国际会议如超级计算领域的SC、并行编程领域PPoPP和程序优化领域PLDI等,其中SC’06和SPAA’07论文是中国大陆学者的首次突破),对曙光高性能计算机的性能优化和应用推广贡献了关键技术。获得1项国家科技进步二等奖,担任IEEE TPDS编委(Associate Editor),多个国际会议的程序委员会委员(ISC2013、ICPP2015/2012、ICS2010、HiPC2011-12、ICPADS2009等),从2010年起担任Graph500基准测试的指导委员会(steering committee)委员。

科研项目
( 1 ) 面向稀疏矩阵和图计算的自适应优化方法研究, 主持, 国家级, 2013-01--2016-12
( 2 ) 面向深度测序大数据量的计算模型与体系结构研究, 参与, 国家级, 2012-01--2016-12
( 3 ) 高通量计算系统的构建原理、支撑技术及云服务应用, 参与, 国家级, 2011-01--2015-12
( 4 ) 海量图像数据处理高效算法及加速计算平台 , 主持, 国家级, 2015-01--2018-12
( 5 ) 十亿亿科学计算中共性算法的高效能实现研究, 主持, 国家级, 2015-01--2018-12
( 6 ) 十亿亿次高性能科学计算算法设计和性能优化, 主持, 市地级, 2014-01--2017-12
( 7 ) GRAPHINE 框架 E 级版研制及应用示范, 主持, 国家级, 2016-07--2020-12

合作情况

   
项目协作单位
  • Argonne National Laboratory(两位博士研究生学习访问一年)
  • University of Delaware
  • MSRA(一位博士实习半年、两位硕士实习3个月)
  • 北京应用物理与计算数学研究所
  • Intel (联合实验室)
  • NVIDIA
  • AMD
  • 中科曙光(联合实验室)

工作经历

   
工作简历
2014-10--今 中国科学院计算技术研究所 研究员
2011-11--今 计算体系结构国家重点实验室 副研究员
2008-03--今 中国科学院计算技术研究所 副研究员
2006-08--2007-08 University of Delaware 访问学者

出版信息

   
发表论文
[1] Xie, Zhen, Tan, Guangming, Liu, Weifeng, Sun, Ninghui. A Pattern-Based SpGEMM Library for Multi-Core and Many-Core Architectures. IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS[J]. 2022, 33(1): 159-175, http://dx.doi.org/10.1109/TPDS.2021.3090328.
[2] Shao, En, Tan, Guangming, Wang, Zhan, Yuan, Guojun, Cao, Zheng, Sun, Ninghui. A New Optoelectronic Hybrid Network Based on Scheduling Optimization of Optical Links. IEEE TRANSACTIONS ON COMPUTERS[J]. 2021, 70(6): 863-876, http://dx.doi.org/10.1109/TC.2021.3054308.
[3] Tan, Guangming, Shui, Chaoyang, Wang, Yinshan, Yu, Xianzhi, Yan, Yujin. Optimizing the LINPACK Algorithm for Large-Scale PCIe-Based CPU-GPU Heterogeneous Systems. IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS[J]. 2021, 32(9): 2367-2380, https://www.webofscience.com/wos/woscc/full-record/WOS:000638398900004.
[4] Zhang, Xiaoyang, Xiao, Junmin, Tan, Guangming. I/O Lower Bounds for Auto-tuning of Convolutions in CNNs. 2020, http://arxiv.org/abs/2012.15667.
[5] Shui, Chaoyang, Yu, Xianzhi, Yan, Yujin, Wang, Yinshan, Meng, Ke, Tan, Guangming, ACM. Revisiting Linpack Algorithm on Large-scale CPU-GPU Heterogeneous Systems. PROCEEDINGS OF THE 25TH ACM SIGPLAN SYMPOSIUM ON PRINCIPLES AND PRACTICE OF PARALLEL PROGRAMMING (PPOPP '20)null. 2020, 411-412, http://dx.doi.org/10.1145/3332466.3374530.
[6] Xie, Zhen, Tan, Guangming, Liu, Weifeng, Sun, Ninghui, ACM. IA-SpGEMM An Input-aware Auto-tuning Framework for Parallel Sparse Matrix-Matrix Multiplication. INTERNATIONAL CONFERENCE ON SUPERCOMPUTING (ICS 2019)null. 2019, 94-105, http://dx.doi.org/10.1145/3330345.3330354.
[7] Xiao, Junmin, Wang, Shijie, Wan, Weiqiang, Hong, Xuehai, Tan, Guangming, ACM. S-EnKF: Co-designing for Scalable Ensemble Kalman Filter. PROCEEDINGS OF THE 24TH SYMPOSIUM ON PRINCIPLES AND PRACTICE OF PARALLEL PROGRAMMING (PPOPP '19)null. 2019, 15-26, http://dx.doi.org/10.1145/3293883.3295722.
[8] Meng, Ke, Li, Jiajia, Tan, Guangming, Sun, Ninghui, ACM. A Pattern Based Algorithmic Autotuner for Graph Processing on GPUs. PROCEEDINGS OF THE 24TH SYMPOSIUM ON PRINCIPLES AND PRACTICE OF PARALLEL PROGRAMMING (PPOPP '19)null. 2019, 201-213, http://dx.doi.org/10.1145/3293883.3295716.
[9] Tan, Guangming, Liu, Junhong, Li, Jiajia. Design and Implementation of Adaptive SpMV Library for Multicore and Many-Core Architecture. ACM TRANSACTIONS ON MATHEMATICAL SOFTWARE[J]. 2018, 44(4): https://www.webofscience.com/wos/woscc/full-record/WOS:000445637100010.
[10] Wang, Yuanrong, Li, Xueqi, Zang, Dawei, Tan, Guangming, Sun, Ninghui, ACM. Accelerating FM-index Search for Genomic Data Processing. PROCEEDINGS OF THE 47TH INTERNATIONAL CONFERENCE ON PARALLEL PROCESSINGnull. 2018, [11] Zhou, Keren, Tan, Guangming, Zhou, Wei. Quadboost: A Scalable Concurrent Quadtree. IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS[J]. 2018, 29(3): 673-686, https://www.webofscience.com/wos/woscc/full-record/WOS:000425173200014.
[12] Junhong Liu, Xin He, Weifeng Liu, Guangming Tan. Register-based Implementation of the Sparse General Matrix-Matrix Multiplication on GPUs. ACM SIGPLAN NOTICESnull. 2018, 53(1): 407-408, [13] Li Xueqi, Tan Guangming, Wang Bingchen, Sun Ninghui, Assoc Comp Machinery. High-Performance Genomic Analysis Framework with In-Memory Computing. PPOPP'18: PROCEEDINGS OF THE 23RD PRINCIPLES AND PRACTICE OF PARALLEL PROGRAMMINGnull. 2018, 317-+, http://dx.doi.org/10.1145/3178487.3178511.
[14] Li Xueqi, Tan Guangming, Zhang Chunming, Li Xu, Zhang Zhonghai, Sun Ninghui, IEEE. Quantifying and Mitigating Computational Inefficiency of Genomics Data Analysis. 2017 19TH IEEE INTERNATIONAL CONFERENCE ON HIGH PERFORMANCE COMPUTING AND COMMUNICATIONS (HPCC) / 2017 15TH IEEE INTERNATIONAL CONFERENCE ON SMART CITY (SMARTCITY) / 2017 3RD IEEE INTERNATIONAL CONFERENCE ON DATA SCIENCE AND SYSTEMS (DSS)null. 2017, 262-269, http://dx.doi.org/10.1109/HPCC-SmartCity-DSS.2017.34.
[15] Guangming Tan. A Performance Analysis Framework for Exploiting GPU Microarchitectural Capability. International Conference on Supercomputing. 2017, [16] Guangming Tan. Understanding GPU Microarchitectureto Achieve Bare-Metal Performance Tuning. ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming. 2017, [17] Tan, Guangming, Zhang, Chunming, Tang, Wen, Zhang, Peiheng, Sun, Ninghui. Accelerating Irregular Computation in Massive Short Reads Mapping on FPGA Co-Processor. IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS[J]. 2016, 27(5): 1253-1264, https://www.webofscience.com/wos/woscc/full-record/WOS:000374238100002.
[18] Guangming Tan. Graphine: Programming Graph-Parallel Computation of Large Natural Graphs on Multicore Cluster. IEEE Transactions on Parallel and Distributed Systems. 2016, [19] Luo Yulong, Tan Guangming, Mo Zeyao, Sun Ninghui, ACM. FAST: A Fast Stencil Autotuning Framework Based on an Optimal-solution Space Model. PROCEEDINGS OF THE 29TH ACM INTERNATIONAL CONFERENCE ON SUPERCOMPUTING (ICS'15)null. 2015, 187-196, http://dx.doi.org/10.1145/2751205.2751214.
[20] Yan Jie, Tan Guangming, Sun Ninghui, IEEE. Study on Partitioning Real-world Directed Graphs of Skewed Degree Distribution. 2015 44TH INTERNATIONAL CONFERENCE ON PARALLEL PROCESSING (ICPP)null. 2015, 699-708, [21] JieYan, Guangming Tan, Xiuxia Zhang, ErlinYao, Ninghui Sun. vLock: Lock Virtualization Mechanism for Exploiting Fine-grained Parallelism in Graph Traversal Algorithms. Proceedings of the 2013 IEEE/ACM International Symposium on Code Generation and Optimization (CGO 2013) :. 2015, 141-150, [22] Tan, Guangming, Zhang, Chunming, Wang, Wendi, Zhang, Peiheng. SuperDragon: A Heterogeneous Parallel System for Accelerating 3D Reconstruction of Cryo-Electron Microscopy Images. ACM TRANSACTIONS ON RECONFIGURABLE TECHNOLOGY AND SYSTEMS[J]. 2015, 8(4): https://www.webofscience.com/wos/woscc/full-record/WOS:000363659000005.
[23] Yan, Jie, Tan, Guangming, Sun, Ninghui. Exploiting fine-grained parallelism in graph traversal algorithms via lock virtualization on multi-core architecture. JOURNAL OF SUPERCOMPUTING[J]. 2014, 69(3): 1462-1490, https://www.webofscience.com/wos/woscc/full-record/WOS:000342454300025.
[24] Li, Jiajia, Tan, Guangming, Chen, Mingyu, Sun, Ninghui. SMAT: An Input Adaptive Auto-Tuner for Sparse Matrix-Vector Multiplication. ACM SIGPLAN NOTICES[J]. 2013, 48(6): 117-126, https://www.webofscience.com/wos/woscc/full-record/WOS:000321865400012.
[25] Yan, Jie, Tan, GuangMing, Sun, NingHui. Optimizing Parallel S (n) Sweeps on Unstructured Grids for Multi-Core Clusters. JOURNAL OF COMPUTER SCIENCE AND TECHNOLOGY[J]. 2013, 28(4): 657-670, https://www.webofscience.com/wos/woscc/full-record/WOS:000321566800009.
[26] Guangming Tan. Scalability study of molecular dynamics simulation on Godson-Tmany-core architecture. Journal of Parallel and Distributed Computing. 2012, [27] Guangming Tan. An Optimized Large-Scale Hybrid DGEMM Design for CPUs and ATI GPUs. 26th ACM International Conference on Supercomputing(ICS). 2012, [28] Li Linchuan, Li Xingjian, Tan Guangming, Chen Mingyu, Zhang Peiheng, ACM SIGARCHUOA. Experience of Parallelizing cryo-EM 3D Reconstruction on a CPU-GPU Heterogeneous System. HPDC 11: PROCEEDINGS OF THE 20TH INTERNATIONAL SYMPOSIUM ON HIGH PERFORMANCE DISTRIBUTED COMPUTINGnull. 2011, 195-204, http://apps.webofknowledge.com/CitedFullRecord.do?product=UA&colName=WOS&SID=5CCFccWmJJRAuMzNPjj&search_mode=CitedFullRecord&isickref=WOS:000291897200019.
[29] Guangming Tan. Fast Implementation of DGEMM on Fermi GPU. ACM/IEEE Supercomputing (SC). 2011, [30] Guangming Tan. Analysis and Performance Results of Computing Betwenness Centrality on IBM Cyclops64. The Journa of Supercomputing. 2009, [31] Tan Guangming, Guo Ziyu, Chen Mingyu, Meng Dan, Gschwind M, Nicolau A, Salapura V, Moreira J. Single-particle 3D Reconstruction from Cryo-Electron Microscopy Images on GPU. ICS'09: PROCEEDINGS OF THE 2009 ACM SIGARCH INTERNATIONAL CONFERENCE ON SUPERCOMPUTINGnull. 2009, 380-389, http://apps.webofknowledge.com/CitedFullRecord.do?product=UA&colName=WOS&SID=5CCFccWmJJRAuMzNPjj&search_mode=CitedFullRecord&isickref=WOS:000268247500037.
[32] Tan, Guangming, Sun, Ninghui, Gao, Guang R. Improving Performance of Dynamic Programming via Parallelism and Locality on Multicore Architectures. IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS[J]. 2009, 20(2): 261-274, http://dx.doi.org/10.1109/TPDS.2008.78.
[33] Guangming Tan. A Parallel Algorithm for Computing Betweenness Centrality. 2009, [34] Tan Guangming, Fan Dongrui, Zhang Junchao, Russo Andrew, Gao Guang R, ACM. Experience on Optimizing Irregular Computation for Memory Hierarchy in Manycore Architecture. PPOPP'08: PROCEEDINGS OF THE 2008 ACM SIGPLAN SYMPOSIUM ON PRINCIPLES AND PRACTICE OF PARALLEL PROGRAMMINGnull. 2008, 279-280, http://apps.webofknowledge.com/CitedFullRecord.do?product=UA&colName=WOS&SID=5CCFccWmJJRAuMzNPjj&search_mode=CitedFullRecord&isickref=WOS:000266619600035.
[35] Tan, G, Xu, L, Dai, Z, Fong, S, Sun, N. A study of architectural optimization methods in bioinformatics applications. INTERNATIONAL JOURNAL OF HIGH PERFORMANCE COMPUTING APPLICATIONS[J]. 2007, 21(3): 371-384, http://dx.doi.org/10.1177/1094342007078175.
[36] Guangming Tan. Cache Oblivious Algorithms for Nonserial Polyadic Dynamic Programming. The Journal of Supercomputing. 2007, [37] Tan, Guangming, Sun, Ninghui, Gao, Guang R, ACM. A Parallel Dynamic Programming Algorithm on a Multi-core Architecture. SPAA'07: PROCEEDINGS OF THE NINETEENTH ANNUAL SYMPOSIUM ON PARALLELISM IN ALGORITHMS AND ARCHITECTURESnull. 2007, 135-+, http://apps.webofknowledge.com/CitedFullRecord.do?product=UA&colName=WOS&SID=5CCFccWmJJRAuMzNPjj&search_mode=CitedFullRecord&isickref=WOS:000266371200017.
[38] Guangming Tan. Locality and Parallelism Optimization for Dynamic Programming Algorithm in Bioinformatics. ACM/IEEE Supercomputing (SC). 2006, 

学生毕业去向

  • 张秀霞(2017年,NVIDIA-硅谷)
  • 周可人(2017年,RICE University)
  • 刘闯(2016年,微软-澳大利亚)
  • 李红印(2016年,百度)
  • 张中海(2015年,中科院计算所)
  • 闫洁(2014年,中物院、华为)
  • 吕慧伟 (2013年,Argonne National Laboratory)
  • 李佳佳 (2013年,Georgia Tech)
  • 戴福鑫 (2013年,MSRA)
  • 王文迪 (2012年,MSRA)
  • 李临川 (2012年,淘宝)
  • 涂登彪 (2011年,网安)
  • 李兴建 (2011年,百度)
  • 郭子昱 (2010年,Willian & Mary)