Professor & Associate Director

Department of High-Performance Computing Technology and Application Development

The Supercomputing Center of Chinese Academy of Sciences

Box 349

Beijing, China, 100190

Phone: 010-58812132, 18101079362

Email: &

Research Areas

My main research field is the development of computational methods for the analysis of biological datasets, especially cancer genomic and metagenomic data using high-performance computing technology. In particular, I have been developing computational tools for working with data from the next generation sequencing technology ( Illumina && 454 ).



Ph.D. in Computer Software and Theory, Super Computing Center, Chinese Academy of Sciences, Beijing, China.

Thesis Title: Research on Short Oligonucleotide Alignment and Assembly Algorithm. 

Advisor: Professor Xuebin Chi

1998 – 2002

S.B. in Computer Science, Shandong Agriculture University, Shandong, China


Work Experience

2015 – present

Professor (100 Talents Program of Chinese Academy of Sciences), (Computational Cancer Genomics and precision medicine big data analysis), The Supercomputing Center of  Chinese Academy of Sciences, Beijing, China.

Sponsor: Chinese Academy of Sciences.

·  Lead the development of Precision Medicine Workstation (PMW) pipeline for the precision medicine big data analysis.

·  Lead the development of HotSpot3D software. HotSpot3D is an automated, statistically-based approach for spatial clustering of mutations and drugs presents an opportunity for identifying novel functional mutations and therapeutic targets in human diseases, including cancer.


Staff Scientist, (Computational Cancer Genomics), The Genome Institute, School of Medicine, Washington University in St. Louis, Missouri, US.

Sponsor: Professor Li Ding.

·  Lead the development of computational pipeline (MUSIC2) for the discovering the significance of somatic mutations   found within a given cohort of cancer samples, and with respect to a variety   of external data sources. Involved in the analysis of 3d mutation proximity.

·  Setupcloud computing environments using Apache Hadoop (MapReduce framework) for large scale cancer genomic BD2K (Big   Data to Knowledge) project from ICGC (International Cancer Genome Consortium).

·  Identifying and characterizing somatic/germline genetic changes relevant to cancer initiation and progression as well as drug response by integrating various data types including DNA, RNA, and proteomics data.

·  Development of algorithms and computational tools to facilitate the translation of genomic findings to clinical practice.


Postdoctoral Associate, (Bioinformatics & Parallel computing), California Institute for Telecommunications and Information Technology(CALIT2) & The Center for Research in Biological   Systems(CRBS) & San Diego Supercomputer Center(SDSC), University of California, San Diego, California, US.

Sponsor: Doctor Weizhong Li.

·  Developing novel computational methods for HMP (Human   Metagenome Project) and other metagenomics projects. Developing computational and informatics tools for next generation sequencing data analysis.

·  Testing Gordon supercomputer in SDSC (San Diego Super Computer   Center) using big bioinformatics data for NSF XSEDE (eXtreme Science and Engineering Discovery Environment) Project.

·  Study of artifacts detecting to 454 pyrosequencing data. Designed one algorithm for detecting artifacts in metagenomics sequencing data under 454 pyrosequencing technology.

·  Assembly  improvement of metagenome by using pre-clustering method.

2007 – 2008

Research Intern, (Bioinformatics), Beijing Genomics Institute (BGI), Beijing, China.

Sponsor: Professor Jun Wang

·  Study of parallelization of short reads assembly algorithm based on de Bruijn graphs.

2003 – 2007

Research Assistant, (Parallel Computing), Supercomputing Center of Chinese Academy of Sciences, Beijing, China.

Sponsor: Professor Xuebin Chi.

·  Study of parallel computing, grid computing and bioinformatics (short oligonucleotide alignment and assembly algorithm).

·  Conducted parallel research on the alternative splicing of mRNA, executed parallelization of AltSplice program MPI_AltSplice.  MPI_AltSplice was designed and implemented under MPI environment.

·  Development of a commodity supercomputing environment, ScBioGrid, for supporting bioinformatics research.

·  Linux system administration using ROCKS and Sun Grid Engine.



Cyriac Kandoth*, Michael D McLellan*, Fabio Vandin, Kai Ye, Beifang Niu, Charles Lu, Mingchao Xie, Qunyuan Zhang, Joshua F McMichael, Matthew A Wyczalkowski, Mark DM Leiserson, Christopher A Miller, John S Welch, Matthew J Walter, Michael C Wendl, Timothy J Ley, Richard K Wilson, Benjamin J Raphael, Li Ding#. (2013) Mutational landscape and significance across 12 major cancer types. Nature. 502 (7471), 333-339. (SCI)

Beifang Niu*, Adam D. Scott*, Sohini Sengupta*, Matthew H. Bailey, Prag Batra, Jie Ning, Matthew A. Wyczalkowski, Wen-Wei Liang, Qunyuan Zhang, Michael D. McLellan, Sam Q. Sun, Piyush Tripathi, Carolyn Lou, Kai Ye, Robert J. Mashl, John Wallis, Michael C. Wendl, Feng Chen#, and Li Ding# .(2016) Protein-structure-guided discovery of functional mutations across 19 cancer types. Nature Genetics. doi: 10.1038/ng.3586. (SCI)

Beifang Niu*#, Xianyu Lang, Zhonghua Lu and Xuebin Chi. (2007) ScBioGrid: a commodity supercomputing environment supporting bioinformatics research. International Journal of Computer Mathematics. 84(2):177-182 . (SCI)

Song Cao*, Michael C. Wendl, Matthew A. Wyczalkowski, Kristine Wylie, Kai Ye, Reyka Jayasinghe, Mingchao Xie, Song Wu, Beifang Niu, Robert Grubb III, Kimberly J. Johnson, Hiram Gay, Ken Chen, Janet S. Rader, John F. Dipersio, Feng Chen# & Li Ding#.(2016) Divergent viral presentation among human tumors and adjacent normal tissues. Scientific Reports. 6:28294, doi: 10.1038/srep28294 (SCI)

Beifang Niu*, Kai Ye*, Qunyuan Zhang, Charles Lu, Mingchao Xie, Michael D. McLellan, Michael C. Wendl and Li Ding#. (2014) MSIsensor: microsatellite instability detection using paired tu-mor-normal sequence data. Bioinformatics, 30(7):1015-6. doi: 10.1093/bioinformatics/btt755. (SCI)

Beifang Niu*#, Xianyu Lang, Zhonghua Lu and Xuebin Chi. (2009) Parallel Algorithm Research on Several Important Open Problems in Bioinformatics. Interdisciplinary Sciences--Computational Life Sciences. 1(3): 187-195. (SCI)

Matthew J. Walter*, Dong Shen*, Jin Shao*, Li Ding, Brian White, Cyriac Kandoth, Christopher A. Miller, Beifang Niu, Michael D. McLellan, Nathan D. Dees, Robert Fulton, Kevin Elliot, Sharon Heath, Marcus Grillot, Peter Westervelt, Daniel C. Link, John F. DiPersio, Elaine Mardis, Timothy J. Ley, Richard K. Wilson, Timothy A. Graubert#. (2013) Clonal Diversity of Recurrently Mutated Genes in Myelodysplastic Syndromes. Leukemia. 27, 1275–1282. doi:10.1038/leu.2013.58. (SCI)

Ying Huang*, Beifang Niu, Ying Gao, Limin Fu, and Weizhong Li#. (2010) CD-HIT Suite: a web server for clustering and comparing biological sequences. Bioinformatics. 26(5): 680–682. doi:10.1093/bioinformatics/btq003.  (SCI)

Kai Ye*, Jiayin Wang*, Reyka Jayasinghe, Eric-Wubbo Lameijer, Joshua F McMichael, Jie Ning, Michael D McLellan, Mingchao Xie, Song Cao, Venkata Yellapantula, Kuan-lin Huang, Adam Scott, Steven Foltz, Beifang Niu, Kimberly J Johnson, Matthijs Moed, PEline Slagboom, Feng Chen, Michael C Wendl, Li Ding#. (2016)Systematic discovery of complex insertions and deletions in human cancers. Nature Medicine. 22(1):97-104. doi: 10.1038/nm.4002. (SCI)

Beifang Niu*, Zhengwei Zhu, Limin Fu, Sitao Wu, and Weizhong Li#. (2011) FR-HIT, a Very Fast Program to Recruit Metagenomic Reads to Homologous Reference Genomes. Bioinformatics. 27(12): 1704-1705. doi: 10.1093/bioinformatics/btr252. (SCI)

Charles Lu*, Mingchao Xie*, Michael C. Wendl, Jiayin Wang, Michael D. McLellan,Mark D. M. Leiserson, Kuan-lin Huang, Matthew A. Wyczalkowski,Reyka Jayasinghe, Tapahsama Banerjee, Jie Ning, Piyush Tripathi, Qunyuan Zhang, Beifang Niu, Kai Ye,Heather K. Schmidt,Robert S. Fulton,Joshua F. McMichael,Prag Batra, Cyriac Kandoth, Maheetha Bharadwaj, Daniel C. Koboldt,Christopher A. Miller, Krishna L. Kanchi,James M. Eldred, David E. Larson, John S. Welch, Ming You, Bradley A. Ozenberger,Ramaswamy Govindan, Matthew J. Walter, Matthew J. Ellis,Elaine R. Mardis, Timothy A. Graubert,  John F. Dipersio, Timothy J. Ley, Richard K. Wilson, Paul J. Goodfellow, Benjamin J. Raphael, Feng Chen, Kimberly J. Johnson,Jeffrey D. Parvin   & Li Ding#. (2015) Patterns and functional implications of rare germline variants across 12 cancer types. Nature Communications. 6: 10086. doi:  10.1038/ncomms10086. (SCI)

Beifang Niu*, Limin Fu, Shulei Sun and Weizhong Li#.  (2010) Artificial and natural duplicates in pyrosequencing reads of metagenomic data. BMC Bioinformatics. 11:187. doi: 10.1186/1471-2105-11-187. (SCI)

Katherine A Hoadley*, Christina Yau*, Denise M Wolf*, Andrew D Cherniack*, David Tamborero, Sam Ng, Max DM Leiserson, Beifang Niu, Michael D McLellan, Vladislav Uzunangelov, Jiashan Zhang, Cyriac Kandoth, Rehan Akbani, Hui Shen, Larsson Omberg, Andy Chu, Adam A Margolin, Laura J van’t Veer, Nuria Lopez-Bigas, Peter W Laird, Benjamin J Raphael, Li Ding, A Gordon Robertson, Lauren A Byers, Gordon B Mills, John N Weinstein, Carter Van Waes, Zhong Chen, Eric A Collisson, Christopher C Benz#, Charles M Perou#, Joshua M Stuart#, Cancer Genome Atlas Research Network. (2014) Multiplatform analysis of 12 cancer types reveals molecular classification within and across tissues of originCell. 158(4) 929-944. doi: 10.1016/j.cell.2014.06.049. (SCI)

Ken Chen*#, Nicholas E Navin, Yong Wang, Heather K Schmidt, John W Wallis, Beifang Niu, Xian Fan, Hao Zhao, Michael D Mclellan, Katherine A Hoadley, Elaine R Mardis, Timothy J Ley, Charles M Perou, Richard K Wilson, Li Ding#. (2013) BreakTrans: uncovering the genomic architecture of gene fusions. Genome biology.14(8):R87. doi: 10.1186/gb-2013-14-8-r87. (SCI)

Limin Fu*, Beifang Niu, Zhengwei Zhu, Sitao Wu and Weizhong Li#. (2012) CD-HIT: accelerated for clustering the next generation sequencing data. Bioinformatics. 28(23): 3150–3152. doi:  10.1093/bioinformatics/bts565. (SCI)

Zhengwei Zhu*, Beifang Niu, Sitao Wu, and Weizhong Li#. (2013) MGAviewer: A desktop visualization tool for analysis of metage-nomics alignment data. Bioinformatics. 29(1):122-3. doi: 10.1093/bioinformatics/bts567. (SCI)

Beifang Niu*, Xianyu Lang, Zhonghua Lu and Xuebin Chi. (2008) Parallelization research on mRNA alternative splicing. Application Research of Computer, Vol. 25, No. 3, Mar. 2008, 705-708.  (EI) & Beifang Niu*, Xiguang Zhang, Tao Liu, Xianyu Lang, Zhonghua Lu and Xuebin Chi. (2009) An Alignment and Assembly Algorithm base on New Genome Sequencing Technique. Computer Engineering. Vol. 35, No. 20, Oct. 2009, 4-6.

Mark DM Leiserson*, Fabio Vandin*, Hsin-Ta Wu, Jason R Dobson, Jonathan V Eldridge, Jacob L Thomas, Alexandra Papoutsaki, Younhun Kim, Beifang Niu, Michael McLellan, Michael S Lawrence, Abel Gonzalez-Perez, David Tamborero, Yuwei Cheng, Gregory A Ryslik, Nuria Lopez-Bigas, Gad Getz, Li Ding, Benjamin J Raphael#. (2015) Pan-cancer network analysis identifies combinations of rare somatic mutations across pathways and protein complexesNature Genetics. 47(2) 106-144. doi: 10.1038/ng.3168. (SCI)

Weizhong Li*#, Limin Fu, Beifang Niu, Sitao Wu and John Wooley. (2012) Ultrafast clustering algorithms for metagenomic sequence analysisAccepted. Briefings in Bioinformatics. 13 (6):656-668. doi: 10.1093/bib/bbs035. (SCI)

Sitao Wu*, Zhengwei Zhu, Limin Fu, Beifang Niu and Weizhong Li#. (2011) WebMGA: a Customizable Web Server for Fast Metagenomic Sequence Analysis. BMC Genomics. 12:444. doi:10.1186/1471-2164-12-444. (SCI)

Beifang Niu*, Zhengwei Zhu, Limin Fu, Sitao Wu, and Weizhong Li. FR-HIT Overview. Encyclopedia of Metagenomics. SpringerReference. 2014. Book charpter.  & Beifang Niu*, Sitao Wu, and Weizhong Li. Clustering-based HMP Sequence Comparison. Encyclopedia of Metagenomics. SpringerReference. 2014. Book Charpter.

Research Interests

  • Bioinformatics software development and algorithm optimization.

  • Cancer genomics and Metagenomics.

  • Large scale precision medicine data analysis and sequence clustering, alignment and assembly algorithm for next generation sequencing data. 

  • High performance parallel computing and cloud computing.


Beifang Niu, John Wallis, Qunyuan Zhang, PiyushTripathi, Mike McLellan, Matt Wyczalkowski, CyriacKandoth, MaheethaBharadwa, Kai Ye, R. Jay Mash, Mike Wendl, Feng Chen, Li Ding. HOTSPOT3D: A novel computational tool for inferring functional importance of cancer mutations through 3D proximity analyses. Genome Informatics 2013, Cold Spring harbor, New York, US. Oct 31-Nov 2,2013. Poster.

Beifang Niu, Limin Fu, Sitao Wu and Weizhong Li. Filtering out extra redundancy significantly improve the assembly of metagenomic short reads, International Human Microbiome Congress (IHMC),  St Louis, US. March 9-11, 2011. Poster.

Beifang Niu, Zhengwei Zhu, Limin Fu, Sitao Wu, and Weizhong Li. FR-HIT, a Very Fast Program to Recruit Metagenomic Reads to Homologous Reference Genomes. International Human Microbiome Congress (IHMC),  St Louis, US. March 9-11, 2011. Poster.

Beifang Niu, Xianyu Lang, Zhonghua Lu and Xuebin Chi. ScBioGrid: a commodity supercomputing environment supporting bioinformatics research. Distributed Computing and its Applications in Business, Engineering, and Sciences/ International Conference on Parallel Algorithms and Computational Environment (DCABES/ICPACE), August 25-27, 2005, London. UK. Presentation.


The  McDonnell Genome Institute, School of Medicine, Washington University in St. Louis, Missouri, US.

San Diego Supercomputer Center(SDSC), University of California, San Diego, California, US.

The J. Craig Venter Institute (JCVI), La Jolla, San Diego, California, US. 

The General Hospital of the People's Liberation Army (PLAGH), Beijing, China.



陈玮  硕士研究生  081202-计算机软件与理论  


李瑞琳  博士研究生  081202-计算机软件与理论  

赵丹  硕士研究生  085211-计算机技术  

张裕  硕士研究生  081202-计算机软件与理论  

祝海栋  硕士研究生  081203-计算机应用技术  

何小雨  博士研究生  081202-计算机软件与理论  

韩鑫胤  硕士研究生  081202-计算机软件与理论  

郝卉群  博士研究生  081202-计算机软件与理论  

何志鹏  硕士研究生  085211-计算机技术  

张舒莹  硕士研究生  085211-计算机技术  

袁丹阳  硕士研究生  081202-计算机软件与理论  

代闯闯  博士研究生  081202-计算机软件与理论  

余果  硕士研究生  025100-金融  

Honors & Distinctions


Top 10 Clinical Research Achievement Awards for 2015(US)(Multiplatform analysis of 12 cancer types reveals   molecular classification within and across tissues of origin)


Member, The Clinical Proteomic Tumor Analysis Consortium (CPTAC) Cancer Biology Working Group


American Association for Cancer Research (AACR)


Member, iSeqTools Network


Member, International Cancer Genome Consortium(ICGC)


Member, The Cancer Genome Atlas (TCGA) Pan Cancer Working Group


Member, The Cancer Genome Atlas (TCGA) Lung Cancer Analysis Working Group


Member, The Cancer Genome Atlas (TCGA) Endometrial Cancer Analysis Working Group


Member, The International Society for Computational Biology (ISCB)


Genome Research, Bioinformatics, Nucleic Acid Research, BMC Bioinformatics and BMC Genomics journals, IEEE Transactions on Computational Biology and Bioinformatics (TCCB)