Zhang Zhang, Ph.D.

Professor in the CAS 100-Talent Program

Executive Director of BIG Data Center

Beijing Institute of Genomics (BIG), Chinese Academy of Sciences (CAS)

No.1 Beichen West Road, Chaoyang District

Beijing 100101, China

Email: zhangzhang(AT)

Lab site:,

Personal profile: Google Scholar


Research Areas

Computational Biology & Bioinformatics


  • Ph.D. in Computer Science, Institute of Computing Technology, Chinese Academy of Sciences, China, 2007
  • M.S. in Computer Science, Nanjing University of Science and Technology, China, 2004
  • B.S. in Computer Science, Ningxia University, China, 2002

Professional Experience

  • Executive Director of BIG Data Center, BIG, CAS, China, 2016−Present
  • Professor in the CAS 100-Talent Program, Beijing Institute of Genomics (BIG), Chinese Academy of Sciences (CAS), China, 2011−Present
  • Research Scientist, King Abdullah University of Science and Technology, Kingdom of Saudi Arabia, 2009−2011
  • Postdoctoral Associate, Yale University, United States of America, 2007−2009


  1. Zhang Z as corresponding author in BIG Data Center Members: The BIG Data Center: from deposition to integration to translation. Nucleic Acids Res 2017, accepted and in press.

  2. Zhang Z listed in The RNAcentral Consortium: RNAcentral: a comprehensive database of non-coding RNA sequences. Nucleic Acids Res 2017, [Epub ahead of print]. Pubmed.png

  3. Xue Y, ..., Zhang Z., Huang K, Yu J: Precision Medicine: What Challenges Are We Facing?, Genomics Proteomics Bioinformatics 2016, 14(5):253-261.Pubmed.png

  4. Yin HY, Wang GY, Ma LN, Yi SV, Zhang Z: What signatures dominantly associate with gene age? Genome Biology and Evolution 2016, 8(10):3083-9. Pubmed.png

  5. Xu XJ, Ji ZH, Zhang Z: CloudPhylo: a fast and scalable tool for phylogeny reconstruction. Bioinformatics 2016, [Epub ahead of print]Pubmed.png

  6. Yin HY, Wang GY, Ma LN, Yi SV, Zhang Z: What signatures dominantly associate with gene age? Genome Biology and Evolution 2016, 8(10):3083-9. Pubmed.png

  7. Sun SX, Xiao JF, Zhang HY*, Zhang Z: Pangenome evidence for higher codon usage bias and stronger translational selection in core genes of Escherichia coli. Frontiers in Microbiology 2016, 7:1180. Pubmed.png

  8. Yin HY, Ma LN, Wang GY, Li MW, Zhang Z: Old genes experience stronger translational selection than young genes. Gene 2016, 590(1):29-34.Pubmed.png

  9. Wang GY, Sun SX, Zhang Z: Randomness in sequence evolution increases over time. PLoS One 201611(5): e0155935.Pubmed.png

  10. Zhang Z as corresponding author in IC4R Project Consortium: Information Commons for Rice (IC4R)Nucleic Acids Res 2016, 44(D1):D1172-1180. [PMID=26519466]

  11. Zou D, Sun S, Li R, Liu J, Zhang J, Zhang Z: MethBank: a database integrating next-generation sequencing single-base-resolution DNA methylation programming data. Nucleic Acids Res 2015, 43(Database issue):D54-58. [PMID=25294826]

  12. Zou D, Ma L, Yu J, Zhang Z: Biological databases for human researchGenomics Proteomics Bioinformatics 2015, 13(1):55-63. [PMID=25712261]

  13. Ma L, Li A, Zou D, Xu X, Xia L, Yu J, Bajic VB, Zhang Z: LncRNAWiki: harnessing community knowledge in collaborative curation of human long non-coding RNAsNucleic Acids Res 2015, 43(Database issue):D187-192. [PMID=25399417]

  14. Bai B, Zhao WM, Tang BX, Wang YQ, Wang L, Zhang Z, Yang HC, Liu YH, Zhu JW, Irwin DM, Wang GD, Zhang YP: DoGSD: the dog and wolf genome SNP databaseNucleic Acids Res 2015, 43(Database issue):D777-783. [PMID=25404132]

  15. Zhao Y, Jia X, Yang J, Ling Y, Zhang Z, Yu J, Wu J, Xiao J: PanGP: a tool for quickly analyzing bacterial pan-genome profileBioinformatics 2014, 30(9):1297-1299.[PMID=24420766]

  16. Zhang Z, Zhu W, Luo J: Bringing biocuration to ChinaGenomics Proteomics Bioinformatics 2014, 12(4):153-155. [PMID=25042682]

  17. Zhang Z, Sang J, Ma L, Wu G, Wu H, Huang D, Zou D, Liu S, Li A, Hao L, Tian M, Xu C, Wang X, Wu J, Xiao J, Dai L, Chen LL, Hu S, Yu J: RiceWiki: a wiki-based database for community curation of rice genesNucleic Acids Res 2014, 42(Database issue):D1222-1228. [PMID=24136999]

  18. Xu P, Zhang X, Wang X, Li J, Liu G, Kuang Y, Xu J, Zheng X, Ren L, Wang G, Zhang Y, Huo L, Zhao Z, Cao D, Lu C, Li C, Zhou Y, Liu Z, Fan Z, Shan G, Li X, Wu S, Song L, Hou G, Jiang Y, Jeney Z, Yu D, Wang L, Shao C, Song L, Sun J, Ji P, Wang J, Li Q, Xu L, Sun F, Feng J, Wang C, Wang S, Wang B, Li Y, Zhu Y, Xue W, Zhao L, Wang J, Gu Y, Lv W, Wu K, Xiao J, Wu J, Zhang Z, Yu J, Sun X: Genome sequence and genetic diversity of the common carp, Cyprinus carpioNat Genet 2014, 46(11):1212-1219. [PMID=25240282]

  19. Wu J, Xiao J, Zhang Z, Wang X, Hu S, Yu J: Ribogenomics: the science and knowledge of RNAGenomics Proteomics Bioinformatics 2014, 12(2):57-63. [PMID=24769101]

  20. Wu H, Fang Y, Yu J, Zhang Z: The quest for a unified view of bacterial land colonizationThe ISME journal 2014, 8(7):1358-1369. [PMID=24451209]

  21. Wu G, Zhu J, Yu J, Zhou L, Huang JZ, Zhang Z: Evaluation of five methods for genome-wide circadian gene identificationJournal of biological rhythms 2014, 29(4):231-242.[PMID=25238853]

  22. Ma L, Cui P, Zhu J, Zhang Z, Zhang Z: Translational selection in human: more pronounced in housekeeping genesBiol Direct 2014, 9:17. [PMID=25011537]

  23. Kang Y, Gu C, Yuan L, Wang Y, Zhu Y, Li X, Luo Q, Xiao J, Jiang D, Qian M, Ahmed Khan A, Chen F, Zhang Z, Yu J: Flexibility and symmetry of prokaryotic genome rearrangement reveal lineage-associated core-gene-defined genome organizational frameworksmBio 2014, 5(6):e01867. [PMID=25425232]

  24. Zhang Z, Yu J: Does the genetic code have a eukaryotic origin?Genomics Proteomics Bioinformatics 2013, 11(1):41-55. [PMID=23402863]

  25. Zhang Z, Wong GK, Yu J: Protein codingEncyclopedia of Life Sciences (eLS) 2013. [Link]

  26. Wu J, Xiao J, Wang L, Zhong J, Yin H, Wu S, Zhang Z, Yu J: Systematic analysis of intron size and abundance parameters in diverse lineagesSci China Life Sci 2013, 56(10):968-974. [PMID=24022126]

  27. Tong X, Yang Y, Wang W, Bai Z, Ma L, Zheng X, Sun H, Zhang Z, Zhao M, Yu J, Ge RL: Expression profiling of abundant genes in pulmonary and cardiac muscle tissues of Tibetan Antelope (Pantholops hodgsonii)Gene 2013, 523(2):187-191. [PMID=23612247]

  28. Ma L, Bajic VB, Zhang Z: On the classification of long non-coding RNAsRNA Biol 2013, 10(6):925-933. [PMID=23696037]

  29. Dai L, Xu C, Tian M, Sang J, Zou D, Li A, Liu G, Chen F, Wu J, Xiao J, Wang X, Yu J, Zhang Z: Community intelligence in knowledge curation: an application to managing scientific nomenclaturePLoS One 2013, 8(2):e56961. [PMID=23451119]

  30. Dai L, Tian M, Wu J, Xiao J, Wang X, Townsend JP, Zhang Z: AuthorReward: increasing community curation in biological knowledge wikis through automated authorship quantificationBioinformatics 2013, 29(14):1837-1839. [PMID=23732274]

  31. Chen M, Xiao J, Zhang Z, Liu J, Wu J, Yu J: Identification of human HK genes and gene expression regulation study in cancer from transcriptomics data analysisPLoS One2013, 8(1):e54082. [PMID=23382867]

  32. Zhang Z, Yu J: The pendulum model for genome compositional dynamics: from the four nucleotides to the twenty amino acidsGenomics Proteomics Bioinformatics 2012, 10(4):175-180. [PMID=23084772]

  33. Zhang Z, Xiao J, Wu J, Zhang H, Liu G, Wang X, Dai L: ParaAT: a parallel tool for constructing multiple protein-coding DNA alignmentsBiochem Biophys Res Commun 2012, 419(4):779-781. [PMID=22390928]

  34. Zhang Z, Li J, Cui P, Ding F, Li A, Townsend JP, Yu J: Codon Deviation Coefficient: a novel measure for estimating codon usage bias and its statistical significanceBMC Bioinformatics 2012, 13(1):43. [PMID=22435713]

  35. Wu H, Zhang Z, Hu S, Yu J: On the molecular mechanism of GC content variation among eubacterial genomesBiol Direct 2012, 7(1):2. [PMID=22230424]

  36. Wu H, Qu H, Wan N, Zhang Z, Hu S, Yu J: Strand-biased gene distribution in bacteria is related to both horizontal gene transfer and strand-biased nucleotide composition.Genomics Proteomics Bioinformatics 2012, 10(4):186-196. [PMID=23084774]

  37. Dai L, Gao X, Guo Y, Xiao J, Zhang Z: Bioinformatics clouds for big data manipulationBiol Direct 2012, 7:43; discussion 43. [PMID=23190475]

  38. Cui P, Liu W, Zhao Y, Lin Q, Zhang D, Ding F, Xin C, Zhang Z, Song S, Sun F, Yu J, Hu S: Comparative analyses of H3K4 and H3K27 trimethylations between the mouse cerebrum and testisGenomics Proteomics Bioinformatics 2012, 10(2):82-93. [PMID=22768982]

  39. Cui P, Ding F, Lin Q, Zhang L, Li A, Zhang Z, Hu S, Yu J: Distinct contributions of replication and transcription to mutation rate variation of human genomesGenomics Proteomics Bioinformatics 2012, 10(1):4-10. [PMID=22449396]

  40. Zhang Z, Yu J: On the organizational dynamics of the genetic codeGenomics Proteomics Bioinformatics 2011, 9(1-2):21-29. [PMID=21641559]

  41. Zhang Z, Bajic VB, Yu J, Cheung K-H, Townsend JP: Data Integration in Bioinformatics: Current Efforts and Challenges. In: Bioinformatics - Trends and Methodologies. Edited by Mahdavi MA, vol. 1. Rijeka, Croatia: InTech; 2011: 41-56. [Link]

  42. Zhang Z, Yu J: Modeling compositional dynamics based on GC and purine contents of protein-coding sequencesBiol Direct 2010, 5(1):63. [PMID=21059261]

  43. Zhang Z, Townsend JP: The filamentous fungal gene expression database (FFGED)Fungal Genet Biol 2010, 47(3):199-204. [PMID=20025988]

  44. Zhang Z, Lopez-Giraldez F, Townsend JP: LOX: inferring Level Of eXpression from diverse methods of census sequencingBioinformatics 2010, 26(15):1918-1919.[PMID=20538728]

  45. Wang D, Zhang Y, Zhang Z, Zhu J, Yu J: KaKs_Calculator 2.0: a toolkit incorporating gamma-series methods and sliding window strategiesGenomics Proteomics Bioinformatics 2010, 8(1):77-80. [PMID=20451164]

  46. Qu H, Wu H, Zhang T, Zhang Z, Hu S, Yu J: Nucleotide compositional asymmetry between the leading and lagging strands of eubacterial genomesRes Microbiol 2010, 161(10):838-846. [PMID=20868744]

  47. Zhang Z, Townsend JP: Maximum-likelihood model averaging to profile clustering of site types across discrete linear sequencesPLoS Comput Biol 2009, 5(6):e1000421.[PMID=19557160]

  48. Zhang Z, Cheung KH, Townsend JP: Bringing Web 2.0 to bioinformaticsBriefings in Bioinformatics 2009, 10(1):1-10. [PMID=18842678]

  49. Li J, Zhang Z, Vang S, Yu J, Wong GK, Wang J: Correlation between Ka/Ks and Ks is related to substitution model and evolutionary lineageJ Mol Evol 2009, 68(4):414-423.[PMID=19308632]

  50. Zheng H, Shi J, Fang X, Li Y, Vang S, Fan W, Wang J, Zhang Z, Wang W, Kristiansen K, Wang J: FGF: a web tool for Fishing Gene Family in a whole genome databaseNucleic Acids Res 2007, 35(Web Server issue):W121-125. [PMID=17584790]

  51. Zhao X, Zhang Z, Yan J, Yu J: GC content variability of eubacteria is governed by the pol III alpha subunitBiochem Biophys Res Commun 2007, 356(1):20-25.[PMID=17336933]

  52. Hu J, Zhao X, Zhang Z, Yu J: Compositional dynamics of guanine and cytosine content in prokaryotic genomesRes Microbiol 2007, 158(4):363-370. [PMID=17449227]

  53. Zhang Z, Yu J: Evaluation of six methods for estimating synonymous and nonsynonymous substitution ratesGenomics Proteomics Bioinformatics 2006, 4(3):173-181.[PMID=17127215]

  54. Zhang Z, Li J, Zhao XQ, Wang J, Wong GK, Yu J: KaKs_Calculator: calculating Ka and Ks through model selection and model averagingGenomics Proteomics Bioinformatics2006, 4(4):259-263. [PMID=17531802]

  55. Zhang Z, Li J, Yu J: Computing Ka and Ks with a consideration of unequal transitional substitutionsBMC Evol Biol 2006, 6:44. [PMID=16740169]

  56. Li H, Coghlan A, Ruan J, Coin LJ, Heriche JK, Osmotherly L, Li R, Liu T, Zhang Z, Bolund L, Wong GK, Zheng W, Dehal P, Wang J, Durbin R: TreeFam: a curated database of phylogenetic trees of animal gene familiesNucleic Acids Res 2006, 34(Database issue):D572-580. [PMID=16381935]

Research Interests

Big Data Integration and Analytics

Computational Molecular Evolution



孙世翔  博士研究生  0710J3-生物信息学  

殷红彦  博士研究生  0710J3-生物信息学  

于春蕾  硕士研究生  0710J3-生物信息学  

徐行健  博士研究生  0710J3-生物信息学  

王光煜  博士研究生  0710J3-生物信息学  

夏琳  博士研究生  0710J3-生物信息学  

李漫  硕士研究生  0710J3-生物信息学  

曹佳宝  硕士研究生  085238-生物工程  

桑健  博士研究生  0710J3-生物信息学  

王佩  硕士研究生  0710J3-生物信息学  

连明  硕士研究生  0710Z1-基因组学  

张阳  硕士研究生  085211-计算机技术  

刘琳  博士研究生  0710J3-生物信息学  

高纯纯  博士研究生  0710J3-生物信息学  

张源笙  硕士研究生  085238-生物工程  

陈茹茹  硕士研究生  0710J3-生物信息学  

程远  硕士研究生  0710J3-生物信息学  

杜强  硕士研究生  0710J3-生物信息学  

冯昶瑞  硕士研究生  0710J3-生物信息学  

滕徐菲  博士研究生  0710J3-生物信息学  


刘晓楠  博士研究生  0710J3-生物信息学  

牛广艺  博士研究生  0710J3-生物信息学  

朱彤彤  博士研究生  0710J3-生物信息学  

李乾鹏  博士研究生  0710J3-生物信息学  

陈铭  博士研究生  0710J3-生物信息学  

刘畅  硕士研究生  0710Z1-基因组学  

张阳  博士研究生  0710J3-生物信息学  

李昭  博士研究生  0710J3-生物信息学  

潘榕  硕士研究生  0710J3-生物信息学  

荆蔚  硕士研究生  0710J3-生物信息学  

罗思成  硕士研究生  0710J3-生物信息学