发表论文
(1) Online Symbolic Regression with Informative Query, AAAI Conference on Artificial Intelligence (AAAI), 2023, 第 11 作者(2) Learning controllable elements oriented representations for reinforcement learning, NEUROCOMPUTING, 2023, 第 10 作者(3) Online Prototype Alignment for Few-shot Policy Transfer, International Conference on Minority Languages (ICML XVI), 2023, 第 11 作者(4) Conceptual Reinforcement Learning for Language-Conditioned Tasks, AAAI Conference on Artificial Intelligence (AAAI), 2023, 第 11 作者(5) BALTO: fast tensor program optimization with diversity-based active learning, ICLR 2023, 2023, 第 10 作者(6) Rescue to the Curse of universality, SCIENCE CHINA-INFORMATION SCIENCES, 2023, 第 11 作者(7) BabelTower: Learning to Auto-parallelized Program Translation, Proceedings of the 39 th International Conference on Machine Learning, 2022, 第 11 作者(8) Object-Category Aware Reinforcement Learning, NeurIPS 2022, 2022, 第 11 作者(9) Neural Program Synthesis with Query, ICLR 2022, 2022, 第 9 作者(10) Tetris:A Heuristic Static Memory Management Framework for Uniform Memory Multicore Neural Network Accelerators, Tetris: A Heuristic Static Memory Management Framework for Uniform Memory Multicore Neural Network Accelerators, JOURNAL OF COMPUTER SCIENCE AND TECHNOLOGY, 2022, 第 6 作者(11) Cambricon-G: A Polyvalent Energy-Efficient Accelerator for Dynamic Graph Neural Networks, IEEE TRANSACTIONS ON COMPUTER-AIDED DESIGN OF INTEGRATED CIRCUITS AND SYSTEMS, 2022, 第 11 作者(12) Accelerating Sparse Convolution with Column Vector-Wise Sparsity, NeurIPS 2022, 2022, 第 6 作者(13) Causality-driven Hierarchical Structure Discovery for Reinforcement Learning, NeurIPS 2022, 2022, 第 11 作者(14) Breaking the interaction wall: A DLPU-centric deep learning computing system, IEEE Transactions on Computers, 2022, 第 11 作者(15) ScaleCert: Scalable Certified Defense against Adversarial Patches with Sparse Superficial Layers, 2021, 第 9 作者(16) Cambricon-G: A Polyvalent Energy-efficient Accelerator for Dynamic Graph Neural Networks, IEEE TRANSACTIONS ON COMPUTER-AIDED DESIGN OF INTEGRATED CIRCUITS AND SYSTEMS, 2021, 第 11 作者(17) A Decomposable Winograd Method for N-D Convolution Acceleration in Video Analysis, INTERNATIONAL JOURNAL OF COMPUTER VISION, 2021, 第 9 作者(18) Distilling Object Detectors with Feature Richness, 2021, 第 7 作者(19) Space-address decoupled scratchpad memory management for neural network accelerators, CONCURRENCY AND COMPUTATION-PRACTICE & EXPERIENCE, 2021, 第 6 作者(20) Hindsight Value Function for Variance Reduction in Stochastic Dynamic Environment, 2021, 第 9 作者(21) 实现科技强国梦,青年科技工作者的使命但当, 科技导报, 2021, 第 1 作者(22) ALT : Optimizing Tensor Compilation in Deep Learning Compilers with Active Learning, 2020 IEEE 38TH INTERNATIONAL CONFERENCE ON COMPUTER DESIGN (ICCD 2020), 2020, 第 6 作者(23) Fixed-Point Back-Propagation Training, 2020 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2020, 第 11 作者(24) Self-Aware Neural Network Systems: A Survey and New Perspective, PROCEEDINGS OF THE IEEE, 2020, 第 11 作者(25) QingLong:一种基于常变量异步拷贝的神经网络编程模型, QingLong:A Neural Network Programming Model Based on Asynchronous Copy of Constant and Variable, 计算机学报, 2020, 第 2 作者(26) Addressing Irregularity in Sparse Neural Networks Through a Cooperative Software/Hardware Approach, IEEE TRANSACTIONS ON COMPUTERS, 2020, 第 11 作者(27) 提升高性能计算程序性能可移植性的领域特定语言, Domain-specific language for improving the performance portability of high-performance computing programs, 高技术通讯, 2020, 第 4 作者(28) ParaML: A Polyvalent Multicore Accelerator for Machine Learning, IEEE TRANSACTIONS ON COMPUTER-AIDED DESIGN OF INTEGRATED CIRCUITS AND SYSTEMS, 2020, 第 12 作者(29) Machine Learning Computers With Fractal von Neumann Architecture, IEEE TRANSACTIONS ON COMPUTERS, 2020, 第 10 作者(30) Cambricon-F: Machine Learning Computers with Fractal von Neumann Architecture, PROCEEDINGS OF THE 2019 46TH INTERNATIONAL SYMPOSIUM ON COMPUTER ARCHITECTURE (ISCA '19), 2019, 第 11 作者(31) An Instruction Set Architecture for Machine Learning, ACM TRANSACTIONS ON COMPUTER SYSTEMS, 2019, 第 1 作者(32) TDSNN: From Deep Neural Networks to Deep Spike Neural Networks with Temporal-Coding, THIRTY-THIRD AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE / THIRTY-FIRST INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE CONFERENCE / NINTH AAAI SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2019, 第 11 作者(33) 低面积低功耗的机器学习运算单元设计, An area and power-efficient machine learning functional unit, 高技术通讯, 2019, 第 5 作者(34) BSHIFT: A Low Cost Deep Neural Networks Accelerator, INTERNATIONAL JOURNAL OF PARALLEL PROGRAMMING, 2019, 第 5 作者(35) Addressing Sparsity in Deep Neural Networks, IEEE TRANSACTIONS ON COMPUTER-AIDED DESIGN OF INTEGRATED CIRCUITS AND SYSTEMS, 2019, 第 11 作者(36) 树立良好作风学风,是科研工作者的责任, Building a good tradition for research, 中国科学基金, 2019, 第 1 作者(37) Guest Editors' Introduction: Special Issue on Big Data Systems on Emerging Architectures, IEEE TRANSACTIONS ON BIG DATA, 2019, 第 2 作者(38) TDSNN:from DNN to deep SNN with temporal coding, AAAI Conference on Artificial Intelligence, 2019, 第 11 作者(39) 智能芯片的评述和展望, A Survey of Artificial Intelligence Chip, 计算机研究与发展, 2019, 第 4 作者(40) 稀疏神经网络加速器设计, An accelerator for sparse neural network, 高技术通讯, 2019, 第 3 作者(41) Addressing Sparisity in Deep Neural Networks, IEEE Trans. on CAD of Integrated Circuits and Systems, 2018, 第 1 作者(42) Using Local Clocks to Reproduce Concurrency Bugs, IEEE TRANSACTIONS ON SOFTWARE ENGINEERING (CCF A), 2018, 第 10 作者(43) Cambricon-S: Addressing Irregularity in Sparse Neural Networks through A Cooperative Software/Hardware Approach, 2018 51ST ANNUAL IEEE/ACM INTERNATIONAL SYMPOSIUM ON MICROARCHITECTURE (MICRO), 2018, 第 11 作者(44) 深度学习编程框架, Programming frameworks for deep learning algorithms, 大数据, 2018, 第 3 作者(45) BENCHIP: Benchmarking Intelligence Processors, JOURNAL OF COMPUTER SCIENCE AND TECHNOLOGY, 2018, 第 14 作者(46) 新时代“科学春天”众人谈, 中国科技奖励, 2018, 第 10 作者(47) TuNao: A High-Performance and Energy-Efficient Reconfigurable Accelerator for Graph Processing, 2017 17TH IEEE/ACM INTERNATIONAL SYMPOSIUM ON CLUSTER, CLOUD AND GRID COMPUTING (CCGRID), 2017, 第 9 作者(48) An Accelerator for High Efficient Vision Processing, IEEE TRANSACTIONS ON COMPUTER-AIDED DESIGN OF INTEGRATED CIRCUITS AND SYSTEMS, 2017, 第 10 作者(49) DLPlib: A Library for Deep Learning Processor, JOURNAL OF COMPUTER SCIENCE AND TECHNOLOGY, 2017, 第 9 作者(50) Service-Oriented Architecture on FPGA-Based MPSoC, IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS, 2017, 第 3 作者(51) Secure Outsourcing of Virtual Appliance, IEEE TRANSACTIONS ON CLOUD COMPUTING, 2017, 第 4 作者(52) DaDianNao: A Neural Network Supercomputer, IEEE TRANSACTIONS ON COMPUTERS, 2017, 第 11 作者(53) 人工神经网络处理器, 中国科学. 生命科学, 2016, 第 1 作者(54) Accelerating Architectural Simulation Via Statistical Techniques: A Survey, IEEE TRANSACTIONS ON COMPUTER-AIDED DESIGN OF INTEGRATED CIRCUITS AND SYSTEMS, 2016, 第 11 作者(55) DianNao Family: Energy-Efficient Hardware Accelerators for Machine Learning, COMMUNICATIONS OF THE ACM, 2016, 第 11 作者(56) 多媒体技术研究:2015——类脑计算的研究进展与发展趋势, Research on multimedia technology 2015������advances and trend of brain-like computing, 中国图象图形学报, 2016, 第 5 作者(57) A survey of routing algorithm for mesh Network-on-Chip, FRONTIERS OF COMPUTER SCIENCE, 2016, 第 11 作者(58) Near-Data Processing of Neural Networks, IEEE MICRO, 2016, 第 11 作者(59) Cambricon-X: An Accelerator for Sparse Neural Networks, 2016 49TH ANNUAL IEEE/ACM INTERNATIONAL SYMPOSIUM ON MICROARCHITECTURE (MICRO), 2016, 第 9 作者(60) Cambricon: An Instruction Set Architecture for Neural Networks, Computer Architecture (ISCA), 2016 ACM/IEEE 43rd Annual International Symposium on, 2016, 第 11 作者(61) IMR: High-Performance Low-Cost Multi-Ring NoCs, IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS, 2016, 第 11 作者(62) 类脑计算芯片与类脑智能机器人发展现状与思考, Current Status and Consideration on Brain-like Computing Chip and Brain-like Intelligent Robot, 中国科学院院刊, 2016, 第 2 作者(63) Deterministic Replay: A Survey, ACM COMPUTING SURVEYS, 2015, 第 1 作者(64) 数据触发的基本块间弹性控制电路综合方法, A method for elastic controller synthesis using data-triggered execution across basic blocks, 高技术通讯, 2015, 第 2 作者(65) 计算机系统模拟器研究综述, Survey on Computer System Simulator, 计算机研究与发展, 2015, 第 3 作者(66) PuDianNao: A Polyvalent Machine Learning Accelerator, ACM SIGPLAN NOTICES, 2015, 第 9 作者(67) Robust Design Space Modeling, ACM TRANSACTIONS ON DESIGN AUTOMATION OF ELECTRONIC SYSTEMS, 2015, 第 11 作者(68) Leveraging the Error Resilience of Neural Networks for Designing Highly Energy Efficient Accelerators, IEEE TRANSACTIONS ON COMPUTER-AIDED DESIGN OF INTEGRATED CIRCUITS AND SYSTEMS, 2015, 第 3 作者(69) Architecture Support for Task Out-of-Order Execution in MPSoCs, IEEE TRANSACTIONS ON COMPUTERS, 2015, 第 5 作者(70) Robust Architectural Design Space Modeling, ACM Transactions on Design Automation of Electronic Systems, 2015, 第 11 作者(71) HERMES: A Fast Cross-ISA Binary Translator with Post-Optimization, 2015 IEEE/ACM INTERNATIONAL SYMPOSIUM ON CODE GENERATION AND OPTIMIZATION (CGO), 2015, 第 3 作者(72) ReCBuLC: Reproducing Concurrency Bugs Using Local Clocks, International Conference on Software Engineering, 2015, 第 1 作者(73) Statistical Performance Comparisons of Computers, IEEE TRANSACTIONS ON COMPUTERS, 2015, 第 7 作者(74) Neuromorphic Accelerators: A Comparison Between Neuroscience and Machine-Learning Approaches, PROCEEDINGS OF THE 48TH ANNUAL IEEE/ACM INTERNATIONAL SYMPOSIUM ON MICROARCHITECTURE (MICRO-48), 2015, 第 3 作者(75) Reproducing Concurrency Bugs Using Local Clocks, ACM/IEEE 37th International Conference on Software Engineering (ICSE 2015) (CCF A类)., 2015, 第 9 作者(76) Retraining-Based Timing Error Mitigation for Hardware Neural Networks, 2015 DESIGN, AUTOMATION & TEST IN EUROPE CONFERENCE & EXHIBITION (DATE), 2015, 第 10 作者(77) ShiDianNao: Shifting Vision Processing Closer to the Sensor, 2015ACMIEEE42NDANNUALINTERNATIONALSYMPOSIUMONCOMPUTERARCHITECTUREISCA, 2015, 第 8 作者(78) FreeRider: Non-Local Adaptive Network-on-Chip Routing with Packet-Carried Propagation of Congestion Information, IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS, 2015, 第 9 作者(79) A Small-Footprint Accelerator for Large-Scale Neural Networks, ACM TRANSACTIONS ON COMPUTER SYSTEMS, 2015, 第 11 作者(80) Practical Iterative Optimization for the Data Center, ACM TRANSACTIONS ON ARCHITECTURE AND CODE OPTIMIZATION, 2015, 第 6 作者(81) 二维Mesh结构的片上网络中利用全局信息的路由算法, A Novel Routing Algorithm for 2D Mesh Network-on-Chip Leveraging Global Information, 计算机辅助设计与图形学学报, 2014, 第 2 作者(82) 基于二进制插桩的共享指令集异构多核处理器进程迁移方法, A binary-instrumentation based execution migration method for shared ISA heterogeneous multi-core processors, 高技术通讯, 2014, 第 4 作者(83) Performance Portability Across Heterogeneous SoCs Using a Generalized Library-Based Approach, ACM TRANSACTIONS ON ARCHITECTURE AND CODE OPTIMIZATION, 2014, 第 9 作者(84) An Elastic Architecture Adaptable to Various Application Scenarios, JOURNAL OF COMPUTER SCIENCE AND TECHNOLOGY, 2014, 第 2 作者(85) 基于可行序的数据竞争检测, Data race detection via feasibleahead relation, 高技术通讯, 2014, 第 2 作者(86) Pre-Silicon Bug Forecast, IEEE TRANSACTIONS ON COMPUTER-AIDED DESIGN OF INTEGRATED CIRCUITS AND SYSTEMS, 2014, 第 3 作者(87) Auxiliary stream for optimizing memory access of video decoders, SCIENCE CHINA-INFORMATION SCIENCES, 2014, 第 3 作者(88) DaDianNao: A Machine-Learning Supercomputer, 2014 47TH ANNUAL IEEE/ACM INTERNATIONAL SYMPOSIUM ON MICROARCHITECTURE (MICRO), 2014, 第 11 作者(89) An 8-Core MIPS-Compatible Processor in 32/28 nm Bulk CMOS, IEEE JOURNAL OF SOLID-STATE CIRCUITS, 2014, 第 5 作者(90) Auxiliary stream for optimizing memory access of video decoders, SCIENCE CHINA-INFORMATION SCIENCES, 2014, 第 3 作者(91) 面向低能耗的非精确异构多核上的运行时技术, Run-time technology for low-power oriented inexact heterogeneous multi-core, 高技术通讯, 2014, 第 6 作者(92) Prevention from Soft Errors via Architecture Elasticity, JOURNAL OF COMPUTER SCIENCE AND TECHNOLOGY, 2014, 第 2 作者(93) A General-Purpose Many-Accelerator Architecture Based on Dataflow Graph Clustering of Applications, JOURNAL OF COMPUTER SCIENCE AND TECHNOLOGY, 2014, 第 4 作者(94) DianNao: A Small-Footprint High-Throughput Accelerator for Ubiquitous Machine-Learning, ACM SIGPLAN NOTICES, 2014, 第 6 作者(95) Deterministic Replay Using Global Clock, ACM TRANSACTIONS ON ARCHITECTURE AND CODE OPTIMIZATION, 2013, 第 1 作者(96) Microarchitectural design space exploration made fast, MICROPROCESSORS AND MICROSYSTEMS, 2013, 第 3 作者(97) Effective and Efficient Microprocessor Design Space Exploration Using Unlabeled Design Configurations, ACM TRANSACTIONS ON INTELLIGENT SYSTEMS AND TECHNOLOGY, 2013, 第 2 作者(98) LDet: Determinizing Asynchronous Transfer for Postsilicon Debugging, IEEE TRANSACTIONS ON COMPUTERS, 2013, 第 11 作者(99) Motion Estimation Without Integer-Pel Search, IEEE TRANSACTIONS ON IMAGE PROCESSING, 2013, 第 3 作者(100) 超大规模集成电路可调试性设计综述, Survey of Design-for-Debug of VLSI, 计算机研究与发展, 2012, 第 4 作者(101) Linear Time Memory Consistency Verification, IEEE TRANSACTIONS ON COMPUTERS, 2012, 第 2 作者(102) 一种用于通用处理器结构优化的矩阵乘法性能模型, Matrix Multiplication Performance Model for Optimizing General-purpose Processor Architecture, 小型微型计算机系统, 2012, 第 3 作者(103) Program Regularization in Memory Consistency Verification, IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS, 2012, 第 11 作者(104) 基于模型树的多核设计空间探索技术, Model Tree Based Multi-core Design Space Exploration, 计算机辅助设计与图形学学报, 2012, 第 3 作者(105) 多核处理器片上网络trace压缩方法, A NOC trace compression method for multi-core processors, 高技术通讯, 2011, 第 3 作者(106) 基于向量扩展多核处理器的矩阵乘法算法优化研究, Optimization of matrix multiplication based on a multi-core architecture extended with vector units, 中国科学技术大学学报, 2011, 第 2 作者(107) 一种面向多核处理器的通用可调试性架构, A General DFD Infrastructure for Chip Multiprocessor, 计算机辅助设计与图形学学报, 2011, 第 6 作者(108) 片上多核处理器存储一致性验证, Memory Consistency Verification of Chip Multi-Processor, 软件学报, 2010, 第 2 作者(109) System Architecture of Godson-3 Multi-Core Processors, JOURNAL OF COMPUTER SCIENCE AND TECHNOLOGY, 2010, 第 2 作者(110) System Architecture of Godson-3 Multi-Core Processors, System Architecture of Godson-3 Multi-Core Processors, 计算机科学技术学报:英文版, 2010, 第 2 作者(111) LReplay: A Pending Period Based Deterministic Replay Scheme, ISCA 2010: THE 37TH ANNUAL INTERNATIONAL SYMPOSIUM ON COMPUTER ARCHITECTURE, 2010, 第 1 作者(112) 覆盖率驱动的随机测试生成技术综述, A Survey on Coverage Directed Generation Technology, 计算机辅助设计与图形学学报, 2009, 第 3 作者(113) Testing content addressable memories with physical fault models, Testing content addressable memories with physical fault models, 半导体学报, 2009, 第 4 作者(114) GODSON-3: A SCALABLE MULTICORE RISC PROCESSOR WITH X86 EMULATION, IEEE MICRO, 2009, 第 4 作者(115) Fast Complete Memory Consistency Verification, HPCA-15 2009: FIFTEENTH INTERNATIONAL SYMPOSIUM ON HIGH-PERFORMANCE COMPUTER ARCHITECTURE, PROCEEDINGS, 2009, 第 1 作者(116) Testing content addressable memories with physical fault models, Testing content addressable memories with physical fault models, JOURNAL OF SEMICONDUCTORS, 2009, 第 4 作者(117) 龙芯3号互联系统的设计与实现, Interconnection of Godson-3 Multi-Core Processor, 计算机研究与发展, 2008, 第 3 作者(118) 一种基于SAT的运算电路查错方法, A SAT-Based Arithmetic Circuit Bug-Hunting Method, 计算机学报, 2007, 第 1 作者(119) 龙芯2号微处理器浮点除法功能部件的形式验证, Formal Verification of Godson-2 Microprocessor Floating-Point Division Unit, 计算机研究与发展, 2006, 第 1 作者