基本信息
聂晓辉 男 中国科学院计算机网络信息中心
电子邮件: xhnie@cnic.cn
通信地址: 北京市海淀区东升南路2号院中国科学院计算机网络信息中心
邮政编码:
电子邮件: xhnie@cnic.cn
通信地址: 北京市海淀区东升南路2号院中国科学院计算机网络信息中心
邮政编码:
招生信息
招生专业
081201-计算机系统结构081203-计算机应用技术
招生方向
智能运维 (AIOps)AI for Networking互联网基础资源监测与治理
教育背景
2013-09--2019-07 清华大学计算机科学与技术系 工学博士2009-09--2013-07 吉林大学计算机科学与技术学院 理学学士
工作经历
工作简历
2024-05~现在, 中国科学院计算机网络信息中心, 副研究员2021-10~2024-05,北京必示科技有限公司, 研究员2019-09~2021-09,清华大学计算机科学与技术系, 博士后
专利与奖励
奖励信息
(1) 中国电子学会科学进步一等奖, 一等奖, 省级, 2023
专利成果
( 1 ) 一种基于调用链的根因定位方法、装置、设备及存储介质, 发明专利, 2023, 第 2 作者, 专利号: CN116820826B( 2 ) 一种 TCP 初始窗口优化方法和系统, 发明专利, 2022, 第 1 作者, 专利号: CN108111430B( 3 ) 一种基于红蓝对抗的故障定位应用的评测方法与系统, 发明专利, 2023, 第 4 作者, 专利号: CN116302762B( 4 ) 一种异常变更检测方法、装置、设备及存储介质, 发明专利, 2023, 第 4 作者, 专利号: CN115391160B( 5 ) 一种网络故障排查方法与系统, 发明专利, 2022, 第 3 作者, 专利号: CN114785666B
出版信息
发表论文
[1] Guanglei He, Xiaohui Nie, Ruming Tang, Kun Wang, Zhaoyang Yu, Xidao Wen, Kanglin Yin, Dan Pei. Guardian of the Resiliency: Detecting Erroneous Software Changes Before They Make Your Microservice System Less Fault-Resilient. IEEE/ACM 29th International Symposium on Quality of Service (IWQOS). 2024, [2] Shenglin Zhang, Jun Zhu, Bowen Hao, Yongqian Sun, Xiaohui Nie, Jingwen Zhu, Xilin Liu, Xiaoqian Li, Yuchi Ma, Dan Pei. Fault Diagnosis for Test Alarms in Microservices Through Multi-source Data. Proceedings of the 32st ACM Joint European Software Engineering Conference and Symposium on the Foundations of Software Engineering (ESEC/FSE). 2024, [3] Zhenhe Yao, Changhua Pei, Wenxiao Chen, Hanzhang Wang, Liangfei Su, Huai Jiang, Zhe Xie, Xiaohui Nie, Dan Pei. Chain-of-Event: Interpretable Root Cause Analysis for Microservices through Automatically Learning Weighted Event Causal Graph. Proceedings of the 32nd ACM International Conference on the Foundations of Software Engineering. 2024, [4] Zhe Xie, Shenglin Zhang, Yitong Geng, Yao Zhang, Xiaohui Nie, Zhenhe Yao, Longlong Xu, Yongqian Sun, Wentao Li, Dan Pei. Microservice Root Cause Analysis With Limited Observability Through Intervention Recognition in the Latent Space. Proceedings of the 30th ACM SIGKDD Conference on Knowledge Discovery and Data Mining. 2024, [5] Shenglin Zhang, Yongxin Zhao, Xiao Xiong, Yongqian Sun, Xiaohui Nie, Jiacheng Jiang, Fenglai Wang, Xian Zheng, Yuzhi Zhang, Dan Pei. Illuminating the Gray Zone: Non-intrusive Gray Failure Localization in Server Operating Systems. Proceedings of the 32nd ACM International Conference on the Foundations of Software Engineering. 2024, [6] Yiran Cheng, Bo Cheng, Pengxiang Jin, Yongqian Sun, Xiaohui Nie, Nengwen Zhao, Shenglin Zhang, Dan Pei. Effective Attribute Selection for Multi-dimensional Root Cause Analysis. IEEE 33rd International Symposium on Software Reliability Engineering (ISSRE). 2022, [7] Mingjie Li, Minghua Ma, Xiaohui Nie, Kanglin Yin, Li Cao, Xidao Wen, Duogang Wu, Guoying Li, Wei Liu, Xin Yang, Dan Pei, Zhiyun Yuan. Mining Fluctuation Propagation Graph Among Time Series with Active Learning. Database and expert systems applications : Part I /. 2022, 220-233, http://dx.doi.org/10.1007/978-3-031-12423-5_17.[8] Xianglin Lu, Zhe Xie, Zeyan Li, Mingjie Li, Xiaohui Nie, Nengwen Zhao, Qingyang Yu, Shenglin Zhang, Kaixin Sui, Lin Zhu, Dan Pei. Generic and Robust Performance Diagnosis via Causal Inference for OLTP Database Systems. 2022 22nd IEEE International Symposium on Cluster, Cloud and Internet Computing (CCGrid). 2022, [9] Mingjie Li, Zeyan Li, Kanglin Yin, Xiaohui Nie, wenchi zhang, Kaixin Sui, Dan Pei. Causal Inference-Based Root Cause Analysis for Online Service Systems with Intervention Recognition. Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining. 2022, https://dl.acm.org/doi/10.1145/3534678.3539041.[10] Li, Zeyan, Zhao, Nengwen, Li, Mingjie, Lu, Xianglin, Wang, Lixin, Chang, Dongdong, Nie, Xiaohui, Cao, Li, Zhang, Wenzhi, Sui, Kaixin, Wang, Yanhua, Du, Xu, Duan, Guoqiang, Pei, Dan. Actionable and Interpretable Fault Localization for Recurring Failures in Online Service Systems. Proceedings of the 30th ACM Joint European Software Engineering Conference and Symposium on the Foundations of Software Engineering. 2022, http://arxiv.org/abs/2207.09021.[11] Canhua Wu, Nengwen Zhao, Lixin Wang, Xiaoqin Yang, Shining Li, Ming Zhang, Xing Jin, Xidao Wen, Xiaohui Nie, Wenchi Zhang, Kaixin Sui, Dan Pei. Identifying root-cause metrics for incident diagnosis in online service systems. IEEE 32nd International Symposium on Software Reliability Engineering (ISSRE). 2021, [12] Zeyan Li, Junjie Chen, Rui Jiao, Nengwen Zhao, Zhijun Wang, Shuwei Zhang, Yanjun Wu, Long Jiang, Leiqin Yan, Zikai Wang, Zhekang Chen, Wenchi Zhang, Xiaohui Nie, Kaixin Sui, Dan Pei. Practical root cause localization for microservice systems via trace analysis. IEEE/ACM 29th International Symposium on Quality of Service (IWQOS). 2021, [13] Minghua Ma, Shenglin Zhang, Junjie Chen, Jim Xu, Haozhe Li, Yongliang Lin, Xiaohui Nie, Bo Zhou, Yong Wang, Dan Pei. Jump-Starting Multivariate Time Series Anomaly Detection for Online Service Systems. USENIX ATC '21. 2021, [14] ZHANG, YUCHAO, Xiaohui Nie, Junchen Jiang, WANG WenDong, Ke Xu, Youjian Zhao, Martin J. Reed, Kai Chen. BDS+: A Centralized Near-Optimal Network System for Inter-Datacenter Data Replication. IEEE/ACM Transaction on Networking[J]. 2021, 29(2): 918-934, https://ieeexplore.ieee.org/abstract/document/9352539.[15] Nengwen Zhao, Junjie Chen, Zhou Wang, Xiao Peng, Gang Wang, Yong Wu, Fang Zhou, Zhen Feng, Xiaohui Nie, Wenchi Zhang, Kaixin Sui, Dan Pei. Real-time incident prediction for online service systems. Proceedings of the 28th ACM ESEC/FSE. 2020, [16] Zhao, Nengwen, Chen, Junjie, Peng, Xiao, Wang, Honglin, Wu, Xinya, Zhang, Yuanzong, Chen, Zikai, Zheng, Xiangzhong, Nie, Xiaohui, Wang, Gang, Wu, Yong, Zhou, Fang, Zhang, Wenchi, Sui, Kaixin, Pei, Dan, IEEE. Understanding and Handling Alert Storm for Online Service Systems. 2020 ACM/IEEE 42ND INTERNATIONAL CONFERENCE ON SOFTWARE ENGINEERING: COMPANION PROCEEDINGS (ICSE-COMPANION 2020). 2020, 262-263, [17] 邹磊, 朱晶, 聂晓辉, 苏亚, 裴丹, 孙宇. 基于聚类的多维数据热点发现算法. 小型微型计算机系统[J]. 2019, 465-471, http://lib.cqvip.com/Qikan/Article/Detail?id=88888788504849574851484849.[18] Ping Liu, Yu Chen, Xiaohui Nie, Jing Zhu, Shenglin Zhang, Kaixin Sui, Ming Zhang, Dan Pei. FluxRank: A Widely-Deployable Framework to Automatically Localizing Root Cause Machines for Web Service Failure Mitigation. The 30th International Symposium on Software Reliability Engineering (ISSRE),. 2019, [19] Nie, Xiaohui, Zhao, Youjian, Li, Zhihan, Chen, Guo, Sui, Kaixin, Zhang, Jiyang, Ye, Zijie, Pei, Dan. Dynamic TCP Initial Windows and Congestion Control Schemes Through Reinforcement Learning. IEEE JOURNAL ON SELECTED AREAS IN COMMUNICATIONS[J]. 2019, 37(6): 1231-1247, https://www.webofscience.com/wos/woscc/full-record/WOS:000468234400005.[20] Sun, Yongqian, Zhao, Youjian, su, Ya, Liu, Dapeng, Nie, Xiaohui, Meng, Yuan, Cheng, Shiwen, Pei, Dan, Zhang, Shenglin, Qu, Xianping, Guo, Xuanyou. HotSpot: Anomaly Localization for Additive KPIs With Multi-Dimensional Attributes. IEEE ACCESS[J]. 2018, 6: 10909-10923, https://doaj.org/article/5b0cf8b9121e42cdbba61b7d560cf337.[21] Zhang Yuchao, Jiang Junchen, Xu Ke, Nie Xiaohui, Reed Martin J, Wang Haiyang, Yao Guang, Zhang Miao, Chen Kai, Assoc Comp Machinery. BDS: A Centralized Near-Optimal Overlay Network for Inter-Datacenter Data Replication. EUROSYS '18: PROCEEDINGS OF THE THIRTEENTH EUROSYS CONFERENCE. 2018, http://dx.doi.org/10.1145/3190508.3190519.[22] Nie Xiaohui, Zhao Youjian, Pei Dan, Chen Guo, Sui Kaixin, Zhang Jiyang, IEEE. Reducing Web Latency through Dynamically Setting TCP Initial Window with Reinforcement Learning. 2018 IEEE/ACM 26TH INTERNATIONAL SYMPOSIUM ON QUALITY OF SERVICE (IWQOS)[J]. 2018, [23] Nie, Xiaohui, Zhao, Youjian, Chen, Guo, Sui, Kaixin, Chen, Yazheng, Pei, Dan, Zhang, Miao, Zhang, Jiyang, IEEE. TCP WISE: One Initial Congestion Window Is Not Enough. 2017 IEEE 36TH INTERNATIONAL PERFORMANCE COMPUTING AND COMMUNICATIONS CONFERENCE (IPCCC). 2017, [24] Zhang Yuchao, Xu Ke, Yao Guang, Zhang Miao, Nie Xiaohui, ACM. PieBridge: A Cross-DR scale Large Data Transmission Scheduling System. PROCEEDINGS OF THE 2016 ACM CONFERENCE ON SPECIAL INTEREST GROUP ON DATA COMMUNICATION (SIGCOMM '16). 2016, 553-554, http://dx.doi.org/10.1145/2934872.2959046.[25] Nie Xiaohui, Zhao Youjian, Sui Kaixin, Pei Dan, Chen Yu, Qu Xianping, IEEE. Mining Causality Graph For Automatic Web-based Service Diagnosis. 2016 IEEE 35TH INTERNATIONAL PERFORMANCE COMPUTING AND COMMUNICATIONS CONFERENCE (IPCCC). 2016,
科研活动
科研项目
( 1 ) 面向多模态监控数据的微服务无监督异常检测与定位, 参与, 国家任务, 2021-01--2024-12( 2 ) 基于大数据分析的互联网服务性能管理体系结构研究, 参与, 国家任务, 2016-01--2018-12( 3 ) 网络感知和诊断技术, 参与, 境内委托项目, 2021-07--2021-12( 4 ) 深圳证券通信有限公司 AIOps 科研项目合作协议, 参与, 境内委托项目, 2020-08--2021-12( 5 ) AIOPS 在网络运维中的研究与初步实践, 参与, 境内委托项目, 2019-12--2020-12( 6 ) 针对 ADOTP 模板组合异常检测、ADOTS 模板序列异常检测的机器学习故障诊断技术项目, 参与, 境内委托项目, 2019-08--2020-08