您当前所在的位置:首页 > 师资队伍 > 教师名录

教师名录

顾小东
长聘教轨副教授

邮箱:xiaodong.gu@sjtu.edu.cn

所在研究所:交叉前沿研究所(筹)

个人主页:https://guxd.github.io

个人简介

顾小东,上海交通大学计算机学院副教授,主要研究研究方向为大语言模型、智能化软件工程、自然语言处理,为软件代码开发高效的机器学习算法。研究课题包括代码大模型、程序生成与修复、Agent智能问答等。研究成果被发表在ICSE、ICLR、FSE、AAAI、ASE、ICPC等国际重要期刊和会议上。主讲计算机学院《机器学习》《计算机数学基础》等人工智能基础课程。主持参与国家自然科学基金、国家重点研发计划、上海市自然科学基金及华为、腾讯等企业课题十余项。获得上海市海外高层次人才计划、华为火花奖等荣誉。

教授课程

SE3332 《机器学习》

SE2324 《计算机科学的数学基础》

CS0001W 《大语言模型基础与实践》

论文发表

[FSE 2026] Neuron-Guided Interpretation of Code LLMs: Where, Why, and How?
Zhe Yin, Xiaodong Gu, Beijun Shen

[FSE 2026] Beyond Language Boundaries: Uncovering Program Language Families with Code Language Models
Shangho Yun, Xiaodong Gu, Jianghong Huang, Beijun Shen

[FSE 2026] In Line with Context: Repository-Level Code Generation via Context Inlining
Chao Hu, Wenhao Zeng, Yuling Shi, Beijun Shen, Xiaodong Gu

[ICSE 2026] SWE-Debate: Competitive Multi-Agent Debate for Software Issue Resolution
Han Li, Yuling Shi, Shaoxin Lin, Xiaodong Gu, Heng Lian, Xin Wang, Yantao Jia, Tao Huang, Qianxiang Wang
[paper]

[ICSE 2026 SEIP] EVOC2RUST: A Skeleton-guided Framework for Project-Level C-to-Rust Translation
Chaofan Wang, Tingrui Yu, Chen Xie, Jie Wang, Dong Chen, Wenrui Zhang, Yuling Shi, Xiaodong Gu, Beijun Shen
[paper]

[ICSE 2026] From Code to Correctness: Closing the Last Mile of Code Generation with Hierarchical Debugging
Yuling Shi, Songsong Wang, Chengcheng Wan, Min Wang and Xiaodong Gu
[paper] [code]

[ICLR 2026] Robust Preference Alignment via Directional Neighborhood Consensus
Ruochen Mao, Yuling Shi, Xiaodong Gu, and Jiaheng Wei
[paper] [code]

[ICLR 2026] Attention as a Compass: Efficient Exploration for Process-Supervised RL in Reasoning Models
Runze Liu, Jiakang Wang, Yuling Shi, Zhihui Xie, Chenxin An, Kaiyan Zhang, Jian Zhao, Xiaodong Gu, Lei Lin, Wenping Hu, Xiu Li, Fuzheng Zhang, Guorui Zhou, Kun Gai
[paper]

[AAAI 2026] Anti-Adversarial Learning: Desensitizing Prompts for Large Language Models
Xuan Li, Zhe Yin, Xiaodong Gu, and Beijun Shen
[paper]

[AAMAS 2026] HyperAgent: Leveraging Hypergraphs for Topology Optimization in Multi-Agent Communication
Heng Zhang, Yuling Shi, Xiaodong Gu, Zijian Zhang, Haochen You, Lubin Gan, Yilei Yuan, Jin Huang

[AAMAS 2026] GraphTracer: Graph-Guided Failure Tracing in LLM Agents for Robust Multi-Turn Deep Search
Heng Zhang, Yuling Shi, Xiaodong Gu, Zijian Zhang, Haochen You, Lubin Gan, Yilei Yuan, Jin Huang

[AAMAS 2026] D³MAS: Decompose, Deduce, and Distribute for Enhanced Knowledge Sharing in Multi-Agent Systems
Heng Zhang, Yuling Shi, Xiaodong Gu, Zijian Zhang, Haochen You, Lubin Gan, Yilei Yuan, Jin Huang

[TSE 2025] Synthetic Malware at Scale: Malicious Code Generation with Code Transplanting
Guangzhan Wang, Diwei Chen, Yuting Chen, Beijun Shen, Xiaodong Gu

[ASE 2025] LongCodeZip: Compress Long Context for Code Language Models
Yuling Shi, Yichun Qian, Hongyu Zhang, Beijun Shen, Xiaodong Gu
[paper] [code]

[EMNLP 2025] Transplant Then Regenerate: A New Paradigm for Text Data Augmentation
Guangzhan Wang, Hongyu Zhang, Beijun Shen and Xiaodong Gu
[paper] [code]

[EMNLP 2025 Findings] LastingBench: Defend Benchmarks Against Knowledge Leakage
Yixiong Fang, Tianran Sun, Yuling Shi, Min Wang and Xiaodong Gu
[paper] [code]

[ICSE 2025] Between Lines of Code: Unraveling the Distinct Patterns of Machine and Human Programmers
Yuling Shi, Hongyu Zhang, Chengcheng Wan and Xiaodong Gu
[paper] [code] [bibtex]

[TOSEM 2025] On the Effectiveness of Large Language Models in Domain-Specific Code Generation
Xiaodong Gu, Meng Chen, Yalan Lin, Yuhan Hu, Hongyu Zhang, Chengcheng Wan, Zhao Wei, Yong Xu, Juhong Wang
(ESI Highly Citated Paper)
[paper]

[JSS 2025] Just-in-time software defect prediction via bi-modal change representation learning
Yuze Jiang, Beijun Shen, Xiaodong Gu
[paper]

[ASE 2024]How Effectively Do Code Language Models Understand Poor-Readability Code?
Chao Hu, Yitian Chai, Hao Zhou, Fandong Meng, Jie Zhou and Xiaodong Gu
[paper] [code] [bibtex]

[TSE 2024]VarGAN: Adversarial Learning of Variable Semantic Representations
Yalan Lin, Chengcheng Wan, Shuwen Bai, Xiaodong Gu
[paper] [code]

[APSEC 2024]Unraveling the Potential of Large Language Models in Code Translation: How Far are We?
Qinxiao Tao, Tingrui Yu, Xiaodong Gu, Beijun Shen
[paper] [code]

[ASE 2023] On the Evaluation of Neural Code Translation: Taxonomy and Benchmark
Mingsheng Jiao, Tingrui Yu, Xuan Li, Guanjie Qiu, Xiaodong Gu, Beijun Shen
[paper] [slides] [code]

[ASE 2023] InfeRE: Step-by-Step Regex Generation via Chain of Inference
Shuai Zhang, Xiaodong Gu, Yuting Chen, Beijun Shen
[paper] [slides] [code] [bibtex]

[ESEC/FSE 2023] Self-Supervised Query Reformulation for Code Search
Yuetian Mao, Chengcheng Wan, Yuze Jiang, Xiaodong Gu
[paper] [slides] [code] [bibtex]

[FSE 2022]Diet Code Is Healthy: Simplifying Programs for Pre-Trained Models of Code
Zhaowei Zhang, Hongyu Zhang, Beijun Shen, Xiaodong Gu
[paper] [slides] [code] [bibtex]

[ICSE 2022] Cross-Domain Deep Code Search with Meta Learning
Yitian Chai, Hongyu Zhang, Beijun Shen and Xiaodong Gu
[paper] [code] [slides] [bibtex]

资助项目

  • 上海自然科学基金面上项目,面向复杂场景的程序自动生成技术,2025.7-2028.6,主持

  • CCF-华为胡杨林基金,针对问题单解决的Multi-Agent能力提升,2025.1.1-2025.7.31,主持

  • 华为,场景知识增强的Java代码自动生成技术,2024.9.1-2025.2.25,主持

  • 宁德时代,基于大模型的软件需求标准化技术,2024.6.1-2025.5.31,主持

  • 宁德时代,基于大模型的测试用例转换技术,2024.6.1-2025.1.31,主持

  • 宁德时代,基于大模型的变量模糊搜索技术,2024.6.1-2025.1.31,主持

  • 国家重点研发计划,面向场景计算的低代码开发方法与环境,2023.12-2026.12,参与

  • 中国航空无线电电子研究所,民机软件研制过程辅助系统,2022.12-2026.6,主持

  • 上海交通大学-华为密码学与数字信任创新实验室课题,基于大模型的恶意代码样本生成,2023.5.1-2024.4.31,主持

  • CCF-腾讯犀牛鸟基金,特定领域程序自动生成,2022.10.1-2023.12.31,主持

  • CCF-百度松果基金,基于预训练模型的程序表征,2021.9.1-2022.8.30,主持

  • 国家自然科学基金,基于小样本学习的跨语言程序自动生成,2022.1.1-2024.12.31,主持


获奖信息

华为火花奖2022