Ran Xu

Room N410, Mathematics and Science Center

400 Dowman Dr, Atlanta, GA 30307

My name is Ran Xu. I’m a final-year Ph.D. student in Department of Computer Science at Emory University, co-advised by Prof. Carl Yang and Prof. Joyce C. Ho. Before that, I obtained my bachelor’s degree (with Highest Honors) from the Department of Computer Science, Emory University in 2021, where I worked with Prof. Jinho Choi.

My current research interest focuses on large language models, with a special interest on search (retrieval)-augmented [NAACL’25, COLM’25, NeurIPS’25] and tool-augmented language models [EMNLP’24a, arXiv’25a, arXiv’25b]. I have also worked on synthetic data generation [ACL Findings’24, EMNLP’24b], LLM alignment [arXiv’24, arXiv’25c], and few/zero-shot learning [ACL’23, AAAI’23].

Feel free to drop me an email (ran.xu at emory dot edu) if you have any questions about my research, or want to discuss about potential collaborations.

I am looking for fulltime industrial opportunities, starting from Fall 2025. Feel free to reach out if there is a good fit!

Educations

Emory University (2021 - Present): Ph.D. in Computational Science and Informatics; GPA: 3.98/4.00; Research Focus: Large Language Models, Retrieval-augmented Generation, Agents, Data Synthesis with applications in healthcare.; Advisor: Prof. Carl Yang & Prof. Joyce Ho
Emory University (2017 - 2021): B.S. in Computer Science, Double Major in Applied Mathematics; GPA: 3.97/4.00; Research Focus: Natural Language Processing.; Advisor: Prof. Jinho Choi

Industrial Experience

Search Intelligence, Google DeepMind (Jun 2025 - Nov 2025): Research Intern; Topic: Agentic Judge Training via Tool-Augmented RL [under review].; Mentors: Jingjing Chen, Jiayu Ye, Yu Wu, Manager: Hongkun Yu.

AI Lab, Tencent America (Feb 2025 - May 2025): Artificial General Intelligence Research Intern; Topic: Retrieval-augmented GUI Agents with Generative Guidelines [EMNLP Main Conference].; Mentors: Kaixin Ma, Wenhao Yu, Hongming Zhang, Manager: Dong Yu.

Query Understanding Team, Amazon (May 2024 - Oct 2024): Applied Scientist Intern; Topic: LLM Self-training for retrieval-augmented generation [NAACL Main Conference].; Mentor: Hui Liu, Manager: Qi He.

Meta Platforms, Inc. (May 2020 - Aug 2020): Enterprise Engineer Intern; Mentor: Zexi Zhang

News

Sep 18, 2025	Our paper AceSearcher: Bootstrapping Reasoning and Search for LLMs via Reinforced Self-Play is accepted to NeurIPS 2025 as Spotlight (top 3.2%). See you in San Diego!
Aug 20, 2025	Our paper on improving GUI Agents with tutorials is accepted to EMNLP 2025 Main Conference.
Sep 20, 2024	Three papers on LLMs for Text Retrieval, LLM Agents for Complex Tabular Reasoning and LLM Test-time Adaptation are accepted to EMNLP 2024.
May 16, 2024	Two papers on Synthetic Data Generation and Retrieval Augmented clinical predictions are accepted to ACL 2024.
Nov 28, 2022	Our paper Counterfactual and Factual Reasoning over Hypergraphs for Interpretable Clinical Predictions on EHR received the Best Paper Award (2 in total) at the Machine Learning for Health 2022.

Selected Publications

AceSearcher: Bootstrapping Reasoning and Search for LLMs via Reinforced Self-Play

Ran Xu, Yuchen Zhuang, Zihan Dong, Jonathan Wang, Yue Yu, Joyce C. Ho, Linjun Zhang, Haoyu Wang, Wenqi Shi, and Carl Yang

Proceedings of NeurIPS, 2025. (Spotlight)

arXiv Code Huggingface
MedAgentGym: Training LLM Agents for Code-Based Medical Reasoning at Scale

Ran Xu*, Yuchen Zhuang*, Yishan Zhong, Yue Yu, Xiangru Tang, Hang Wu, May D Wang, Peifeng Ruan, Donghan Yang, Tao Wang, Guanghua Xiao, Carl Yang, Yang Xie, and Wenqi Shi

arXiv preprint arXiv:2506.04405, 2025.

arXiv Code
SimRAG: Self-Improving Retrieval-Augmented Generation for Adapting Large Language Models to Specialized Domains

Ran Xu, Hui Liu, Sreyashi Nag, Zhenwei Dai, Yaochen Xie, Xianfeng Tang, Chen Luo, Yang Li, Joyce C. Ho, Carl Yang, and Qi He

Proceedings of NAACL, 2025.

Abs arXiv

Retrieval-augmented generation (RAG) enhances the question answering (QA) abilities of large language models (LLMs) by integrating external knowledge. However, adapting general-purpose RAG systems to specialized fields such as science and medicine poses unique challenges due to distribution shifts and limited access to domain-specific data. To tackle this, we propose SimRAG, a self-training approach that equips LLMs with joint capabilities of question answering and question generation for domain adaptation. Our method first fine-tunes LLMs on instruction-following, question-answering, and search-related data. Then, it prompts LLMs to generate diverse domain-relevant questions from unlabeled corpora, with an additional filtering strategy to retain high-quality synthetic examples. By leveraging these synthetic examples, the LLMs can improve their performance on domain-specific RAG tasks. Experiments on 11 datasets across three different domains verify the efficacy of SimRAG over baselines by 1.2%–8.6%.
BMRetriever: Tuning Large Language Models as Better Biomedical Text Retrievers

Ran Xu*, Wenqi Shi*, Yue Yu*, Yuchen Zhuang, Yanqiao Zhu, May Dongmei Wang, Joyce C. Ho, Chao Zhang, and Carl Yang

Proceedings of EMNLP, 2024.

Abs arXiv Code Huggingface

Developing effective biomedical retrieval models is important for excelling at knowledge-intensive biomedical tasks but still challenging due to the lack of sufficient publicly annotated biomedical data and computational resources. We present BMRetriever, a series of dense retrievers for enhancing biomedical retrieval via unsupervised pre-training on large biomedical corpora, followed by instruction fine-tuning on a combination of labeled datasets and synthetic pairs. Experiments on 5 biomedical tasks across 11 datasets verify BMRetriever’s efficacy on various biomedical applications. BMRetriever also exhibits strong parameter efficiency, with the 410M variant outperforming baselines up to 11.7 times larger, and the 2B variant matching the performance of models with over 5B parameters. The training data and model checkpoints are released at https://huggingface.co/BMRetriever to ensure transparency, reproducibility, and application to new domains.
Counterfactual and Factual Reasoning over Hypergraphs for Interpretable Clinical Predictions on EHR

Ran Xu, Yue Yu, Chao Zhang, Mohammed K Ali, Joyce C Ho, and Carl Yang

Proceedings of ML4H, 2022. (Best Paper Award)

Abs PDF Code

Electronic Health Record modeling is crucial for digital medicine. However, existing models ignore higher-order interactions among medical codes and their causal relations towards downstream clinical predictions. To address such limitations, we propose a novel framework CACHE, to provide effective and insightful clinical predictions based on hypergraph representation learning and counterfactual and factual reasoning techniques. Experiments on two real EHR datasets show the superior performance of CACHE. Case studies with a domain expert illustrate a preferred capability of CACHE in generating clinically meaningful interpretations towards the correct predictions.