About

Thank you for visiting! I’m Long (*), a second-year CS Ph.D. student at the National University of Singapore (NUS), advised by Professors Min-Yen Kan, Kenji Kawaguchi, Nancy Chen, Shafiq Joty. I am also joining the Amazon Search team to intern as an Applied Scientist, and was previously a Student Researcher at the Google Cloud AI Research team. My research focuses on efficient reasoning and adaptation of large language and vision-language models:

(1) Efficient LLM and LVLM alignment [LongGuide (ACL’25), Probe sampling (NeurIPS’24)].
(2) Prompt understanding [NLPromptEval (ACL’25), FormatBias (NAACL’25)], design [Multi-expert Prompting (EMNLP’24), Chain-of-Opinion (COLING’25)], and optimization [adv-ICL (ACL’24)].
(3) Multi-agent system [Multi-expert Prompting (EMNLP’24)].
(4) Vision-language [PromptChart (preprint’23), Unichart (EMNLP’23), OpenCQA (EMNLP’22), ChartQA (ACL’22)].

Previously, I received my B.Sc. in Mathematics (Statistics) and Computer Science at Nanyang Technological University, Singapore (NTU) advised by Prof. Shafiq Joty. At NTU, I won the SPMS Outstanding Undergraduate Award 2023 (Outstanding Achievement).

_{(*) my name means Dragon in Vietnamese context.}

_Education

National University of Singapore (NUS), Doctor of Philosophy (Ph.D.) in Computer Science (Highest Distinction), Aug. 2023 - present
Nanyang Technological University, Singapore (NTU), Bachelor of Science (B.Sc.) in Mathematical and Computer Sciences (Double major) (Honours, Highest Distinction), Aug. 2019 - Jul. 2023

_Publications

() denotes equal contribution.*

2025

17. What Makes a Good Natural Language Prompt?

_{Do Xuan Long, Duy Dinh, Ngoc-Hai Nguyen, Kenji Kawaguchi, Nancy F. Chen, Shafiq Joty, Min-Yen Kan; Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (ACL 2025).}

_{Featured by multiple X (e.g., 1, 2, 3) accounts.}

16. Beyond In-Context Learning: Aligning Long-form Generation of Large Language Models via Task-Inherent Attribute Guidelines

_{Do Xuan Long, Duong Ngoc Yen, Do Xuan Trong, Anh Tuan Luu, Kenji Kawaguchi, Shafiq Joty, Min-Yen Kan, Nancy F. Chen; Proceedings of Findings of the 63rd Annual Meeting of the Association for Computational Linguistics (ACL 2025 Findings).}

15. Systematically Evaluating and Mitigating Output Format Bias of LLMs

_{Do Xuan Long, Hai Nguyen Ngoc, Tiviatis Sim, Hieu Dao, Shafiq Joty, Kenji Kawaguchi, Nancy Chen, Min-Yen Kan; Proceedings of the 2025 Annual Conference of the Nations of the Americas Chapter of the Association for Computational Linguistics (NAACL 2025).}

14. Aligning Large Language Models with Human Opinions through Persona Selection and Value–Belief–Norm Reasoning

_{Do Xuan Long, Kenji Kawaguchi, Min-Yen Kan, Nancy F. Chen; Proceedings of the 31st International Conference on Computational Linguistics (COLING 2025).}

2024

13. Multi-expert Prompting Improves Safety, Reliability, and Usefulness of Large Language Models

_{Do Xuan Long, Yen Duong, Anh Tuan Luu, Kenji Kawaguchi, Min-Yen Kan, Nancy Chen; Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing (EMNLP 2024).}

_{Featured by elvis X and other X, Reddit, Linkedin accounts.}

12. Prompt Optimization via Adversarial In-Context Learning

_{Do Xuan Long*, Yiran Zhao*, Hannah Brown*, Yuxi Xie, James Xu Zhao, Nancy F. Chen, Kenji Kawaguchi, Michael Qizhe Xie, Junxian He; Proceedings of the 62nd Annual Meeting of Association for Computational Lingustics (ACL 2024, Oral).}

11. Accelerating Greedy Coordinate Gradient and General Prompt Optimization via Probe Sampling

_{Yiran Zhao, Wenyue Zheng, Tianle Cai, Do Xuan Long, Kenji Kawaguchi, Anirudh Goyal, Michael Shieh; Proceedings of the Thirty-eighth Annual Conference on Neural Information Processing Systems (NeurIPS 2024).}

10. xCodeEval: A Large Scale Multilingual Multitask Benchmark for Code Understanding, Generation, Translation and Retrieval

_{Mohammad Abdullah Matin Khan*, M Saiful Bari*, Xuan Long Do, Weishi Wang, Md Rizwan Parvez, Shafiq Joty; Proceedings of the 62nd Annual Meeting of Association for Computational Lingustics (ACL 2024).}

9. ToXCL: A Unified Framework for Toxic Speech Detection and Explanation

_{Nhat M. Hoang*, Xuan Long Do*, Duc Anh Do, Duc Anh Vu, Anh Tuan Luu; Proceedings of the 2024 Annual Conference of the North American Chapter of the Association for Computational Linguistics (NAACL 2024).}

8. ChatGPT as a Math Questioner? Evaluating ChatGPT on Generating Pre-university Math Questions

_{Phuoc Pham Van Long*, Duc Anh Vu*, Nhat Minh Hoang*, Xuan Long Do*, Anh Tuan Luu; Proceedings of the 39th ACM/SIGAPP Symposium On Applied Computing, AI for Education Track (ACM/SIGAPP SAC 2024).}

2022-2023

7. UniChart: A Universal Vision-language Pretrained Model for Chart Comprehension and Reasoning

_{Ahmed Masry*, Parsa Kavehzadeh*, Xuan Long Do, Enamul Hoque, Shafiq Joty; Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing (EMNLP 2023).}

6. Retrieving Multimodal Information for Augmented Generation: A Survey

_{Ruochen Zhao, Hailin Chen, Weishi Wang, Fangkai Jiao, Xuan Long Do, Chengwei Qin, Bosheng Ding, Xiaobao Guo, Minzhi Li, Xingxuan Li, Shafiq Joty; Proceedings of Findings of 2023 Conference on Empirical Methods in Natural Language Processing (EMNLP 2023 Findings).}

_{Selected Research Experiences}

Amazon Search, Applied Scientist Intern, Aug. 2025 - Present
Google Cloud AI Research, Student Researcher, hosted by Xingchen Wan and Sercan Ö. Arik, May. 2025 - Jul. 2025
Natural Language Processing group at NTU (NTU-NLP), NTU President Research Scholar under FYP-URECA programme at NTU, advised by Prof. Shafiq Rayhan Joty, Jul. 2021 - Jul. 2023

Institute for Infocomm Research, A*STAR, Singapore, Research Intern, advised by Dr. Bowei Zou and Dr. Nancy F. Chen, Dec. 2021 - Jan. 2023

_Teaching

National University of Singapore, Teaching Assistant (SoC, NUS), Jan. 2024 - present
- AY24-25:
  - Sem 2: CS5339 Theory and Algorithms for Machine Learning.
  - Sem 1: CS4248 Natural Language Processing.
- AY23-24:
  - Sem 2: CS3244 Machine Learning.
Nanyang Technological University, Singapore, Teaching Assistant (SPMS, NTU), Aug. 2022 - May. 2023
- AY22-23 Sem 2: MH3500: Statistics, PS0002: Introduction to Data Science and Artificial Intelligence.
- AY22-23 Sem 1: PS0001: Introduction to Computational Thinking.

_{Selected Awards}

Graduate Awards
- [2025] Research Achievement Award, SoC, NUS
- ARR Great Review (Feb’25, Oct’24, Aug’24)
Undergraduate Awards
- [2023] SPMS Outstanding Undergraduate Award (Outstanding Achievement)
- [2023] A*STAR Computing and Information Science (ACIS) Scholarship
- [2022, 2021] ACM-ICPC Jakarta Regional Contest, team NTUDragons & WCRush, ICPCID
- [2021] Second Place Award, ISC Student Cluster Competition
- [2020] Second Prize, International Mathematics Competition for University Students 2020 (IMC)
High-school Awards
- [2018] Gold Medal, Iranian Geometry Olympiad (IGO)
- [2018] Honorable Prize, Vietnamese Mathematical Olympiads (VMO)

_{Services & Volunteers}

Professional Membership:
- Association for Computational Linguistics (ACL) Member, May 2021 - present
- Association for Computing Machinery (ACM) Member Dec. 2023 - Dec. 2024
Program Committee/Reviewer:
- Journals: IEEE TIP (2025), TPAMI (2025), IEEE/ACM TASLP (2022-2023).
- Conferences: NeurIPS (2025), ICLR (2024-2025), ACL (2024-2025), EMNLP (2024), NAACL (2024-2025), ACL RR (2022-2025), COLING (2022).
Student Volunteer Award:
- ACL (2022-2024), EMNLP (2022).

Do Xuan Long

About

_Education

_Publications

() denotes equal contribution.*

2025

17. What Makes a Good Natural Language Prompt?

16. Beyond In-Context Learning: Aligning Long-form Generation of Large Language Models via Task-Inherent Attribute Guidelines

15. Systematically Evaluating and Mitigating Output Format Bias of LLMs

14. Aligning Large Language Models with Human Opinions through Persona Selection and Value–Belief–Norm Reasoning

2024

13. Multi-expert Prompting Improves Safety, Reliability, and Usefulness of Large Language Models

12. Prompt Optimization via Adversarial In-Context Learning

11. Accelerating Greedy Coordinate Gradient and General Prompt Optimization via Probe Sampling

10. xCodeEval: A Large Scale Multilingual Multitask Benchmark for Code Understanding, Generation, Translation and Retrieval

9. ToXCL: A Unified Framework for Toxic Speech Detection and Explanation

8. ChatGPT as a Math Questioner? Evaluating ChatGPT on Generating Pre-university Math Questions

2022-2023

7. UniChart: A Universal Vision-language Pretrained Model for Chart Comprehension and Reasoning

6. Retrieving Multimodal Information for Augmented Generation: A Survey

5. Modeling What-to-ask and How-to-ask for Answer-unaware Conversational Question Generation

4. CoHS-CQG: Context and History Selection for Conversational Question Generation

3. OpenCQA: Open-ended Question Answering with Charts

2. ChartQA: A Benchmark for Question Answering about Charts with Visual and Logical Reasoning

1. A Deep Learning Platform for Language Education Research and Development

_{Selected Research Experiences}

_Teaching

_{Selected Awards}

_{Services & Volunteers}

_{Media Coverage}

Do Xuan Long

Education

Publications

(*) denotes equal contribution.

2025

17. What Makes a Good Natural Language Prompt?

16. Beyond In-Context Learning: Aligning Long-form Generation of Large Language Models via Task-Inherent Attribute Guidelines

15. Systematically Evaluating and Mitigating Output Format Bias of LLMs

14. Aligning Large Language Models with Human Opinions through Persona Selection and Value–Belief–Norm Reasoning

2024

13. Multi-expert Prompting Improves Safety, Reliability, and Usefulness of Large Language Models

12. Prompt Optimization via Adversarial In-Context Learning

11. Accelerating Greedy Coordinate Gradient and General Prompt Optimization via Probe Sampling

10. xCodeEval: A Large Scale Multilingual Multitask Benchmark for Code Understanding, Generation, Translation and Retrieval

9. ToXCL: A Unified Framework for Toxic Speech Detection and Explanation

8. ChatGPT as a Math Questioner? Evaluating ChatGPT on Generating Pre-university Math Questions

2022-2023

7. UniChart: A Universal Vision-language Pretrained Model for Chart Comprehension and Reasoning

6. Retrieving Multimodal Information for Augmented Generation: A Survey

5. Modeling What-to-ask and How-to-ask for Answer-unaware Conversational Question Generation

4. CoHS-CQG: Context and History Selection for Conversational Question Generation

3. OpenCQA: Open-ended Question Answering with Charts

2. ChartQA: A Benchmark for Question Answering about Charts with Visual and Logical Reasoning

1. A Deep Learning Platform for Language Education Research and Development

Selected Research Experiences

Teaching

Selected Awards

Services & Volunteers

Media Coverage

_Education

_Publications

() denotes equal contribution.*

_{Selected Research Experiences}

_Teaching

_{Selected Awards}

_{Services & Volunteers}

_{Media Coverage}