About

Thank you for visiting! I’m Long (*), a second-year CS Ph.D. student at the National University of Singapore (NUS), advised by Professors Kenji Kawaguchi, Kan Min Yen, Nancy Chen (NTU, A*STAR). My research focuses on efficient adaptation of large language and vision-language models:

Previously, I received my B.Sc. in Mathematics (Statistics) and Computer Science at Nanyang Technological University, Singapore (NTU) advised by Prof. Shafiq Joty. At NTU, I won the SPMS Outstanding Undergraduate Award 2023 (Outstanding Achievement).

(*) my name means Dragon in Vietnamese context.

Education

Preprints

(*) denotes equal contribution.

2. LLMs Are Biased Towards Output Formats! Systematically Evaluating and Mitigating Output Format Bias of LLMs

Do Xuan Long, Hai Nguyen Ngoc, Tiviatis Sim, Hieu Dao, Shafiq Joty, Kenji Kawaguchi, Nancy Chen, Min-Yen Kan; Under review 2024.

1. Aligning Large Language Models with Human Opinions through Persona Selection and Value–Belief–Norm Reasoning

Do Xuan Long, Kenji Kawaguchi, Min-Yen Kan, Nancy F. Chen; Under review 2024.

Publications

(*) denotes equal contribution.

13. Multi-expert Prompting Improves Safety, Reliability, and Usefulness of Large Language Models

Do Xuan Long, Yen Duong, Anh Tuan Luu, Kenji Kawaguchi, Min-Yen Kan, Nancy Chen; Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing (EMNLP 2024);

Featured by elvis X, Huggingface, and other X, Reddit, Linkedin accounts.

12. Prompt Optimization via Adversarial In-Context Learning

Do Xuan Long*, Yiran Zhao*, Hannah Brown*, Yuxi Xie, James Xu Zhao, Nancy F. Chen, Kenji Kawaguchi, Michael Qizhe Xie, Junxian He; Proceedings of the 62nd Annual Meeting of Association for Computational Lingustics (ACL 2024, Oral).

11. Accelerating Greedy Coordinate Gradient and General Prompt Optimization via Probe Sampling

Yiran Zhao, Wenyue Zheng, Tianle Cai, Do Xuan Long, Kenji Kawaguchi, Anirudh Goyal, Michael Shieh; Proceedings of the Thirty-eighth Annual Conference on Neural Information Processing Systems (NeurIPS 2024).

10. xCodeEval: A Large Scale Multilingual Multitask Benchmark for Code Understanding, Generation, Translation and Retrieval

Mohammad Abdullah Matin Khan*, M Saiful Bari*, Xuan Long Do, Weishi Wang, Md Rizwan Parvez, Shafiq Joty; Proceedings of the 62nd Annual Meeting of Association for Computational Lingustics (ACL 2024).

9. ToXCL: A Unified Framework for Toxic Speech Detection and Explanation

Nhat M. Hoang*, Xuan Long Do*, Duc Anh Do, Duc Anh Vu, Anh Tuan Luu; Proceedings of the 2024 Annual Conference of the North American Chapter of the Association for Computational Linguistics (NAACL 2024).

8. ChatGPT as a Math Questioner? Evaluating ChatGPT on Generating Pre-university Math Questions

Phuoc Pham Van Long*, Duc Anh Vu*, Nhat Minh Hoang*, Xuan Long Do*, Anh Tuan Luu; Proceedings of the 39th ACM/SIGAPP Symposium On Applied Computing, AI for Education Track (ACM/SIGAPP SAC 2024).

7. UniChart: A Universal Vision-language Pretrained Model for Chart Comprehension and Reasoning

Ahmed Masry*, Parsa Kavehzadeh*, Xuan Long Do, Enamul Hoque, Shafiq Joty; Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing (EMNLP 2023).

6. Retrieving Multimodal Information for Augmented Generation: A Survey

Ruochen Zhao, Hailin Chen, Weishi Wang, Fangkai Jiao, Xuan Long Do, Chengwei Qin, Bosheng Ding, Xiaobao Guo, Minzhi Li, Xingxuan Li, Shafiq Joty; Proceedings of Findings of 2023 Conference on Empirical Methods in Natural Language Processing (EMNLP 2023 Findings).

5. Modeling What-to-ask and How-to-ask for Answer-unaware Conversational Question Generation

Xuan Long Do, Bowei Zou, Shafiq Joty, Anh Tai Tran, Liangming Pan, Nancy F. Chen, Ai Ti Aw; Proceedings of the 61st Annual Meeting of Association for Computational Lingustics (ACL 2023).

4. CoHS-CQG: Context and History Selection for Conversational Question Generation

Xuan Long Do, Bowei Zou, Liangming Pan, Nancy F. Chen, Shafiq Joty, Ai Ti Aw; Proceedings of the 29th International Conference on Computational Lingustics (COLING 2022).

3. OpenCQA: Open-ended Question Answering with Charts

Shankar Kantharaj, Xuan Long Do, Rixie Tiffany Leong, Jia Qing Tan, Enamul Hoque and Shafiq Joty; Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing (EMNLP 2022).

2. ChartQA: A Benchmark for Question Answering about Charts with Visual and Logical Reasoning

Ahmed Masry, Do Xuan Long, Jia Qing Tan, Shafiq Joty, Enamul Hoque; Proceedings of Findings of 60th Annual Meeting of the Association for Computational Linguistics (ACL 2022 Findings).

1. A Deep Learning Platform for Language Education Research and Development

Kye Min Tan, Richeng Duan, Xin Huang, Bowei Zou, Xuan Long Do; Proceedings of 2022 Conference of the International Speech Communication Association (INTERSPEECH 2022).

Research Experiences

Teaching

  • National University of Singapore, Teaching Assistant (SoC, NUS), Jan. 2024 - present
    • AY23-24 Sem 2: CS3244 Machine Learning.
  • Nanyang Technological University, Singapore, Teaching Assistant (SPMS, NTU), Aug. 2022 - May. 2023
    • AY22-23 Sem 2: MH3500: Statistics, PS0002: Introduction to Data Science and Artificial Intelligence.
    • AY22-23 Sem 1: PS0001: Introduction to Computational Thinking.

Awards

  • Undergraduate Awards
    • SPMS Outstanding Undergraduate Award 2023 (Outstanding Achievement)
    • A*STAR Computing and Information Science (ACIS) Scholarship, 2023-2027
    • NTU President Research Scholar, 2022
    • ACM-ICPC Jakarta Regional Contest 2021, 2022, team NTUDragons & WCRush, ICPCID
    • Second Place Award, ISC 2021 Student Cluster Competition
    • Second Prize, International Mathematics Competition for University Students 2020 (IMC)
    • Dean’s List AY2019-2020, School of Computer Science and Engineering (SCSE), NTU
  • High-school Awards
    • Gold Medal, Iranian Geometry Olympiad 2018, Open Section (IGO)
    • Honorable Prize, Vietnamese Mathematical Olympiads 2018 (VMO)

Services & Volunteers

  • Professional Membership:
    • Association for Computational Linguistics (ACL) Member, May 2021 - present
    • Association for Computing Machinery (ACM) Member Dec. 2023 - present
  • Program Committee/Reviewer:
    • Journals: IEEE/ACM TASLP (2023, 2022).
    • Conferences: ICLR (2024), EMNLP (2024), ACL RR (2024, 2023, 2022), COLING (2022).
  • Student Volunteer Award:
    • ACL (2024, 2023, 2022), EMNLP (2022).

Media Coverage