About
Thank you for visiting! I’m Long (*), a second-year CS Ph.D. student at the National University of Singapore (NUS), advised by Professors Kenji Kawaguchi, Kan Min Yen, Nancy Chen (NTU, A*STAR). My research focuses on efficient adaptation of large language and vision-language models:
- (1) Efficient LLM and LVLM alignment [Probe sampling (NeurIPS’24)].
- (2) Prompt understanding [FormatBias (preprint’24)], design [Multi-expert Prompting (EMNLP’24), Chain-of-Opinion (preprint’24)], and optimization [adv-ICL (ACL’24)].
- (3) Multi-agent system [Multi-expert Prompting (EMNLP’24)].
- (4) Vision-language [PromptChart (preprint’23), Unichart (EMNLP’23), OpenCQA (EMNLP’22), ChartQA (ACL’22)].
Previously, I received my B.Sc. in Mathematics (Statistics) and Computer Science at Nanyang Technological University, Singapore (NTU) advised by Prof. Shafiq Joty. At NTU, I won the SPMS Outstanding Undergraduate Award 2023 (Outstanding Achievement).
(*) my name means Dragon in Vietnamese context.
Education
National University of Singapore (NUS), Doctor of Philosophy (Ph.D.) in Computer Science (Highest Distinction), Aug. 2023 - present
Nanyang Technological University, Singapore (NTU), Bachelor of Science (B.Sc.) in Mathematical and Computer Sciences (Double major) (Honours, Highest Distinction), Aug. 2019 - Jul. 2023
Preprints
(*) denotes equal contribution.
2. LLMs Are Biased Towards Output Formats! Systematically Evaluating and Mitigating Output Format Bias of LLMs
Do Xuan Long, Hai Nguyen Ngoc, Tiviatis Sim, Hieu Dao, Shafiq Joty, Kenji Kawaguchi, Nancy Chen, Min-Yen Kan; Under review 2024.
1. Aligning Large Language Models with Human Opinions through Persona Selection and Value–Belief–Norm Reasoning
Do Xuan Long, Kenji Kawaguchi, Min-Yen Kan, Nancy F. Chen; Under review 2024.
Publications
(*) denotes equal contribution.
13. Multi-expert Prompting Improves Safety, Reliability, and Usefulness of Large Language Models
Do Xuan Long, Yen Duong, Anh Tuan Luu, Kenji Kawaguchi, Min-Yen Kan, Nancy Chen; Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing (EMNLP 2024).
12. Prompt Optimization via Adversarial In-Context Learning
Do Xuan Long*, Yiran Zhao*, Hannah Brown*, Yuxi Xie, James Xu Zhao, Nancy F. Chen, Kenji Kawaguchi, Michael Qizhe Xie, Junxian He; Proceedings of the 62nd Annual Meeting of Association for Computational Lingustics (ACL 2024, Oral).
11. Accelerating Greedy Coordinate Gradient and General Prompt Optimization via Probe Sampling
Yiran Zhao, Wenyue Zheng, Tianle Cai, Do Xuan Long, Kenji Kawaguchi, Anirudh Goyal, Michael Shieh; Proceedings of the Thirty-eighth Annual Conference on Neural Information Processing Systems (NeurIPS 2024).
10. xCodeEval: A Large Scale Multilingual Multitask Benchmark for Code Understanding, Generation, Translation and Retrieval
Mohammad Abdullah Matin Khan*, M Saiful Bari*, Xuan Long Do, Weishi Wang, Md Rizwan Parvez, Shafiq Joty; Proceedings of the 62nd Annual Meeting of Association for Computational Lingustics (ACL 2024).
9. ToXCL: A Unified Framework for Toxic Speech Detection and Explanation
Nhat M. Hoang*, Xuan Long Do*, Duc Anh Do, Duc Anh Vu, Anh Tuan Luu; Proceedings of the 2024 Annual Conference of the North American Chapter of the Association for Computational Linguistics (NAACL 2024).
8. ChatGPT as a Math Questioner? Evaluating ChatGPT on Generating Pre-university Math Questions
Phuoc Pham Van Long*, Duc Anh Vu*, Nhat Minh Hoang*, Xuan Long Do*, Anh Tuan Luu; Proceedings of the 39th ACM/SIGAPP Symposium On Applied Computing, AI for Education Track (ACM/SIGAPP SAC 2024).
7. UniChart: A Universal Vision-language Pretrained Model for Chart Comprehension and Reasoning
Ahmed Masry*, Parsa Kavehzadeh*, Xuan Long Do, Enamul Hoque, Shafiq Joty; Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing (EMNLP 2023).
6. Retrieving Multimodal Information for Augmented Generation: A Survey
Ruochen Zhao, Hailin Chen, Weishi Wang, Fangkai Jiao, Xuan Long Do, Chengwei Qin, Bosheng Ding, Xiaobao Guo, Minzhi Li, Xingxuan Li, Shafiq Joty; Proceedings of Findings of 2023 Conference on Empirical Methods in Natural Language Processing (EMNLP 2023 Findings).
5. Modeling What-to-ask and How-to-ask for Answer-unaware Conversational Question Generation
Xuan Long Do, Bowei Zou, Shafiq Joty, Anh Tai Tran, Liangming Pan, Nancy F. Chen, Ai Ti Aw; Proceedings of the 61st Annual Meeting of Association for Computational Lingustics (ACL 2023).
4. CoHS-CQG: Context and History Selection for Conversational Question Generation
Xuan Long Do, Bowei Zou, Liangming Pan, Nancy F. Chen, Shafiq Joty, Ai Ti Aw; Proceedings of the 29th International Conference on Computational Lingustics (COLING 2022).
3. OpenCQA: Open-ended Question Answering with Charts
Shankar Kantharaj, Xuan Long Do, Rixie Tiffany Leong, Jia Qing Tan, Enamul Hoque and Shafiq Joty; Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing (EMNLP 2022).
2. ChartQA: A Benchmark for Question Answering about Charts with Visual and Logical Reasoning
Ahmed Masry, Do Xuan Long, Jia Qing Tan, Shafiq Joty, Enamul Hoque; Proceedings of Findings of 60th Annual Meeting of the Association for Computational Linguistics (ACL 2022 Findings).
1. A Deep Learning Platform for Language Education Research and Development
Kye Min Tan, Richeng Duan, Xin Huang, Bowei Zou, Xuan Long Do; Proceedings of 2022 Conference of the International Speech Communication Association (INTERSPEECH 2022).
Research Experiences
- Natural Language Processing group at NTU (NTU-NLP), NTU President Research Scholar under FYP-URECA programme at NTU, advised by Prof. Shafiq Rayhan Joty, Jul. 2021 - Jul. 2023
Institute for Infocomm Research, A*STAR, Singapore, NLP Research Intern, advised by Dr. Bowei Zou and Dr. Nancy F. Chen, Dec. 2021 - Jan. 2023
Nanyang Technological University, Singapore, Research Assistant at NAIL, advised by Prof. Luu Anh Tuan, Aug. 2022 - Dec. 2022
Eureka Robotics, Singapore, Computer Vision Intern under SGInnovate Summation Programme, supervised by Dr. Xu Zhang, May. 2021 - Aug. 2021
Earth Observatory of Singapore, Research Assistant, advised by Dr. Christina WIDIWIJAYANTI, Aug. 2020 - May. 2021
Panasonic R&D Center, Singapore, Video Algorithm Research Intern, advised by Han Boon Teo, Jun. 2020 - Aug. 2020
Teaching
- National University of Singapore, Teaching Assistant (SoC, NUS), Jan. 2024 - present
- AY23-24 Sem 2: CS3244 Machine Learning.
- Nanyang Technological University, Singapore, Teaching Assistant (SPMS, NTU), Aug. 2022 - May. 2023
- AY22-23 Sem 2: MH3500: Statistics, PS0002: Introduction to Data Science and Artificial Intelligence.
- AY22-23 Sem 1: PS0001: Introduction to Computational Thinking.
Awards
- Undergraduate Awards
- SPMS Outstanding Undergraduate Award 2023 (Outstanding Achievement)
- A*STAR Computing and Information Science (ACIS) Scholarship, 2023-2027
- NTU President Research Scholar, 2022
- ACM-ICPC Jakarta Regional Contest 2021, 2022, team NTUDragons & WCRush, ICPCID
- Second Place Award, ISC 2021 Student Cluster Competition
- Second Prize, International Mathematics Competition for University Students 2020 (IMC)
- Dean’s List AY2019-2020, School of Computer Science and Engineering (SCSE), NTU
- High-school Awards
- Gold Medal, Iranian Geometry Olympiad 2018, Open Section (IGO)
- Honorable Prize, Vietnamese Mathematical Olympiads 2018 (VMO)
Services & Volunteers
- Professional Membership:
- Association for Computational Linguistics (ACL) Member, May 2021 - present
- Association for Computing Machinery (ACM) Member Dec. 2023 - present
- Program Committee/Reviewer:
- Journals: IEEE/ACM TASLP (2023, 2022).
- Conferences: ICLR (2024), EMNLP (2024), ACL RR (2024, 2023, 2022), COLING (2022).
- Student Volunteer Award:
- ACL (2024, 2023, 2022), EMNLP (2022).