About

Hello! I am Soyeong Jeong, a Ph.D. student at the MLAI Lab at KAIST. My research interests are mainly on information retrieval for solving open domain language tasks and interpretation of large language models for making them interpretable when deployed on real-world applications. Not limited to, I am interested in broad topics on natural language understanding.

Experiences

Applied Scientist Intern

Jul 2024 - Oct 2024
Amazon, Bellevue

Conducting research on a proactive conversational task with LLM-powered multi-agents.

Publications

My selected publications are as follows (See more in my CV):

  • Adaptive Multi-Agent Response Refinement in Conversational Systems
  • Soyeong Jeong, Aparna Elangovan, Emine Yilmaz, and Oleg Rokhlenko
    Under Review
  • Database-Augmented Query Representation for Information Retrieval
  • Soyeong Jeong, Jinheon Baek, Sukmin Cho, Sung Ju Hwang, and Jong C. Park
    Under Review
  • Unified Multi-Modal Interleaved Document Representation for Information Retrieval
  • Jaewoo Lee, Joonho Ko, Jinheon Baek, Soyeong Jeong, and Sung Ju Hwang
    Under Review
  • Lossless Acceleration of Large Language Models with Hierarchical Drafting based on Temporal Locality in Speculative Decoding
  • Sukmin Cho, Sangjin Choi, Taeho Hwang, Jeongyeon Seo, Soyeong Jeong, Huije Lee, Hoyun Song, Jong C. Park, and Youngjin Kwon
    Under Review
  • CVQA: Culturally-diverse Multilingual Visual Question Answering Benchmark
  • David Romero, Chenyang Lyu, Haryo Akbarianto Wibowo, Teresa Lynn, Injy Hamed, Aditya Nanda Kishore, …, Jinheon Baek, …, Soyeong Jeong, …, Thamar Solorio, and Alham Fikri Aji
    Neural Information Processing Systems Datasets and Benchmarks Track (NeurIPS Datasets & Benchmarks), 2024. (Oral)
  • Typos that Broke the RAG's Back: Genetic Attack on RAG Pipeline by Simulating Documents in the Wild via Low-level Perturbations
  • Sukmin Cho, Soyeong Jeong, Jeongyeon Seo, Taeho Hwang, and Jong C. Park
    Findings of the Empirical Methods in Natural Language Processing (Findings of EMNLP), 2024.
  • DSLR: Document Refinement with Sentence-Level Re-ranking and Reconstruction to Enhance Retrieval-Augmented Generation
  • Taeho Hwang, Soyeong Jeong, Sukmin Cho, SeungYoon Han, and Jong C. Park
    Knowledge Augmented Methods for NLP at Association for Computational Linguistics (KnowledgeNLP@ACL), 2024.
  • Ask LLMs Directly, "What shapes your bias?": Measuring Social Bias in Large Language Models
  • Jisu Shin, Hoyun Song, Huije Lee, Soyeong Jeong, and Jong C. Park
    Findings of the Association for Computational Linguistics (Findings of ACL), 2024.
  • Adaptive-RAG: Learning to Adapt Retrieval-Augmented Large Language Models through Question Complexity
  • Soyeong Jeong, Jinheon Baek, Sukmin Cho, Sung Ju Hwang, and Jong C. Park
    North American Chapter of the Association for Computational Linguistics (NAACL), 2024.
  • Test-Time Self-Adaptive Small Language Models for Question Answering
  • Soyeong Jeong, Jinheon Baek, Sukmin Cho, Sung Ju Hwang, and Jong C. Park
    Findings of the Empirical Methods in Natural Language Processing (Findings of EMNLP), 2023.
  • Knowledge-Augmented Language Model Verification
  • Jinheon Baek, Soyeong Jeong, Minki Kang, Jong C. Park, and Sung Ju Hwang
    Empirical Methods in Natural Language Processing (EMNLP), 2023.
  • Improving Zero-shot Reader by Reducing Distractions from Irrelevant Documents in Open-Domain Question Answering
  • Sukmin Cho, Jeongyeon Seo, Soyeong Jeong, Jong C. Park
    Findings of the Empirical Methods in Natural Language Processing (Findings of EMNLP), 2023.
  • Phrase Retrieval for Open Domain Conversational Question Answering with Conversational Dependency Modeling via Contrastive Learning
  • Soyeong Jeong, Jinheon Baek, Sung Ju Hwang, Jong C. Park
    Findings of the Association for Computational Linguistics (Findings of ACL), 2023.
  • Discrete Prompt Optimization via Constrained Generation for Zero-shot Re-ranker
  • Sukmin Cho, Soyeong Jeong, Jeongyeon Seo, Jong C. Park
    Findings of the Association for Computational Linguistics (Findings of ACL), 2023.
  • Realistic Conversational Question Answering with Answer Selection based on Calibrated Confidence and Uncertainty Measurement
  • Soyeong Jeong, Jinheon Baek, Sung Ju Hwang, Jong C. Park
    European Chapter of the Association for Computational Linguistics (EACL), 2023.
  • Augmenting Document Representations for Dense Retrieval with Interpolation and Perturbation
  • Soyeong Jeong, Jinheon Baek, Sukmin Cho, Sung Ju Hwang, Jong C. Park
    Association for Computational Linguistics (ACL), 2022. (Oral)
  • Query Generation with External Knowledge for Dense Retrieval
  • Sukmin Cho, Soyeong Jeong, Wonsuk Yang, Jong C. Park
    Deep Learning Inside Out at Association for Computational Linguistics (DeeLIO@ACL), 2022.
  • Unsupervised Document Expansion for Information Retrieval with Stochastic Text Generation
  • Soyeong Jeong, Jinheon Baek, ChaeHun Park, Jong C. Park
    Scholarly Document Processing at Conference of the North American Chapter of the Association for Computational Linguistics (SDP@NAACL), 2021. (Oral)
  • Development of Speech Emotion Recognition Algorithm using MFCC and Prosody
  • Hyejin Koo, Soyeong Jeong, Sungjae Yoon, Wonjong Kim
    International Conference on Electronics, Information, and Communication (ICEIC), 2020.

    Education

    Korea Advanced Institute of Science and Technology (KAIST)

    Mar. 2022 - current
    • Ph.D. in Graduate school of AI

    Korea Advanced Institute of Science and Technology (KAIST)

    Mar. 2020 - Feb. 2022
    • M.S. in School of Computing

    Korea University

    Mar. 2016 - Feb. 2020
    • Computer Science and Engineering (Graduated with Honor)
    • Software Technology and Enterprise Program (Interdisciplinary Program)

    Anyang Foreign Language High School

    Mar. 2013 - Feb. 2016
    • Major in English