Hallucinations can improve large language models in drug discovery

์ €์ž: Shuzhou Yuan, Zhan Qu, Ashish Yashwanth Kangen, Michael Fรคrber | ๋‚ ์งœ: 2025 | DOI: N/A 📄 PDF


Essence

Figure 1

HHEM-2.1-Open ๋ชจ๋ธ์„ ์‚ฌ์šฉํ•œ ์‚ฌ์‹ค์  ์ผ๊ด€์„ฑ ์ ์ˆ˜. ๋Œ€๋ถ€๋ถ„์˜ LLM์ด MolT5 ๊ธฐ์ค€ ์„ค๋ช…๊ณผ ๋‚ฎ์€ ์ผ๊ด€์„ฑ์„ ๋ณด์—ฌ ๊ด‘๋ฒ”์œ„ํ•œ ํ™˜๊ฐ์„ ๋‚˜ํƒ€๋ƒ„

์ผ๋ฐ˜์ ์œผ๋กœ ๋ฌธ์ œ๋กœ ๊ฐ„์ฃผ๋˜๋Š” ๋Œ€๊ทœ๋ชจ ์–ธ์–ด๋ชจ๋ธ(LLM)์˜ ํ™˜๊ฐ(hallucinations)์ด ์˜คํžˆ๋ ค ์•ฝ๋ฌผ ๋ฐœ๊ฒฌ์˜ ๋ถ„์ž ํŠน์„ฑ ์˜ˆ์ธก ๊ณผ์ œ์—์„œ ๋ชจ๋ธ์˜ ์„ฑ๋Šฅ์„ ํ–ฅ์ƒ์‹œํ‚ฌ ์ˆ˜ ์žˆ๋‹ค๋Š” ์—ญ์„ค์ ์ธ ๋ฐœ๊ฒฌ์„ ์ œ์‹œํ•œ๋‹ค. ๊ตฌ์กฐ์  ์˜ค๊ธฐ์ˆ (structural misdescription)๊ณผ ๊ฐ™์€ ํŠน์ • ์œ ํ˜•์˜ ํ™˜๊ฐ์ด ๋ชจ๋ธ์˜ ์ผ๋ฐ˜ํ™” ๋Šฅ๋ ฅ์„ ์ฆ๋Œ€์‹œํ‚ค๋Š” ์•”๋ฌต์  ๋ฐ˜์‚ฌ์‹ค(implicit counterfactual)๋กœ ์ž‘๋™ํ•จ์„ ๋ณด์—ฌ์ค€๋‹ค.

Motivation

Achievement

Figure 2

HIV ๋ฐ์ดํ„ฐ์…‹ ์ƒ˜ํ”Œ์„ ์ด์šฉํ•œ ๋ฐฉ๋ฒ• ์„ค๋ช…. SMILES๋กœ๋ถ€ํ„ฐ ํ™˜๊ฐ๋œ ๋ถ„์ž ์„ค๋ช…์„ ์ƒ์„ฑํ•œ ํ›„ ์ด์ง„ ๋ถ„๋ฅ˜ ๊ณผ์ œ์˜ ํ”„๋กฌํ”„ํŠธ์— ํฌํ•จ

  1. ์„ฑ๋Šฅ ํ–ฅ์ƒ: Falcon3-Mamba-7B๊ฐ€ ํ™˜๊ฐ ํฌํ•จ ์‹œ ๋ชจ๋“  ๊ธฐ์ค€์„ ์„ ์ดˆ๊ณผํ•˜๋ฉฐ PubChem ๊ธฐ์ค€์„ ๋ณด๋‹ค ROC-AUC 8.22% ๊ฐœ์„ . Llama-3.1-8B๋Š” SMILES ๊ธฐ์ค€์„  ๋Œ€๋น„ 15.8%, MolT5 ๊ธฐ์ค€์„  ๋Œ€๋น„ 11.2% ํ–ฅ์ƒ. GPT-4o์—์„œ ์ƒ์„ฑ๋œ ํ™˜๊ฐ์ด ๋ชจ๋ธ๋“ค ๊ฐ„ ๊ฐ€์žฅ ์ผ๊ด€๋œ ์„ฑ๊ณผ ์ œ๊ณต.
  2. ํ™˜๊ฐ ๋ถ„์„: 18,000๊ฐœ ์ด์ƒ์˜ ์œ ์ตํ•œ ํ™˜๊ฐ์„ ์‹๋ณ„ ๋ฐ ๋ถ„๋ฅ˜. ๊ตฌ์กฐ์  ์˜ค๊ธฐ์ˆ ์ด ๊ฐ€์žฅ ์˜ํ–ฅ๋ ฅ ์žˆ๋Š” ์œ ํ˜•์œผ๋กœ ๋„์ถœ๋˜์–ด, ๋ถ„์ž ๊ตฌ์กฐ์— ๋Œ€ํ•œ ํ™˜๊ฐ๋œ ์ง„์ˆ ์ด ๋ชจ๋ธ ์‹ ๋ขฐ๋„๋ฅผ ์ฆ๊ฐ€์‹œํ‚ฌ ์ˆ˜ ์žˆ์Œ์„ ์‹œ์‚ฌ. ๊ธฐํƒ€ ์œ ํ˜•: ๊ธฐ๋Šฅ์  ํ™˜๊ฐ(functional hallucination), ์œ ์ถ”์  ํ™˜๊ฐ(analogical hallucination), ์ผ๋ฐ˜์  ์ˆ˜์‚ฌ(generic fluff).

How

Figure 3

7๊ฐœ LLM ์ „๋ฐ˜์˜ ROC-AUC ํ‰๊ท  ํ–ฅ์ƒ๋„

Originality

Limitation & Further Study

Evaluation

Novelty: 4/5 Technical Soundness: 4/5 Significance: 4/5 Clarity: 4/5 Overall: 4/5

์ดํ‰: ํ™˜๊ฐ์˜ ์—ญ์„ค์  ์œ ์ต์„ฑ์„ ์‹ค์ฆ์ ์œผ๋กœ ์ œ์‹œํ•˜๋Š” ์ฐฝ์˜์  ์—ฐ๊ตฌ๋กœ, ์•ฝ๋ฌผ ๋ฐœ๊ฒฌ ๋„๋ฉ”์ธ์— ์ƒˆ๋กœ์šด ๊ด€์ ์„ ์ œ๊ณตํ•œ๋‹ค. ๋‹ค๋งŒ ๋ฉ”์ปค๋‹ˆ์ฆ˜ ์ดํ•ด์™€ ์‹ค์ œ ์ ์šฉ ๊ฐ€๋Šฅ์„ฑ ๊ฒ€์ฆ์ด ๋ณด์™„๋˜๋ฉด ์ž„ํŒฉํŠธ๊ฐ€ ๋”์šฑ ๊ฐ•ํ™”๋  ๊ฒƒ์ด๋‹ค.

๊ฐ™์ด ๋ณด๋ฉด ์ข‹์€ ๋…ผ๋ฌธ

๊ธฐ๋ฐ˜ ์—ฐ๊ตฌ
Retrieval-augmented generation์˜ ํ™˜๊ฐ ์™„ํ™” ์—ญํ•  ๋ฐ ํ•œ๊ณ„ ๋…ผ์˜๊ฐ€ LLM ํ™˜๊ฐ์˜ ๊ฐ€์น˜ ํ‰๊ฐ€ ๋ฌธ์ œ(๋ณธ ๋…ผ๋ฌธ)์™€ ์ด๋ก ์ ์œผ๋กœ ์—ฐ๊ฒฐ๋ฉ๋‹ˆ๋‹ค.
๊ธฐ๋ฐ˜ ์—ฐ๊ตฌ
๋ชจ๋ธ ํ™˜๊ฐ ๋ฐ ์‹ ๋ขฐ๋„ ํ‰๊ฐ€์— ๊ด€ํ•œ ๊ณผํ•™์  ํ‰๊ฐ€ ์ฒด๊ณ„๋ฅผ ์ œ๊ณตํ•˜๋ฏ€๋กœ, ํ™˜๊ฐ์ด ์„ฑ๋Šฅ์— ๋ฏธ์น˜๋Š” ํšจ๊ณผ ํ•ด์„์˜ ์ด๋ก ์  ๊ทผ๊ฑฐ๋ฅผ ์ค€๋‹ค.
๋‹ค๋ฅธ ์ ‘๊ทผ
LLM ์ ์šฉ์ด ๋…ผ๋ฌธ ์ž‘์„ฑ ๋“ฑ ๊ณผํ•™์  ์ฐฝ์ž‘์—์„œ ๋ณธ์˜ ์•„๋‹ˆ๊ฒŒ ์ž˜๋ชป๋œ ์ •๋ณด ํ™•์‚ฐ ๋ฌธ์ œ๋ฅผ ์‹ฌ์ธต์ ์œผ๋กœ ์กฐ๋ช…ํ•˜๋ฉฐ, ํ™˜๊ฐ์˜ ์œ„ํ—˜์„ฑ๊ณผ ๊ฐ€์น˜ ๋ชจ๋‘๋ฅผ ๊ท ํ˜• ์žˆ๊ฒŒ ๊ณ ๋ คํ•ฉ๋‹ˆ๋‹ค.
๋‹ค๋ฅธ ์ ‘๊ทผ
์ง„ํ™” ์•Œ๊ณ ๋ฆฌ์ฆ˜ ๊ธฐ๋ฐ˜ ๋ถ„์ž ์ตœ์ ํ™”๋ฅผ ์œ„ํ•œ ๋Œ€์•ˆ์  ๋ฐฉ๋ฒ•๋ก ์„ ์ œ์•ˆํ•˜๋Š” ๊ด€๋ จ ์—ฐ๊ตฌ์ด๋‹ค.
๋‹ค๋ฅธ ์ ‘๊ทผ
์ƒ๋ฌผํ•™์  ๋ฐ์ดํ„ฐ์—์„œ ๋Œ€ํ˜• ์‚ฌ์ „ํ•™์Šต ๋ชจ๋ธ์˜ ํšจ์œจ์  ๋ฏธ์„ธ์กฐ์ •์„ ์œ„ํ•œ ๊ด€๋ จ ๋ฐฉ๋ฒ•๋ก  ์—ฐ๊ตฌ์ด๋‹ค.
๋‹ค๋ฅธ ์ ‘๊ทผ
397๋ฒˆ ๋…ผ๋ฌธ์€ LLM ํ™˜๊ฐ์˜ ๊ธ์ •์  ์ธก๋ฉด์„ ์•ฝ๋ฌผ ๋ฐœ๊ฒฌ prediction์— ๋ถ„์„ํ•˜์—ฌ, 3276๋ฒˆ์˜ ๋ณ€์ด ์‹œ๊ทธ๋‹ˆ์ฒ˜ ์ถ”์ถœ ๋ฐ ์ƒ์„ฑ๋ชจ๋ธ ์„ฑ๋Šฅ ํ‰๊ฐ€์™€ ๋Œ€๋น„๋ฉ๋‹ˆ๋‹ค.
๋‹ค๋ฅธ ์ ‘๊ทผ
์•ฝ๋ฌผ ๋ฐœ๊ฒฌ์—์„œ ๋‹ค์–‘ํ•œ ๋จธ์‹ ๋Ÿฌ๋‹ ๋ชจ๋ธ์˜ ์„ฑ๋Šฅ์„ ๋น„๊ตํ•˜๋Š” ์œ ์‚ฌํ•œ ๋ฒค์น˜๋งˆํ‚น ์—ฐ๊ตฌ์ด๋‹ค.
ํ›„์† ์—ฐ๊ตฌ
LLM์˜ ํ™˜๊ฐ ํ˜„์ƒ์—์„œ ๋น„๋กฏ๋œ ์ •๋ณด ์™œ๊ณก ๋ฌธ์ œ ๋ฐ ๊ทธ ์˜ํ–ฅ์— ๋Œ€ํ•ด ์‹คํ—˜์ ์œผ๋กœ ํ™•์žฅยท๊ฒ€์ฆํ•ฉ๋‹ˆ๋‹ค.
ํ›„์† ์—ฐ๊ตฌ
์•ฝ๋ฌผ ์žฌ์ฐฝ์ถœ ๋ชจ๋ธ์˜ ์„ฑ๋Šฅ์„ ํ–ฅ์ƒ์‹œํ‚ค๊ธฐ ์œ„ํ•œ ์ถ”๊ฐ€์ ์ธ ๋ฐฉ๋ฒ•๋ก ์„ ์ œ๊ณตํ•œ๋‹ค.
์‘์šฉ ์‚ฌ๋ก€
Efficient Evolutionary Search Over Chemical Space ๋…ผ๋ฌธ์€ ๊ตฌ์กฐ์  ํ™˜๊ฐ์ด ์‹ค์ œ ๋ถ„์ž ํŠน์„ฑ ์˜ˆ์ธก ๋ฐ ์‹ ์•ฝ ํƒ์ƒ‰ ๋‹ค์–‘ํ•œ ๋ฌธ์ œ์—์„œ ์–ด๋–ค ํšจ๊ณผ๋ฅผ ๋‚ด๋Š”์ง€ ์‚ฌ๋ก€ ์ค‘์‹ฌ์œผ๋กœ ํ™œ์šฉํ•œ๋‹ค.
์‘์šฉ ์‚ฌ๋ก€
3276๋ฒˆ ๋…ผ๋ฌธ์€ ๋ณ€์ด ์‹œ๊ทธ๋‹ˆ์ฒ˜ ์˜ˆ์ธก ๋“ฑ ์‹ค์ œ ๋ถ„์ž ํƒœ์Šคํฌ์—์„œ ์ƒ์„ฑ๋ชจ๋ธ์˜ ํŠน์ด์„ฑ/์˜ค๋ฅ˜๊ฐ€ ์„ฑ๋Šฅ์— ๋ผ์น˜๋Š” ์˜ํ–ฅ๊นŒ์ง€ ๋‹ค๋ค„, 397๋ฒˆ์˜ ํ™˜๊ฐ์ด ์ผ๋ฐ˜ํ™”์— ๋ฏธ์น˜๋Š” ์‹ค์ œ ์˜ˆ์‹œ๋กœ ์ฝํž ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.
๋ฐ˜๋ก /๋น„ํŒ
Hallucination mitigation ๋…ผ๋ฌธ์€ ํ™˜๊ฐ ์ค„์ด๊ธฐ๋ฅผ ๋ชฉํ‘œ๋กœ ํ•˜์—ฌ, ์˜๋„์  ํ™˜๊ฐ ํ™œ์šฉ ๊ฐ€๋Šฅ์„ฑ๊ณผ ํ•œ๊ณ„๋ฅผ ๋Œ€์กฐ์ ์œผ๋กœ ๋ณด์—ฌ์ค€๋‹ค.
๋ฐ˜๋ก /๋น„ํŒ
Hallucinations can improve large language models in drug discovery ๋…ผ๋ฌธ์€ '๋ถˆ์•ˆ์ •์„ฑ'์ด ํ•ญ์ƒ ๋ถ€์ •์ ์ด์ง€ ์•Š๋‹ค๋Š” ์‹œ๊ฐ์„ ์ œ์‹œํ•˜์—ฌ, reward-guided fine-tuning์˜ ํ•œ๊ณ„์™€ ํ•ด์„์„ ๊ท ํ˜•๊ฐ์žˆ๊ฒŒ ๋ณด์—ฌ์ค๋‹ˆ๋‹ค.
๋ฐ˜๋ก /๋น„ํŒ
์•ฝ๋ฌผ ๋ฐœ๊ฒฌ LLM์˜ ์˜ˆ์ธก ์‹ ๋ขฐ์„ฑ์— ๋Œ€ํ•ด ๋ณด์ˆ˜์  ์‹œ๊ฐ(์‹ ๋ขฐํ•  ์ˆ˜ ์žˆ๋Š” ์˜ˆ์ธก์€ ๋ฌด์—‡์ธ๊ฐ€)๊ณผ ํ™˜๊ฐ ์ˆ˜์šฉ(์„ฑ๋Šฅ ํ–ฅ์ƒ ์š”์†Œ)์ด๋ผ๋Š” ์ƒ๋ฐ˜๋œ ๊ด€์ ์„ ๋น„๊ตํ•  ์ˆ˜ ์žˆ๋‹ค.
๋ฐ˜๋ก /๋น„ํŒ
397 ๋…ผ๋ฌธ์€ LLM์˜ ํ™˜๊ฐ ํ˜„์ƒ์ด ์˜คํžˆ๋ ค ํ™”ํ•ฉ๋ฌผ ์ฐฝ์˜์„ฑ์— ๊ธ์ •์ ์œผ๋กœ ์ž‘์šฉํ•  ์ˆ˜ ์žˆ์Œ์„ ๋…ผ์˜ํ•˜๋ฉฐ, 3131์˜ ์ฐฝ์˜์„ฑ ๋ฉ”์ปค๋‹ˆ์ฆ˜ ํ•ด์„๊ณผ ์ƒ๋ฐ˜๋ฉ๋‹ˆ๋‹ค.
← ๋ชฉ๋ก์œผ๋กœ ๋Œ์•„๊ฐ€๊ธฐ

๐ŸŽง Audio Overview

์ด ๋…ผ๋ฌธ ๋ฆฌ๋ทฐ๋ฅผ ํŒŸ์บ์ŠคํŠธํ˜• ์˜ค๋””์˜ค๋กœ ์ƒ์„ฑํ•ฉ๋‹ˆ๋‹ค. (Gemini ยท ํ‚ค๋Š” ๋ธŒ๋ผ์šฐ์ €์—๋งŒ ์ €์žฅ ยท ์™„์„ฑ๋ณธ์€ ์ด๋ฉ”์ผ๋กœ๋„ ์ „์†ก)
โ–ธ ๊ณ ๊ธ‰: ๊ตฌ์„ฑ ๋ฐฉํ–ฅ(๋Œ€๋ณธ ์ž‘์„ฑ ์ง€์นจ) ์ง์ ‘ ์ˆ˜์ •