Retrieval-Augmented Foundation Models for Matched Molecular Pair Transformations to Recapitulate Medicinal Chemistry Intuition

์ €์ž: | ๋‚ ์งœ: 2026-02-18 | URL: https://arxiv.org/abs/2602.16684 📄 PDF


Essence

Figure 1

Figure 1: An example of (a) Matched Molecular Pairs (MMP);

์˜์•ฝํ™”ํ•™์ž์˜ ์ง๊ด€์„ ๋ฐ˜์˜ํ•˜์—ฌ ๋งฅ๋ฝ ๋…๋ฆฝ์ ์ธ ๋ถ„์ž ๋ถ€๋ถ„๊ตฐ ๋ณ€ํ™˜(MMPT)์„ ํ•™์Šตํ•˜๋Š” ํŒŒ์šด๋ฐ์ด์…˜ ๋ชจ๋ธ์„ ์ œ์‹œํ•˜๊ณ , ๊ฒ€์ƒ‰ ์ฆ๊ฐ• ํ”„๋ ˆ์ž„์›Œํฌ(MMPT-RAG)๋ฅผ ํ†ตํ•ด ์™ธ๋ถ€ ์ฐธ์กฐ ์œ ์‚ฌ์ฒด๋ฅผ ํ™œ์šฉํ•˜์—ฌ ์ œ์–ด ๊ฐ€๋Šฅํ•˜๊ณ  ๋‹ค์–‘ํ•œ ์•ฝ๋ฌผ ํ›„๋ณด ๊ตฌ์กฐ๋ฅผ ์ƒ์„ฑํ•œ๋‹ค.

Motivation

Achievement

Figure 4

Figure 4: UMAP visualization of MMPT-FM and MMPT-RAGโ€™s

How

Figure 2

Figure 2: Overview of the proposed MMPT framework. (a) The foundation model (MMPT-FM) is trained on large-scale MMPT

Originality

Limitation & Further Study

Evaluation

Novelty: 4/5 Technical Soundness: 3/5 Significance: 4/5 Clarity: 4/5 Overall: 4/5

์ดํ‰: ๋ณธ ๋…ผ๋ฌธ์€ MMPT๋ฅผ ๋ช…์‹œ์ ์ธ ์ƒ์„ฑ ๋‹จ์œ„๋กœ ์‚ผ๋Š” ํ˜์‹ ์  ์ ‘๊ทผ์œผ๋กœ ์˜์•ฝํ™”ํ•™์˜ ์ง๊ด€์„ ML ์‹œ์Šคํ…œ์— ํ†ตํ•ฉํ•˜๋ฉฐ, ๋Œ€๊ทœ๋ชจ ๋ฐ์ดํ„ฐ ๊ธฐ๋ฐ˜ ํ•™์Šต๊ณผ ๊ฒ€์ƒ‰ ์ฆ๊ฐ• ๊ธฐ๋ฒ•์„ ๊ฒฐํ•ฉํ•˜์—ฌ ์ œ์–ด ๊ฐ€๋Šฅํ•˜๊ณ  ์‹ค์šฉ์ ์ธ ์•ฝ๋ฌผ ํ›„๋ณด ์ƒ์„ฑ์„ ์‹คํ˜„ํ•œ๋‹ค. ๋‹ค์ค‘ ๋ฒค์น˜๋งˆํฌ์—์„œ์˜ ๊ฐ•๋ ฅํ•œ ์„ฑ๊ณผ์™€ ์‚ฐ์—…์  ์ ์šฉ ๊ฐ€๋Šฅ์„ฑ์œผ๋กœ ์ธํ•ด ์•ฝ๋ฌผ ๊ฐœ๋ฐœ ๋ถ„์•ผ์— ์˜๋ฏธ ์žˆ๋Š” ๊ธฐ์—ฌ๋ฅผ ํ•œ๋‹ค.

๊ฐ™์ด ๋ณด๋ฉด ์ข‹์€ ๋…ผ๋ฌธ

๊ธฐ๋ฐ˜ ์—ฐ๊ตฌ
AI ๊ธฐ๋ฐ˜ ํ™”ํ•™ ์—์ด์ „ํŠธ๊ฐ€ ๋„๊ตฌ ํ™œ์šฉ์„ ํ†ตํ•ด ์‹ค์ œ ์•ฝ๋ฌผ ์„ค๊ณ„/ํƒ์ƒ‰์„ ์ง„ํ–‰ํ•˜๋Š” ๋ฐฉ์‹์˜ ๊ธฐ๋ณธ ๊ฐœ๋…์„ ๋‹ด๊ณ  ์žˆ๋‹ค.
๊ธฐ๋ฐ˜ ์—ฐ๊ตฌ
์ƒ๋ฌผ์ •๋ณดํ•™์—์„œ ํŒŒ์šด๋ฐ์ด์…˜ ๋ชจ๋ธ์˜ ๊ตฌ์กฐ์  ์ „์ด์™€ ์‘์šฉ์„ ํฌ๊ด„ํ•ด ์ž๋ฃŒ ์ฆ๊ฐ• ๋ฐ ๊ฒ€์ƒ‰ ํ™œ์šฉ MMPT ํ”„๋ ˆ์ž„์›Œํฌ์˜ ๊ธฐ๋ฐ˜์ด ๋œ๋‹ค.
๊ธฐ๋ฐ˜ ์—ฐ๊ตฌ
๋ถ„์ž ์„ค๊ณ„๋ฅผ ์œ„ํ•œ ์–ธ์–ด ๋ชจ๋ธ ๊ธฐ๋ฐ˜ ํ”„๋ ˆ์ž„์›Œํฌ์˜ ๋ฐฉ๋ฒ•๋ก ์  ๊ธฐ๋ฐ˜์ด ๋˜๋Š” ์—ฐ๊ตฌ์ด๋‹ค.
๊ธฐ๋ฐ˜ ์—ฐ๊ตฌ
์‹ ๊ฒฝ-๊ธฐํ˜ธ ํ†ตํ•ฉ ์ ‘๊ทผ๊ณผ LLM ๊ธฐ๋ฐ˜ ํ•ฉ์„ฑ ์ „๋žต ์ƒ์„ฑ์ด๋ผ๋Š” ๋ฌธ์ œ์‹์ด 3231์˜ ํšŒ์ˆ˜์ฆ๊ฐ• ๊ธฐ๋ฒ•๊ณผ ์ง์ ‘์ ์œผ๋กœ ์—ฐ๊ฒฐ๋œ๋‹ค.
๋‹ค๋ฅธ ์ ‘๊ทผ
๋ถ„์ž ๊ตฌ์กฐ ๋ณ€ํ™˜ ๋ฐ ์•ฝ๋ฌผ ํ›„๋ณด ์ƒ์„ฑ์„ ์œ„ํ•œ ๋Œ€์•ˆ์  ์ƒ์„ฑ ๋ชจ๋ธ ์ ‘๊ทผ๋ฒ•์„ ์ œ์‹œํ•˜๋Š” ์—ฐ๊ตฌ์ด๋‹ค.
๋‹ค๋ฅธ ์ ‘๊ทผ
๋ถ„์ž ๋ถ€๋ถ„๊ตฐ ๋ณ€ํ™˜ ๋ฐ ์•ฝ๋ฌผ ์ตœ์ ํ™”๋ฅผ ์œ„ํ•œ ์œ ์‚ฌํ•œ ์ ‘๊ทผ๋ฒ•์„ ์ œ์‹œํ•˜๋Š” ์—ฐ๊ตฌ์ด๋‹ค.
๋‹ค๋ฅธ ์ ‘๊ทผ
์•ฝ๋ฌผ ํ›„๋ณด ๊ตฌ์กฐ ๋ณ€ํ™˜ ๋ฌธ์ œ๋ฅผ ๋Œ€์กฐ ํ•™์Šต ๊ธฐ๋ฐ˜ ํ”„๋ ˆ์ž„์›Œํฌ๋กœ ์ ‘๊ทผํ•˜์—ฌ, 3231์˜ MMPT-RAG์™€ ์„œ๋กœ ๋‹ค๋ฅธ ๋ฐ์ดํ„ฐ ํ™œ์šฉ๋ฒ•์„ ๋น„๊ตํ•  ์ˆ˜ ์žˆ๋‹ค.
๋‹ค๋ฅธ ์ ‘๊ทผ
Mol-Debate ๋…ผ๋ฌธ์€ ๋ถ„์ž ๊ตฌ์กฐ ์ƒ์„ฑ์—์„œ ๋‹ค์ค‘ ์—์ด์ „ํŠธ ํ† ๋ก  ๊ธฐ๋ฐ˜ ๊ตฌ์กฐ ์ถ”๋ก ์„ ์‹œ๋„ํ•˜์—ฌ, 3231์˜ ๊ฒ€์ƒ‰ ์ฆ๊ฐ• ๊ธฐ๋ฐ˜ ๋ณ€ํ™˜๊ณผ ๋น„๊ต ํ‰๊ฐ€์— ์ข‹์Šต๋‹ˆ๋‹ค.
๋‹ค๋ฅธ ์ ‘๊ทผ
AgenticPosesRanker ๋…ผ๋ฌธ์€ ๋ถ„์ž ๊ตฌ์กฐ ์„ค๊ณ„๋ฅผ ์—์ด์ „ํŠธ ๊ธฐ๋ฐ˜์œผ๋กœ ๋‹ค๋ฃจ๋ฏ€๋กœ, 3231์˜ ๋ถ„์ž ๋ณ€ํ™˜๊ณผ ๋น„๊ตํ•ด ๋‹ค์–‘ํ•œ ์•ฝ๋ฌผ ์ƒ์„ฑ ์ „๋žต์˜ ์žฅ๋‹จ์ ์„ ํŒŒ์•…ํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.
ํ›„์† ์—ฐ๊ตฌ
๋ถ„์ž ์Œ ๋งž์ถค ๋ฐ ๊ฒ€์ƒ‰ ๊ธฐ๋Šฅ์„ ํ™œ์šฉํ•˜์—ฌ ์ƒˆ๋กœ์šด ์ „์ด๊ธˆ์† ์‚ฐํ™”๋ฌผ ํŠน์„ฑ ์˜ˆ์ธก ๊ฐ€๋Šฅ์„ฑ์„ ํ™•์žฅ์‹œํ‚ต๋‹ˆ๋‹ค.
ํ›„์† ์—ฐ๊ตฌ
BioMiner์˜ ๋ฌธํ—Œ๊ธฐ๋ฐ˜ ๋ถ„์žยท๋‹จ๋ฐฑ์งˆ์ •๋ณด ์ถ”์ถœ๊ณผ์ •์ด retrieval-augmented foundation model ๊ธฐ๋ฐ˜ ๋ถ„์ž๋งค์นญ์œผ๋กœ ํ™•์žฅ ์—ฐ๊ตฌ๋œ๋‹ค.
← ๋ชฉ๋ก์œผ๋กœ ๋Œ์•„๊ฐ€๊ธฐ

๐ŸŽง Audio Overview

์ด ๋…ผ๋ฌธ ๋ฆฌ๋ทฐ๋ฅผ ํŒŸ์บ์ŠคํŠธํ˜• ์˜ค๋””์˜ค๋กœ ์ƒ์„ฑํ•ฉ๋‹ˆ๋‹ค. (Gemini ยท ํ‚ค๋Š” ๋ธŒ๋ผ์šฐ์ €์—๋งŒ ์ €์žฅ ยท ์™„์„ฑ๋ณธ์€ ์ด๋ฉ”์ผ๋กœ๋„ ์ „์†ก)
โ–ธ ๊ณ ๊ธ‰: ๊ตฌ์„ฑ ๋ฐฉํ–ฅ(๋Œ€๋ณธ ์ž‘์„ฑ ์ง€์นจ) ์ง์ ‘ ์ˆ˜์ •