Do Larger Models Really Win in Drug Discovery? A Benchmark Assessment of Model Scaling in AI-Driven Molecular Property and Activity Prediction

์ €์ž: Jinjiang Guo | ๋‚ ์งœ: 2026-04-29 | URL: https://arxiv.org/abs/2604.26498 📄 PDF


Essence

Figure 4

Figure 4: Proportional summary of model-family wins across ADMET, Tox21 and anti-infective

22๊ฐœ ๋ถ„์ž ์—”๋“œํฌ์ธํŠธ์—์„œ 167,056ํšŒ ๊ฒ€์ฆ์„ ํ†ตํ•ด RFยทGNNยท๋Œ€ํ˜• ์‚ฌ์ „ํ•™์Šต ๋ชจ๋ธ์„ ๋น„๊ตํ•œ ๊ฒฐ๊ณผ, ๋ชจ๋ธ ๊ทœ๋ชจ๋ณด๋‹ค ํ‘œํ˜„ยท๊ท€๋‚ฉ ํŽธํ–ฅยท๋ฐ์ดํ„ฐ ์ฒด๊ณ„ยท๊ฒ€์ฆ ํ”„๋กœํ† ์ฝœ์˜ ์ •ํ•ฉ์„ฑ์ด ์•ฝ๋ฌผ ๋ฐœ๊ฒฌ์˜ ์˜ˆ์ธก ์„ฑ๋Šฅ์„ ๋” ์ž˜ ์„ค๋ช…ํ•จ์„ ์ž…์ฆํ–ˆ๋‹ค.

Motivation

Achievement

Figure 4

Figure 4: Proportional summary of model-family wins across ADMET, Tox21 and anti-infective

How

Figure 3

Figure 3: Structure-similarity-separated five-fold cross-validation workflow. Molecules are stan-

Originality

Limitation & Further Study

Evaluation

Novelty: 4/5 Technical Soundness: 3/5 Significance: 4/5 Clarity: 4/5 Overall: 4/5

์ดํ‰: ์ด ๋…ผ๋ฌธ์€ ์•ฝ๋ฌผ ๋ฐœ๊ฒฌ ์ปค๋ฎค๋‹ˆํ‹ฐ์˜ ๊ทœ๋ชจ ์ค‘์‹ฌ์  ๊ฐ€์ •์„ ์‹ค์ฆ์ ์œผ๋กœ ๋„์ „ํ•˜๋ฉฐ, 167,056๊ฐœ ๊ฒ€์ฆ์„ ํ†ตํ•ด ๋ชจ๋ธโ€“์ž‘์—… ์ •ํ•ฉ์„ฑ(representationยทinductive biasยทdata regimeยทvalidation protocol์˜ ์กฐํ™”)์ด ๊ทœ๋ชจ๋ณด๋‹ค ์ค‘์š”ํ•จ์„ ๋ช…ํ™•ํžˆ ์ž…์ฆํ–ˆ๋‹ค. ADMETยทTox21ยทํ•ญ๊ฐ์—ผ ๋ฐ์ดํ„ฐ์˜ ๊ด‘๋ฒ”์œ„ํ•œ ๋น„๊ต์™€ ๊ตฌ์กฐ-์œ ์‚ฌ์„ฑ ๋ถ„๋ฆฌ cross-validation์€ ๋†’์€ ์‹ ๋ขฐ์„ฑ์„ ์ œ๊ณตํ•˜๊ณ , ์‹ค๋ฌด์ž์—๊ฒŒ ๋ชจ๋ธ ์„ ํƒ์˜ ์‹ค์งˆ์  ๊ฐ€์ด๋“œ๋ฅผ ์ œ์‹œํ•œ๋‹ค.

๊ฐ™์ด ๋ณด๋ฉด ์ข‹์€ ๋…ผ๋ฌธ

๊ธฐ๋ฐ˜ ์—ฐ๊ตฌ
๋Œ€๊ทœ๋ชจ ๋”ฅ๋Ÿฌ๋‹ ๋ชจ๋ธ์„ ํ™œ์šฉํ•œ ์†Œ์žฌ ๋ฐœ๊ฒฌ ์„ฑ๊ณผ๋ฅผ ์ข…ํ•ฉ์ ์œผ๋กœ ๊ฒ€ํ† ํ•˜์—ฌ, ๋ชจ๋ธ ํฌ๊ธฐ ๋ฐ ํ‘œํ˜„๋ ฅ์ด ์‹ค์ œ ์˜ˆ์ธก ์„ฑ๋Šฅ์— ๋ฏธ์น˜๋Š” ์˜ํ–ฅ์„ ์‹ฌ์ธต์ ์œผ๋กœ ๋…ผ์˜ํ•œ๋‹ค.
๊ธฐ๋ฐ˜ ์—ฐ๊ตฌ
AI ๊ธฐ๋ฐ˜ ์•ฝ๋ฌผ๋ฐœ๊ฒฌ์„ ์‹คํ—˜์‹ค ์ „์ฒด ์˜ค์ผ€์ŠคํŠธ๋ ˆ์ด์…˜ ๊ด€์ ์—์„œ ๋‹ค๋ฃจ๊ธฐ ๋•Œ๋ฌธ์—, ๊ฐ๊ฐ์˜ ์˜ˆ์ธก/์„ค๊ณ„ ์„ฑ๋Šฅ ํ‰๊ฐ€ ๋ฐฉ๋ฒ•์— ๋Œ€ํ•œ ๊ธฐ๋ณธ์  ๋…ผ์˜๊ฐ€ ์—ฐ๊ฒฐ๋ฉ๋‹ˆ๋‹ค.
๋‹ค๋ฅธ ์ ‘๊ทผ
์•ฝ๋ฌผ ๋ฐœ๊ฒฌ์„ ์œ„ํ•œ ๋ถ„์ž ์ตœ์ ํ™” ๋ฒค์น˜๋งˆํ‚น ๋ฐ ๋น„๊ต ์—ฐ๊ตฌ๋กœ MOLLEO์™€ ์œ ์‚ฌํ•œ ๋งฅ๋ฝ์„ ๊ณต์œ ํ•œ๋‹ค.
๋‹ค๋ฅธ ์ ‘๊ทผ
์•ฝ๋ฌผ ๋ฐœ๊ฒฌ์„ ์œ„ํ•œ ๋ถ„์ž ๋ฌผ์„ฑ ์˜ˆ์ธก์— LLM์„ ํ™œ์šฉํ•˜๋Š” ์œ ์‚ฌํ•œ ์ ‘๊ทผ๋ฒ•์„ ๋‹ค๋ฃจ๋Š” ์—ฐ๊ตฌ์ด๋‹ค.
๋‹ค๋ฅธ ์ ‘๊ทผ
์•ฝ๋ฌผ ๋ฐœ๊ฒฌ์—์„œ ๋‹ค์–‘ํ•œ ๋จธ์‹ ๋Ÿฌ๋‹ ๋ชจ๋ธ์˜ ์„ฑ๋Šฅ์„ ๋น„๊ตํ•˜๋Š” ์œ ์‚ฌํ•œ ๋ฒค์น˜๋งˆํ‚น ์—ฐ๊ตฌ์ด๋‹ค.
๋‹ค๋ฅธ ์ ‘๊ทผ
์•ฝ๋ฌผ ๋ฐœ๊ฒฌ ๋ฌธ์ œ์—์„œ GNN ๋ฐ ์‚ฌ์ „ํ•™์Šต ๋ชจ๋ธ์„ ๋น„๊ต ํ‰๊ฐ€ํ•˜๋Š” ์œ ์‚ฌํ•œ ๋ฒค์น˜๋งˆํ‚น ์—ฐ๊ตฌ์ด๋‹ค.
๋‹ค๋ฅธ ์ ‘๊ทผ
๋ถ„์ž ๋ฌผ์„ฑ ์˜ˆ์ธก ๋ชจ๋ธ์˜ ์ผ๋ฐ˜ํ™” ๋ฐ ๊ฒ€์ฆ ํ”„๋กœํ† ์ฝœ์„ ๋‹ค๋ฃจ๋Š” ๊ด€๋ จ ์—ฐ๊ตฌ์ด๋‹ค.
๋‹ค๋ฅธ ์ ‘๊ทผ
์˜์•ฝํ’ˆ ํƒ์ƒ‰ ๋ฐ ๊ณ ์„ฑ๋Šฅ ๋ฌผ์„ฑ ์˜ˆ์ธก์„ ์œ„ํ•œ ๋Œ€๊ทœ๋ชจ ์ƒ์„ฑ๋ชจ๋ธ ์„ค๊ณ„๊ฐ€ MLIP๊ณผ ์œ ์‚ฌ ๊ธฐ๋Šฅ์—์„œ ๋ฐ์ดํ„ฐ ํšจ์œจ์„ฑ๊ณผ ํ™•์žฅ์„ฑ ๋น„๊ต์— ๋„์›€์ด ๋œ๋‹ค.
๋‹ค๋ฅธ ์ ‘๊ทผ
์•™์ƒ๋ธ” ๊ธฐ๋ฐ˜ ๋ถˆํ™•์‹ค์„ฑ ์ •๋Ÿ‰ํ™”์™€ ๋Œ€์กฐ๋˜๋Š” ๋ถ„๋ฅ˜๊ธฐ ๊ธฐ๋ฐ˜ ์‹ ๋ขฐ๋„ ํ‰๊ฐ€ ๋ฐฉ๋ฒ•์ด๋‹ค.
๋‹ค๋ฅธ ์ ‘๊ทผ
๋จธ์‹ ๋Ÿฌ๋‹ ์›์ž๊ฐ„ ํผํ…์…œ์˜ ๋Šฅ๋™ํ•™์Šต์„ ์œ„ํ•œ ๋‹ค๋ฅธ ์ ‘๊ทผ ๋ฐฉ์‹์„ ์ œ์‹œํ•˜๋Š” ์—ฐ๊ตฌ์ด๋‹ค.
ํ›„์† ์—ฐ๊ตฌ
์•ฝ๋ฌผ๋ฐœ๊ฒฌ์—์„œ LLMยทGNNยทRF ๋“ฑ ๋‹ค์–‘ํ•œ ๋Œ€ํ˜•๋ชจ๋ธ์˜ ๋„๋ฉ”์ธ์ ํ•ฉ์„ฑ ๋ฐ ํŒŒ์ธํŠœ๋‹์˜ ์„ฑ๋Šฅ ์˜ํ–ฅ์„ ์‹ค์ œ๋กœ ๊ฒ€์ฆํ•œ๋‹ค.
๋ฐ˜๋ก /๋น„ํŒ
๋Œ€ํ˜• ๋ชจ๋ธ์˜ ๋ถ„์ž ์ตœ์ ํ™” ์šฐ์›”์„ฑ์„ ์ฃผ์žฅํ•˜๋Š” ์—ฐ๊ตฌ์— ๋ฐ˜ํ•ด ๋ชจ๋ธ ๊ทœ๋ชจ๋ณด๋‹ค ๋‹ค๋ฅธ ์š”์ธ์ด ์ค‘์š”ํ•จ์„ ์‹ค์ฆํ•˜๋Š” ์—ฐ๊ตฌ์ด๋‹ค.
๋ฐ˜๋ก /๋น„ํŒ
domain-adaptation ๋”ฅ๋Ÿฌ๋‹๊ณผ ๊ฐ™์€ ๋ณ€ํ˜• ๋ชจ๋ธ์ด ์‹ค์ œ๋กœ ์˜ˆ์ธก ์„ฑ๋Šฅ์„ ๋ฐ˜๋“œ์‹œ ๋†’์—ฌ์ฃผ์ง€ ์•Š๋Š”๋‹ค๋Š” ๊ฒฐ๊ณผ๋Š”, ๋Œ€ํ˜• ๋ชจ๋ธ์˜ ๊ทœ๋ชจ๋ณด๋‹ค ํ”„๋กœํ† ์ฝœ/๋ฐ์ดํ„ฐ ์ •ํ•ฉ์„ฑ์ด ์ค‘์š”ํ•จ์„ ๊ฐ•์กฐํ•˜๋Š” ๋ณธ ๋…ผ๋ฌธ๊ณผ ์ƒํ˜ธ ๋น„ํŒ์ ์œผ๋กœ ์ฝ์„ ๋งŒํ•ฉ๋‹ˆ๋‹ค.
๋ฐ˜๋ก /๋น„ํŒ
AI ๊ธฐ๋ฐ˜ ์•ฝ๋ฌผ ๋ฐœ๊ฒฌ์˜ ์‹ ๋ขฐ์„ฑยท์ผ๊ด€์„ฑ์— ๋Œ€ํ•œ ์‹ค์ œ ๋ฆฌ์Šคํฌ์™€ ํ‰๊ฐ€ ํ•œ๊ณ„๋ฅผ ์ฒด๊ณ„์ ์œผ๋กœ ๋ถ„์„ํ•œ๋‹ค.
← ๋ชฉ๋ก์œผ๋กœ ๋Œ์•„๊ฐ€๊ธฐ

๐ŸŽง Audio Overview

์ด ๋…ผ๋ฌธ ๋ฆฌ๋ทฐ๋ฅผ ํŒŸ์บ์ŠคํŠธํ˜• ์˜ค๋””์˜ค๋กœ ์ƒ์„ฑํ•ฉ๋‹ˆ๋‹ค. (Gemini ยท ํ‚ค๋Š” ๋ธŒ๋ผ์šฐ์ €์—๋งŒ ์ €์žฅ ยท ์™„์„ฑ๋ณธ์€ ์ด๋ฉ”์ผ๋กœ๋„ ์ „์†ก)
โ–ธ ๊ณ ๊ธ‰: ๊ตฌ์„ฑ ๋ฐฉํ–ฅ(๋Œ€๋ณธ ์ž‘์„ฑ ์ง€์นจ) ์ง์ ‘ ์ˆ˜์ •