CAGenMol: Condition-Aware Diffusion Language Model for Goal-Directed Molecular Generation

์ €์ž: | ๋‚ ์งœ: 2026-04-13 | URL: https://arxiv.org/abs/2604.11483 📄 PDF


Essence

Figure 1

Figure 1: Overview of CAGenMol. UCA encodes either protein-pocket structure or target properties, which guides

๋ณธ ๋…ผ๋ฌธ์€ discrete diffusion๊ณผ reinforcement learning์„ ๊ฒฐํ•ฉํ•˜์—ฌ ๋‹จ๋ฐฑ์งˆ ๊ฒฐํ•ฉ, ์•ฝ๋ฌผ์„ฑ, ๋…์„ฑ ๋“ฑ ๋‹ค์ค‘ ๋ชฉํ‘œ๋ฅผ ๋™์‹œ์— ๋งŒ์กฑํ•˜๋Š” ๋ถ„์ž๋ฅผ ์ƒ์„ฑํ•˜๋Š” CAGenMol ํ”„๋ ˆ์ž„์›Œํฌ๋ฅผ ์ œ์•ˆํ•œ๋‹ค. ์กฐ๊ฑด๋ถ€ denoising์„ ํ†ตํ•ด ๋น„๋ฏธ๋ถ„ ๊ฐ€๋Šฅํ•œ ๋ชฉํ‘œ๋“ค์„ anisotropic ๊ตฌ์กฐ-์†์„ฑ ์‹ ํ˜ธ๋กœ ์•ˆ๋‚ดํ•˜๋ฉฐ, non-autoregressive ๊ตฌ์กฐ๋ฅผ ํ™œ์šฉํ•œ ๋ฐ˜๋ณต์  ์ •์ œ๊ฐ€ ๊ฐ€๋Šฅํ•˜๋‹ค.

Motivation

Achievement

Figure 1

Figure 1: Overview of CAGenMol. UCA encodes either protein-pocket structure or target properties, which guides

How

Figure 1

Figure 1: Overview of CAGenMol. UCA encodes either protein-pocket structure or target properties, which guides

Originality

Limitation & Further Study

Evaluation

Novelty: 4/5 Technical Soundness: 4/5 Significance: 4/5 Clarity: 4/5 Overall: 4/5

์ดํ‰: ๋ณธ ๋…ผ๋ฌธ์€ discrete diffusion language model์„ ๋ชฉํ‘œ ์ง€ํ–ฅ ๋ถ„์ž ์ƒ์„ฑ์— ์ ์šฉํ•˜๋Š” ์ฐฝ์˜์  ํ”„๋ ˆ์ž„์›Œํฌ๋ฅผ ์ œ์‹œํ•˜๋ฉฐ, Step-PPO์™€ EFO ๊ฐ™์€ ๊ธฐ์ˆ ์  ๊ธฐ์—ฌ๋กœ ๋‹ค์ค‘ ์ด์งˆ์  ์ œ์•ฝ์„ ํšจ๊ณผ์ ์œผ๋กœ ์กฐํ™”์‹œํ‚จ๋‹ค. ์‹คํ—˜์  ๊ฒ€์ฆ๊ณผ ์ฝ”๋“œ ๊ณต๊ฐœ๋กœ ์žฌํ˜„์„ฑ๋„ ์šฐ์ˆ˜ํ•˜๋‚˜, ๊ณ„์‚ฐ ํšจ์œจ์„ฑ๊ณผ ๋Œ€๊ทœ๋ชจ ์ ์šฉ ๊ฐ€๋Šฅ์„ฑ์— ๋Œ€ํ•œ ์ถ”๊ฐ€ ๋ถ„์„์ด ํ•„์š”ํ•˜๋‹ค.

๊ฐ™์ด ๋ณด๋ฉด ์ข‹์€ ๋…ผ๋ฌธ

๊ธฐ๋ฐ˜ ์—ฐ๊ตฌ
Diffusion ๋ชจ๋ธ์˜ reward-guided fine-tuning ๋ฐฉ๋ฒ•๋ก ์— ๋Œ€ํ•œ ์ฒด๊ณ„์  ๋ถ„์„ ๊ฒฐ๊ณผ๋กœ, CAGenMol์˜ ๊ฐ•ํ™”ํ•™์Šต ๊ธฐ๋ฐ˜ ๋ณด์ƒ ์„ธ๋ถ€ ๊ตฌํ˜„์— ํ•„์š”ํ•œ ์ด๋ก ์  ํ† ๋Œ€๋ฅผ ์ œ๊ณตํ•œ๋‹ค.
๊ธฐ๋ฐ˜ ์—ฐ๊ตฌ
Diffusion ๋ชจ๋ธ์—์„œ ๋ณด์ƒ ๊ธฐ๋ฐ˜ ๋ฐ˜๋ณต์  ๊ฐœ์„  ๋ฐ ์ƒ˜ํ”Œ๋ง ๊ธฐ๋ฒ•์˜ ์ด๋ก ยท์‹ค์Šต์  ๋ฐฐ๊ฒฝ์„ ์ œ๊ณตํ•ฉ๋‹ˆ๋‹ค.
๋‹ค๋ฅธ ์ ‘๊ทผ
Reward-Guided Discrete Diffusion์— ๋Œ€ํ•œ ๋˜๋‹ค๋ฅธ ์ ‘๊ทผ๋ฒ•์œผ๋กœ, ๋ถ„์ž ์„ค๊ณ„ ๋ฌธ์ œ์—์„œ ๋ณด์ƒ ๊ธฐ๋ฐ˜ ๋””ํ“จ์ „ ๋ชจ๋ธ๋ง ์„ฑ๋Šฅ ๋ฐ ์„ค๊ณ„ ์ฐจ์ด๋ฅผ ๋น„๊ตํ•  ์ˆ˜ ์žˆ๋‹ค.
๋‹ค๋ฅธ ์ ‘๊ทผ
CAGenMol ๋…ผ๋ฌธ์€ ๋ชฉํ‘œ ๊ธฐ๋ฐ˜ ๋ถ„์ž ์ƒ์„ฑ diffusion ๋ชจ๋ธ์„ ํ™œ์šฉํ•ด ์—ฐ์† ์ตœ์ ํ™”์™€๋Š” ๋˜ ๋‹ค๋ฅธ ๋ถ„์ž/์„œ์—ด ์ƒ์„ฑ ์ ‘๊ทผ๋ฒ•์„ ๋ณด์—ฌ์ค€๋‹ค.
๋‹ค๋ฅธ ์ ‘๊ทผ
CAGenMol์€ ์กฐ๊ฑด ๊ธฐ๋ฐ˜ diffusion ๋ชจ๋ธโ€”์ƒ์„ฑํ˜• ํƒ์ƒ‰ ๋ฐฉ์‹์œผ๋กœ, normalizing flow์™€ diffusion์„ ์œตํ•ฉํ•œ ๊ตฌ์กฐ ์ƒ˜ํ”Œ๋ง ๋ฐฉ๋ฒ• ๋น„๊ต๊ฐ€ ๊ฐ€๋Šฅํ•ฉ๋‹ˆ๋‹ค.
๋‹ค๋ฅธ ์ ‘๊ทผ
์กฐ๊ฑด๋ถ€ ๋ถ„์ž ์ƒ์„ฑ์šฉ diffusion language model์„ ํ™œ์šฉ, GSS์™€ ๋ฌผ๋ฆฌ ๊ธฐ๋ฐ˜ ํ†ตํ•ฉ ํƒ์ƒ‰์˜ ์ฐจ๋ณ„์ ์„ ํ† ์˜ํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.
ํ›„์† ์—ฐ๊ตฌ
CAGenMol ๋…ผ๋ฌธ์€ ์กฐ๊ฑด ์ธ์ง€ ๋ฐ ๋ชฉ์ ์ถ”๊ตฌ ํ™•์‚ฐ์–ธ์–ด๋ชจ๋ธ๋กœ ์ƒ๋ฌผ๋ถ„์ž/์žฌ๋ฃŒ ์„ค๊ณ„์— reward-guided fine-tuning ํ”„๋ ˆ์ž„์›Œํฌ๋ฅผ ํ™•์žฅํ•ฉ๋‹ˆ๋‹ค.
← ๋ชฉ๋ก์œผ๋กœ ๋Œ์•„๊ฐ€๊ธฐ

๐ŸŽง Audio Overview

์ด ๋…ผ๋ฌธ ๋ฆฌ๋ทฐ๋ฅผ ํŒŸ์บ์ŠคํŠธํ˜• ์˜ค๋””์˜ค๋กœ ์ƒ์„ฑํ•ฉ๋‹ˆ๋‹ค. (Gemini ยท ํ‚ค๋Š” ๋ธŒ๋ผ์šฐ์ €์—๋งŒ ์ €์žฅ ยท ์™„์„ฑ๋ณธ์€ ์ด๋ฉ”์ผ๋กœ๋„ ์ „์†ก)
โ–ธ ๊ณ ๊ธ‰: ๊ตฌ์„ฑ ๋ฐฉํ–ฅ(๋Œ€๋ณธ ์ž‘์„ฑ ์ง€์นจ) ์ง์ ‘ ์ˆ˜์ •