MP2D: Constrained Monte Carlo Tree-Guided Diffusion for Multi-Objective Protein Sequence Design

์ €์ž: Zitai Kong, Yifan Dong, Yixuan Wu, Zhaokang Liang, Jian Wu, Hongxia Xu | ๋‚ ์งœ: 2026-05-07 | URL: https://arxiv.org/abs/2605.05829 📄 PDF


Essence

Figure 1

Figure 1: Overview of MP2D. (A) Illustration of global-level iterative refinement process. (B) Visualization of the cons

๋ณธ ๋…ผ๋ฌธ์€ ์กฐ๊ฑด๋ถ€ ์ด์‚ฐ diffusion ๋ชจ๋ธ๊ณผ ์ œ์•ฝ MCTS, ์ „์—ญ ๋ฐ˜๋ณต ์ •์ œ๋ฅผ ํ†ตํ•ฉํ•˜์—ฌ ๋‹ค์ค‘ ๋ชฉ์  ๋‹จ๋ฐฑ์งˆ ์„œ์—ด ์„ค๊ณ„ ๋ฌธ์ œ๋ฅผ ํ•ด๊ฒฐํ•˜๋Š” MP2D ํ”„๋ ˆ์ž„์›Œํฌ๋ฅผ ์ œ์•ˆํ•œ๋‹ค. ๋ชจ๋ธ ์žฌํ•™์Šต ์—†์ด 4~5๊ฐœ์˜ ์ถฉ๋Œํ•˜๋Š” ์†์„ฑ์„ ๊ท ํ˜•์žˆ๊ฒŒ ์ตœ์ ํ™”ํ•  ์ˆ˜ ์žˆ๋‹ค.

Motivation

Achievement

Figure 1

Figure 1: Overview of MP2D. (A) Illustration of global-level iterative refinement process. (B) Visualization of the cons

MP2D ํ†ตํ•ฉ ํ”„๋ ˆ์ž„์›Œํฌ: ์กฐ๊ฑด๋ถ€ discrete diffusion๊ณผ ์ œ์•ฝ MCTS, ์ „์—ญ ๋ฐ˜๋ณต ์ •์ œ๋ฅผ ๊ฒฐํ•ฉํ•œ unified framework ๊ฐœ๋ฐœ. CMDLM ๋ชจ๋ธ: classifier-free label-guided conditional masked diffusion ์–ธ์–ด ๋ชจ๋ธ ์ œ์•ˆ. Training-free MCTS ๊ธฐ๋ฐ˜ ๋‹ค์ค‘ ๋ชฉ์  ์ตœ์ ํ™”: ์žฌํ•™์Šต ์—†์ด inference ์‹œ์ ์—์„œ ๋ฐ˜๋ณต์  ์ •์ œ๋ฅผ ์ง€์›ํ•˜๋Š” MCTS-guided diffusion. ๋™์  Pareto ์ œ์•ฝ(CMCTD): Pareto ๊ฒฝ๊ณ„ ์—…๋ฐ์ดํŠธ ์‹œ ์ตœ์ ํ™” ๋ถ•๊ดด๋ฅผ ๋ฐฉ์ง€ํ•˜๋Š” constraint ์ „๋žต. ์šฐ์ˆ˜ํ•œ ์‹คํ—˜ ์„ฑ๊ณผ: ํ•ญ๊ท  ํŽฉํƒ€์ด๋“œ ๋ฐ ๋‹จ๋ฐฑ์งˆ binder ์„ค๊ณ„์—์„œ 4~5๊ฐœ ์ถฉ๋Œ ์†์„ฑ์„ ๊ท ํ˜•์žˆ๊ฒŒ ๊ฐœ์„ , ๊ธฐ์กด ๋‹ค์ค‘ ๋ชฉ์  baseline ๋Œ€๋น„ ์ผ๊ด€๋˜๊ฒŒ ์šฐ์ˆ˜ํ•œ ์„ฑ๋Šฅ ๋‹ฌ์„ฑ.

How

Figure 1

Figure 1: Overview of MP2D. (A) Illustration of global-level iterative refinement process. (B) Visualization of the cons

Conditional masked diffusion ๋ชจ๋ธ ์„ค๊ณ„: classifier-free guidance๋ฅผ ํ™œ์šฉํ•˜์—ฌ label-conditioned ์˜ˆ์ธก์„ ์ƒ์„ฑ. MCTS ๊ธฐ๋ฐ˜ ํƒ์ƒ‰: diffusion step๋งˆ๋‹ค UCB ๊ธฐ์ค€์— ๋”ฐ๋ผ ๋…ธ๋“œ๋ฅผ ์„ ํƒํ•˜๊ณ  Pareto ๋ณด์ƒ์œผ๋กœ simulation. ๋™์  Pareto ์ œ์•ฝ: frontier ํฌ๊ธฐ ์ œ์–ด ๋ฐ ์ง€๋ฐฐ ๊ด€๊ณ„ ๊ฐฑ์‹ . ์ „์—ญ ๋ฐ˜๋ณต ์ •์ œ: candidate ์žฌ์„ ํƒ ํ›„ ๋ถ€๋ถ„ ๋งˆ์Šคํ‚น์œผ๋กœ ๊ฐœ๋ณ„ ์†์„ฑ ๊ฐœ์„  ๊ธฐํšŒ ์ œ๊ณต. Inference-time ์ตœ์ ํ™”: ๋ชจ๋ธ ์žฌํ•™์Šต ์—†์ด reward function ๊ต์ฒด๋งŒ์œผ๋กœ ๋‹ค์–‘ํ•œ ๋ชฉ์  ์กฐํ•ฉ ๋Œ€์‘.

Originality

CMDLM ์ œ์•ˆ: ๋‹จ์ˆœ pretrained ๋ชจ๋ธ ๋Œ€์‹  task-specific conditional diffusion ๋ชจ๋ธ๋กœ ํƒ์ƒ‰ ๊ณต๊ฐ„ ์ถ•์†Œ. MCTS์™€ diffusion ๊ฒฐํ•ฉ: denoising ํ”„๋กœ์„ธ์Šค๋ฅผ sequential decision-making์œผ๋กœ ์žฌํ•ด์„ํ•˜๊ณ  MCTS ํƒ์ƒ‰ ์ ์šฉ. ๋™์  Pareto ์ œ์•ฝ(CMCTD): ๊ธฐ์กด Pareto ์ตœ์ ํ™”์˜ ๊ฒฝ๊ณ„ ํŒฝ์ฐฝ ๋ฌธ์ œ๋ฅผ ๋ช…์‹œ์ ์œผ๋กœ ํ•ด๊ฒฐํ•˜๋Š” ์ƒˆ๋กœ์šด constraint ๋ฉ”์ปค๋‹ˆ์ฆ˜. ์ „์—ญ ๋ฐ˜๋ณต ์ •์ œ: ๋ถ€๋ถ„ ๋งˆ์Šคํ‚น์„ ํ†ตํ•œ targeted ๊ฐœ์„ ์œผ๋กœ ๋‹จ์ˆœ ์ƒ์„ฑ ๋Œ€๋น„ ๊ทผ๋ณธ์ ์œผ๋กœ ๋‹ค๋ฅธ ์ตœ์ ํ™” ํŒจ๋Ÿฌ๋‹ค์ž„ ์ œ์‹œ.

Limitation & Further Study

์ดˆ๊ธฐ CMDLM ํ’ˆ์งˆ ์˜์กด์„ฑ: task-specific conditional diffusion ๋ชจ๋ธ์˜ ์‚ฌ์ „ํ•™์Šต ํ•„์š”, ์ƒˆ๋กœ์šด protein type์— ๋Œ€ํ•œ ํ™•์žฅ์„ฑ ์ œํ•œ. ๊ณ„์‚ฐ ๋ณต์žก๋„: MCTS ํƒ์ƒ‰๊ณผ ๋ฐ˜๋ณต ์ •์ œ๋กœ ์ธํ•œ ๋†’์€ ๊ณ„์‚ฐ ๋น„์šฉ, ์‹ค์‹œ๊ฐ„ ์‘์šฉ ์ œํ•œ ๊ฐ€๋Šฅ์„ฑ. ํ‰๊ฐ€ ๋…ธ์ด์ฆˆ: global property evaluation์ด ์™„๋ฒฝํ•˜์ง€ ์•Š์„ ์ˆ˜ ์žˆ์œผ๋ฏ€๋กœ ํ‰๊ฐ€ ํ•จ์ˆ˜ ์„ ํƒ์ด ๊ฒฐ๊ณผ์— ํฐ ์˜ํ–ฅ. ํ›„์† ์—ฐ๊ตฌ: pretrained diffusion ๋ชจ๋ธ์˜ few-shot adaptation ๋ฐฉ๋ฒ•, ๊ณ„์‚ฐ ํšจ์œจ์„ฑ ๊ฐœ์„ , ๋” ๋ณต์žกํ•œ ์†์„ฑ ์กฐํ•ฉ(5๊ฐœ ์ด์ƒ)์— ๋Œ€ํ•œ ํ™•์žฅ์„ฑ ๊ฒ€์ฆ ํ•„์š”.

Evaluation

Novelty: 4/5 Technical Soundness: 4/5 Significance: 4/5 Clarity: 4/5 Overall: 4/5

์ดํ‰: MP2D๋Š” ๋‹ค์ค‘ ๋ชฉ์  ๋‹จ๋ฐฑ์งˆ ์„ค๊ณ„์˜ ์‹ค์งˆ์  ๋ฌธ์ œ๋ฅผ ํ•ด๊ฒฐํ•˜๊ธฐ ์œ„ํ•ด diffusion, MCTS, Pareto ์ตœ์ ํ™”๋ฅผ ์ฐฝ์˜์ ์œผ๋กœ ๊ฒฐํ•ฉํ•œ ์šฐ์ˆ˜ํ•œ ๋…ผ๋ฌธ์ด๋‹ค. ํŠนํžˆ ๋™์  Pareto ์ œ์•ฝ๊ณผ ์ „์—ญ ๋ฐ˜๋ณต ์ •์ œ๋ผ๋Š” ๋ช…ํ™•ํ•œ ๊ธฐ์ˆ ์  ๊ธฐ์—ฌ์™€ 4~5๊ฐœ ์†์„ฑ์˜ ๊ท ํ˜•์žˆ๋Š” ์ตœ์ ํ™” ๋‹ฌ์„ฑ์ด ์˜๋ฏธ์žˆ๋‹ค. ๋‹ค๋งŒ initial model dependency์™€ ๊ณ„์‚ฐ ๋ณต์žก๋„๋Š” ์‹ค์ œ ์ ์šฉ ์‹œ ๊ณ ๋ ค ๋Œ€์ƒ์ด๋‹ค.

๊ฐ™์ด ๋ณด๋ฉด ์ข‹์€ ๋…ผ๋ฌธ

๊ธฐ๋ฐ˜ ์—ฐ๊ตฌ
๋ณด์ƒ ์œ ๋„ํ˜• diffusion ๋ชจ๋ธ fine-tuning์˜ ์ผ๋ฐ˜์  ์ „๋žต์„ ์ œ์‹œํ•˜๋ฉฐ, MP2D์˜ reward-guided sampling ์„ค๊ณ„์˜ ์ด๋ก ์  ๊ธฐ๋ฐ˜์ด ๋œ๋‹ค.
๊ธฐ๋ฐ˜ ์—ฐ๊ตฌ
Diffusion ๋ชจ๋ธ์˜ reward-guided iterative refinement๊ฐ€ MP2D์˜ ๋‹ค๋ชฉ์  ์„ค๊ณ„ ๋ฌธ์ œ ํ•ด๊ฒฐ์— ๊ธฐ๋ณธ ์•Œ๊ณ ๋ฆฌ์ฆ˜์  ์—ญํ• ์„ ํ•ฉ๋‹ˆ๋‹ค.
๋‹ค๋ฅธ ์ ‘๊ทผ
Reward-guided discrete diffusion์˜ ์›๋ฆฌ๋ฅผ ํ™”ํ•™ ํ•ฉ์„ฑ ๋“ฑ ๋ถ„์ž ์ƒ์„ฑ์—์„œ ๋‹ค๋ฃจ์–ด, MP2D์˜ ๋ณด์ƒ ๊ธฐ๋ฐ˜ ์ ‘๊ทผ๊ณผ ์ง์ ‘ ๋น„๊ตยท์ฐธ์กฐ๊ฐ€ ๊ฐ€๋Šฅํ•˜๋‹ค.
์‘์šฉ ์‚ฌ๋ก€
Multi-agent ๊ธฐ๋ฐ˜ ์•ฝ๋ฌผ ์„ค๊ณ„์—์„œ ๋‹ค์–‘ํ•œ ๋ชฉ์  ์ตœ์ ํ™” ์ „๋žต์„ ์ ์šฉํ•˜๋ฉฐ, MP2D์˜ ๋‹ค๋ชฉ์  ์„ค๊ณ„ ๋ฌธ์ œ์™€ ์‹ค์šฉ์  ์—ฐ๊ฒฐ์ ์ด ์žˆ๋‹ค.
๋ฐ˜๋ก /๋น„ํŒ
๊ณผํ•™ ์—ฐ๊ตฌ ์„ค๊ณ„์—์„œ ์ƒ์„ฑํ˜• AI์˜ ํ•œ๊ณ„ ๋ฐ ์ธ๊ฐ„-์ปดํ“จํ„ฐ ์ƒํ˜ธ์ž‘์šฉ ๋ฌธ์ œ์— ๋Œ€ํ•ด ๋น„ํŒ์  ๊ด€์ ์„ ์ œ์‹œํ•ฉ๋‹ˆ๋‹ค.
← ๋ชฉ๋ก์œผ๋กœ ๋Œ์•„๊ฐ€๊ธฐ

๐ŸŽง Audio Overview

์ด ๋…ผ๋ฌธ ๋ฆฌ๋ทฐ๋ฅผ ํŒŸ์บ์ŠคํŠธํ˜• ์˜ค๋””์˜ค๋กœ ์ƒ์„ฑํ•ฉ๋‹ˆ๋‹ค. (Gemini ยท ํ‚ค๋Š” ๋ธŒ๋ผ์šฐ์ €์—๋งŒ ์ €์žฅ ยท ์™„์„ฑ๋ณธ์€ ์ด๋ฉ”์ผ๋กœ๋„ ์ „์†ก)
โ–ธ ๊ณ ๊ธ‰: ๊ตฌ์„ฑ ๋ฐฉํ–ฅ(๋Œ€๋ณธ ์ž‘์„ฑ ์ง€์นจ) ์ง์ ‘ ์ˆ˜์ •