Agentic End-to-End De Novo Protein Design for Tailored Dynamics Using a Language Diffusion Model

์ €์ž: Bo Ni, Markus J. Buehler | ๋‚ ์งœ: 2025 | DOI: 10.48550/arXiv.2502.10173 📄 PDF


Essence

Figure 1

Fig. 1. Workflow of developing the end-to-end protein generation model based on dynamics signature, featuring an

VibeGen์€ language diffusion model์„ ์‚ฌ์šฉํ•˜์—ฌ ์ง€์ •๋œ normal mode ์ง„๋™์„ ๊ธฐ๋ฐ˜์œผ๋กœ ๋‹จ๋ฐฑ์งˆ์„ de novo๋กœ ์„ค๊ณ„ํ•˜๋Š” agentic ์ด์ค‘ ๋ชจ๋ธ ํ”„๋ ˆ์ž„์›Œํฌ๋กœ, protein designer์™€ protein predictor๊ฐ€ ํ˜‘๋ ฅํ•˜์—ฌ ์ •ํ™•ํ•˜๊ณ  ๋‹ค์–‘ํ•œ ์„ค๊ณ„๋ฅผ ์‹คํ˜„ํ•œ๋‹ค.

Motivation

Achievement

Figure 4

Fig. 4 shows some examples of the designed proteins and their normal mode shapes measured using our protocol.

How

Figure 1

Fig. 1. Workflow of developing the end-to-end protein generation model based on dynamics signature, featuring an

Originality

Limitation & Further Study

Evaluation

Novelty: 4/5 Technical Soundness: 3/5 Significance: 4/5 Clarity: 4/5 Overall: 4/5

์ดํ‰: ๋ณธ ๋…ผ๋ฌธ์€ ๋‹จ๋ฐฑ์งˆ ์„ค๊ณ„์— dynamics ์ •๋ณด๋ฅผ ์ง์ ‘ ํ†ตํ•ฉํ•œ ํ˜์‹ ์ ์ธ end-to-end ํ”„๋ ˆ์ž„์›Œํฌ๋ฅผ ์ œ์‹œํ•˜๋ฉฐ, language diffusion model๊ณผ agentic ํ˜‘๋ ฅ ๊ตฌ์กฐ๋ฅผ ํ†ตํ•ด de novo ๋‹จ๋ฐฑ์งˆ ์ƒ์„ฑ์—์„œ sequence-dynamics ๊ด€๊ณ„์˜ ์–‘๋ฐฉํ–ฅ ๋งคํ•‘์„ ์„ฑ๊ณต์ ์œผ๋กœ ๊ตฌํ˜„ํ–ˆ๋‹ค. ์„ค๊ณ„๋œ ๋‹จ๋ฐฑ์งˆ๋“ค์ด MD ๊ฒ€์ฆ์„ ํ†ตํ•ด ๋ชฉํ‘œ dynamics๋ฅผ ์ •ํ™•ํžˆ ์žฌํ˜„ํ•˜๋ฉด์„œ๋„ ์ง„ํ™” ์ œ์•ฝ์„ ๋ฒ—์–ด๋‚œ ์™„์ „ ์‹ ๊ทœ sequence์ž„์„ ์ž…์ฆํ•œ ์ ์—์„œ ๋‹จ๋ฐฑ์งˆ ์—”์ง€๋‹ˆ์–ด๋ง์— ์ƒ๋‹นํ•œ ๊ธฐ์—ฌ๋ฅผ ํ•˜๋‚˜, ์ƒ๋ฌผํ•™์  ๊ธฐ๋Šฅ ์‹คํ—˜ ๊ฒ€์ฆ๊ณผ ๋ณตํ•ฉ dynamics ์„ค๊ณ„๊นŒ์ง€ ํ™•์žฅํ•˜๋Š” ๊ฒƒ์ด ํ–ฅํ›„ ๊ณผ์ œ์ด๋‹ค.

๊ฐ™์ด ๋ณด๋ฉด ์ข‹์€ ๋…ผ๋ฌธ

๊ธฐ๋ฐ˜ ์—ฐ๊ตฌ
Agentic End-to-End De Novo Protein Design ๋…ผ๋ฌธ์—์„œ๋Š” ๋‹จ๋ฐฑ์งˆ ์„ค๊ณ„ ์ž๋™ํ™” ์ „์ฒด ํŒŒ์ดํ”„๋ผ์ธ์„ ์ œ์‹œํ•˜๋ฉฐ, ProtAgents์˜ ํˆดยท์‹œ๋ฎฌ๋ ˆ์ด์…˜ ํ†ตํ•ฉ ์ ‘๊ทผ๊ณผ ๋ฐ€์ ‘ํ•˜๊ฒŒ ์—ฐ๊ฒฐ๋œ๋‹ค.
๊ธฐ๋ฐ˜ ์—ฐ๊ตฌ
๋‹จ๋ฐฑ์งˆ de novo ์„ค๊ณ„ ์•Œ๊ณ ๋ฆฌ์ฆ˜ ๋ฐ ์„ฑ๋Šฅ ํ‰๊ฐ€์—์„œ ํ•ต์‹ฌ ์ฐธ๊ณ ์ž๋ฃŒ๋กœ, VibeGen์˜ ๋‹จ๋ฐฑ์งˆ ์„ค๊ณ„ ๋ฐฉ๋ฒ•๋ก  ๊ฐœ๋ฐœ์— ํ™œ์šฉ๋ฉ๋‹ˆ๋‹ค.
๊ธฐ๋ฐ˜ ์—ฐ๊ตฌ
๋ถ„์ž ๋™์—ญํ•™ ๊ถค์  ์ƒ์„ฑ์„ ์œ„ํ•œ ๋”ฅ๋Ÿฌ๋‹ ๋ฐฉ๋ฒ•๋ก ์˜ ๊ธฐ์ดˆ๋ฅผ ์ œ๊ณตํ•˜๋Š” ์—ฐ๊ตฌ์ด๋‹ค
๊ธฐ๋ฐ˜ ์—ฐ๊ตฌ
์—์ด์ „ํŠธ ๊ธฐ๋ฐ˜ de novo ๋‹จ๋ฐฑ์งˆ ์—ญ์„ค๊ณ„ ์—ฐ๊ตฌ๋กœ, LLM ๊ธฐ๋ฐ˜ ์—ญ์„ค๊ณ„ ์›Œํฌํ”Œ๋กœ์šฐ์˜ ์ด๋ก ์  ๋ฐฐ๊ฒฝ ๋ฐ ์‹ค์งˆ์  design pipeline ์ „๋žต์„ ์ œ๊ณตํ•ฉ๋‹ˆ๋‹ค.
๊ธฐ๋ฐ˜ ์—ฐ๊ตฌ
๊ฐ•ํ™”ํ•™์Šต ๊ธฐ๋ฐ˜ ์ž์œจ ์‹คํ—˜ ์ œ์–ด์˜ ๋ฐฉ๋ฒ•๋ก ์  ๊ธฐ์ดˆ๋ฅผ ์ œ๊ณตํ•˜๋Š” ์—ฐ๊ตฌ์ด๋‹ค
๋‹ค๋ฅธ ์ ‘๊ทผ
638์€ ๋‹ค์ค‘ LLM ๊ธฐ๋ฐ˜ ํ”„๋กœํ‹ด ๋””์ž์ธ ๋ฉ€ํ‹ฐ์—์ด์ „ํŠธ ํ”„๋ ˆ์ž„์›Œํฌ๋ฅผ ์ œ์•ˆํ•˜์—ฌ, 065์˜ VibeGen๊ณผ ์œ ์‚ฌํ•˜์ง€๋งŒ ์„ค๊ณ„ ์ „๋žต์ด ๋‹ค๋ฆ…๋‹ˆ๋‹ค.
๋‹ค๋ฅธ ์ ‘๊ทผ
Agentic End-to-End De Novo Protein Design ๋…ผ๋ฌธ์€ ๋‹จ๋ฐฑ์งˆ ๋””์ž์ธ ์ž๋™ํ™”์— ์ดˆ์ ์„ ๋งž์ถ”์–ด, AutoProteinEngine์˜ LLM ๊ธฐ๋ฐ˜ ์ž๋™ํ™” ์ ‘๊ทผ๊ณผ ๋น„๊ต์  ๊ด€์ ์„ ์ œ๊ณตํ•ฉ๋‹ˆ๋‹ค.
๋‹ค๋ฅธ ์ ‘๊ทผ
de novo ๋‹จ๋ฐฑ์งˆ ์„ค๊ณ„๋ฅผ ์œ„ํ•œ ์œ ์‚ฌํ•œ ์ƒ์„ฑ ๋ชจ๋ธ ๊ธฐ๋ฐ˜ ์ ‘๊ทผ๋ฒ•์„ ์‚ฌ์šฉํ•˜๋Š” ๋Œ€์•ˆ ์—ฐ๊ตฌ์ด๋‹ค.
๋‹ค๋ฅธ ์ ‘๊ทผ
065๋ฒˆ ๋…ผ๋ฌธ์€ ๋‹จ๋ฐฑ์งˆ ๊ตฌ์กฐ ์„ค๊ณ„๋ฅผ ์œ„ํ•œ ์—์ด์ „ํŠธ์  ์ ‘๊ทผ์„ ๋‹ค๋ฃจ๊ณ , 594๋ฒˆ OSDA Agent์™€ ์œ ๊ธฐ๊ตฌ์กฐ์ง€ํ–ฅ์ œ ๋ฐœ๊ฒฌ ํ”„๋ ˆ์ž„์›Œํฌ์˜ ์ฐจ๋ณ„์ ์„ ํ™•์ธํ•  ์ˆ˜ ์žˆ๋‹ค.
๋‹ค๋ฅธ ์ ‘๊ทผ
๋‹จ๋ฐฑ์งˆ ์„œ์—ด ์ƒ์„ฑ์„ ์œ„ํ•œ ์–ธ์–ด ๋ชจ๋ธ ๊ธฐ๋ฐ˜ ์ ‘๊ทผ๋ฒ•์— ๋Œ€ํ•œ ์œ ์‚ฌํ•œ ์—ฐ๊ตฌ์ด๋‹ค.
๋‹ค๋ฅธ ์ ‘๊ทผ
๋‹จ๋ฐฑ์งˆ ๋™์—ญํ•™ ๋˜๋Š” ๊ตฌ์กฐ ์„ค๊ณ„๋ฅผ ์œ„ํ•œ ์œ ์‚ฌํ•œ AI ๊ธฐ๋ฐ˜ ํ”„๋ ˆ์ž„์›Œํฌ๋ฅผ ์ œ์‹œํ•˜๋Š” ์—ฐ๊ตฌ์ด๋‹ค.
๋‹ค๋ฅธ ์ ‘๊ทผ
๋‹จ๋ฐฑ์งˆ ์„ค๊ณ„์— ์—์ด์ „ํ‹ฑ ๋˜๋Š” ๋‹ค์ค‘ ๋ชจ๋ธ ํ˜‘๋ ฅ ์ ‘๊ทผ๋ฒ•์„ ์ ์šฉํ•˜๋Š” ์œ ์‚ฌํ•œ ์—ฐ๊ตฌ์ด๋‹ค.
๋‹ค๋ฅธ ์ ‘๊ทผ
๋‹จ๋ฐฑ์งˆ์˜ ๋ฌผ๋ฆฌ์  ํŠน์„ฑ์„ ๋ชฉํ‘œ๋กœ ํ•œ de novo ์„ค๊ณ„๋ฅผ ๋‹ค๋ฃจ๋Š” ์œ ์‚ฌํ•œ ๋ฐฉ๋ฒ•๋ก ์˜ ์—ฐ๊ตฌ์ด๋‹ค.
๋‹ค๋ฅธ ์ ‘๊ทผ
065 ๋…ผ๋ฌธ์€ ์—์ด์ „ํŠธ ๊ธฐ๋ฐ˜ de novo ๋‹จ๋ฐฑ์งˆ ๋””์ž์ธ์„ ๋‹ค๋ฃจ๋ฏ€๋กœ, 3025์˜ ์ƒ์„ฑํ˜• LSTM ๋ชจ๋ธ๊ณผ ํ”„๋ ˆ์ž„์›Œํฌ ์ˆ˜์ค€์˜ ๋‹ค์–‘ํ•œ ์„ค๊ณ„ ์ ‘๊ทผ์„ ๋น„๊ตํ•ด๋ณผ ์ˆ˜ ์žˆ๋‹ค.
๋‹ค๋ฅธ ์ ‘๊ทผ
์ƒ์„ฑ์  ๊ธฐ๋ฐ˜ ๋””๋…ธ๋ณด ๋‹จ๋ฐฑ์งˆ ์„ค๊ณ„ ๋ชจ๋ธ๋กœ, ๋‹จ๋ฐฑ์งˆ ๋™์—ญํ•™ยท์ง„๋™ ํŠน์„ฑ์„ ๋‹ค๋ฅด๊ฒŒ ๊ณ ๋ คํ•˜๋Š” ์ ‘๊ทผ ๋ฐฉ์‹์„ ๋น„๊ตํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.
๋‹ค๋ฅธ ์ ‘๊ทผ
065๋ฒˆ ๋…ผ๋ฌธ์€ AI ๊ธฐ๋ฐ˜ de novo ๋‹จ๋ฐฑ์งˆ ์„ค๊ณ„๋ฅผ ๋‹ค๋ฃจ๋ฉฐ, 3262๋ฒˆ์˜ ์ƒ์„ฑํ˜• AI๋ฅผ ํ†ตํ•œ ์•„๋ฏธ๋…ธ์‚ฐ ์žฌ์„ค๊ณ„์™€ ๋น„์Šทํ•˜๋‚˜ ๋‹ค๋ฅธ ์„ค๊ณ„ ํ”„๋ ˆ์ž„์›Œํฌ๋ฅผ ์‚ฌ์šฉํ•ฉ๋‹ˆ๋‹ค.
๋‹ค๋ฅธ ์ ‘๊ทผ
065๋Š” ์—์ด์ „ํ‹ฑ ๋ฐฉ์‹์˜ de novo ๋‹จ๋ฐฑ์งˆ ์„ค๊ณ„ ์ž๋™ํ™”๋ฅผ ์ง€ํ–ฅํ•˜๋ฉฐ, 3263์˜ ์ปดํŒŒ์ผ๋Ÿฌ-๊ฒ€์ฆ ๊ธฐ๋ฐ˜ ๊ณผํ•™ ํ”„๋กœํ† ์ฝœ ์‹คํ–‰๊ณผ ๋ชฉ์ ์ด ์œ ์‚ฌํ•ฉ๋‹ˆ๋‹ค.
๋‹ค๋ฅธ ์ ‘๊ทผ
์—์ด์ „ํŠธ ๊ธฐ๋ฐ˜ ๋‹จ๋ฐฑ์งˆ ๊ตฌ์กฐ ์„ค๊ณ„ ํ”„๋ ˆ์ž„์›Œํฌ๋กœ, Genie 3๊ณผ ๊ฐ™์€ SE(3)-๋™๋ณ€์„ฑ ํ™•์‚ฐ ๋ชจ๋ธ๊ณผ์˜ ๋น„๊ต/๋Œ€์กฐ๊ฐ€ ๊ฐ€๋Šฅํ•ฉ๋‹ˆ๋‹ค.
ํ›„์† ์—ฐ๊ตฌ
VibeGen์˜ ํ•ต์‹ฌ ๊ธฐ๋ฐ˜ ๊ธฐ์ˆ ์ธ ๋‹จ๋ฐฑ์งˆ ์„ค๊ณ„ ๋˜๋Š” ์–ธ์–ด ํ™•์‚ฐ ๋ชจ๋ธ์„ ์ง์ ‘์ ์œผ๋กœ ํ™•์žฅํ•˜๊ฑฐ๋‚˜ ๊ธฐ๋ฐ˜์„ ์ œ๊ณตํ•˜๋Š” ์—ฐ๊ตฌ์ด๋‹ค.
ํ›„์† ์—ฐ๊ตฌ
VibeGen๊ณผ ์œ ์‚ฌํ•˜๊ฒŒ protein-ligand ๋ชจ๋ธ๋ง์„ ์œ„ํ•ด geometric foundation models๋ฅผ ์ ์šฉํ•œ ์‚ฌ๋ก€๋กœ, ๋‹จ๋ฐฑ์งˆ ์„ค๊ณ„์˜ ์‹ค์ œ ์ ์šฉ์„ ๋ณด์—ฌ์ค€๋‹ค.
ํ›„์† ์—ฐ๊ตฌ
Agentic End-to-End De Novo Protein Design์€ ์—์ด์ „ํŠธ ๊ธฐ๋ฐ˜์œผ๋กœ ๋‹จ๋ฐฑ์งˆ ๋‹ค์ด๋‚˜๋ฏน์Šค ์ƒ์„ฑํ˜• ์„ค๊ณ„ ์ž๋™ํ™” ๋ฐฉ๋ฒ•์„ ์ œ์‹œํ•˜์—ฌ, 3112์™€ ์ƒํ˜ธ๋ณด์™„์ ์ด๋‹ค.
์‘์šฉ ์‚ฌ๋ก€
์—์ด์ „ํŠธ ๊ธฐ๋ฐ˜ de novo ๋‹จ๋ฐฑ์งˆ ์„ค๊ณ„ ํ”„๋ ˆ์ž„์›Œํฌ๋กœ ์‹ค์ œ ์„ค๊ณ„ ์ „๋žต ์ ์šฉ์— ๊ด€ํ•œ ์—ฐ๊ฒฐ๊ณ ๋ฆฌ๋ฅผ ์ œ๊ณตํ•œ๋‹ค.
← ๋ชฉ๋ก์œผ๋กœ ๋Œ์•„๊ฐ€๊ธฐ

๐ŸŽง Audio Overview

์ด ๋…ผ๋ฌธ ๋ฆฌ๋ทฐ๋ฅผ ํŒŸ์บ์ŠคํŠธํ˜• ์˜ค๋””์˜ค๋กœ ์ƒ์„ฑํ•ฉ๋‹ˆ๋‹ค. (Gemini ยท ํ‚ค๋Š” ๋ธŒ๋ผ์šฐ์ €์—๋งŒ ์ €์žฅ ยท ์™„์„ฑ๋ณธ์€ ์ด๋ฉ”์ผ๋กœ๋„ ์ „์†ก)
โ–ธ ๊ณ ๊ธ‰: ๊ตฌ์„ฑ ๋ฐฉํ–ฅ(๋Œ€๋ณธ ์ž‘์„ฑ ์ง€์นจ) ์ง์ ‘ ์ˆ˜์ •