De novo design of protein structure and function with RFdiffusion

์ €์ž: Joseph L. Watson, David Juergens, Nathaniel R. Bennett, Brian L. Trippe, Jason Yim, Helen E. Eisenach, Woody Ahern, Andrew J. Borst, Robert J. Ragotte, Lukas F. Milles, Basile I. M. Wicky, Nikita Hanikel, Samuel J. Pellock, Alexis Courbet, William Sheffler, Jue Wang, Preetham Venkatesh, Isaac Sappington, Susana Vรกzquez Torres, Anna Lauko, Valentin De Bortoli, Emile Mathieu, Sergey Ovchinnikov, Regina Barzilay, Tommi S. Jaakkola, Frank DiMaio, Minkyung Baek, David Baker | ๋‚ ์งœ: 2023-08-31 | DOI: 10.1038/s41586-023-06415-8 📄 PDF


Essence

Figure 1

Fig. 1 | Protein design using RFdiffusion. a, Diffusion models for proteins are

RFdiffusion์€ RoseTTAFold ๊ตฌ์กฐ ์˜ˆ์ธก ๋„คํŠธ์›Œํฌ๋ฅผ ๋‹จ๋ฐฑ์งˆ ๊ตฌ์กฐ denoising ์ž‘์—…์œผ๋กœ fine-tuningํ•˜์—ฌ ๋‹ค์–‘ํ•œ ๋‹จ๋ฐฑ์งˆ ์„ค๊ณ„ ๋ฌธ์ œ(de novo binder, ๋Œ€์นญ ์˜ฌ๋ฆฌ๊ณ ๋จธ, ํšจ์†Œ scaffolding ๋“ฑ)๋ฅผ ํ•ด๊ฒฐํ•˜๋Š” ์ƒ์„ฑํ˜• diffusion model์ด๋‹ค.

Motivation

Achievement

Figure 2

Fig. 2 | Outstanding performance of RFdiffusion for monomer generation.

How

Figure 1

Fig. 1 | Protein design using RFdiffusion. a, Diffusion models for proteins are

Originality

Limitation & Further Study

Evaluation

Novelty: 4/5 Technical Soundness: 3/5 Significance: 4/5 Clarity: 4/5 Overall: 4/5

์ดํ‰: RFdiffusion์€ ๊ตฌ์กฐ ์˜ˆ์ธก ๋„คํŠธ์›Œํฌ์˜ ๊ฐ•๋ ฅํ•œ ํ‘œํ˜„๋ ฅ์„ generative diffusion model๋กœ ์ „ํ™˜ํ•˜์—ฌ ๋‹จ๋ฐฑ์งˆ ์„ค๊ณ„์˜ ๋‹ค์–‘ํ•œ ๋„์ „์„ ํ†ต์ผ์ ์œผ๋กœ ํ•ด๊ฒฐํ•˜๋Š” ํš๊ธฐ์  ๋ฐฉ๋ฒ•๋ก ์ด๋ฉฐ, ๊ด‘๋ฒ”์œ„ํ•œ ์‹คํ—˜์  ๊ฒ€์ฆ๊ณผ cryo-EM ๊ตฌ์กฐ ํ™•์ธ์œผ๋กœ ๊ทธ ์‹ค์šฉ์„ฑ๊ณผ ์ •ํ™•์„ฑ์„ ์ž…์ฆํ•œ ๋งค์šฐ ์ค‘์š”ํ•œ ๊ธฐ์—ฌ์ด๋‹ค.

๊ฐ™์ด ๋ณด๋ฉด ์ข‹์€ ๋…ผ๋ฌธ

๊ธฐ๋ฐ˜ ์—ฐ๊ตฌ
344๋Š” ์ƒ๋ฌผ์ •๋ณดํ•™ ๋ถ„์•ผ์—์„œ์˜ ํŒŒ์šด๋ฐ์ด์…˜ ๋ชจ๋ธ ๊ฐœ๊ด„์„œ๋กœ, 256์˜ ๋‹จ๋ฐฑ์งˆ ์„ค๊ณ„ diffusion ๋ชจ๋ธ์˜ ๊ธฐ์ˆ ์  ๋ฐ ์ด๋ก ์  ๊ธฐ๋ฐ˜์ด ๋œ๋‹ค.
๊ธฐ๋ฐ˜ ์—ฐ๊ตฌ
RFdiffusion ๊ธฐ๋ฐ˜์˜ ๋‹จ๋ฐฑ์งˆ ๊ตฌ์กฐ ๋ฐ ๊ธฐ๋Šฅ ์„ค๊ณ„ ์—ฐ๊ตฌ๋Š” ๊ธฐ์กด ์กฐ์„ฑ ์ค‘์‹ฌ ํœด๋ฆฌ์Šคํ‹ฑ์„ ๋„˜์–ด ๋ถ„์ž ๋ ˆ๋ฒจ์˜ ๊ตฌ์กฐ-๊ธฐ๋Šฅ ์ƒ๊ด€๊ด€๊ณ„๋ฅผ ํƒ๊ตฌํ•˜๋Š” ๊ธฐ๋ฐ˜์„ ์ œ๊ณตํ•œ๋‹ค.
๊ธฐ๋ฐ˜ ์—ฐ๊ตฌ
๊ตฌ์กฐ ๊ธฐ๋ฐ˜ ๋‹จ๋ฐฑ์งˆ ์„ค๊ณ„(RFdiffusion)์— ๊ด€ํ•œ ์—ฐ๊ตฌ๋กœ, ํ•ด์„ ๊ฐ€๋Šฅ์„ฑ๊ณผ generative ๊ธฐ๋ฒ•์ด ์–ด๋–ป๊ฒŒ ๊ฒฐํ•ฉ๋˜๋Š”์ง€ ์„ค๋ช…ํ•œ๋‹ค.
๊ธฐ๋ฐ˜ ์—ฐ๊ตฌ
RFdiffusion์„ ํ™œ์šฉํ•œ de novo ๋‹จ๋ฐฑ์งˆ/ํŽฉํƒ€์ด๋“œ ์„ค๊ณ„ ์—ฐ๊ตฌ๋กœ, ์ง์ ‘์  ์ด๋ก ์  ๊ธฐ๋ฐ˜์ž…๋‹ˆ๋‹ค.
๊ธฐ๋ฐ˜ ์—ฐ๊ตฌ
256 ๋…ผ๋ฌธ์€ RFdiffusion ๊ธฐ๋ฐ˜ ๋‹จ๋ฐฑ์งˆ์˜ de novo ๊ตฌ์กฐ ๋ฐ ๊ธฐ๋Šฅ ์„ค๊ณ„์˜ ์ตœ์‹  ์ด๋ก  ๋ฐ ์ ์šฉ์„ ๋‹ค๋ค„, 3028์˜ ๋ฒค์น˜๋งˆํฌ ํ•ญ๋ชฉ์— ์ฃผ์š”ํ•œ ์ด๋ก ์„ ์ œ๊ณตํ•œ๋‹ค.
๊ธฐ๋ฐ˜ ์—ฐ๊ตฌ
RFdiffusion ํ™œ์šฉ de novo ๋‹จ๋ฐฑ์งˆยท๋ฆฌ๊ฐ„๋“œ ๊ฒฐํ•ฉ ์„ค๊ณ„์˜ ์ด๋ก ์  ๊ธฐ๋ฐ˜์œผ๋กœ, ์‹คํ—˜์  ์„ค๊ณ„ ๋ฐฉ๋ฒ•๋ก ์„ ์‹ฌํ™”ํ•œ๋‹ค.
๊ธฐ๋ฐ˜ ์—ฐ๊ตฌ
RFdiffusion ๊ธฐ๋ฐ˜ de novo ๋‹จ๋ฐฑ์งˆ ์„ค๊ณ„ ๋…ผ๋ฌธ์€ ๋Œ์—ฐ๋ณ€์ด ๊ฐ•๊ฑด์„ฑ ๋“ฑ ๋‹จ๋ฐฑ์งˆ ์„œ์—ด-๊ตฌ์กฐ ๊ด€๊ณ„ ์—ฐ๊ตฌ์˜ ์ด๋ก ์  ๊ธฐ๋ฐ˜์„ ์ œ๊ณตํ•ฉ๋‹ˆ๋‹ค.
๊ธฐ๋ฐ˜ ์—ฐ๊ตฌ
de novo ๋‹จ๋ฐฑ์งˆ ๊ตฌ์กฐ ์˜ˆ์ธก ๋ฐ ์„ค๊ณ„์˜ ๊ธฐ๋ณธ์  ์•Œ๊ณ ๋ฆฌ์ฆ˜๊ณผ ๋ฒค์น˜๋งˆํ‚น ๋ฐฉ๋ฒ•๋ก ์„ ์ œ๊ณตํ•ฉ๋‹ˆ๋‹ค.
๊ธฐ๋ฐ˜ ์—ฐ๊ตฌ
AF2 ๋ฐ RFdiffusion ๋“ฑ ๊ตฌ์กฐ ๊ธฐ๋ฐ˜ ๋‹จ๋ฐฑ์งˆ-๋ฆฌ๊ฐ„๋“œ ๊ฒฐํ•ฉ ์˜ˆ์ธก์˜ ์ตœ์‹  ์ฃผ์š” ๋ฐฉ๋ฒ•๋ก ์— ๋Œ€ํ•œ ๊ธฐ์ดˆ๋ฅผ ์ œ๊ณตํ•œ๋‹ค.
๊ธฐ๋ฐ˜ ์—ฐ๊ตฌ
256๋ฒˆ ๋…ผ๋ฌธ์€ RFdiffusion ๊ธฐ๋ฐ˜ de novo ๋‹จ๋ฐฑ์งˆ ์„ค๊ณ„ ํŒŒ์ดํ”„๋ผ์ธ์„ ์ œ์‹œ, IARA๊ฐ€ ์‚ฌ์ „ํ‰๊ฐ€ํ•  ์ƒ์„ฑ ๊ฒฐ๊ณผ์˜ ๋Œ€ํ‘œ์  ์˜ˆ์‹œ๋กœ ์ฐธ๊ณ  ๊ฐ€๋Šฅํ•ฉ๋‹ˆ๋‹ค.
๊ธฐ๋ฐ˜ ์—ฐ๊ตฌ
256๋ฒˆ ๋…ผ๋ฌธ์€ de novo ๋‹จ๋ฐฑ์งˆ ๊ตฌ์กฐ ์„ค๊ณ„๋ฅผ ์œ„ํ•œ RFdiffusion ์ ‘๊ทผ๋ฒ•์„ ์ œ๊ณตํ•ด small-molecule binding protein ์„ค๊ณ„ workflow์ธ CLAIRE์™€ ๊ทผ๋ณธ์ ์œผ๋กœ ์—ฐ๊ฒฐ๋ฉ๋‹ˆ๋‹ค.
๊ธฐ๋ฐ˜ ์—ฐ๊ตฌ
de novo ๋‹จ๋ฐฑ์งˆ ๊ตฌ์กฐ ๋ฐ ๊ธฐ๋Šฅ ์„ค๊ณ„ LLM ๊ธฐ๋ฐ˜ ์ ‘๊ทผ์œผ๋กœ ๋ณธ ๋…ผ๋ฌธ์˜ ์ƒ์„ฑํ˜• ๋‹จ๋ฐฑ์งˆ ์„ค๊ณ„์— ๊ธฐ๋ฐ˜์„ ์ œ๊ณตํ•ฉ๋‹ˆ๋‹ค.
๊ธฐ๋ฐ˜ ์—ฐ๊ตฌ
256๋ฒˆ ๋…ผ๋ฌธ(RFdiffusion ๊ธฐ๋ฐ˜ ๋‹จ๋ฐฑ์งˆ ๊ตฌ์กฐ/function ์ƒ์„ฑ)๊ณผ 3097์˜ Genie 3 ๋‹จ๋ฐฑ์งˆ ํ™•์‚ฐ ๋ชจ๋ธ์€ ์ตœ์‹  ๊ตฌ์กฐ์ƒ์„ฑ๋ฒ•์œผ๋กœ ์ƒํ˜ธ ์ฐธ์กฐ๊ฐ€ ์œ ์ตํ•ฉ๋‹ˆ๋‹ค.
๋‹ค๋ฅธ ์ ‘๊ทผ
256์€ RFdiffusion ๊ธฐ๋ฐ˜ ๋‹จ๋ฐฑ์งˆ ๊ตฌ์กฐ/๊ธฐ๋Šฅ ์„ค๊ณ„๋ฅผ ๋‹ค๋ฃจ์–ด, DNA ์„œ์—ด ์„ค๊ณ„ ์ค‘์‹ฌ 459์™€ ๋น„๊ต๋˜๋Š” ๋Œ€ํ‘œ์ ์ธ ๋ถ„์ž ์ƒ์„ฑ ์ ‘๊ทผ๋ฒ•์ด๋‹ค.
๋‹ค๋ฅธ ์ ‘๊ทผ
RFdiffusion์„ ๊ธฐ๋ฐ˜์œผ๋กœ ํ•œ de novo ๋‹จ๋ฐฑ์งˆ ๊ตฌ์กฐ ์„ค๊ณ„ ๋…ผ๋ฌธ์œผ๋กœ, ํ•ญ์ฒด๊ฐ€ ์•„๋‹Œ ์ผ๋ฐ˜ ๋‹จ๋ฐฑ์งˆ๋กœ ์ ์šฉ๋˜๋Š” ๋ฉ”์ปค๋‹ˆ์ฆ˜ ๋ฐ ์„ฑ๋Šฅ ๋น„๊ต์— ์ฐธ๊ณ ๊ฐ€ ๋ฉ๋‹ˆ๋‹ค.
๋‹ค๋ฅธ ์ ‘๊ทผ
RFdiffusion ๊ธฐ๋ฐ˜ ๋‹จ๋ฐฑ์งˆ ๊ตฌ์กฐ ์„ค๊ณ„ ๋ชจ๋ธ๋กœ, CryoNet.Refine์˜ one-step diffusion ๋ฐฉ์‹๊ณผ iterative ์ƒ์„ฑ๊ตฌ์กฐ refinement์˜ ์ฐจ๋ณ„์ ์„ ๋น„๊ตํ•  ์ˆ˜ ์žˆ๋‹ค.
๋‹ค๋ฅธ ์ ‘๊ทผ
State Space Model์„ ํ™œ์šฉํ•œ ์ƒ์ฒด๋ถ„์ž ์‹œ๊ณ„์—ด ๋ชจ๋ธ๋ง์˜ ๋Œ€์•ˆ์  ์ ‘๊ทผ๋ฒ•์„ ์ œ์‹œํ•œ๋‹ค
๋‹ค๋ฅธ ์ ‘๊ทผ
De novo protein design์„ RFdiffusion ๊ธฐ๋ฐ˜์œผ๋กœ ์ ‘๊ทผํ•˜์—ฌ, RL ์•„๋‹Œ diffusion ๊ธฐ๋ฐ˜ ์ƒ์„ฑ์˜ ์„ฑ๊ณผ๋ฅผ ๋น„๊ตํ•  ์ˆ˜ ์žˆ๋‹ค.
๋‹ค๋ฅธ ์ ‘๊ทผ
De novo design of protein structure and function with RFdiff๋Š” ์ƒ๋ฌผํ•™์  ๊ธฐ๋Šฅ ๋‹จ๋ฐฑ์งˆ de novo ์„ค๊ณ„๋ฅผ ๋‹ค๋ฃจ๋‚˜, 3114์˜ ์—๋„ˆ์ง€ ๋ฌผ์งˆ ์ „์ด ํ•™์Šต ์ƒ์„ฑ AI์™€๋Š” ์ ์šฉ ๋„๋ฉ”์ธ์ด ๋‹ค๋ฅด๋‹ค.
๋‹ค๋ฅธ ์ ‘๊ทผ
๋‹จ๋ฐฑ์งˆ binder ์„ค๊ณ„ ๋˜๋Š” ํšจ์†Œ scaffolding์„ ์œ„ํ•œ ๊ณ„์‚ฐ์  ๋ฐฉ๋ฒ•๋ก ์„ ์ œ์•ˆํ•˜๋Š” ์—ฐ๊ตฌ์ด๋‹ค.
๋‹ค๋ฅธ ์ ‘๊ทผ
2990 ๋…ผ๋ฌธ์€ de novo ๋‹จ๋ฐฑ์งˆ ์„ค๊ณ„ ๋ฌธ์ œ๋ฅผ ์‹ ๊ฒฝ-๊ธฐํ˜ธ์  ๋ฐฉ๋ฒ•๋ก ์œผ๋กœ ์ ‘๊ทผํ•˜๋ฉฐ, RFdiffusion์˜ pure neural generative approach์™€ ๋Œ€์กฐ์ ์ž…๋‹ˆ๋‹ค.
๋‹ค๋ฅธ ์ ‘๊ทผ
์‹œ๋ฎฌ๋ ˆ์ด์…˜์—์„œ ์‹คํ—˜ ํ™˜๊ฒฝ์œผ๋กœ์˜ ์ „์ด ํ•™์Šต์„ ํ™œ์šฉํ•˜๋Š” ์ž์œจ ๊ณผํ•™ ์‹คํ—˜์˜ ๋Œ€์•ˆ์  ์ ‘๊ทผ์ด๋‹ค
๋‹ค๋ฅธ ์ ‘๊ทผ
RFdiffusion์„ ํ™œ์šฉํ•œ ์ œ๋กœ์ƒท ๋‹จ๋ฐฑ์งˆ ๊ตฌ์กฐ ์ƒ์„ฑ ์ ‘๊ทผ๋ฒ•์œผ๋กœ, AlphaFold ๋‚ด๋ถ€ ํ™œ์„ฑํ™” ์กฐ์ž‘๊ณผ๋Š” ๋‹ค๋ฅธ ๋ฐฉ๋ฒ•๋ก ์„ ์ œ์‹œํ•ฉ๋‹ˆ๋‹ค.
ํ›„์† ์—ฐ๊ตฌ
De novo design of protein structure with RFdiff ๋…ผ๋ฌธ์€ ๋‹จ๋ฐฑ์งˆ ๊ตฌ์กฐ ์„ค๊ณ„์˜ ์ƒˆ๋กœ์šด ๊ธฐ๊ณ„ํ•™์Šต ๊ธฐ๋ฐ˜์„ ์ œ์‹œํ•˜์—ฌ ProteinMPNN๊ณผ ๋น„๊ต๋  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.
ํ›„์† ์—ฐ๊ตฌ
RFdiffusion ๋…ผ๋ฌธ์€ de novo ๋‹จ๋ฐฑ์งˆ ์„ค๊ณ„์˜ ๊ตฌ์กฐ์˜ˆ์ธก ๋ฐ ์ƒ์„ฑํ˜• ๋ชจ๋ธ ์ ‘๊ทผ์„ ๊ณ ์นœํ™”๋„ ํ•ญ์ฒด ์„ค๊ณ„์— ํ™•์žฅํ•œ ์‚ฌ๋ก€์ด๋‹ค.
ํ›„์† ์—ฐ๊ตฌ
VibeGen์˜ ํ•ต์‹ฌ ๊ธฐ๋ฐ˜ ๊ธฐ์ˆ ์ธ ๋‹จ๋ฐฑ์งˆ ์„ค๊ณ„ ๋˜๋Š” ์–ธ์–ด ํ™•์‚ฐ ๋ชจ๋ธ์„ ์ง์ ‘์ ์œผ๋กœ ํ™•์žฅํ•˜๊ฑฐ๋‚˜ ๊ธฐ๋ฐ˜์„ ์ œ๊ณตํ•˜๋Š” ์—ฐ๊ตฌ์ด๋‹ค.
ํ›„์† ์—ฐ๊ตฌ
112๋Š” RFdiffusion์„ ํ•ญ์ฒด ์„ค๊ณ„๋กœ ํ™•์žฅ ์ ์šฉํ•œ ๋…ผ๋ฌธ์œผ๋กœ, 256์˜ ๋‹จ๋ฐฑ์งˆ ์ƒ์„ฑ ๋ชจ๋ธ ์ ‘๊ทผ์„ ๊ตฌ์ฒด์  ์ƒ๋ช…๊ณผํ•™ ์‘์šฉ์œผ๋กœ ์‹ฌํ™”ํ•œ๋‹ค.
ํ›„์† ์—ฐ๊ตฌ
Latent-Y ๋…ผ๋ฌธ์€ de novo ํ•ญ์ฒด๋ฅผ ์œ„ํ•œ ์ž์œจ ์—์ด์ „ํŠธ ์„ค๊ณ„ ๋ฐฉ๋ฒ•์„ ์ œ์•ˆํ•˜๋ฉฐ, RFdiffusion ๋ชจ๋ธ์˜ ์‹คํ—˜์  ํ™•์žฅ ๊ฒฐ๊ณผ๋ฅผ ๋ณด์—ฌ์ค€๋‹ค.
ํ›„์† ์—ฐ๊ตฌ
2988 ๋…ผ๋ฌธ์€ RFdiffusion ๋“ฑ ์ƒ์„ฑ ๋‹จ๋ฐฑ์งˆ ๋ชจ๋ธ์˜ ๊ฒฐํ•ฉ ๋ถ€์œ„ ์˜ˆ์ธก ์ •ํ™•๋„ ๊ฐœ์„ ์„ ๋ชฉ์ ์œผ๋กœ ์‚ฌ์ „ ํ‰๊ฐ€ํ•˜๋Š” GAT ๊ธฐ๋ฐ˜ ๋ชจ๋ธ์„ ์ œ์‹œํ•ฉ๋‹ˆ๋‹ค.
ํ›„์† ์—ฐ๊ตฌ
RFdiffusion ๊ธฐ๋ฐ˜ de novo ๋‹จ๋ฐฑ์งˆ ๊ตฌ์กฐ ๋ฐ ๊ธฐ๋Šฅ ์„ค๊ณ„ ๋…ผ๋ฌธ์œผ๋กœ, ์„ธ๋ฐ€ํ•œ ์ž…์ฒด์„ ํƒ์„ฑ์ด๋‚˜ ์ด‰๋งค-๊ธฐ์งˆ ์„ค๊ณ„ ์˜ˆ์ธก์˜ ์‹ค์ œ์  ์‘์šฉ ์˜ˆ์‹œ๊ฐ€ ๋ฉ๋‹ˆ๋‹ค.
← ๋ชฉ๋ก์œผ๋กœ ๋Œ์•„๊ฐ€๊ธฐ

๐ŸŽง Audio Overview

์ด ๋…ผ๋ฌธ ๋ฆฌ๋ทฐ๋ฅผ ํŒŸ์บ์ŠคํŠธํ˜• ์˜ค๋””์˜ค๋กœ ์ƒ์„ฑํ•ฉ๋‹ˆ๋‹ค. (Gemini ยท ํ‚ค๋Š” ๋ธŒ๋ผ์šฐ์ €์—๋งŒ ์ €์žฅ ยท ์™„์„ฑ๋ณธ์€ ์ด๋ฉ”์ผ๋กœ๋„ ์ „์†ก)
โ–ธ ๊ณ ๊ธ‰: ๊ตฌ์„ฑ ๋ฐฉํ–ฅ(๋Œ€๋ณธ ์ž‘์„ฑ ์ง€์นจ) ์ง์ ‘ ์ˆ˜์ •