On improving experimental binding affinity predictions with synthetic data

์ €์ž: | ๋‚ ์งœ: 2026-03-02 | URL: https://www.biorxiv.org/content/10.64898/2026.03.02.708607v1 📄 PDF


Essence

Figure 1

Figure 1. Overview of the work presented. a) The construction of the SAIR-FEP dataset, which includes computed docking a

๋ณธ ๋…ผ๋ฌธ์€ ์‹ ์•ฝ ๊ฐœ๋ฐœ ๋‹จ๊ณ„์—์„œ ํ•„์ˆ˜์ ์ธ ๋‹จ๋ฐฑ์งˆ-๋ฆฌ๊ฐ„๋“œ ๊ฒฐํ•ฉ ์นœํ™”๋„ ์˜ˆ์ธก์„ ์œ„ํ•ด ํ•ฉ์„ฑ ๋ฐ์ดํ„ฐ(์ ˆ๋Œ€ ์ž์œ  ์—๋„ˆ์ง€ ์„ญ๋™ ๊ณ„์‚ฐ)๋ฅผ ํ™œ์šฉํ•˜๋Š” ์ „๋žต์„ ์ œ์‹œํ•œ๋‹ค. SAIR ๋ฐ์ดํ„ฐ๋ฒ ์ด์Šค์— ์•ฝ 80,000๊ฐœ์˜ AFEP ๊ณ„์‚ฐ์„ ์ถ”๊ฐ€ํ•˜์—ฌ ๋‘ ๊ฐœ์˜ ๋ฐ์ดํ„ฐ ๋ถ„ํ• (SAIR-FEP, SAIR-OOD)์„ ๊ตฌ์„ฑํ•˜๊ณ , ์–ธ์–ด ๊ธฐ๋ฐ˜ ๋ชจ๋ธ(PCM)๊ณผ ๊ตฌ์กฐ ๊ธฐ๋ฐ˜ ๋”ฅ๋Ÿฌ๋‹(AEVPLIG) ๋ชจ๋ธ์„ ๋น„๊ต ํ‰๊ฐ€ํ•œ๋‹ค.

Motivation

Achievement

Figure 1

Figure 1. Overview of the work presented. a) The construction of the SAIR-FEP dataset, which includes computed docking a

PCM ๋ชจ๋ธ ์„ฑ๋Šฅ ๊ฐœ์„ : physics-informed descriptor ์ถ”๊ฐ€๋กœ ์˜ˆ์ธก ์ •ํ™•๋„ ํ–ฅ์ƒ. ๊ตฌ์กฐ ๊ธฐ๋ฐ˜ ๋ชจ๋ธ์˜ ํ’ˆ์งˆ ์˜์กด์„ฑ: confidence score ํ•„ํ„ฐ๋ง์œผ๋กœ ๋ชจ๋ธ ์„ฑ๋Šฅ์ด ์˜ˆ์ธก ๊ฐ€๋Šฅํ•˜๊ฒŒ ๊ฐœ์„ ๋จ์„ ์ž…์ฆ. ํ•ฉ์„ฑ-์‹คํ—˜ ๋ฐ์ดํ„ฐ ๊ฒฐํ•ฉ ํ•™์Šต: SAIR-OOD ๋ถ„ํ• ์„ ์‚ฌ์šฉํ•œ ๋™์‹œ ํ•™์Šต์œผ๋กœ ๊ณต๊ฐœ ๋ฒค์น˜๋งˆํฌ(CASF 2016, AEVPLIG-OOD)์—์„œ ์„ฑ๋Šฅ ํ–ฅ์ƒ ๋‹ฌ์„ฑ. ์ฒด๊ณ„์  ๋น„๊ต: PCM๊ณผ structure-based ๋ชจ๋ธ์˜ ์ƒ๋Œ€์  ๊ฐ•์  ๋ฐ ์•ฝ์ ์„ ๋ช…ํ™•ํžˆ ์ œ์‹œํ•˜๋Š” ํฌ๊ด„์  ๋ฒค์น˜๋งˆํ‚น ์ˆ˜ํ–‰.

How

Figure 1

Figure 1. Overview of the work presented. a) The construction of the SAIR-FEP dataset, which includes computed docking a

Originality

Limitation & Further Study

Evaluation

Novelty: 4/5 Technical Soundness: 4/5 Significance: 4/5 Clarity: 4/5 Overall: 4/5

์ดํ‰: ๋ณธ ๋…ผ๋ฌธ์€ ํ•ฉ์„ฑ ๋ฐ์ดํ„ฐ(AFEP ๊ณ„์‚ฐ)๋ฅผ ์ฒด๊ณ„์ ์œผ๋กœ ํ™œ์šฉํ•˜์—ฌ ๊ฒฐํ•ฉ ์นœํ™”๋„ ์˜ˆ์ธก ๋ชจ๋ธ์˜ ์„ฑ๋Šฅ์„ ๊ฐœ์„ ํ•˜๋Š” ๋ช…ํ™•ํ•œ ์ „๋žต์„ ์ œ์‹œํ•œ๋‹ค. SAIR-FEP/OOD ๋ฐ์ดํ„ฐ์…‹, confidence-based filtering, source embedding ๋„์ž… ๋“ฑ ์—ฌ๋Ÿฌ ํ˜์‹ ์  ๊ธฐ๋ฒ•๊ณผ ํฌ๊ด„์ ์ธ ๋ฒค์น˜๋งˆํ‚น์„ ํ†ตํ•ด ์‹ ์•ฝ ๊ฐœ๋ฐœ์— ์‹ค์งˆ์  ๊ฐ€์น˜๋ฅผ ์ œ๊ณตํ•˜๋Š” ์šฐ์ˆ˜ํ•œ ์—ฐ๊ตฌ์ด๋‹ค. ๋‹ค๋งŒ ์ผ๋ฐ˜ํ™” ๊ฐ€๋Šฅ์„ฑ๊ณผ ํ•ฉ์„ฑ ๋ฐ์ดํ„ฐ ํ’ˆ์งˆ์— ๋Œ€ํ•œ ์‹ฌํ™” ๋ถ„์„์ด ๋ณด์™„๋œ๋‹ค๋ฉด ๋”์šฑ ๊ฐ•๋ ฅํ•  ๊ฒƒ์ด๋‹ค.

๊ฐ™์ด ๋ณด๋ฉด ์ข‹์€ ๋…ผ๋ฌธ

๊ธฐ๋ฐ˜ ์—ฐ๊ตฌ
AF2 ๋ฐ RFdiffusion ๋“ฑ ๊ตฌ์กฐ ๊ธฐ๋ฐ˜ ๋‹จ๋ฐฑ์งˆ-๋ฆฌ๊ฐ„๋“œ ๊ฒฐํ•ฉ ์˜ˆ์ธก์˜ ์ตœ์‹  ์ฃผ์š” ๋ฐฉ๋ฒ•๋ก ์— ๋Œ€ํ•œ ๊ธฐ์ดˆ๋ฅผ ์ œ๊ณตํ•œ๋‹ค.
๊ธฐ๋ฐ˜ ์—ฐ๊ตฌ
How to gain valuable insight from scarce data ๋…ผ๋ฌธ์€ ๋“œ๋ฌธยท์ œํ•œ๋œ ๋ฐ์ดํ„ฐ์—์„œ๋„ ML๊ณผ ์‹คํ—˜ ๋ฐ์ดํ„ฐ๋ฅผ ๊ฒฐํ•ฉํ•ด robustํ•œ ์˜ˆ์ธก์„ ํ•˜๋Š” ์ ‘๊ทผ๋ฒ•์„ ๋‹ค๋ค„, ํ•ฉ์„ฑ ๋ฐ์ดํ„ฐ ์ฆ๊ฐ• ์ „๋žต๊ณผ ์—ฐ๊ฒฐ๋œ๋‹ค.
๊ธฐ๋ฐ˜ ์—ฐ๊ตฌ
๋‹ค๋‹จ๊ณ„ ๋ฐ ๋‹ค์–‘ํ•œ ๋ชจ๋‹ฌ๋ฆฌํ‹ฐ์˜ ๋‹จ์ผ์„ธํฌ ๋ชจ๋ธ ๋ฒค์น˜๋งˆํ‚น ์—ฐ๊ตฌ๋กœ, ์‹ ์•ฝ ๊ฐœ๋ฐœ ๋ฐ ๊ฒฐํ•ฉ ์นœํ™”๋„ ํ‰๊ฐ€ ๋ฐ์ดํ„ฐ์…‹ ๊ตฌ์„ฑ์˜ ์ด๋ก ์  ๋งฅ๋ฝ์„ ์ œ๊ณตํ•ฉ๋‹ˆ๋‹ค.
๋‹ค๋ฅธ ์ ‘๊ทผ
Structure-guided generative design of peptides ๋…ผ๋ฌธ์€ binding affinity ์˜ˆ์ธก์—์„œ ์ƒ์„ฑ์  ์ ‘๊ทผ์„ ์“ฐ๋ฉฐ, ํ•ฉ์„ฑ๋ฐ์ดํ„ฐ ํ™œ์šฉ์˜ ํšจ๊ณผ์„ฑ์„ ๋น„๊ตํ•  ์ˆ˜ ์žˆ๋‹ค.
๋‹ค๋ฅธ ์ ‘๊ทผ
ํญ๋ฐœ ํŠน์„ฑ ์˜ˆ์ธก์—์„œ ํ™œ์„ฑํ™” ์„ฑ๋Šฅ ์ œ๊ณ ๋ฅผ ์œ„ํ•œ ์•กํ‹ฐ๋ธŒ๋Ÿฌ๋‹ ๋ฐ ํ•ฉ์„ฑ ๋ฐ์ดํ„ฐ ํ™œ์šฉ ์‚ฌ๋ก€๋กœ, ๋ถ„์ž-๋ฌผ์งˆ ์นœํ™”๋„ ์˜ˆ์ธก์˜ ๋‹ค๋ฅธ ์˜ˆ์‹œ์ž…๋‹ˆ๋‹ค.
๋‹ค๋ฅธ ์ ‘๊ทผ
ํ•ญ๋ฐ”์ด๋Ÿฌ์Šค ํ›„๋ณด ์•ฝ๋ฌผ์˜ docking ๊ธฐ๋ฐ˜ in silico ์˜ˆ์ธก ๋ชจ๋ธ์„ ๋Œ€๊ทœ๋ชจ๋กœ ๋ฒค์น˜๋งˆํ‚นํ•˜๋ฏ€๋กœ, ํ•ฉ์„ฑ/์‹คํ—˜ ๋ฐ์ดํ„ฐ ํ™œ์šฉ์˜ ํ•œ๊ณ„ ๋ฐ ์ƒ๋Œ€ ์„ฑ๋Šฅ์„ ์ง์ ‘ ๋น„๊ตํ•  ์ˆ˜ ์žˆ๋‹ค.
ํ›„์† ์—ฐ๊ตฌ
์‹ค์ œ ๊ฒฐํ•ฉ ์นœํ™”๋„ ์˜ˆ์ธก ์‹ ๋ขฐ์„ฑ ํ–ฅ์ƒ ๋ฐ ์‹คํ—˜๊ฒฐ๊ณผ์˜ ํ•ด์„๋ ฅ ๊ฐ•ํ™”๋กœ, spectral map ๊ธฐ๋ฐ˜ ๋ถ„์„ ๊ฒฐ๊ณผ๋ฅผ ์ƒ๋ฌผ๋ฌผ๋ฆฌ ์‹คํ—˜์— ์ง์ ‘ ์—ฐ๊ณ„ํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.
์‘์šฉ ์‚ฌ๋ก€
์‹ ๊ทœ ํ•ญ์ฒด/๋‹จ๋ฐฑ์งˆ ๊ฒฐํ•ฉ ์นœํ™”๋ ฅ ์˜ˆ์ธก ๋…ผ๋ฌธ์€ FLIP2์—์„œ ๋‹ค๋ฃจ๋Š” fitness landscape ์ผ๋ฐ˜ํ™” ์‹ค์ฆ์— ์‹ค์ œ ์ ์šฉ์‚ฌ๋ก€๋กœ ์—ฐ๊ณ„ํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.
← ๋ชฉ๋ก์œผ๋กœ ๋Œ์•„๊ฐ€๊ธฐ

๐ŸŽง Audio Overview

์ด ๋…ผ๋ฌธ ๋ฆฌ๋ทฐ๋ฅผ ํŒŸ์บ์ŠคํŠธํ˜• ์˜ค๋””์˜ค๋กœ ์ƒ์„ฑํ•ฉ๋‹ˆ๋‹ค. (Gemini ยท ํ‚ค๋Š” ๋ธŒ๋ผ์šฐ์ €์—๋งŒ ์ €์žฅ ยท ์™„์„ฑ๋ณธ์€ ์ด๋ฉ”์ผ๋กœ๋„ ์ „์†ก)
โ–ธ ๊ณ ๊ธ‰: ๊ตฌ์„ฑ ๋ฐฉํ–ฅ(๋Œ€๋ณธ ์ž‘์„ฑ ์ง€์นจ) ์ง์ ‘ ์ˆ˜์ •