Protein structure-informed deep learning enables species-specific codon optimization

์ €์ž: | ๋‚ ์งœ: 2026-04-24 | URL: https://www.biorxiv.org/content/10.64898/2026.04.21.720047v1 📄 PDF


Essence

Figure 1

Fig. 1: (A). Overview of PISCO tasks: 1.Codon optimization conditioned on host

๋ณธ ๋…ผ๋ฌธ์€ GVP ๊ธฐ๋ฐ˜ deep learning ๋ชจ๋ธ์ธ PISCO๋ฅผ ์ œ์•ˆํ•˜์—ฌ, ๋‹จ๋ฐฑ์งˆ ์„œ์—ดยท3D ๊ตฌ์กฐยท์ข… ํŠน์ด์  ์ฝ”๋ˆ ์‚ฌ์šฉ ํ†ต๊ณ„๋ฅผ ํ†ตํ•ฉํ•จ์œผ๋กœ์จ ์ข… ํŠน์ด์  ์ฝ”๋ˆ ์ตœ์ ํ™”๋ฅผ ์ˆ˜ํ–‰ํ•œ๋‹ค. ๋‹จ๋ฐฑ์งˆ ๊ตฌ์กฐ๋ฅผ ๋ช…์‹œ์ ์œผ๋กœ ๊ณ ๋ คํ•จ์œผ๋กœ์จ co-translational folding์„ ๋ชจ๋ธ๋งํ•˜๊ณ , ๊ธฐ์กด์˜ ์„œ์—ด ์ค‘์‹ฌ ๋ฐฉ๋ฒ•๋“ค๋ณด๋‹ค ์šฐ์ˆ˜ํ•œ ์„ฑ๋Šฅ์„ ๋‹ฌ์„ฑํ•œ๋‹ค.

Motivation

Achievement

Figure 2

Fig. 2: Species-specific distributions of global codon usage patterns across models.

How

Figure 1

Fig. 1: (A). Overview of PISCO tasks: 1.Codon optimization conditioned on host

Originality

Limitation & Further Study

Evaluation

Novelty: 4/5 Technical Soundness: 4/5 Significance: 4/5 Clarity: 4/5 Overall: 4/5

์ดํ‰: ๋ณธ ๋…ผ๋ฌธ์€ ๋‹จ๋ฐฑ์งˆ ๊ตฌ์กฐ๋ฅผ ์ฝ”๋ˆ ์ตœ์ ํ™”์— ๋ช…์‹œ์ ์œผ๋กœ ํ†ตํ•ฉํ•œ ํ˜์‹ ์  ์ ‘๊ทผ์„ ์ œ์‹œํ•˜๋ฉฐ, ํฌ๊ด„์ ์ธ ablation study์™€ ์Šต์‹ ๊ฒ€์ฆ์„ ํ†ตํ•ด ๊ทธ ํšจ๊ณผ๋ฅผ ์ž…์ฆํ•œ๋‹ค. ๋ฐฉ๋ฒ•๋ก ์˜ ์ฐธ์‹ ์„ฑ๊ณผ ์‹ค๋ฌด์  ๊ฐ€์น˜๊ฐ€ ๋†’์œผ๋‚˜, ๊ตฌ์กฐ ์˜ˆ์ธก ์˜ค๋ฅ˜์˜ ์˜ํ–ฅ ๋ถ„์„๊ณผ ๋” ๊ด‘๋ฒ”์œ„ํ•œ ๊ฒ€์ฆ ๋ฐ์ดํ„ฐ ํ™•๋ณด๊ฐ€ ํ–ฅํ›„ ๋ณด์™„๋˜์–ด์•ผ ํ•œ๋‹ค.

๊ฐ™์ด ๋ณด๋ฉด ์ข‹์€ ๋…ผ๋ฌธ

๊ธฐ๋ฐ˜ ์—ฐ๊ตฌ
๊ณผํ•™ ๋…ผ๋ฌธ ํ…์ŠคํŠธ ๋ถ„์„์„ ๊ธฐ๋ฐ˜์œผ๋กœ ํ•œ ์‚ฌ์ „ํ•™์Šต ๋ชจ๋ธ ํ™œ์šฉ ๋…ผ๋ฌธ์œผ๋กœ, ๋‹จ๋ฐฑ์งˆ ์„œ์—ด ๋ฐ ๊ตฌ์กฐ ์ •๋ณด์˜ ํ†ตํ•ฉ์— ์ด๋ก ์  ํ† ๋Œ€๋ฅผ ์ œ๊ณตํ•œ๋‹ค.
๊ธฐ๋ฐ˜ ์—ฐ๊ตฌ
DNA ์„œ์—ด ์„ค๊ณ„ ๋ฐ ์ œ์–ด์˜ ์–ธ์–ด๋ชจ๋ธ ํ™œ์šฉ์— ๋Œ€ํ•œ ์ „๋ฐ˜์  ๊ฒ€ํ† ๋กœ, 3223์˜ ์ฝ”๋ˆ ์ตœ์ ํ™” ๋ฌธ์ œ์— ์ด๋ก ์  ๊ธฐ๋ฐ˜์„ ์ œ๊ณตํ•œ๋‹ค.
๊ธฐ๋ฐ˜ ์—ฐ๊ตฌ
AlphaFold2 ๊ธฐ๋ฐ˜์œผ๋กœ ๋‹จ๋ฐฑ์งˆ ํ‘œ๋ฉด ๋ฐ”์ธ๋”ฉ ์˜ˆ์ธก์„ ๋‹ค๋ฃฌ ๋…ผ๋ฌธ์œผ๋กœ, PISCO์˜ ๊ตฌ์กฐ ์ •๋ณด ๊ธฐ๋ฐ˜ ๋ฐ”์ธ๋”ฉ ์˜ˆ์ธก ๊ธฐ์ˆ ๊ณผ ๊ฐœ๋…์ ์œผ๋กœ ์—ฐ๊ณ„๋œ๋‹ค.
๋‹ค๋ฅธ ์ ‘๊ทผ
ํ•ญ์ฒด-ํ•ญ์› ํŠน์ด์„ฑ ์˜ˆ์ธก์— ์„œ์—ดยท๊ตฌ์กฐยทํ†ต๊ณ„ ๊ธฐ๋ฐ˜ ๋ชจ๋ธ (GVP, DCA ๋“ฑ) ๊ฐ„ ํŠน์ง• ๋ฐ ์„ฑ๋Šฅ์„ ๋น„๊ตํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.
๋‹ค๋ฅธ ์ ‘๊ทผ
๋‹จ๋ฐฑ์งˆ ๊ตฌ์กฐ ์ •๋ณด๋ฅผ ํ™œ์šฉํ•œ ๋ฐ”์ธ๋”ฉ ํŠน์ด์„ฑ ์˜ˆ์ธก ๋ถ„์•ผ์—์„œ ์ปดํ“จํ„ฐ ์„ค๊ณ„ ๋ฐ ์‹คํ—˜ ๊ฒ€์ฆ ๋ชจ๋ธ๊ณผ์˜ ๋น„๊ต๊ฐ€ ๊ฐ€๋Šฅํ•ฉ๋‹ˆ๋‹ค.
๋‹ค๋ฅธ ์ ‘๊ทผ
๋”ฅ๋Ÿฌ๋‹ ๊ธฐ๋ฐ˜ ๋‹จ๋ฐฑ์งˆ ์ฝ”๋ˆ ์ตœ์ ํ™” ๋ฐฉ๋ฒ•์œผ๋กœ, RL ๊ธฐ๋ฐ˜ ๋‹ค์–‘ํ•œ ์„œ์—ด ์ƒ์„ฑ ๋Œ€์•ˆ ๋ฐฉ์‹์— ๋Œ€ํ•œ ๋น„๊ต์ ์„ ์ œ๊ณตํ•œ๋‹ค.
ํ›„์† ์—ฐ๊ตฌ
PLM์„ ๊ฐ•ํ™”ํ•™์Šต์œผ๋กœ ๊ฐ€์ด๋“œํ•˜์—ฌ ์ข… ํŠน์ด์  ๋‹จ๋ฐฑ์งˆ ์„œ์—ด์„ ์ƒ์„ฑํ•˜๋Š” ๋“ฑ 3223์ด ์ œ์•ˆํ•œ ๊ตฌ์กฐยท์ฝ”๋ˆ ์ตœ์ ํ™” ๋ฌธ์ œ๋ฅผ ํ™•์žฅํ•œ๋‹ค.
ํ›„์† ์—ฐ๊ตฌ
3223 ๋…ผ๋ฌธ์€ ์ข… ํŠน์ด์  ๋‹จ๋ฐฑ์งˆ ์ƒํ˜ธ์ž‘์šฉ ์˜ˆ์ธก์— ์‹ฌ์ธตํ•™์Šต์„ ์ ์šฉํ•ด, HADDOCK3(3139)๋ฅผ ํ™œ์šฉํ•œ ์ •๋ฐ€ ๊ตฌ์กฐ ๋ชจ๋ธ๋ง์˜ ํ™•์žฅ ์‘์šฉ ์‚ฌ๋ก€๋ฅผ ๋ณด์—ฌ์ค๋‹ˆ๋‹ค.
ํ›„์† ์—ฐ๊ตฌ
Protein structure-informed deep learning enables species-specific serogroup prediction ๋“ฑ, ํ˜ˆ์ฒญํ˜• ๋“ฑ๊ธ‰ ํŒ๋ณ„์„ ์„œ๋กœ ๋‹ค๋ฅธ ์ ‘๊ทผ์œผ๋กœ ๋‹ค๋ฃจ๋ฏ€๋กœ ๋น„๊ต ๋ถ„์„์— ์ ํ•ฉํ•ฉ๋‹ˆ๋‹ค.
ํ›„์† ์—ฐ๊ตฌ
3223 ๋…ผ๋ฌธ์€ ์ข… ํŠน์ด์ ์ธ ๋‹จ๋ฐฑ์งˆ ๊ตฌ์กฐ ์˜ˆ์ธก์„ ์œ„ํ•œ ๋”ฅ๋Ÿฌ๋‹ ๋ฐฉ๋ฒ•์„ ๋‹ค๋ฃจ๋ฉฐ, 3022์˜ cryo-EM ๊ธฐ๋ฐ˜ ๊ตฌ์กฐ ์ƒ์„ฑ๊ณผ ์ƒํ˜ธ๋ณด์™„์ ์œผ๋กœ ์ ์šฉ๋  ์ˆ˜ ์žˆ๋‹ค.
์‘์šฉ ์‚ฌ๋ก€
3218์˜ ๋‹จ๋ฐฑ์งˆ ๊ฒฐํ•ฉ ๋ถ€์œ„ ์˜ˆ์ธก ๋ชจ๋ธ์„ 3223์ฒ˜๋Ÿผ structure-informed deep learning ๋ฐฉ์‹์œผ๋กœ ์ข…๊ฐ„ ํŠน์ด์„ฑ ์ ์šฉ ๋“ฑ ์‹ค์ œ ๋ฌธ์ œ์— ์‘์šฉํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.
← ๋ชฉ๋ก์œผ๋กœ ๋Œ์•„๊ฐ€๊ธฐ

๐ŸŽง Audio Overview

์ด ๋…ผ๋ฌธ ๋ฆฌ๋ทฐ๋ฅผ ํŒŸ์บ์ŠคํŠธํ˜• ์˜ค๋””์˜ค๋กœ ์ƒ์„ฑํ•ฉ๋‹ˆ๋‹ค. (Gemini ยท ํ‚ค๋Š” ๋ธŒ๋ผ์šฐ์ €์—๋งŒ ์ €์žฅ ยท ์™„์„ฑ๋ณธ์€ ์ด๋ฉ”์ผ๋กœ๋„ ์ „์†ก)
โ–ธ ๊ณ ๊ธ‰: ๊ตฌ์„ฑ ๋ฐฉํ–ฅ(๋Œ€๋ณธ ์ž‘์„ฑ ์ง€์นจ) ์ง์ ‘ ์ˆ˜์ •