BioReason-Pro: Advancing Protein Function Prediction with Multimodal Biological Reasoning

์ €์ž: | ๋‚ ์งœ: 2026-03-19 | URL: https://www.biorxiv.org/content/10.64898/2026.03.19.712954v1 📄 PDF


Essence

Figure 1

Figure 1 | Overview of BioReason-Pro for protein function prediction. (A) BioReason-Pro architecture. A multi-

BioReason-Pro๋Š” protein embeddings(ESM3)๊ณผ ์ƒ๋ฌผํ•™์  ์ปจํ…์ŠคํŠธ๋ฅผ ํ†ตํ•ฉํ•˜์—ฌ ๋‹จ๋ฐฑ์งˆ ๊ธฐ๋Šฅ ์˜ˆ์ธก์„ ์ˆ˜ํ–‰ํ•˜๋Š” ์ฒซ ๋ฒˆ์งธ ๋‹ค์ค‘๋ชจ๋‹ฌ ์ถ”๋ก  LLM์ด๋‹ค. GO-GPT๋ผ๋Š” ๋ณด์กฐ autoregressive transformer๋ฅผ ํ†ตํ•ด GO term์„ ์˜ˆ์ธกํ•˜๊ณ , ์ด๋ฅผ ๋ฐ”ํƒ•์œผ๋กœ structured reasoning traces๋ฅผ ์ƒ์„ฑํ•˜์—ฌ ์„ค๋ช… ๊ฐ€๋Šฅํ•œ ๊ธฐ๋Šฅ ์ฃผ์„์„ ์ œ๊ณตํ•œ๋‹ค.

Motivation

Achievement

Figure 3

Figure 3 | BioReason-Pro evaluation on protein function prediction. (A) LLM-as-Judge framework. GPT-5.1 evaluated

GO-GPT ์„ฑ๋Šฅ: weighted Fmax 0.65โ€“0.70์œผ๋กœ CAFA 5 ๊ฒฝ์Ÿ์˜ ์ƒ์œ„ ์ ‘๊ทผ ๊ฐ€๋Šฅ ๋ฐฉ๋ฒ•๋“ค์„ ์ดˆ๊ณผ. BioReason-Pro GO ์˜ˆ์ธก: 73.6% Fmax ๋‹ฌ์„ฑ. ๊ธฐ๋Šฅ ์š”์•ฝ: LLM judge ์ ์ˆ˜ 8/10. ์ธ๊ฐ„ ์ „๋ฌธ๊ฐ€ ํ‰๊ฐ€: 79% ์‚ฌ๋ก€์—์„œ UniProt ์ฃผ์„๋ณด๋‹ค ์„ ํ˜ธ. Binding partner ์˜ˆ์ธก: de novo๋กœ ์‹คํ—˜ ๊ฒ€์ฆ๋œ ๊ฒฐํ•ฉ ํŒŒํŠธ๋„ˆ ์˜ˆ์ธก, cryo-EM ๊ตฌ์กฐ์˜ ์ •ํ™•ํ•œ contact residues์— per-residue attention ์ง‘์ค‘.

How

Figure 1

Figure 1 | Overview of BioReason-Pro for protein function prediction. (A) BioReason-Pro architecture. A multi-

Originality

Limitation & Further Study

ํ›„์† ์—ฐ๊ตฌ:

Evaluation

Novelty: 4/5 Technical Soundness: 4/5 Significance: 4/5 Clarity: 4/5 Overall: 4/5

์ดํ‰: BioReason-Pro๋Š” protein function prediction์„ ์œ„ํ•ด multimodal embeddings๊ณผ structured reasoning์„ ํ†ตํ•ฉํ•œ ํ˜์‹ ์  ์ ‘๊ทผ์ด๋‹ค. GO-GPT์˜ ์šฐ์ˆ˜ํ•œ ์„ฑ๋Šฅ, ์ธ๊ฐ„ ์ „๋ฌธ๊ฐ€์˜ ๋†’์€ ์„ ํ˜ธ๋„(79%), ๊ตฌ์กฐ์  ๊ทผ๊ฑฐ์— ๊ธฐ๋ฐ˜ํ•œ binding partner ์˜ˆ์ธก์˜ ์ •ํ™•์„ฑ ๋“ฑ์ด ๊ฐ•์ ์ด๋‹ค. ํ•ฉ์„ฑ ํ›ˆ๋ จ ๋ฐ์ดํ„ฐ ์˜์กด์„ฑ๊ณผ ์ปจํ…์ŠคํŠธ ๊ธธ์ด ์ œ์•ฝ์ด ์ œํ•œ ์‚ฌํ•ญ์ด๋‚˜, ์ „์ฒด์ ์œผ๋กœ ๋‹จ๋ฐฑ์งˆ ๊ธฐ๋Šฅ ์ฃผ์„ ์ž๋™ํ™”์— ํฌ๊ฒŒ ๊ธฐ์—ฌํ•˜๋Š” ์˜๋ฏธ ์žˆ๋Š” ์ž‘์—…์ด๋‹ค.

๊ฐ™์ด ๋ณด๋ฉด ์ข‹์€ ๋…ผ๋ฌธ

๊ธฐ๋ฐ˜ ์—ฐ๊ตฌ
344๋ฒˆ ๋…ผ๋ฌธ์€ ๋ฐ”์ด์˜ค์ •๋ณดํ•™์—์„œ ํŒŒ์šด๋ฐ์ด์…˜ ๋ชจ๋ธ ๋ฐ ๋‹จ๋ฐฑ์งˆ ์ž„๋ฒ ๋”ฉ ๋“ฑ ๊ด€๋ จ ํ•ต์‹ฌ ๊ธฐ์ˆ ์˜ ์ตœ์‹  ๋™ํ–ฅ์„ ์ œ์‹œํ•˜์—ฌ BioReason-Pro์˜ ๋งฅ๋ฝ์„ ์„ค๋ช…ํ•ฉ๋‹ˆ๋‹ค.
๊ธฐ๋ฐ˜ ์—ฐ๊ตฌ
015 ๋…ผ๋ฌธ์€ ํ™”ํ•™ ๋ฐ ์ƒ๋ช…๊ณผํ•™ ํŒŒ์šด๋ฐ์ด์…˜ ๋ชจ๋ธ์˜ ์ตœ์‹  ๋ฆฌ๋ทฐ๋ฅผ ์ œ๊ณตํ•˜์—ฌ, 3045์˜ ๋‹ค์ค‘๋ชจ๋‹ฌ ๋‹จ๋ฐฑ์งˆ ๊ธฐ๋Šฅ ์˜ˆ์ธก ๋ชจ๋ธ์˜ ์ด๋ก ์  ๊ธฐ๋ฐ˜์ด ๋œ๋‹ค.
๊ธฐ๋ฐ˜ ์—ฐ๊ตฌ
๋‹จ๋ฐฑ์งˆ ๊ธฐ๋Šฅ ์˜ˆ์ธก์—์„œ foundation model ํ™œ์šฉ ๋ฐ ๋„๋ฉ”์ธ ํŠนํ™” autoregressive ์ถ”๋ก ์ด๋ก ์„ BioReason-Pro์˜ ๊ธฐ๋ฐ˜์œผ๋กœ ์ œ๊ณตํ•œ๋‹ค.
๋‹ค๋ฅธ ์ ‘๊ทผ
638๋ฒˆ ๋…ผ๋ฌธ์€ ๋Œ€๊ทœ๋ชจ ๋ฉ€ํ‹ฐ์—์ด์ „ํŠธ LLM ๋ฐฉ๋ฒ•์„ ํ†ตํ•œ ๋‹จ๋ฐฑ์งˆ ๊ตฌ์กฐ์˜ˆ์ธก ๋ฐ ์„ค๊ณ„ ์‘์šฉ ์‚ฌ๋ก€์ด๋ฏ€๋กœ, ๋‹จ๋ฐฑ์งˆ ๊ธฐ๋Šฅ ์˜ˆ์ธก ๋ชจ๋ธ์˜ ์‹ค์ œ ์œตํ•ฉ์‚ฌ๋ก€๋กœ ์ฐธ๊ณ ๋ฉ๋‹ˆ๋‹ค.
๋‹ค๋ฅธ ์ ‘๊ทผ
๋‹จ๋ฐฑ์งˆ ์–ธ์–ด๋ชจ๋ธ ๋‚ด๋ถ€ ๋ฐ˜๋ณตํŒจํ„ด ๊ฐ์ง€ ๊ธฐ์ž‘์„ ํ•ด์„ํ•˜์—ฌ, BioReason-Pro์˜ ๊ธฐ๋Šฅ์˜ˆ์ธก๊ณผ ์„ค๋ช…๊ฐ€๋Šฅ์„ฑ ์—ฐ๊ตฌ์™€ ๋Œ€์กฐ์  ์‹œ๊ฐ์„ ์ œ๊ณตํ•œ๋‹ค.
๋‹ค๋ฅธ ์ ‘๊ทผ
3045๋Š” ๋ฉ€ํ‹ฐ๋ชจ๋‹ฌ ์ •๋ณด๋ฅผ ํ™œ์šฉํ•œ ๋‹จ๋ฐฑ์งˆ ๊ธฐ๋Šฅ ์˜ˆ์ธก์—์„œ ๋‹ค๋ฅธ ๋ชจ๋ธ ๊ตฌ์กฐ์™€ ์‹คํ—˜์„ ๊ฐ•์กฐํ•˜๋ฉฐ, 3135์™€ ๋น„๊ตํ•ด๋ณผ ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.
ํ›„์† ์—ฐ๊ตฌ
509 ๋…ผ๋ฌธ์€ LLM ๊ธฐ๋ฐ˜ ์กฐํ•ฉ์  ์ฐฝ์˜์„ฑ ๋ฐ ์‹คํ—˜์  ์ƒ์„ฑ ๋Šฅ๋ ฅ์„ ๋ถ„์„ํ•ด, 3045์˜ ๊ตฌ์กฐยท๊ธฐ๋Šฅ ์ถ”๋ก  LLM ์„ฑ๊ณผ๋ฅผ ์‹ค์งˆ์  ์ฐฝ์˜์„ฑ๊ณผ ์—ฐ๊ณ„ํ•ด์ค€๋‹ค.
ํ›„์† ์—ฐ๊ตฌ
BioReason-Pro: Advancing Protein Function Prediction with Multimodal Scientific Knowledge์€ HADDOCK์„ ๋น„๋กฏํ•œ ๋‹ค์–‘ํ•œ AI ๊ธฐ๋ฐ˜ ๊ธฐ๋Šฅ ์˜ˆ์ธก ์ ‘๊ทผ์„ ํ™•์žฅํ•˜์—ฌ, 3139์˜ ๋ฐฉ๋ฒ•๊ณผ ์‹คํ—˜์ ยท์ด๋ก ์  ์‹œ๋„ˆ์ง€๋ฅผ ๋งŒ๋“ค ์ˆ˜ ์žˆ๋‹ค.
ํ›„์† ์—ฐ๊ตฌ
๋ฉ€ํ‹ฐ๋ชจ๋‹ฌ ๋‹จ๋ฐฑ์งˆ ๊ธฐ๋Šฅ ์˜ˆ์ธก์„ ์œ„ํ•œ ๋‹ค์–‘ํ•œ AI ๋ชจ๋ธ ํ‰๊ฐ€ ๋ฐ ํ™•์žฅ ์—ฐ๊ตฌ์ž…๋‹ˆ๋‹ค.
← ๋ชฉ๋ก์œผ๋กœ ๋Œ์•„๊ฐ€๊ธฐ

๐ŸŽง Audio Overview

์ด ๋…ผ๋ฌธ ๋ฆฌ๋ทฐ๋ฅผ ํŒŸ์บ์ŠคํŠธํ˜• ์˜ค๋””์˜ค๋กœ ์ƒ์„ฑํ•ฉ๋‹ˆ๋‹ค. (Gemini ยท ํ‚ค๋Š” ๋ธŒ๋ผ์šฐ์ €์—๋งŒ ์ €์žฅ ยท ์™„์„ฑ๋ณธ์€ ์ด๋ฉ”์ผ๋กœ๋„ ์ „์†ก)
โ–ธ ๊ณ ๊ธ‰: ๊ตฌ์„ฑ ๋ฐฉํ–ฅ(๋Œ€๋ณธ ์ž‘์„ฑ ์ง€์นจ) ์ง์ ‘ ์ˆ˜์ •