CrossLLM-Mamba: Multimodal State Space Fusion of LLMs for RNA Interaction Prediction

์ €์ž: | ๋‚ ์งœ: 2026-02-23 | URL: https://arxiv.org/abs/2602.22236 📄 PDF


Essence

Figure 2

Figure 2: The CrossLLM-Mamba Model Architecture. The framework processes multi-modal inputs (Protein, RNA, or Molecule f

CrossLLM-Mamba๋Š” Mamba ๊ธฐ๋ฐ˜ state-space ๋ชจ๋ธ๋ง์„ ํ†ตํ•ด BioLLM ์ž„๋ฒ ๋”ฉ์˜ ๋™์  crosstalk์„ ํ™œ์šฉํ•˜์—ฌ RNA-๋‹จ๋ฐฑ์งˆยทRNA-์†Œ๋ถ„์žยทRNA-RNA ์ƒํ˜ธ์ž‘์šฉ์„ ์˜ˆ์ธกํ•œ๋‹ค.

Motivation

Achievement

Figure 3

Figure 3: Performance Comparison on the RPI1460 Dataset. The boxplots

How

Figure 2

Figure 2: The CrossLLM-Mamba Model Architecture. The framework processes multi-modal inputs (Protein, RNA, or Molecule f

Originality

Limitation & Further Study

Evaluation

Novelty: 4/5 Technical Soundness: 3/5 Significance: 4/5 Clarity: 4/5 Overall: 4/5

์ดํ‰: CrossLLM-Mamba๋Š” state-space ๋ชจ๋ธ๋ง์œผ๋กœ ์ƒ๋ฌผํ•™์  ์ƒํ˜ธ์ž‘์šฉ ์˜ˆ์ธก์— ์ƒˆ๋กœ์šด ํŒจ๋Ÿฌ๋‹ค์ž„์„ ์ œ์‹œํ•˜๋ฉฐ, Mamba์˜ ์„ ํ˜• ๋ณต์žก๋„์™€ BioLLM ์ž„๋ฒ ๋”ฉ์˜ ๊ฐ•๋ ฅํ•จ์„ ํšจ๊ณผ์ ์œผ๋กœ ๊ฒฐํ•ฉํ•˜์—ฌ ์„ธ ๊ฐ€์ง€ RNA ์ƒํ˜ธ์ž‘์šฉ ์นดํ…Œ๊ณ ๋ฆฌ์—์„œ ์ตœ๊ณ  ์„ฑ๋Šฅ์„ ๋‹ฌ์„ฑํ•œ ์šฐ์ˆ˜ํ•œ ์—ฐ๊ตฌ๋‹ค.

๊ฐ™์ด ๋ณด๋ฉด ์ข‹์€ ๋…ผ๋ฌธ

๊ธฐ๋ฐ˜ ์—ฐ๊ตฌ
Multi-Modal Foundation Models์˜ ๊ตฌ์กฐ์™€ ๊ฐœ๋…์ด CrossLLM-Mamba์˜ ๋ฉ€ํ‹ฐ๋ชจ๋‹ฌ ์Šคํ…Œ์ดํŠธ ์ŠคํŽ˜์ด์Šค ๋ชจ๋ธ๋ง ๊ธฐ์ดˆ๋ฅผ ์ œ๊ณตํ•œ๋‹ค.
๊ธฐ๋ฐ˜ ์—ฐ๊ตฌ
๋ฐ”์ด์˜ค ํŒŒ์šด๋ฐ์ด์…˜ ๋ชจ๋ธ์˜ ๋‹ค์ค‘ ๋ชจ๋‹ฌ ์—ฐ๊ณ„ ๋ฐ state-space ๋ชจ๋ธ๋ง ์›๋ฆฌ์— ๋Œ€ํ•œ ์ด๋ก ์  ๊ธฐ๋ฐ˜์„ ์ œ๊ณตํ•ฉ๋‹ˆ๋‹ค.
๊ธฐ๋ฐ˜ ์—ฐ๊ตฌ
BioLLM ์ž„๋ฒ ๋”ฉ ๋ฐ RNA-interaction ์˜ˆ์ธก ์˜์—ญ์˜ foundation model์„ ๊ฒฐํ•ฉํ•œ ์—ฐ๊ตฌ๋กœ CrossLLM-Mamba ๊ฐœ๋…์˜ ์ด๋ก ์  ๊ธฐ๋ฐ˜์„ ์ œ๊ณตํ•œ๋‹ค.
๋‹ค๋ฅธ ์ ‘๊ทผ
RNA-๋‹จ๋ฐฑ์งˆ ์ƒํ˜ธ์ž‘์šฉ ์˜ˆ์ธก์„ ์œ„ํ•œ ์œ ์‚ฌํ•œ ๋”ฅ๋Ÿฌ๋‹ ๊ธฐ๋ฐ˜ ๋ฐฉ๋ฒ•๋ก ์„ ์ œ์•ˆํ•˜๋Š” ์—ฐ๊ตฌ์ด๋‹ค.
๋‹ค๋ฅธ ์ ‘๊ทผ
RNA-๋‹จ๋ฐฑ์งˆ ๋˜๋Š” RNA-์†Œ๋ถ„์ž ์ƒํ˜ธ์ž‘์šฉ ์˜ˆ์ธก์„ ์œ„ํ•œ ๋Œ€์•ˆ์  ๋”ฅ๋Ÿฌ๋‹ ๋ฐฉ๋ฒ•๋ก ์„ ์ œ์‹œํ•˜๋Š” ์—ฐ๊ตฌ์ด๋‹ค.
๋‹ค๋ฅธ ์ ‘๊ทผ
Mamba ๊ธฐ๋ฐ˜ state-space ๋ชจ๋ธ์„ ์ƒ๋ฌผํ•™์  ์„œ์—ด ๋ถ„์„์— ์ ์šฉํ•˜๋Š” ์œ ์‚ฌํ•œ ์—ฐ๊ตฌ์ด๋‹ค.
๋‹ค๋ฅธ ์ ‘๊ทผ
์ƒ๋ฌผํ•™์  ์ง€์‹์˜ ์ŠคํŽ™ํŠธ๋Ÿด ์ง€์˜ค๋ฉ”ํŠธ๋ฆฌ ๋ฐ ์ƒํ˜ธ์ž‘์šฉ ์˜ˆ์ธก์„, ๋ฉ€ํ‹ฐ๋ชจ๋‹ฌ ์ž„๋ฒ ๋”ฉ๊ณผ ์—ฐ๊ณ„ํ•˜์—ฌ ๋ถ„์„ํ•˜๋ฏ€๋กœ ๋™์  crosstalk ๊ด€์ ์˜ ์ฐจ๋ณ„์„ฑ์„ ๋น„๊ตํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.
๋‹ค๋ฅธ ์ ‘๊ทผ
์ƒ๋ฌผํ•™์  ๋ถ„์ž ์ƒํ˜ธ์ž‘์šฉ ์˜ˆ์ธก์„ ์œ„ํ•œ ํฌ๋กœ์Šค-์–ดํ…์…˜ ๋˜๋Š” ํ“จ์ „ ๊ธฐ๋ฐ˜ ๋Œ€์•ˆ์  ๋ฐฉ๋ฒ•๋ก ์„ ๋‹ค๋ฃจ๋Š” ์—ฐ๊ตฌ์ด๋‹ค.
๋‹ค๋ฅธ ์ ‘๊ทผ
3064 ๋…ผ๋ฌธ๋„ ์œ ์ „์ฒด ๊ตฌ์กฐ ๋ณ€์ด์˜ ํ•ด์„์„ ์œ„ํ•œ ๋”ฅ๋Ÿฌ๋‹ ๋ฐฉ๋ฒ•์„ ์ œ์‹œํ•˜์—ฌ, MEIsensor์˜ ์‚ฝ์ž… ๋ณ€์ด ๊ฒ€์ถœ๊ณผ ์ƒํ˜ธ ๋น„๊ต๊ฐ€ ๊ฐ€๋Šฅํ•ฉ๋‹ˆ๋‹ค.
๋‹ค๋ฅธ ์ ‘๊ทผ
RNA ์ƒํ˜ธ์ž‘์šฉ ์˜ˆ์ธก์„ ์œ„ํ•œ ๋ฉ€ํ‹ฐ๋ชจ๋‹ฌ ๋˜๋Š” LLM ๊ธฐ๋ฐ˜ ๋Œ€์•ˆ์  ์ ‘๊ทผ๋ฒ•์„ ๋‹ค๋ฃจ๋Š” ์—ฐ๊ตฌ์ด๋‹ค.
๋‹ค๋ฅธ ์ ‘๊ทผ
CORAL์€ cross-attention ๊ธฐ๋ฐ˜ RNA-protein ์˜ˆ์ธก ๋”ฅ๋Ÿฌ๋‹ ํ”„๋ ˆ์ž„์›Œํฌ๋กœ, CrossLLM-Mamba์˜ state-space ์„ค๊ณ„์™€ ์ ‘๊ทผ๋ฒ• ์ฐจ๋ณ„์„ฑ์„ ์‚ดํŽด๋ณผ ์ˆ˜ ์žˆ๋‹ค.
๋‹ค๋ฅธ ์ ‘๊ทผ
PUFFIN์€ ๋‹จ๋ฐฑ์งˆ ๊ตฌ์กฐ ์œ ๋‹› ๋ฐœ๊ฒฌ์—์„œ ์ž”๊ธฐ-๊ทธ๋ž˜ํ”„ ๋‹จ์œ„ ์˜ˆ์ธก ์ ‘๊ทผ์œผ๋กœ, CrossLLM-Mamba์˜ state-space์œตํ•ฉ๋ฐฉ๋ฒ•๊ณผ ๋‹ค๋ฅธ ๋ฐฉํ–ฅ์„ ์ œ์•ˆํ•œ๋‹ค.
๋‹ค๋ฅธ ์ ‘๊ทผ
MLIP์˜ ์ ์šฉ ๋ฒ”์œ„ ๋ฐ ์ผ๋ฐ˜ํ™” ์„ฑ๋Šฅ์„ ํ‰๊ฐ€ํ•˜๋Š” ๋‹ค๋ฅธ ๋ฐฉ๋ฒ•๋ก ์ด๋‹ค.
ํ›„์† ์—ฐ๊ตฌ
CrossLLM-Mamba๋Š” RNAยท๋‹จ๋ฐฑ์งˆ ์‹œํ€€์Šค ๊ฒฐํ•ฉ multi-modal state space ๋ชจ๋ธ์„ ๊ฐ•ํ™”ํ•œ ๋ฐฉ์‹์œผ๋กœ Orthrus์˜ ์ง„ํ™”์ /๊ธฐ๋Šฅ์  ํŠน์„ฑ ์˜ˆ์ธก๋ ฅ ํ™•์žฅ ๊ฐ€๋Šฅ์„ฑ์„ ์‹œ์‚ฌํ•œ๋‹ค.
ํ›„์† ์—ฐ๊ตฌ
CrossLLM-Mamba๋Š” RNA-๋‹จ๋ฐฑ์งˆ ์ƒํ˜ธ์ž‘์šฉ ์˜ˆ์ธก์˜ ๋ฉ€ํ‹ฐ๋ชจ๋‹ฌ ๋ชจ๋ธ์„ ์ œ์‹œํ•˜์—ฌ, CORAL ํ”„๋ ˆ์ž„์›Œํฌ์˜ bidirectional cross-attention ๊ฐœ๋…์„ ์‹ค์งˆ์ ์œผ๋กœ ํ™•์žฅํ•œ๋‹ค.
← ๋ชฉ๋ก์œผ๋กœ ๋Œ์•„๊ฐ€๊ธฐ

๐ŸŽง Audio Overview

์ด ๋…ผ๋ฌธ ๋ฆฌ๋ทฐ๋ฅผ ํŒŸ์บ์ŠคํŠธํ˜• ์˜ค๋””์˜ค๋กœ ์ƒ์„ฑํ•ฉ๋‹ˆ๋‹ค. (Gemini ยท ํ‚ค๋Š” ๋ธŒ๋ผ์šฐ์ €์—๋งŒ ์ €์žฅ ยท ์™„์„ฑ๋ณธ์€ ์ด๋ฉ”์ผ๋กœ๋„ ์ „์†ก)
โ–ธ ๊ณ ๊ธ‰: ๊ตฌ์„ฑ ๋ฐฉํ–ฅ(๋Œ€๋ณธ ์ž‘์„ฑ ์ง€์นจ) ์ง์ ‘ ์ˆ˜์ •