What Do Biological Foundation Models Compute? Sparse Autoencoders from Feature Recovery to Mechanistic Interpretability

์ €์ž: | ๋‚ ์งœ: 2026-03-04 | URL: https://www.biorxiv.org/content/10.64898/2026.03.04.709491v1 📄 PDF


Essence

Figure 2

Figure 2. Sparse autoencoder architecture and the feature-to-experiment pipeline. (Left) Model activations (h โˆˆ

์ด ๋…ผ๋ฌธ์€ ๋‹จ๋ฐฑ์งˆ๊ณผ DNA ํŒŒ์šด๋ฐ์ด์…˜ ๋ชจ๋ธ์˜ ๋‚ด๋ถ€ ํ‘œํ˜„์„ sparse autoencoder(SAE)๋กœ ๋ถ„ํ•ดํ•˜์—ฌ ์ƒ๋ฌผํ•™์ ์œผ๋กœ ํ•ด์„ ๊ฐ€๋Šฅํ•œ ํŠน์ง• ์‚ฌ์ „์„ ํ•™์Šตํ•˜๊ณ , ๊ธฐ์กด behavioral ๋ฐฉ๋ฒ•๋“ค์ด ๋†“์น˜๋Š” ๋ชจ๋ธ์˜ ๋‚ด๋ถ€ ๊ณ„์‚ฐ ์กฐ์งํ™” ๋ฐฉ์‹์„ ๊ทœ๋ช…ํ•˜๋Š” ์ฒด๊ณ„์  ๋ฆฌ๋ทฐ๋ฅผ ์ œ์‹œํ•œ๋‹ค.

Motivation

Achievement

Figure 1

Figure 1. Overview of behavioral interpretation methods for biological foundation models. (A) Attention analysis extract

SAE ์ ์šฉ ๋ฒ”์œ„ ํ™•๋Œ€: 1๋…„ ๋ฏธ๋งŒ์˜ ๊ธฐ๊ฐ„์— ๋‹จ๋ฐฑ์งˆ ์–ธ์–ด ๋ชจ๋ธ, ๊ฒŒ๋†ˆ ์–ธ์–ด ๋ชจ๋ธ, ๋ณ‘๋ฆฌํ•™ vision transformer, ๋‹จ์ผ์„ธํฌ ํŒŒ์šด๋ฐ์ด์…˜ ๋ชจ๋ธ, ๋‹จ๋ฐฑ์งˆ ๊ตฌ์กฐ ์ƒ์„ฑ๊ธฐ ๋“ฑ ๋‹ค์–‘ํ•œ ์ƒ๋ฌผํ•™์  ํŒŒ์šด๋ฐ์ด์…˜ ๋ชจ๋ธ์— SAE ๋ถ„์„์ด ์ ์šฉ๋จ. ์ˆ˜๋ ด์  ์ฆ๊ฑฐ: ์„œ๋กœ ๋‹ค๋ฅธ ์•„ํ‚คํ…์ฒ˜์™€ ํ‰๊ฐ€ ์ „๋žต์„ ์‚ฌ์šฉํ•œ ๋…๋ฆฝ์  ์—ฐ๊ตฌ๋“ค์ด ์ผ๊ด€๋˜๊ฒŒ 2์ฐจ ๊ตฌ์กฐ ์š”์†Œ, ๊ธฐ๋Šฅ ๋„๋ฉ”์ธ, transcription factor binding site, regulatory element ๋“ฑ ์ƒ๋ฌผํ•™์  ๊ทœ๋ชจ์— ๊ฑธ์นœ ํŠน์ง•์„ ๋ณต๊ตฌ. ์„ธ ๊ฐ€์ง€ ํ•ด์„์„ฑ ํ”„๋ ˆ์ž„์›Œํฌ ์ œ์•ˆ: representational, computational, mechanistic ์ˆ˜์ค€์˜ ์œ„๊ณ„์  ํ•ด์„ ํ‹€ ๋„์ž….

How

Figure 1

Figure 1. Overview of behavioral interpretation methods for biological foundation models. (A) Attention analysis extract

Originality

Limitation & Further Study

Evaluation

Novelty: 4/5 Technical Soundness: 4/5 Significance: 4/5 Clarity: 4/5 Overall: 4/5

์ดํ‰: ์ด ๋…ผ๋ฌธ์€ ์ƒ๋ฌผํ•™์  ํŒŒ์šด๋ฐ์ด์…˜ ๋ชจ๋ธ์˜ ํ•ด์„์„ฑ์— ๊ด€ํ•œ ์ฒด๊ณ„์ ์ด๊ณ  ๊ท ํ˜• ์žกํžŒ ๋ฆฌ๋ทฐ๋ฅผ ์ œ๊ณตํ•˜๋ฉฐ, sparse autoencoder๊ฐ€ ๋‚ด๋ถ€ ๊ณ„์‚ฐ ์กฐ์ง์„ ๊ทœ๋ช…ํ•˜๋Š” ์ƒˆ๋กœ์šด ๋„๊ตฌ์ž„์„ ๋ช…ํ™•ํžˆ ํ•œ๋‹ค. ๋‹ค๋งŒ ํ˜„์žฅ์˜ ๋Œ€๋ถ€๋ถ„ ์—ฐ๊ตฌ๊ฐ€ peer-reviewed ๊ฒ€์ฆ ์ด์ „ ์ƒํƒœ์ด๊ณ , ์‹คํ—˜์  ๊ฒ€์ฆ๊ณผ ์ธ๊ณผ์  ๋ฉ”์ปค๋‹ˆ์ฆ˜ ๊ทœ๋ช…์˜ ๊ฒฝ๋กœ๊ฐ€ ์•„์ง ํ™•๋ฆฝ๋˜์ง€ ์•Š์•„, ์ง„์ •ํ•œ ์ƒ๋ฌผํ•™์  ์ดํ•ด๋กœ์˜ ์ „ํ™˜์ด ์•ž์œผ๋กœ์˜ ๊ณผ์ œ์ด๋‹ค.

๊ฐ™์ด ๋ณด๋ฉด ์ข‹์€ ๋…ผ๋ฌธ

๊ธฐ๋ฐ˜ ์—ฐ๊ตฌ
017์€ ํŠธ๋žœ์Šคํฌ๋จธ ๋ฉ”์ปค๋‹ˆ์ฆ˜ ํ•ด์„์˜ ์ตœ์‹  ํ๋ฆ„์„ ์ •๋ฆฌํ•˜์—ฌ 3281์˜ ํŒŒ์šด๋ฐ์ด์…˜ ๋ชจ๋ธ ๋‚ด๋ถ€ ํ‘œํ˜„ ํ•ด์„ ๋ฐฉ๋ฒ•๋ก ์— ์ง์ ‘์ ์ธ ์ด๋ก ์  ๊ทผ๊ฐ„์„ ์ œ๊ณตํ•ฉ๋‹ˆ๋‹ค.
๋‹ค๋ฅธ ์ ‘๊ทผ
ํŒŒ์šด๋ฐ์ด์…˜ ๋ชจ๋ธ์˜ ๋‚ด๋ถ€ ํ‘œํ˜„ ํ™œ์šฉ์ด ์•„๋‹Œ ์ž๊ธฐ ๋ฐ˜์„ฑ ๊ธฐ๋ฐ˜ ๊ธฐ๊ณ„ํ•™์Šต(์ƒ๋ฌผํ•™์  ํƒœ์Šคํฌ)์— ์ง‘์ค‘ํ•˜๋Š” ๋Œ€์•ˆ์  ์—ฐ๊ตฌ์ž…๋‹ˆ๋‹ค.
๋‹ค๋ฅธ ์ ‘๊ทผ
๋‘˜ ๋‹ค ์ƒ๋ฌผํ•™ ํŒŒ์šด๋ฐ์ด์…˜ ๋ชจ๋ธ์˜ ๋‚ด๋ถ€ ํ‘œํ˜„(๊ตฌ์กฐ/ํฌ์†Œ์„ฑ ๋“ฑ)์„ ํ•ด์„ํ•˜๋ ค ์‹œ๋„ํ•˜์ง€๋งŒ, 3282๋Š” ์œ„์ƒยท๊ธฐํ•˜ ๊ตฌ์กฐ์— ์ง‘์ค‘, 3281์€ ํฌ์†Œ ์˜คํ† ์ธ์ฝ”๋” ๊ธฐ๋ฐ˜ ํ•ด์„์„ ์‹œ๋„ํ•ฉ๋‹ˆ๋‹ค.
๋‹ค๋ฅธ ์ ‘๊ทผ
What Do Biological Foundation Models Compute ๋…ผ๋ฌธ์€ ์ƒ๋ฌผํ•™์  ๋ฐ์ดํ„ฐ์˜ ์ƒ์„ฑ๊ณผ ์‹ ๋ขฐ๋„ ๋ฌธ์ œ์— ๋Œ€ํ•œ ์‹ค์ฆ์  ๋Œ€์•ˆ ํ•ด์„์„ ์ œ์‹œํ•œ๋‹ค.
๋‹ค๋ฅธ ์ ‘๊ทผ
์ƒ๋ฌผํ•™ ํŒŒ์šด๋ฐ์ด์…˜ ๋ชจ๋ธ์ด ๊ณ„์‚ฐํ•˜๋Š” feature sparsity ๋ฐ ํ‘œํ˜„ ํŠน์„ฑ์„ ํ•ด์„ํ•˜๋Š” ๋‹ค๋ฅธ ์ ‘๊ทผ๋ฐฉ์‹์ด๋‹ค.
ํ›„์† ์—ฐ๊ตฌ
344๋Š” ์ƒ๋ฌผ์ •๋ณด ๋ถ„์•ผ์—์„œ ํŒŒ์šด๋ฐ์ด์…˜ ๋ชจ๋ธ์˜ ์—ญํ• ์„ ์„œ๋ฒ ์ดํ•˜์—ฌ 3281์˜ ์„ธ๋ฐ€ํ•œ ๋‚ด๋ถ€ ๊ณ„์ธต ํ•ด์„์„ ์‹ค์ œ ์ƒ๋ฌผํ•™์  ์˜๋ฏธ๋กœ ์—ฐ๊ฒฐํ•˜๋Š” ํ™•์žฅ ์‚ฌ๋ก€๊ฐ€ ๋ฉ๋‹ˆ๋‹ค.
ํ›„์† ์—ฐ๊ตฌ
SAE ๊ธฐ๋ฐ˜ ๋‚ด๋ถ€ ๊ตฌ์กฐ ํ•ด์„์„ ๋„˜์–ด์„œ ์œ„์ƒ ๋ฐ ๊ธฐํ•˜ํ•™์  ์˜๋ฏธ๋ฅผ ์‹คํ—˜ ๋ฃจํ”„์™€ ์—ฐ๊ณ„ํ•ด ๊ฒ€์ฆํ•ฉ๋‹ˆ๋‹ค.
ํ›„์† ์—ฐ๊ตฌ
์ƒ๋ช…๊ณผํ•™ ํŒŒ์šด๋ฐ์ด์…˜ ๋ชจ๋ธ์˜ ๋‚ด๋ถ€ ์ž ์žฌํ‘œํ˜„์„ OT ๊ธฐ๋ฐ˜์œผ๋กœ ํ•ด์„ํ•˜๋Š” ๋“ฑ, sparse autoencoder์™€ ๊ฒฐํ•ฉํ•œ ๋‚ด์žฌ์  ํŠน์ง• ์ถ”์ถœ์„ ์‹ฌํ™”ํ•ฉ๋‹ˆ๋‹ค.
ํ›„์† ์—ฐ๊ตฌ
ViraHinter๊ฐ€ ๋‹ค๋ฃฌ ๋ฐ”์ด์˜ค ํŒŒ์šด๋ฐ์ด์…˜ ๋ชจ๋ธ ๋‚ด๋ถ€ ํ‘œํ˜„์— ๊ด€ํ•œ ๋ถ„์„์„ ์‹ฌ๋„ ์žˆ๊ฒŒ ๋‹ค๋ฃจ๊ณ , sparse autoencoder๋ฅผ ํ†ตํ•ด ๋‚ด๋ถ€ ๊ณ„์‚ฐ ๊ตฌ์กฐ๋ฅผ ํ•ด์„ํ•ฉ๋‹ˆ๋‹ค.
์‘์šฉ ์‚ฌ๋ก€
3237์˜ ๋Œ€๊ทœ๋ชจ ํŒŒ์šด๋ฐ์ด์…˜ ๋ชจ๋ธ์„ 3281์ฒ˜๋Ÿผ ์ƒ๋ฌผํ•™์  ๊ตฌ์กฐ ์ธ์‹ ๋ฐ ํฌ์†Œ ์˜คํ† ์ธ์ฝ”๋” ์‘์šฉ ๋“ฑ ์‹ค์ œ ํŒฌ๋‹ค๋ฏน ์˜ˆ์ธก์ด๋‚˜ ๋‹จ๋ฐฑ์งˆ ๊ธฐ๋Šฅ ์ถ”๋ก ์— ์ ์šฉํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.
๋ฐ˜๋ก /๋น„ํŒ
3281 ๋…ผ๋ฌธ์€ ๋ฐ”์ด์˜ค ํŒŒ์šด๋ฐ์ด์…˜ ๋ชจ๋ธ์˜ ๊ณ„์‚ฐ์  ํ•œ๊ณ„๋ฅผ ์ง€์ ํ•˜๋ฉฐ, 3091์˜ ์ดˆ๋Œ€๊ทœ๋ชจ ํƒ์ƒ‰ ์ ‘๊ทผ์˜ ํ•จ์˜๋ฅผ ์žฌ๊ณ ํ•˜๊ฒŒ ํ•ฉ๋‹ˆ๋‹ค.
← ๋ชฉ๋ก์œผ๋กœ ๋Œ์•„๊ฐ€๊ธฐ

๐ŸŽง Audio Overview

์ด ๋…ผ๋ฌธ ๋ฆฌ๋ทฐ๋ฅผ ํŒŸ์บ์ŠคํŠธํ˜• ์˜ค๋””์˜ค๋กœ ์ƒ์„ฑํ•ฉ๋‹ˆ๋‹ค. (Gemini ยท ํ‚ค๋Š” ๋ธŒ๋ผ์šฐ์ €์—๋งŒ ์ €์žฅ ยท ์™„์„ฑ๋ณธ์€ ์ด๋ฉ”์ผ๋กœ๋„ ์ „์†ก)
โ–ธ ๊ณ ๊ธ‰: ๊ตฌ์„ฑ ๋ฐฉํ–ฅ(๋Œ€๋ณธ ์ž‘์„ฑ ์ง€์นจ) ์ง์ ‘ ์ˆ˜์ •