Advancing Mathematics Research with AI-Driven Formal Proof Search

์ €์ž: George Tsoukalas, Anton Kovsharov, Sergey Shirobokov, Anja Surina, Moritz Firsching, Gergely Bรฉrczi, Francisco J. R. Ruiz, Arun Suggala, Adam Zsolt Wagner, Eric Wieser, Lei Yu, Aja Huang, Miklรณs Z. Horvรกth, Andrew Ferrauiolo, Henryk Michalewski, Codrut Grosu, Thomas Hubert, Matej Balog, Pushmeet Kohli, Swarat Chaudhuri | ๋‚ ์งœ: 2026/05/21 | URL: https://arxiv.org/abs/2605.22763v1 📄 PDF


Essence

Figure 1

Figure 1 | Example inputs/outputs for an AlphaProof-equipped agent (applied to Erdล‘s

์ด ๋…ผ๋ฌธ์€ LLM์„ ํ™œ์šฉํ•œ ํ˜•์‹ ์ฆ๋ช… ํƒ์ƒ‰(formal proof search)์œผ๋กœ ๊ฐœ๋ฐฉํ˜• ์ˆ˜ํ•™ ๋ฌธ์ œ๋ฅผ ์ž๋™์œผ๋กœ ํ•ด๊ฒฐํ•˜๋Š” ๋ฐฉ๋ฒ•์„ ์ œ์‹œํ•œ๋‹ค. AlphaProof Nexus ํ”„๋ ˆ์ž„์›Œํฌ๋ฅผ ํ†ตํ•ด Lean ์ฆ๋ช… ์–ด์‹œ์Šคํ„ดํŠธ์™€ LLM์„ ํ†ตํ•ฉํ•˜์—ฌ Erdล‘s ๋ฌธ์ œ 353๊ฐœ ์ค‘ 9๊ฐœ, OEIS ์ถ”์ธก 492๊ฐœ ์ค‘ 44๊ฐœ๋ฅผ ์ฆ๋ช…ํ•˜์—ฌ, AI ๊ธฐ๋ฐ˜ ํ˜•์‹ ์ฆ๋ช…์˜ ์‹ค์งˆ์  ๊ฐ€์น˜๋ฅผ ์ž…์ฆํ•œ๋‹ค.

Motivation

Achievement

Figure 2

Figure 2 | Design of the full-featured AlphaProof Nexus agent. The mathematician provides

์ฃผ์š” ์„ฑ๊ณผ:

How

Figure 1

Figure 1 | Example inputs/outputs for an AlphaProof-equipped agent (applied to Erdล‘s

Originality

Limitation & Further Study

ํ•œ๊ณ„:

ํ›„์† ์—ฐ๊ตฌ: ๋น„์šฉ ํšจ์œจ์„ฑ ๊ฐœ์„ , ๋” ๊ด‘๋ฒ”์œ„ํ•œ ์ˆ˜ํ•™ ๋ถ„์•ผ ํ™•๋Œ€, ํ˜•์‹ํ™” ์ž๋™ํ™”

Evaluation

Novelty: 4/5 Technical Soundness: 4/5 Significance: 4/5 Clarity: 4/5 Overall: 4/5

์ดํ‰: ์ด ๋…ผ๋ฌธ์€ LLM ๊ธฐ๋ฐ˜ ํ˜•์‹ ์ฆ๋ช…์ด ์‹ค์ œ ๊ฐœ๋ฐฉํ˜• ์—ฐ๊ตฌ ๋ฌธ์ œ๋ฅผ ํ•ด๊ฒฐํ•  ์ˆ˜ ์žˆ์Œ์„ ์ฒ˜์Œ ๋Œ€๊ทœ๋ชจ๋กœ ์ž…์ฆํ•œ ์ค‘์š”ํ•œ ์—ฐ๊ตฌ์ด๋‹ค. ํ˜์‹ ์ ์ธ ์—์ด์ „ํŠธ ์•„ํ‚คํ…์ฒ˜์™€ ์‹ค์งˆ์ ์ธ ์ˆ˜ํ•™ ์—ฐ๊ตฌ ์„ฑ๊ณผ(Erdล‘s ๋ฌธ์ œ ํ•ด๊ฒฐ, OEIS ์ถ”์ธก ์ฆ๋ช…)๋Š” AI์˜ ์ˆ˜ํ•™ ์—ฐ๊ตฌ ๊ธฐ์—ฌ๋ฅผ ๊ตฌ์ฒด์ ์œผ๋กœ ๋ณด์—ฌ์ค€๋‹ค. ๋‹ค๋งŒ ๋น„์šฉ ํšจ์œจ์„ฑ ๊ฐœ์„  ํ•„์š” ๋ฐ ๊ธฐ๋ณธ ์—์ด์ „ํŠธ๋„ ๋™๋“ฑํ•œ ์„ฑ๋Šฅ์„ ๋ณด์ธ ์ ์€ ์ œํ•œ์‚ฌํ•ญ์ด๋‚˜, ์ „๋ฐ˜์ ์œผ๋กœ ํ˜•์‹ ์ฆ๋ช… ๊ธฐ๋ฐ˜ AI ํ™œ์šฉ์˜ ๊ฐ€๋Šฅ์„ฑ์„ ๊ฐ•ํ•˜๊ฒŒ ์ž…์ฆํ•˜๋Š” ์˜๋ฏธ ์žˆ๋Š” ๋…ผ๋ฌธ์ด๋‹ค.

๊ฐ™์ด ๋ณด๋ฉด ์ข‹์€ ๋…ผ๋ฌธ

๊ธฐ๋ฐ˜ ์—ฐ๊ตฌ
FIMO ๋ฐ์ดํ„ฐ์…‹์€ ๋Œ€๊ทœ๋ชจ ์–ธ์–ด๋ชจ๋ธ์˜ ์ˆ˜ํ•™ ์ž๋™์ •๋ฆฌ์ฆ๋ช… ์„ฑ๋Šฅ์˜ ํ•œ๊ณ„๋ฅผ ์ง€์ ํ•˜๋ฉฐ, 3372์˜ ํ˜•์‹ ์ฆ๋ช… ์ž๋™ํ™” ์—ฐ๊ตฌ์˜ ํ•„์š”์„ฑ์„ ์ด๋ก ์ ์œผ๋กœ ๋’ท๋ฐ›์นจํ•œ๋‹ค.
๊ธฐ๋ฐ˜ ์—ฐ๊ตฌ
๋”ฅ๋Ÿฌ๋‹์„ ํ†ตํ•œ ์ •๋ฆฌ ์ฆ๋ช… ์ž๋™ํ™” ๊ด€๋ จ ์„œ๋ฒ ์ด๋กœ์„œ ๋ณธ ๋…ผ๋ฌธ์˜ ์ด๋ก ์  ํ† ๋Œ€๋ฅผ ์ œ๊ณตํ•ฉ๋‹ˆ๋‹ค.
๊ธฐ๋ฐ˜ ์—ฐ๊ตฌ
504๋Š” ๊ณผํ•™ ๋ฐฉ์ •์‹ ์ž๋™ ๋ฐœ๊ฒฌ ๋ฒค์น˜๋งˆํฌ/๋ฐฉ๋ฒ•๋ก ์„ ์ œ๊ณตํ•˜๋ฉฐ, 3372์˜ LLM+Lean ๊ธฐ๋ฐ˜ ์ˆ˜ํ•™ ์ •๋ฆฌ ๋ฐœ๊ฒฌ์— ์‹ค์งˆ์  ํ‰๊ฐ€ ์ง€ํ‘œ์™€ ๋น„๊ต๊ตฐ์„ ๋งˆ๋ จํ•ฉ๋‹ˆ๋‹ค.
๊ธฐ๋ฐ˜ ์—ฐ๊ตฌ
์ˆ˜ํ•™ ๋ฐ ๊ณผํ•™ ๋ถ„์•ผ์—์„œ AI ๊ธฐ๋ฐ˜ ๊ณต์‹ ์ฆ๋ช… ๋ฐ ์‹œ๊ฐํ™” ์ž๋™ํ™”์˜ ๊ธฐ๋ฐ˜์„ ๋‹ค๋ฃจ๋Š” ๋…ผ๋ฌธ์œผ๋กœ PaperBanana์˜ ๊ธฐ์ˆ ์  ํ† ๋Œ€๊ฐ€ ๋ฉ๋‹ˆ๋‹ค.
๋‹ค๋ฅธ ์ ‘๊ทผ
์ž๋™ ์ •๋ฆฌ ์ฆ๋ช…์—์„œ ์ƒ์„ฑ ๊ธฐ๋ฐ˜ ์–ธ์–ด๋ชจ๋ธ์„ ์‚ฌ์šฉํ•˜๋Š” ๋˜ ๋‹ค๋ฅธ ๋ฐฉ์‹์œผ๋กœ ๋ณธ ๋…ผ๋ฌธ๊ณผ 3372๊ฐ€ ๋ฐฉ๋ฒ•๋ก ์ ์œผ๋กœ ๋น„๊ต๋  ์ˆ˜ ์žˆ๋‹ค.
๋‹ค๋ฅธ ์ ‘๊ทผ
Deepseek-prover๋Š” LLM ๊ธฐ๋ฐ˜ ์ž๋™ ์ •๋ฆฌ ์ฆ๋ช… ๋ถ„์•ผ์—์„œ ๋Œ€๊ทœ๋ชจ ์ฆ๋ช… ๋ผ์ด๋ธŒ๋Ÿฌ๋ฆฌ ๊ตฌ์ถ•์— ์ฃผ๋ชฉํ•˜๋Š” ๋Œ€์•ˆ์  ๋ฐฉ๋ฒ•์„ ์ œ์‹œํ•ฉ๋‹ˆ๋‹ค.
๋‹ค๋ฅธ ์ ‘๊ทผ
AI ๊ธฐ๋ฐ˜ ์ˆ˜ํ•™ ์ •๋ฆฌ ์ฆ๋ช… ์ž๋™ํ™”์˜ ๋Œ€์•ˆ์  ์‹œ๋„๋กœ, AI-๊ตฌ๋™ ๊ณต์‹ ์ฆ๋ช… ํ”„๋ ˆ์ž„์›Œํฌ๋ฅผ ๋‹ค๋ฃน๋‹ˆ๋‹ค.
ํ›„์† ์—ฐ๊ตฌ
AI-๊ธฐ๋ฐ˜ ์ˆ˜ํ•™ ๊ณต์‹ํ™” ์—ฐ๊ตฌ์—์„œ TheoremQA ๋ฐ์ดํ„ฐ์…‹ ํ™œ์šฉ ๋ฐ ํ™•์žฅ ์‚ฌ๋ก€๋ฅผ ํ™•์ธํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.
ํ›„์† ์—ฐ๊ตฌ
3372 ๋…ผ๋ฌธ์€ ์ˆ˜๋ฆฌ์  ๊ณต์‹์ฆ๋ช…์—์„œ AI ๊ธฐ๋ฐ˜ ํฌ๋ฉ€๋ฆฌ์ฆ˜ ํ™•์žฅ ์‚ฌ๋ก€๋ฅผ ์†Œ๊ฐœํ•˜๋ฉฐ, 568์—์„œ ์ƒ์„ฑ๋˜๋Š” ๋Œ€๊ทœ๋ชจ ๋ฐ์ดํ„ฐ์…‹ ํ™œ์šฉ๋ฒ•๊ณผ ์—ฐ๊ฒฐ๋œ๋‹ค.
ํ›„์† ์—ฐ๊ตฌ
3372 ๋…ผ๋ฌธ์€ AI ๊ธฐ๋ฐ˜ ์ˆ˜ํ•™ ์—ฐ๊ตฌ์™€ ์ž๋™ ํ˜•์‹ ์ฆ๋ช… ์‹œ์Šคํ…œ์˜ ์ตœ์‹  ๋ฐœ์ „์„ ๋‹ค๋ฃจ๋ฉฐ, 489์˜ ๋…ผ๋ฆฌ์  ๊ธฐ์ดˆ๋ฅผ ์‹ค์ œ ์—ฐ๊ตฌ์— ํ™•์žฅ ์ ์šฉํ•œ ์‚ฌ๋ก€๋ฅผ ์ œ์‹œํ•ฉ๋‹ˆ๋‹ค.
ํ›„์† ์—ฐ๊ตฌ
์ด ๋…ผ๋ฌธ ์—ญ์‹œ ์ธ๊ณต์ง€๋Šฅ์„ ํ†ตํ•œ ์ž์œจ ์ˆ˜ํ•™ ์—ฐ๊ตฌ์˜ ์ „๋ฐ˜์  ํ”„๋ ˆ์ž„์›Œํฌ๋ฅผ ์ œ์‹œํ•˜๋ฉฐ, 3372์˜ ์ž๋™ ์ฆ๋ช… ํƒ์ƒ‰์„ ์ˆ˜ํ•™ ์ „์ฒด๋กœ ํ™•์žฅํ•œ๋‹ค.
ํ›„์† ์—ฐ๊ตฌ
3372๋Š” ์ˆ˜ํ•™ ์—ฐ๊ตฌ ๋ถ€๋ฌธ์—์„œ ์—์ด์ „ํŠธ ๊ธฐ๋ฐ˜ ์ž๋™ํ™”ยทํ˜‘๋ ฅ ํ”„๋ ˆ์ž„์›Œํฌ๋ฅผ ๋‹ค๋ฃจ์–ด, AI Co-Mathematician์˜ ์ „์ฒด ์‹œ์Šคํ…œ์„ SOTA๋กœ ํ™•์žฅํ•ฉ๋‹ˆ๋‹ค.
← ๋ชฉ๋ก์œผ๋กœ ๋Œ์•„๊ฐ€๊ธฐ

๐ŸŽง Audio Overview

์ด ๋…ผ๋ฌธ ๋ฆฌ๋ทฐ๋ฅผ ํŒŸ์บ์ŠคํŠธํ˜• ์˜ค๋””์˜ค๋กœ ์ƒ์„ฑํ•ฉ๋‹ˆ๋‹ค. (Gemini ยท ํ‚ค๋Š” ๋ธŒ๋ผ์šฐ์ €์—๋งŒ ์ €์žฅ ยท ์™„์„ฑ๋ณธ์€ ์ด๋ฉ”์ผ๋กœ๋„ ์ „์†ก)
โ–ธ ๊ณ ๊ธ‰: ๊ตฌ์„ฑ ๋ฐฉํ–ฅ(๋Œ€๋ณธ ์ž‘์„ฑ ์ง€์นจ) ์ง์ ‘ ์ˆ˜์ •