AI Co-Mathematician: Accelerating Mathematicians with Agentic AI

์ €์ž: Daniel Zheng, Ingrid von Glehn, Yori Zwols, Iuliya Beloshapka, Lars Buesing, Daniel M. Roy, Martin Wattenberg, Bogdan Georgiev, Tatiana Schmidt, Andrew Cowie, Fernanda Viegas, Dimitri Kanevsky, Vineet Kahlon, Hartmut Maennel, Sophia Alj, George Holland, Alex Davies, Pushmeet Kohli | ๋‚ ์งœ: 2026-05-07 | URL: https://arxiv.org/abs/2605.06651 📄 PDF


Essence

Figure 1

Figure 1 | A simplified diagram of the organization of agents in a typical AI co-mathematician

์ˆ˜ํ•™ ์—ฐ๊ตฌ์˜ ์ „์ฒด ์›Œํฌํ”Œ๋กœ์šฐ๋ฅผ ์ง€์›ํ•˜๋Š” ์ƒํƒœํ˜• ์—์ด์ „ํŠธ ์‹œ์Šคํ…œ AI co-mathematician์„ ์ œ์•ˆํ•œ๋‹ค. ํ”„๋กœ์ ํŠธ ์ฝ”๋””๋„ค์ดํ„ฐ ์—์ด์ „ํŠธ๊ฐ€ ์—ฌ๋Ÿฌ ์ „๋ฌธํ™”๋œ ์—์ด์ „ํŠธ๋ฅผ ์กฐ์œจํ•˜์—ฌ ์ด์ƒ๋ฐœ์ƒ, ๋ฌธํ—Œ๊ฒ€์ƒ‰, ๊ณ„์‚ฐํƒ์ƒ‰, ์ •๋ฆฌ์ฆ๋ช…, ์ด๋ก ๊ตฌ์ถ•์„ ํฌ๊ด„ํ•˜๋Š” ์ƒํ˜ธ์ž‘์šฉํ˜• ์ˆ˜ํ•™ ์—ฐ๊ตฌ ํ™˜๊ฒฝ์„ ์ œ๊ณตํ•˜๋ฉฐ, FrontierMath Tier 4์—์„œ 48% ์ •ํ™•๋„๋กœ SOTA๋ฅผ ๋‹ฌ์„ฑํ–ˆ๋‹ค.

Motivation

Achievement

Figure 5

Figure 5 | Accuracy scores for Gemini 3.1 Pro, Gemini 3.1 Deep Think, and the AI co-mathematician

AI co-mathematician์˜ ์„ค๊ณ„ ๋ฐ ๊ตฌํ˜„: ์ƒํ˜ธ์ž‘์šฉํ˜• ์ˆ˜ํ•™ ์—ฐ๊ตฌ ์›Œํฌ๋ฒค์น˜ ์‹œ์Šคํ…œ ๊ฐœ๋ฐœ. ์ƒํƒœํ˜• ํ˜‘์—… ์•„ํ‚คํ…์ฒ˜: ํ”„๋กœ์ ํŠธ ์ฝ”๋””๋„ค์ดํ„ฐ ์—์ด์ „ํŠธ์™€ ๋ณ‘๋ ฌ ์›Œํฌ์ŠคํŠธ๋ฆผ์œผ๋กœ ์žฅ๊ธฐ ์—ฐ๊ตฌ ๊ด€๋ฆฌ. Native ์ˆ˜ํ•™ ์‚ฐ์ถœ๋ฌผ: ์ž‘์—… ๋…ผ๋ฌธ, ์ฃผ์„, ์ฆ๋ช… ์ถ”์  ๋“ฑ ์ˆ˜ํ•™ ๊ณต๋™์ฒด์— ์นœ์ˆ™ํ•œ ์•„ํ‹ฐํŒฉํŠธ ์ƒ์„ฑ. ๋ฒค์น˜๋งˆํฌ ์„ฑ๊ณผ: FrontierMath Tier 4์—์„œ 48% ์ •ํ™•๋„ ๋‹ฌ์„ฑ (SOTA). ์ •์„ฑ์  ๊ฒ€์ฆ: ์‹ค์ œ ์ˆ˜ํ•™์ž๋“ค์ด ๋ฏธํ•ด๊ฒฐ ๋ฌธ์ œ ํ•ด๊ฒฐ, ์ƒˆ ์—ฐ๊ตฌ ๋ฐฉํ–ฅ ๋ฐœ๊ฒฌ, ๊ฐ„๊ณผ๋œ ๋ฌธํ—Œ ์ฐธ๊ณ  ๋“ฑ์— ํ™œ์šฉํ•œ ์‚ฌ๋ก€ ์ œ์‹œ.

How

Figure 4

Figure 4 | A single workstream consists of a sequence of actions taken by a workstream coordinator

Originality

Limitation & Further Study

Evaluation

Novelty: 4/5 Technical Soundness: 4/5 Significance: 4/5 Clarity: 4/5 Overall: 4/5

์ดํ‰: ๋ณธ ๋…ผ๋ฌธ์€ ์ˆ˜ํ•™ ์—ฐ๊ตฌ์˜ ์ „์ฒด ์›Œํฌํ”Œ๋กœ์šฐ๋ฅผ ์ง€์›ํ•˜๋Š” ํ˜์‹ ์ ์ธ ์ƒํƒœํ˜• agent ์‹œ์Šคํ…œ์„ ์ œ์‹œํ•˜๋ฉฐ, ์„ค๊ณ„ ์›์น™์˜ ์ฒ ํ•™์  ๊ทผ๊ฑฐ๊ฐ€ ํƒ„ํƒ„ํ•˜๊ณ , FrontierMath Tier 4์—์„œ SOTA ์„ฑ๊ณผ๋ฅผ ๋‹ฌ์„ฑํ–ˆ๋‹ค. ๋‹ค๋งŒ ํ˜„์žฌ ์ œํ•œ๋œ ๋ฆด๋ฆฌ์Šค ์ƒํƒœ์ด๊ณ , ์ •์„ฑ์  ๊ฒ€์ฆ ์‚ฌ๋ก€๋Š” ํ’๋ถ€ํ•˜์ง€๋งŒ ๊ด‘๋ฒ”์œ„ํ•œ ์ •๋Ÿ‰์  ํ‰๊ฐ€๊ฐ€ ์ถ”๊ฐ€์ ์œผ๋กœ ํ•„์š”ํ•˜๋‹ค.

๊ฐ™์ด ๋ณด๋ฉด ์ข‹์€ ๋…ผ๋ฌธ

๊ธฐ๋ฐ˜ ์—ฐ๊ตฌ
์ž๋™ ์ •๋ฆฌ ์ฆ๋ช…๊ณผ ์ˆ˜ํ•™ ๋ฌธ์ œ ํ•ด๊ฒฐ์„ ์œ„ํ•œ ์ƒ์„ฑํ˜• LLM์˜ ๊ธฐ์ดˆ ๋ชจ๋ธ๋ง๊ณผ ์›Œํฌํ”Œ๋กœ์šฐ ๊ฐœ์„  ๋ฐฉํ–ฅ์ด ๋…ผ์˜๋จ.
๊ธฐ๋ฐ˜ ์—ฐ๊ตฌ
133 ๋…ผ๋ฌธ์€ ์‹คํ—˜๋ฌผ๋ฆฌ ์—ฐ๊ตฌ ์ž๋™ํ™”์˜ ์‚ฌ๋ก€์ด์ง€๋งŒ, ๋ณต์žก ์ด๋ก ๊ณผ ์‹คํ—˜ ์›Œํฌํ”Œ๋กœ์šฐ ์ž๋™ํ™”์˜ ์›๋ฆฌ์™€ ๊ตฌ์กฐ๊ฐ€ ์ˆ˜ํ•™ ์—ฐ๊ตฌ ์—์ด์ „ํŠธ ์„ค๊ณ„์—๋„ ๊ฐœ๋…์  ๊ธฐ๋ฐ˜์ด ๋  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.
๋‹ค๋ฅธ ์ ‘๊ทผ
482๋ฒˆ ๋…ผ๋ฌธ์€ ์ƒ๊ฐ-์ฆ๋ช…์˜ ์ƒํ˜ธ ๊ต์ฐจ ๋ฐฉ์‹์„ ํ•™์Šตํ•˜๋Š” ์ƒˆ๋กœ์šด ์‹ ๊ฒฝ ์ •๋ฆฌ์ฆ๋ช… ํ•ต์‹ฌ ์ „๋žต์„ ์ œ์‹œํ•˜์—ฌ, ์ƒํ˜ธ์ž‘์šฉ์  ์ˆ˜ํ•™ ์ž๋™ํ™”์˜ ๋‹ค์–‘ํ•œ ์„ค๊ณ„์•ˆ์„ ๋น„๊ตํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.
๋‹ค๋ฅธ ์ ‘๊ทผ
AI๊ฐ€ ์ˆ˜ํ•™ ์—ฐ๊ตฌ ์ „๋ฐ˜์— ๊ฑธ์ณ ์–ด๋–ค ์—ญํ• ์„ ํ•˜๋Š”์ง€, co-mathematician๊ณผ ์œ ์‚ฌํ•˜๊ฒŒ ์—์ด์ „ํŠธ ๊ธฐ๋ฐ˜ ์ ‘๊ทผ์„ ๋…ผํ•จ.
ํ›„์† ์—ฐ๊ตฌ
3372๋Š” ์ˆ˜ํ•™ ์—ฐ๊ตฌ ๋ถ€๋ฌธ์—์„œ ์—์ด์ „ํŠธ ๊ธฐ๋ฐ˜ ์ž๋™ํ™”ยทํ˜‘๋ ฅ ํ”„๋ ˆ์ž„์›Œํฌ๋ฅผ ๋‹ค๋ฃจ์–ด, AI Co-Mathematician์˜ ์ „์ฒด ์‹œ์Šคํ…œ์„ SOTA๋กœ ํ™•์žฅํ•ฉ๋‹ˆ๋‹ค.
์‘์šฉ ์‚ฌ๋ก€
AI Co-Mathematician ๋…ผ๋ฌธ์€ MUSTARD์—์„œ ๋‹ค๋ฃฌ LLM ๊ธฐ๋ฐ˜ ์ˆ˜ํ•™ ๋ฐ์ดํ„ฐ ์ƒ์„ฑ์„ ์‹ค์ œ ์ˆ˜ํ•™์ž ์ง€์› ์‹œ์Šคํ…œ์— ์‘์šฉํ•œ ๊ตฌ์ฒด์  ์‚ฌ๋ก€๋ฅผ ์ œ์‹œํ•œ๋‹ค.
์‘์šฉ ์‚ฌ๋ก€
ํ™”ํ•™ ๋ถ„์•ผ์—์„œ LLM ๊ธฐ๋ฐ˜ ๊ณผํ•™์  ์ถ”๋ก ์„ ํ™•์žฅํ•ด ์ˆ˜ํ•™์  AI ์‹œ์Šคํ…œ์˜ ๋ฒ”์šฉ์„ฑ๊ณผ ํ•œ๊ณ„๋ฅผ ๋น„๊ต ๋ถ„์„ํ•  ์ˆ˜ ์žˆ์Œ.
← ๋ชฉ๋ก์œผ๋กœ ๋Œ์•„๊ฐ€๊ธฐ

๐ŸŽง Audio Overview

์ด ๋…ผ๋ฌธ ๋ฆฌ๋ทฐ๋ฅผ ํŒŸ์บ์ŠคํŠธํ˜• ์˜ค๋””์˜ค๋กœ ์ƒ์„ฑํ•ฉ๋‹ˆ๋‹ค. (Gemini ยท ํ‚ค๋Š” ๋ธŒ๋ผ์šฐ์ €์—๋งŒ ์ €์žฅ ยท ์™„์„ฑ๋ณธ์€ ์ด๋ฉ”์ผ๋กœ๋„ ์ „์†ก)
โ–ธ ๊ณ ๊ธ‰: ๊ตฌ์„ฑ ๋ฐฉํ–ฅ(๋Œ€๋ณธ ์ž‘์„ฑ ์ง€์นจ) ์ง์ ‘ ์ˆ˜์ •