GraphInstruct: A Progressive Benchmark for Diagnosing Capability Gaps in LLM Graph Generation

์ €์ž: Zihe Wei, Sheng Xiang, Ying Zhang, Changjun Jiang | ๋‚ ์งœ: 2026-05-11 | DOI: 10.48550/arxiv.2605.09997 📄 PDF


Essence

Figure 3

Figure 3: Per-level Quality by capability tier, averaged over the 45 (model, strategy) configurations in

GraphInstruct๋Š” LLM์˜ ๊ทธ๋ž˜ํ”„ ์ƒ์„ฑ ๋Šฅ๋ ฅ์„ ์ง„๋‹จํ•˜๊ธฐ ์œ„ํ•ด ๊ตฌ์กฐ์  ๋ณต์žก๋„๋ฅผ 6๋‹จ๊ณ„๋กœ ๊ณ„์ธตํ™”ํ•˜๊ณ  5๊ฐœ ํ‰๊ฐ€ ์ฐจ์›์œผ๋กœ ํ‰๊ฐ€ํ•˜๋Š” ํ”„๋กœ๊ทธ๋ ˆ์‹œ๋ธŒ ๋ฒค์น˜๋งˆํฌ์ด๋ฉฐ, 12๊ฐœ LLM์— ๋Œ€ํ•œ 180K๊ฐœ ์ถœ๋ ฅ์„ ํ†ตํ•ด ๋ฉ€ํ‹ฐ-์ œ์•ฝ ์กฐํ•ฉ์ด ์ถ”๋ก  ๊นŠ์ด๋ณด๋‹ค ๋” ํฐ ์ฐจ๋ณ„ ๋Šฅ๋ ฅ์„ ๊ฐ€์ง์„ ๋ฐœ๊ฒฌํ–ˆ๋‹ค.

Motivation

Achievement

Figure 3

Figure 3: Per-level Quality by capability tier, averaged over the 45 (model, strategy) configurations in

How

Figure 1

Figure 1: The GraphInstruct benchmark framework. The Progressive Instruction Layer (L0โ€“L5)

Originality

Limitation & Further Study

Evaluation

Novelty: 4/5 Technical Soundness: 4/5 Significance: 4/5 Clarity: 4/5 Overall: 4/5

์ดํ‰: GraphInstruct๋Š” LLM์˜ ๊ทธ๋ž˜ํ”„ ์ƒ์„ฑ ๋Šฅ๋ ฅ์„ ๊ตฌ์กฐ์  ๋ณต์žก๋„์™€ ๋‹ค์ค‘ ํ‰๊ฐ€ ์ฐจ์›์œผ๋กœ ์ •๋ฐ€ํ•˜๊ฒŒ ์ง„๋‹จํ•˜๋Š” ์ตœ์ดˆ์˜ ํฌ๊ด„์  ๋ฒค์น˜๋งˆํฌ๋กœ์„œ, ๊ธฐ์กด ์—ฐ๊ตฌ์˜ ํ‰๊ท ํ™” ๋ฌธ์ œ๋ฅผ ํ•ด๊ฒฐํ•˜๊ณ  180K๊ฐœ์˜ ๋Œ€๊ทœ๋ชจ ์‹ค์ฆ ๋ฐ์ดํ„ฐ์™€ 6๊ฐ€์ง€ ํ•ต์‹ฌ ๋ฐœ๊ฒฌ์œผ๋กœ ๋ฐฉ๋ฒ•๋ก  ๊ฐœ๋ฐœ์˜ ๊ธฐ์ดˆ๋ฅผ ์ œ๊ณตํ•˜๋ฉฐ, VGIG์™€ CAAP๋ฅผ ํ†ตํ•œ ๊ฐœ์„  ํŒŒ์ดํ”„๋ผ์ธ๋„ ์„ค๋“๋ ฅ ์žˆ๊ฒŒ ๊ฒ€์ฆํ•˜์—ฌ ๊ทธ๋ž˜ํ”„ ์ƒ์„ฑ ๋ถ„์•ผ์— ๋†’์€ ์˜ํ–ฅ๋ ฅ์„ ๋ฏธ์น  ๊ฒƒ์œผ๋กœ ์˜ˆ์ƒ๋œ๋‹ค.

๊ฐ™์ด ๋ณด๋ฉด ์ข‹์€ ๋…ผ๋ฌธ

๊ธฐ๋ฐ˜ ์—ฐ๊ตฌ
145๋Š” LLM์˜ ๊ตฌ์กฐ์  ์ถ”๋ก  ๋Šฅ๋ ฅ ํ‰๊ฐ€๋ฅผ ์œ„ํ•œ ์ด๋ก ์ ยท๋ฐฉ๋ฒ•๋ก ์  ๊ธฐ๋ฐ˜์„ ์ œ๊ณตํ•˜์—ฌ GraphInstruct ์„ค๊ณ„์— ํ™œ์šฉ๋œ๋‹ค.
๋‹ค๋ฅธ ์ ‘๊ทผ
712๋Š” LLM์˜ ๋ณต์žกํ•œ ์ถ”๋ก  ๋Šฅ๋ ฅ์„ ํ‰๊ฐ€ํ•˜๋Š” ๋‹ค๋ฅธ ์ ‘๊ทผ๋ฒ•์„ ์ œ์‹œํ•˜์—ฌ GraphInstruct์˜ ๋Œ€์•ˆ์  ๋น„๊ต ๋Œ€์ƒ์ด ๋œ๋‹ค.
๋‹ค๋ฅธ ์ ‘๊ทผ
LLM์˜ ๊ทธ๋ž˜ํ”„ ๊ด€๋ จ ๋Šฅ๋ ฅ์„ ํ‰๊ฐ€ํ•˜๋Š” ๋ฒค์น˜๋งˆํฌ๋กœ ๋™์ผํ•œ ๋ฌธ์ œ ์˜์—ญ์„ ๋‹ค๋ฃจ๋Š” ์œ ์‚ฌํ•œ ์—ฐ๊ตฌ์ด๋‹ค.
๋‹ค๋ฅธ ์ ‘๊ทผ
LLM์˜ ๋ณต์žกํ•œ ์ถ”๋ก  ๋˜๋Š” ๊ตฌ์กฐ์  ์ดํ•ด ๋Šฅ๋ ฅ ํ‰๊ฐ€ ์—ฐ๊ตฌ๋กœ ์œ ์‚ฌํ•œ ๋ชฉ์ ์„ ๊ณต์œ ํ•œ๋‹ค.
๋‹ค๋ฅธ ์ ‘๊ทผ
LLM ์„ฑ๋Šฅ ์ง„๋‹จ ๋ฐ ํ‰๊ฐ€๋ฅผ ์œ„ํ•œ ๋ฒค์น˜๋งˆํฌ ์—ฐ๊ตฌ๋กœ ์œ ์‚ฌํ•œ ์ ‘๊ทผ ๋ฐฉ์‹์„ ์ทจํ•œ๋‹ค.
๋‹ค๋ฅธ ์ ‘๊ทผ
LLM์˜ ์ถ”๋ก  ๋Šฅ๋ ฅ ํ‰๊ฐ€ ๋ฒค์น˜๋งˆํฌ๋กœ ์œ ์‚ฌํ•œ ๋ฐฉ๋ฒ•๋ก ๊ณผ ๋ชฉํ‘œ๋ฅผ ๊ณต์œ ํ•œ๋‹ค.
๋‹ค๋ฅธ ์ ‘๊ทผ
3288์€ LLM์˜ ๊ทธ๋ž˜ํ”„ ๊ด€๋ จ ์ถ”๋ก  ๋Šฅ๋ ฅ์„ ํ‰๊ฐ€ํ•˜๋Š” ๋‹ค๋ฅธ ๋ฒค์น˜๋งˆํฌ๋ฅผ ์ œ์‹œํ•˜์—ฌ GraphInstruct์™€ ๋Œ€์•ˆ์ ์œผ๋กœ ๋น„๊ต๋œ๋‹ค.
← ๋ชฉ๋ก์œผ๋กœ ๋Œ์•„๊ฐ€๊ธฐ

๐ŸŽง Audio Overview

์ด ๋…ผ๋ฌธ ๋ฆฌ๋ทฐ๋ฅผ ํŒŸ์บ์ŠคํŠธํ˜• ์˜ค๋””์˜ค๋กœ ์ƒ์„ฑํ•ฉ๋‹ˆ๋‹ค. (Gemini ยท ํ‚ค๋Š” ๋ธŒ๋ผ์šฐ์ €์—๋งŒ ์ €์žฅ ยท ์™„์„ฑ๋ณธ์€ ์ด๋ฉ”์ผ๋กœ๋„ ์ „์†ก)
โ–ธ ๊ณ ๊ธ‰: ๊ตฌ์„ฑ ๋ฐฉํ–ฅ(๋Œ€๋ณธ ์ž‘์„ฑ ์ง€์นจ) ์ง์ ‘ ์ˆ˜์ •