Retrieval-Augmented Generation for Large Language Models: A Survey

์ €์ž: Yunfan Gao, Yun Xiong, Xinyu Gao, Kangxiang Jia, Jin Pan | ๋‚ ์งœ: 2023 | URL: https://arxiv.org/abs/2312.10997 📄 PDF


Essence

Figure 2

Fig. 2. A representative instance of the RAG process applied to question answering. It mainly consists of 3 steps. 1) In

์ด ๋…ผ๋ฌธ์€ Large Language Models(LLMs)์˜ hallucination, ์ง€์‹ ๋ถ€์กฑ, ์ถ”๋ก  ๊ณผ์ •์˜ ๋ถˆํˆฌ๋ช…์„ฑ ๋“ฑ์˜ ๋ฌธ์ œ๋ฅผ ํ•ด๊ฒฐํ•˜๊ธฐ ์œ„ํ•ด Retrieval-Augmented Generation(RAG) ๊ธฐ์ˆ ์˜ ๋ฐœ์ „์„ ์ฒด๊ณ„์ ์œผ๋กœ ์กฐ์‚ฌํ•˜๋Š” ์ข…ํ•ฉ ์„ค๋ฌธ ๋…ผ๋ฌธ์ด๋‹ค. RAG ์—ฐ๊ตฌ๋ฅผ Naive RAG, Advanced RAG, Modular RAG์˜ ์„ธ ๊ฐ€์ง€ ํŒจ๋Ÿฌ๋‹ค์ž„์œผ๋กœ ๋ถ„๋ฅ˜ํ•˜๊ณ , retrieval, generation, augmentation์˜ ํ•ต์‹ฌ ๊ธฐ์ˆ  ์š”์†Œ๋ฅผ ์ƒ์„ธํžˆ ๋ถ„์„ํ•œ๋‹ค.

Motivation

Achievement

Figure 1

Fig. 1. Technology tree of RAG research. The stages of involving RAG mainly include pre-training, fine-tuning, and infer

โ€ข RAG ํŒจ๋Ÿฌ๋‹ค์ž„ ๋ถ„๋ฅ˜: Naive RAG(Retrieve-Read), Advanced RAG, Modular RAG์˜ ์ง„ํ™” ๋‹จ๊ณ„๋ฅผ ๋ช…ํ™•ํžˆ ์ •์˜ํ•˜๊ณ  ๊ฐ ๋‹จ๊ณ„์˜ ํŠน์ง•๊ณผ ๊ฐœ์„ ์ ์„ ์ฒด๊ณ„ํ™”.

โ€ข ํ•ต์‹ฌ ๊ธฐ์ˆ  ๋ถ„์„: Retrieval(indexing, query optimization, embedding), Generation(post-retrieval processing, fine-tuning), Augmentation ์„ธ ๊ฐ€์ง€ ํ•ต์‹ฌ ์š”์†Œ์˜ ์ƒ์„ธ ๋ถ„์„.

โ€ข ํ‰๊ฐ€ ํ”„๋ ˆ์ž„์›Œํฌ: 26๊ฐœ ํƒœ์Šคํฌ, ์•ฝ 50๊ฐœ ๋ฐ์ดํ„ฐ์…‹์„ ํฌํ•จํ•˜๋Š” ํฌ๊ด„์ ์ธ ํ‰๊ฐ€ ๋ชฉํ‘œ, ํ‰๊ฐ€ ์ง€ํ‘œ, ๋ฒค์น˜๋งˆํฌ ๋ฐ ํ‰๊ฐ€ ๋„๊ตฌ๋ฅผ ์ •๋ฆฌ.

โ€ข ๊ธฐ์ˆ  ํŠธ๋ฆฌ ์ œ์‹œ: RAG ์—ฐ๊ตฌ์˜ ๋ฐœ์ „ ๊ฒฝ๋กœ๋ฅผ pre-training, fine-tuning, inference ๋‹จ๊ณ„๋ณ„๋กœ ์‹œ๊ฐํ™”ํ•˜์—ฌ ์—ญ์‚ฌ์  ์ง„ํ™” ๊ณผ์ •์„ ๋ช…ํ™•ํžˆ ํ•จ.

How

Figure 2

Fig. 2. A representative instance of the RAG process applied to question answering. It mainly consists of 3 steps. 1) In

โ€ข Naive RAG์˜ ์„ธ ๋‹จ๊ณ„ ํ”„๋กœ์„ธ์Šค(indexing, retrieval, generation)์—์„œ ๊ฐ ๋‹จ๊ณ„์˜ ๊ธฐ์ˆ ์  ๊ณผ์ œ ๋ช…์‹œ (precision/recall ๋ฌธ์ œ, hallucination, augmentation coherence ๋“ฑ)

โ€ข Advanced RAG์™€ Modular RAG๊ฐ€ ์œ„ ๊ณผ์ œ๋“ค์„ ์–ด๋–ป๊ฒŒ ํ•ด๊ฒฐํ•˜๋Š”์ง€ ๋‹จ๊ณ„์ ์œผ๋กœ ๋ถ„์„

โ€ข retrieval ๋‹จ๊ณ„์˜ ์ตœ์ ํ™” ๋ฐฉ๋ฒ• (๋ฒกํ„ฐ ์ธ๋ฑ์‹ฑ, ์ฟผ๋ฆฌ ๋ณ€ํ™˜, embedding ๊ฐœ์„ )

โ€ข generation ๋‹จ๊ณ„์˜ post-retrieval processing๊ณผ LLM fine-tuning ๊ธฐ๋ฒ•

โ€ข ์„ธ ๊ฐ€์ง€ augmentation ๊ณผ์ •์˜ ํŠน์„ฑ๊ณผ ํšจ์œจ์„ฑ ๋น„๊ต

โ€ข ๋‹ค์šด์ŠคํŠธ๋ฆผ ํƒœ์Šคํฌ๋ณ„(question answering, summarization, domain-specific tasks ๋“ฑ) ์ ์šฉ ์‚ฌ๋ก€์™€ ํ‰๊ฐ€ ๋ฉ”ํŠธ๋ฆญ ์ œ์‹œ

Originality

โ€ข RAG๋ฅผ ์„ธ ๊ฐ€์ง€ ๋ช…ํ™•ํ•œ ํŒจ๋Ÿฌ๋‹ค์ž„์œผ๋กœ ๋ถ„๋ฅ˜ํ•˜์—ฌ ์ง„ํ™” ๊ฒฝ๋กœ๋ฅผ ์ฒด๊ณ„์ ์œผ๋กœ ์ œ์‹œ (๊ธฐ์กด ์ž‘์—…์—์„œ๋Š” RAG ๋ฐฉ๋ฒ•๋“ค์„ ์‚ฐ๋ฐœ์ ์œผ๋กœ ๋‹ค๋ฃธ)

โ€ข retrieval, generation, augmentation์„ ๊ตฌ๋ถ„ํ•˜์—ฌ ๊ฐ ์š”์†Œ์˜ ๋…๋ฆฝ์  ํŠน์„ฑ๊ณผ ์ƒํ˜ธ์ž‘์šฉ์„ ๋ถ„์„ํ•˜๋Š” ํ”„๋ ˆ์ž„์›Œํฌ

โ€ข LLM ์‹œ๋Œ€์˜ RAG ๋ฐœ์ „์„ pre-training, fine-tuning, inference ์„ธ ๋‹จ๊ณ„๋กœ ๊ตฌ์กฐํ™”

โ€ข ํ‰๊ฐ€ ํ”„๋ ˆ์ž„์›Œํฌ๋ฅผ ์ฒด๊ณ„์ ์œผ๋กœ ์ •๋ฆฌํ•˜์—ฌ ๊ธฐ์กด์˜ ๋ฐฉ๋ฒ•๋ก  ํŽธํ–ฅ์„ ์‹œ์ •

Limitation & Further Study

โ€ข ๋…ผ๋ฌธ์˜ ๋ฒ”์œ„๊ฐ€ ๋งค์šฐ ๊ด‘๋ฒ”์œ„ํ•˜์—ฌ, ๊ฐ ์„ธ๋ถ€ ๊ธฐ์ˆ ์˜ ์‹ฌํ™” ๋ถ„์„์ด ์ œํ•œ์ ์ผ ๊ฐ€๋Šฅ์„ฑ ์žˆ์Œ

โ€ข Naive RAG์˜ ํ•œ๊ณ„(precision/recall ๋ฌธ์ œ, hallucination)๊ฐ€ Advanced ๋ฐ Modular RAG์—์„œ ์™„์ „ํžˆ ํ•ด๊ฒฐ๋˜๋Š”์ง€์— ๋Œ€ํ•œ ์ •๋Ÿ‰์  ๋น„๊ต ๋ถ„์„ ๋ถ€์žฌ

โ€ข ํ‰๊ฐ€ ๋ฒค์น˜๋งˆํฌ ์ œ์‹œ๋Š” ํฌ๊ด„์ ์ด๋‚˜, ๋‹ค์–‘ํ•œ ํ‰๊ฐ€ ๋ฉ”ํŠธ๋ฆญ ๊ฐ„์˜ ์ผ๊ด€์„ฑ๊ณผ ํƒ€๋‹น์„ฑ์— ๋Œ€ํ•œ ๋ฉ”ํƒ€-๋ถ„์„ ๋ถ€์กฑ

โ€ข ๊ณ„์‚ฐ ๋น„์šฉ, ๋ ˆ์ดํ„ด์‹œ ๋“ฑ์˜ ์‹ค๋ฌด์  ํšจ์œจ์„ฑ ์ธก๋ฉด์— ๋Œ€ํ•œ ๋…ผ์˜ ์ œํ•œ์ 

โ€ข ํ–ฅํ›„ ์—ฐ๊ตฌ ๋ฐฉํ–ฅ์— ๋Œ€ํ•œ ์ œ์•ˆ์ด ๊ฐœ๊ด„์ ์ด์–ด์„œ ๊ตฌ์ฒด์ ์ธ ๊ธฐ์ˆ  ๋กœ๋“œ๋งต์ด ๋ถ€์กฑํ•  ๊ฐ€๋Šฅ์„ฑ

Evaluation

Novelty: 4/5 Technical Soundness: 4/5 Significance: 5/5 Clarity: 4/5 Overall: 4/5

์ดํ‰: ์ด ์„ค๋ฌธ ๋…ผ๋ฌธ์€ RAG์˜ ๋น ๋ฅธ ๋ฐœ์ „์— ๋Œ€์‘ํ•˜์—ฌ 100๊ฐœ ์ด์ƒ์˜ ์—ฐ๊ตฌ๋ฅผ ์ฒด๊ณ„์ ์œผ๋กœ ์ •๋ฆฌํ•˜๊ณ  ์„ธ ๊ฐ€์ง€ ํŒจ๋Ÿฌ๋‹ค์ž„์œผ๋กœ ๋ถ„๋ฅ˜ํ•˜๋ฉฐ ํฌ๊ด„์ ์ธ ํ‰๊ฐ€ ํ”„๋ ˆ์ž„์›Œํฌ๋ฅผ ์ œ์‹œํ•จ์œผ๋กœ์จ ํ•™๊ณ„์™€ ์‚ฐ์—…์— ์‹ค์งˆ์ ์ธ ๊ธฐ์—ฌ๋ฅผ ํ•œ๋‹ค. ํŠนํžˆ LLM ์‹œ๋Œ€์˜ RAG ์ง„ํ™”๋ฅผ ๋ช…ํ™•ํžˆ ํ•˜๊ณ  retrieval-generation-augmentation์˜ ์ƒํ˜ธ์ž‘์šฉ์„ ๋ถ„์„ํ•œ ์ ์ด ๊ฐ•์ ์ด๋‹ค. ๋‹ค๋งŒ ๊ฐ ์„ธ๋ถ€ ๊ธฐ์ˆ ์— ๋Œ€ํ•œ ์‹ฌํ™” ๋ถ„์„๊ณผ ์ •๋Ÿ‰์  ๋น„๊ต, ์‹ค๋ฌด์  ํšจ์œจ์„ฑ ๋…ผ์˜๊ฐ€ ๋ณด์™„๋˜๋ฉด ๋”์šฑ ์™„์„ฑ๋„ ๋†’์€ ์ž๋ฃŒ๊ฐ€ ๋  ๊ฒƒ์œผ๋กœ ํŒ๋‹จ๋œ๋‹ค.

๊ฐ™์ด ๋ณด๋ฉด ์ข‹์€ ๋…ผ๋ฌธ

๊ธฐ๋ฐ˜ ์—ฐ๊ตฌ
REALM ๋“ฑ RAG ์‚ฌ์ „ํ•™์Šต ํ”„๋ ˆ์ž„์›Œํฌ ์—ฐ๊ตฌ๊ฐ€ RAG ๊ธฐ๋ฐ˜ LLM์˜ ์—ญ์‚ฌ ๋ฐ ๊ทผ๋ณธ์ ์ธ ๋™์ž‘ ์›๋ฆฌ ํƒ๊ตฌ์˜ ํ† ๋Œ€๊ฐ€ ๋ฉ๋‹ˆ๋‹ค.
๊ธฐ๋ฐ˜ ์—ฐ๊ตฌ
3391 'Retrieval-Augmented Generation for Large Language Models' ๋…ผ๋ฌธ์€ RAG ๊ฐœ๋…์˜ ์ „๋ฐ˜์  ์ดํ•ด์™€ ์ตœ์‹  ์•Œ๊ณ ๋ฆฌ์ฆ˜ ๋™ํ–ฅ์„ ์งš์–ด์ฃผ์–ด, 366 Futuregen ๋ฐฉ์‹์˜ RAG ์„ค๊ณ„์— ์ด๋ก ์  ํ† ๋Œ€๋ฅผ ์ œ๊ณตํ•ฉ๋‹ˆ๋‹ค.
๊ธฐ๋ฐ˜ ์—ฐ๊ตฌ
Retrieval-augmented generation์˜ ํ™˜๊ฐ ์™„ํ™” ์—ญํ•  ๋ฐ ํ•œ๊ณ„ ๋…ผ์˜๊ฐ€ LLM ํ™˜๊ฐ์˜ ๊ฐ€์น˜ ํ‰๊ฐ€ ๋ฌธ์ œ(๋ณธ ๋…ผ๋ฌธ)์™€ ์ด๋ก ์ ์œผ๋กœ ์—ฐ๊ฒฐ๋ฉ๋‹ˆ๋‹ค.
๊ธฐ๋ฐ˜ ์—ฐ๊ตฌ
Retrieval-Augmented Generation for Large Language Models: A Survey๋Š” RAG์˜ ํ•œ๊ณ„ ๋ฐ ๋ฌธ๋งฅ ๊ธธ์ด ๋ฌธ์ œ์— ๋Œ€ํ•œ ์ด๋ก ์ ยท์‹ค์ฆ์  ์ •๋ฆฌ๋กœ 318์˜ ์—ฐ๊ตฌ ์„ค๊ณ„์— ๊ธฐ๋ฐ˜์  ์—ญํ• ์„ ํ•ฉ๋‹ˆ๋‹ค.
๊ธฐ๋ฐ˜ ์—ฐ๊ตฌ
Retrieval-Augmented Generation for Large Language Models๋Š” ์ธ์šฉ ์˜ค๋ฅ˜ ํƒ์ง€์˜ ํ•ต์‹ฌ ํ† ๋Œ€์ธ retrieval-augmented generation์˜ ์ตœ์‹  ๋™ํ–ฅ๊ณผ ํ•œ๊ณ„๋ฅผ ์ฒด๊ณ„์ ์œผ๋กœ ์ œ๊ณตํ•œ๋‹ค.
๋‹ค๋ฅธ ์ ‘๊ทผ
RAG๋ฅผ ํ™œ์šฉํ•œ ํ•˜์ด๋ธŒ๋ฆฌ๋“œ QA ๋ฐ ํ‘œ+ํ…์ŠคํŠธ ๊ธฐ๋ฐ˜ ์งˆ์˜์‘๋‹ต ํ•ด๊ฒฐ์— ์ดˆ์ ์„ ๋งž์ถ˜ ์ตœ์‹  ๋ฒค์น˜๋งˆํฌ ๋ถ„์„ ๋…ผ๋ฌธ์ž…๋‹ˆ๋‹ค.
๋‹ค๋ฅธ ์ ‘๊ทผ
Retrieval-Augmented Generation์— ๋Œ€ํ•œ ์ด๋ก ์ ยท์‹ค์งˆ์  ๋ฐœ์ „์„ ๋ชจ๋‘ ๋‹ค๋ฃจ๋Š” ์ข…์„ค ๋…ผ๋ฌธ์œผ๋กœ, RAG ๊ธฐ์ˆ ์˜ ์ง„ํ™” ๊ฒฝ๋กœ๋ณ„ ๋ถ„์„์„ ํ†ตํ•ด ์ƒํ˜ธ ๋ณด์™„์  ์‹œ๊ฐ์„ ์–ป์„ ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.
ํ›„์† ์—ฐ๊ตฌ
FRAG: A Flexible Modular Framework for Retrieval-Augmented Generation๋Š” ์‹ค์ œ RAG ์‹œ์Šคํ…œ์—์„œ ์œ ์—ฐ์„ฑ๊ณผ ํ’ˆ์งˆ์„ ๋งž์ถ”๋Š” ์ƒˆ๋กœ์šด ํ”„๋ ˆ์ž„์›Œํฌ๋ฅผ ์ œ์‹œํ•˜๋ฏ€๋กœ, RAG ๋ถ„์•ผ์˜ ์‘์šฉยทํ™•์žฅ ์‚ฌ๋ก€๋กœ ์—ฐ๊ฒฐ๋ฉ๋‹ˆ๋‹ค.
ํ›„์† ์—ฐ๊ตฌ
Turning Citation Networks Inside Out๋Š” ๊ธฐ์กด ์ธ์šฉ ๋„คํŠธ์›Œํฌ ๊ธฐ๋ฐ˜ ์ง€์‹ ๊ทธ๋ž˜ํ”„ ๋Œ€์‹  ๋…ผ๋ฌธ ๋‚ด์šฉ ๊ธฐ๋ฐ˜ ์‚ผ์ค‘ํ•ญ ์ถ”์ถœ์ด๋ผ๋Š” RAG ์‘์šฉ์˜ ์ƒˆ๋กœ์šด ํ™•์žฅ์„ ์‹คํ˜„ํ•œ๋‹ค.
← ๋ชฉ๋ก์œผ๋กœ ๋Œ์•„๊ฐ€๊ธฐ

๐ŸŽง Audio Overview

์ด ๋…ผ๋ฌธ ๋ฆฌ๋ทฐ๋ฅผ ํŒŸ์บ์ŠคํŠธํ˜• ์˜ค๋””์˜ค๋กœ ์ƒ์„ฑํ•ฉ๋‹ˆ๋‹ค. (Gemini ยท ํ‚ค๋Š” ๋ธŒ๋ผ์šฐ์ €์—๋งŒ ์ €์žฅ ยท ์™„์„ฑ๋ณธ์€ ์ด๋ฉ”์ผ๋กœ๋„ ์ „์†ก)
โ–ธ ๊ณ ๊ธ‰: ๊ตฌ์„ฑ ๋ฐฉํ–ฅ(๋Œ€๋ณธ ์ž‘์„ฑ ์ง€์นจ) ์ง์ ‘ ์ˆ˜์ •