Nanostructured Material Design via a Retrieval-Augmented Generation (RAG) Approach: Bridging Laboratory Practice and Scientific Literature

์ €์ž: Nikita A. Krotkov, Dmitrii A. Sbytov, Anna A. Chakhoyan, Polina I. Kornienko, Anna A. Starikova, Maxim G. Stepanov, Anastasiia O. Piven, Timur A. Aliev, Tetiana Orlova, Mushegh S. Rafayelyan, Ekaterina V. Skorb | ๋‚ ์งœ: 2025-10-27 | DOI: 10.1021/acs.jcim.5c01897 📄 PDF


Essence

Figure 2

Figure 2. A schematic of a Retrieval-Augmented Generation (RAG) system processing user queries and categorizing them int

์ด ์—ฐ๊ตฌ๋Š” Retrieval-Augmented Generation (RAG) ์‹œ์Šคํ…œ๊ณผ LLM์„ ํ†ตํ•ฉํ•˜์—ฌ ๋‚˜๋…ธ๊ตฌ์กฐ ์žฌ๋ฃŒ(ํŠนํžˆ two-photon polymerization์œผ๋กœ ์ œ์กฐ๋œ)์˜ ์„ค๊ณ„๋ฅผ ์ž๋™ํ™”ํ•˜๊ณ , ๊ด‘๋Œ€ํ•œ ๊ณผํ•™ ๋ฌธํ—Œ ๋ฐ์ดํ„ฐ๋ฒ ์ด์Šค์—์„œ ์ •๋ณด๋ฅผ ์ถ”์ถœยท๋ถ„์„ํ•˜๋Š” ์—์ด์ „ํŠธ ๊ธฐ๋ฐ˜ ํ”Œ๋žซํผ์„ ์ œ์•ˆํ•œ๋‹ค.

Motivation

Achievement

Figure 3

Figure 3 provides a comprehensive evaluation of baseline

How

Figure 4

Figure 4. Architecture and workflow of the microservice-based RAG web application. The diagram illustrates the complete

Originality

Limitation & Further Study

Evaluation

Novelty: 4/5 Technical Soundness: 3/5 Significance: 4/5 Clarity: 4/5 Overall: 4/5

์ดํ‰: ์ด ์—ฐ๊ตฌ๋Š” RAG์™€ LLM์„ ํ™œ์šฉํ•˜์—ฌ ๋‚˜๋…ธ์žฌ๋ฃŒ ์„ค๊ณ„ ๋ถ„์•ผ์˜ ๋ฌธํ—Œ ๋ถ„์„์„ ํšจ๊ณผ์ ์œผ๋กœ ์ž๋™ํ™”ํ•˜๋Š” ํ˜์‹ ์  ํ”Œ๋žซํผ์„ ์ œ์‹œํ•˜๋ฉฐ, ๋†’์€ ์ •ํ™•๋„(0.82 cosine similarity, 0.81 precision)์™€ ์ง๊ด€์  ์ธํ„ฐํŽ˜์ด์Šค๋กœ ์—ฐ๊ตฌ ์ƒ์‚ฐ์„ฑ์„ ํฌ๊ฒŒ ํ–ฅ์ƒ์‹œํ‚จ๋‹ค. ๋‹ค๋งŒ domain-specific ์šฉ์–ด ์ปค๋ฒ„๋ฆฌ์ง€์™€ ์ผ๋ฐ˜ํ™” ๋Šฅ๋ ฅ ๊ฐœ์„ ์ด ํ•„์š”ํ•˜๊ณ , ํ–ฅํ›„ MatSci-LLM ๊ฐœ๋ฐœ๊ณผ ์‹คํ—˜์‹ค ์ž๋™ํ™” ํ†ตํ•ฉ์ด ์ค‘์š”ํ•œ ๊ณผ์ œ์ด๋‹ค.

๊ฐ™์ด ๋ณด๋ฉด ์ข‹์€ ๋…ผ๋ฌธ

๊ธฐ๋ฐ˜ ์—ฐ๊ตฌ
๊ณผํ•™์  ๋ฒ•์น™ ๋ฐ ์†Œ์žฌ ๊ฐ€์„ค ๋ฐœ๊ฒฌ ์ž๋™ํ™”๋ฅผ ์œ„ํ•ด ๋ฉ€ํ‹ฐ์—์ด์ „ํŠธ ํ”„๋ ˆ์ž„์›Œํฌ์˜ ์—ญํ• ๊ณผ ๊ตฌํ˜„ ๊ฐ€๋Šฅ์„ฑ์„ ์ด๋ก ์ ์œผ๋กœ ํƒ๊ตฌํ•œ๋‹ค.
๊ธฐ๋ฐ˜ ์—ฐ๊ตฌ
589๋ฒˆ(OpenFOAMGPT)์€ retrieval-augmented LLM์ด ํ™œ์šฉ๋œ ํŠน์ • ์‘์šฉ์‚ฌ๋ก€๋กœ, 569๋ฒˆ์ด ์ œ์•ˆํ•œ RAG ์‹œ์Šคํ…œ์˜ ๊ตฌ์ฒด์  ์ ์šฉ ์˜ˆ์‹œ๋กœ ์ฐธ๊ณ ํ•  ์ˆ˜ ์žˆ๋‹ค.
๊ธฐ๋ฐ˜ ์—ฐ๊ตฌ
MOF ๋…ผ๋ฌธ ๋ฐ์ดํ„ฐ๋งˆ์ด๋‹ยท์กฐ๊ฑด ์ถ”์ฒœ์„ ๋ฐ”ํƒ•์œผ๋กœ retrieval-augmented generation ๊ธฐ๋ฐ˜ ์žฌ๋ฃŒ ์„ค๊ณ„๋ฅผ ์‹ค์ œ ์ˆ˜ํ–‰ํ•œ ์‚ฌ๋ก€์ž…๋‹ˆ๋‹ค.
๊ธฐ๋ฐ˜ ์—ฐ๊ตฌ
A Survey of AI for Materials Science ๋…ผ๋ฌธ์€ ์žฌ๋ฃŒ๊ณผํ•™ ๋ถ„์•ผ LLM/RAG ํ™œ์šฉ ํŠธ๋ Œ๋“œ์™€ ํ•œ๊ณ„๋ฅผ ๋ถ„์„ํ•˜์—ฌ ๋ณธ ๋…ผ๋ฌธ์˜ ๊ธฐ์ˆ ์ , ์‚ฌํšŒ์  ๋งฅ๋ฝ์„ ์ œ๊ณตํ•œ๋‹ค.
๊ธฐ๋ฐ˜ ์—ฐ๊ตฌ
569 ๋…ผ๋ฌธ์€ ํšŒ์ˆ˜ ๋ฐ ์ ์ธต/์ด์ข…์†Œ์žฌ ์„ค๊ณ„์— ๋Œ€ํ•œ RAG ๋ฐ ์ƒ์„ฑํ˜• ๋ชจ๋ธ ์ ‘๊ทผ๋ฒ•์„ ๋‹ค๋ฃจ์–ด, 3039์˜ ์ฐจ์„ธ๋Œ€ ์ฐจํ ์†Œ์žฌ ์„ค๊ณ„ ๋…ผ์˜์— ๊นŠ์ด๋ฅผ ๋”ํ•œ๋‹ค.
๊ธฐ๋ฐ˜ ์—ฐ๊ตฌ
๊ณ ์—”ํŠธ๋กœํ”ผ ์ด‰๋งค์— ๋Œ€ํ•œ RAG ๊ธฐ๋ฐ˜ ์†Œ์žฌ์„ค๊ณ„์˜ ์›๋ฆฌ๋ฅผ ์ •๋ฆฌํ•œ ๋ฆฌ๋ทฐ๋กœ, ๋ฐ์ดํ„ฐ-์ฃผ๋„ ์ด‰๋งค ์„ค๊ณ„์˜ ๋ฐฉ๋ฒ•๋ก ์  ํ† ๋Œ€๋ฅผ ์ œ๊ณตํ•œ๋‹ค.
๋‹ค๋ฅธ ์ ‘๊ทผ
602๋ฒˆ ๋…ผ๋ฌธ์€ ๊ณผํ•™ ๋ฌธํ—Œ์˜ ๊ฒ€์ƒ‰ ๋ฐ RAG ๋ฐฉ์‹์„ ํ™œ์šฉํ•ด, 569๋ฒˆ์˜ ๋‚˜๋…ธ์žฌ๋ฃŒ ์„ค๊ณ„ RAG ์‹œ์Šคํ…œ๊ณผ ๋น„๊ต๊ฐ€ ๊ฐ€๋Šฅํ•ฉ๋‹ˆ๋‹ค.
๋‹ค๋ฅธ ์ ‘๊ทผ
๋ฌผ๋ฆฌ ๊ธฐ๋ฐ˜ ์žฌ๋ฃŒ ๋ฐœ๊ฒฌ ์‹œ์Šคํ…œ์˜ ๋Œ€์•ˆ์  ๋ฐฉ๋ฒ•๋ก ์„ ์ œ์‹œํ•˜๋Š” ์—ฐ๊ตฌ์ด๋‹ค.
๋‹ค๋ฅธ ์ ‘๊ทผ
Frame-wise LLM ๊ธฐ๋ฐ˜ ์žฌ๋ฃŒ ๊ณผํ•™ ์ž๋™ํ™” ์—์ด์ „ํŠธ(MATPilot)๋กœ, RAG ๊ธฐ๋ฐ˜ ์„ค๊ณ„ ์™ธ ๋ฒค์น˜๋งˆํฌ์šฉ ํ”„๋ ˆ์ž„์›Œํฌ๋ฅผ ์ œ์‹œํ•ฉ๋‹ˆ๋‹ค.
๋‹ค๋ฅธ ์ ‘๊ทผ
RAG-LLM ๊ธฐ๋ฐ˜ ๋‚˜๋…ธ์†Œ์žฌ ์„ค๊ณ„ ์ž๋™ํ™”๋ผ๋Š” ์œ ์‚ฌ ๋ชฉ์ ์ด์ง€๋งŒ, ์„œ๋กœ ๋‹ค๋ฅธ ๋„๋ฉ”์ธ(์žฌ๋ฃŒ vs. ๋‚˜๋…ธ๊ตฌ์กฐ)๊ณผ ์ ‘๊ทผ์„ ์ทจํ•œ๋‹ค.
๋‹ค๋ฅธ ์ ‘๊ทผ
RAG ๋ฐฉ์‹์ด ์•„๋‹Œ ๋„๋ฉ”์ธ ์ง€์‹ ์ฃผ์ž…ํ˜• LLM ๊ธฐ๋ฐ˜ ์†Œ์žฌ ์„ค๊ณ„๋ผ๋Š” ๋˜๋‹ค๋ฅธ ๋ฐฉ๋ฒ•๋ก ์„ ์ œ์‹œํ•œ๋‹ค.
๋‹ค๋ฅธ ์ ‘๊ทผ
588๋ฒˆ์€ CFD ์‹œ๋ฎฌ๋ ˆ์ด์…˜ ์™„์ „ ์ž๋™ํ™”๋ฅผ ๋‹ค๋ฃฌ multi-agent LLM ์‹œ์Šคํ…œ์œผ๋กœ, 569๋ฒˆ์˜ ์—์ด์ „ํŠธ ๊ธฐ๋ฐ˜ ์ž๋™์„ค๊ณ„ ์—ฐ๊ตฌ์™€ ์œ ์‚ฌํ•œ ๋ฐฉํ–ฅ์„ฑ์„ ๋ณด์ธ๋‹ค.
๋‹ค๋ฅธ ์ ‘๊ทผ
569(Nanostructured Material Design)๋Š” ๋ฉ€ํ‹ฐ๋ชจ๋‹ฌ LLM๊ณผ RAG๋ฅผ ๊ฒฐํ•ฉํ•˜์—ฌ ๋‚˜๋…ธ๋ฌผ์งˆ ์„ค๊ณ„ ๋ฌธ์ œ๋ฅผ ํ‘ธ๋Š” ๋Œ€์กฐ์  ์ ‘๊ทผ์„ ๋ณด์—ฌ์ค๋‹ˆ๋‹ค.
๋‹ค๋ฅธ ์ ‘๊ทผ
569๋Š” RAG ๊ธฐ๋ฐ˜์œผ๋กœ ๋‚˜๋…ธ ์†Œ์žฌ ๋“ฑ ๊ตฌ์กฐ ์˜ˆ์ธก/๋””์ž์ธ์„ ๋‹ค๋ฃจ๋ฉฐ, 1104์˜ topological materials rule discovery์™€ ๋‹ค๋ฅธ ๋ฐฉ์‹์˜ ์ธ๊ณต์ง€๋Šฅ ํ™œ์šฉ ์‚ฌ๋ก€์ž…๋‹ˆ๋‹ค.
๋‹ค๋ฅธ ์ ‘๊ทผ
retrieval-augmented generative AI๋กœ ์†Œ์žฌ ํ•ฉ์„ฑ ๊ฒฝ๋กœ๋‚˜ ์ƒˆ๋กœ์šด ์กฐ์„ฑ์„ ์ƒ์„ฑํ•˜๋Š” ๋™์‹œ๋Œ€ ๋ฐฉ๋ฒ•๋ก ์„ ๋น„๊ตํ•จ.
๋‹ค๋ฅธ ์ ‘๊ทผ
SMILES/SMARTS ํŒจํ„ด ํ™œ์šฉ, ์ƒ์„ฑ์  ๋ถ„์ž ํ•ฉ์„ฑ ์„ค๊ณ„ ๋“ฑ์—์„œ ์„œ๋กœ ๋‹ค๋ฅธ ์‹ ๊ฒฝ-๊ธฐํ˜ธ ๊ฒฐํ•ฉ ๋ฐฉ๋ฒ•๋ก ์„ ํƒ์ƒ‰ํ•จ.
๋‹ค๋ฅธ ์ ‘๊ทผ
๊ฒ€์ƒ‰ ๊ธฐ๋ฐ˜ ์ƒ์„ฑ AI๋กœ ๋‚˜๋…ธ์†Œ์žฌ์˜ ์„ค๊ณ„๋ฅผ ์‹œ๋„ํ•˜๋Š” ๋Œ€์•ˆ์  ์ „๋žต์„ ์ œ์‹œํ•ฉ๋‹ˆ๋‹ค.
๋‹ค๋ฅธ ์ ‘๊ทผ
retrieval ๊ธฐ๋ฐ˜ ์ƒ์„ฑ AI๋ฅผ ํ™œ์šฉํ•œ ์‹ค์ œ ๋ฌด๊ธฐ ๊ตฌ์กฐ ์„ค๊ณ„ ์ ‘๊ทผ๋ฒ•์„ ์‹ค์ฆ์ ์œผ๋กœ ๋น„๊ต ๊ฐ€๋Šฅํ•จ.
๋‹ค๋ฅธ ์ ‘๊ทผ
์†Œ์žฌ ๋ฐœ๊ฒฌ์—์„œ ์ œ์กฐ์„ฑ ๋ฐ ์‹ค์šฉ์  ์ œ์•ฝ์„ ํ†ตํ•ฉํ•˜๋Š” ๋Œ€์•ˆ์  ์ ‘๊ทผ๋ฒ•์„ ๋‹ค๋ฃจ๋Š” ์—ฐ๊ตฌ์ด๋‹ค.
ํ›„์† ์—ฐ๊ตฌ
651๋ฒˆ ๋…ผ๋ฌธ ์—ญ์‹œ RAG ๊ธฐ๋ฐ˜ LLM ํ˜‘์—… ์—์ด์ „ํŠธ๋ฅผ ์•ฝ๋ฌผ ๊ฐœ๋ฐœ์— ์ ์šฉํ•ด, 569๋ฒˆ์ด ์ œ์•ˆํ•œ ์‹œ์Šคํ…œ ๊ตฌ์กฐ์˜ ํ™•์žฅ์  ์‚ฌ๋ก€์ž…๋‹ˆ๋‹ค.
ํ›„์† ์—ฐ๊ตฌ
MOF ์‹คํ—˜ ๋ฐ์ดํ„ฐ ์ถ”์ถœ ๋ฐ ํ™œ์šฉ์—์„œ ํ•œ๊ฑธ์Œ ๋” ๋‚˜์•„๊ฐ€ retrieval-augmented generation ๊ธฐ๋ฐ˜ ์žฌ๋ฃŒ ์„ค๊ณ„์— ์ ์šฉํ•ฉ๋‹ˆ๋‹ค.
ํ›„์† ์—ฐ๊ตฌ
RAG์™€ ์ƒ์„ฑ๋ชจ๋ธ์„ ๊ฒฐํ•ฉํ•ด ์ž๋™ ๋ฌผ์งˆ ํ•ฉ์„ฑ๊ณผ ์ด‰๋งค์ œ ๋ฐœ๊ตด์„ ํ™•์žฅ ์ ์šฉํ•œ ์‚ฌ๋ก€์ž„.
ํ›„์† ์—ฐ๊ตฌ
Retrieval-augmented generation(RAG)๊ณผ ์†Œ์žฌ ์„ค๊ณ„ ํ†ตํ•ฉ์„ ๋ณด๋‹ค ๊ตฌ์กฐ์ ์œผ๋กœ ํ™•์žฅํ•œ ์ ‘๊ทผ์ด๋‹ค.
ํ›„์† ์—ฐ๊ตฌ
Nanostructured Material Design via a Retrieval-Augmented Generative Model์€ ์ƒ์„ฑํ˜• ๋ชจ๋ธ๊ณผ ์ง„ํ™”์  ๊ฒ€์ƒ‰, ์ŠคํŽ™ํŠธ๋Ÿผ ์œตํ•ฉ ๊ธฐ๋ฐ˜ ๋‚˜๋…ธ์†Œ์žฌ ์ƒ์„ฑ์—์„œ 3113์˜ ํ”„๋ ˆ์ž„์›Œํฌ๋ฅผ ์†Œ์žฌ ํƒ์ƒ‰์œผ๋กœ ํ™•์žฅ ์ ์šฉํ•œ๋‹ค.
์‘์šฉ ์‚ฌ๋ก€
์žฌ๋ฃŒ๊ณผํ•™ ์—์ด์ „ํŠธ๊ฐ€ ์‹ค์ œ ์†Œ์žฌ ์„ค๊ณ„ ๋ฐ ์ •๋ณด ์ถ”์ถœ์— ํ™œ์šฉ๋˜๋Š” ๊ตฌ์ฒด์  ํ”Œ๋žซํผ ๊ฐœ๋ฐœ ์‚ฌ๋ก€๋‹ค.
์‘์šฉ ์‚ฌ๋ก€
569๋ฒˆ์—์„œ ์ œ์‹œ๋œ RAG ๊ธฐ๋ฐ˜ ๋‚˜๋…ธ์žฌ๋ฃŒ ์„ค๊ณ„ ์ž๋™ํ™”๋Š” 614๋ฒˆ์—์„œ ์ง€ํ–ฅํ•˜๋Š” ์‹คํ—˜์‹ค ์ž๋™ํ™”์˜ ์‹ค์ œ ์‚ฌ๋ก€๋กœ ํ™œ์šฉ๋  ์ˆ˜ ์žˆ๋‹ค.
์‘์šฉ ์‚ฌ๋ก€
569๋ฒˆ์€ RAG+LLM์„ ํ™œ์šฉํ•œ ๋‚˜๋…ธ๊ตฌ์กฐ ์žฌ๋ฃŒ ์ž๋™ ์„ค๊ณ„ ํ”Œ๋žซํผ์„ ๊ฐœ๋ฐœํ•ด, 594๋ฒˆ์˜ ๊ณผํ•™ ์—์ด์ „ํŠธ ๊ตฌ์กฐ๋ฅผ ์„ ํ—˜์ ์œผ๋กœ ์ ์šฉยทํ™•์žฅ ๊ฐ€๋Šฅํ•˜๋‹ค.
← ๋ชฉ๋ก์œผ๋กœ ๋Œ์•„๊ฐ€๊ธฐ

๐ŸŽง Audio Overview

์ด ๋…ผ๋ฌธ ๋ฆฌ๋ทฐ๋ฅผ ํŒŸ์บ์ŠคํŠธํ˜• ์˜ค๋””์˜ค๋กœ ์ƒ์„ฑํ•ฉ๋‹ˆ๋‹ค. (Gemini ยท ํ‚ค๋Š” ๋ธŒ๋ผ์šฐ์ €์—๋งŒ ์ €์žฅ ยท ์™„์„ฑ๋ณธ์€ ์ด๋ฉ”์ผ๋กœ๋„ ์ „์†ก)
โ–ธ ๊ณ ๊ธ‰: ๊ตฌ์„ฑ ๋ฐฉํ–ฅ(๋Œ€๋ณธ ์ž‘์„ฑ ์ง€์นจ) ์ง์ ‘ ์ˆ˜์ •