MineDreamer: Learning to Follow Instructions via Chain-of-Imagination for Simulated-World Control

์ €์ž: Enshen Zhou, Yiran Qin, Zhenfei Yin, Yuzhou Huang, Ruimao Zhang, Lu Sheng, Yu Qiao, Jing Shao | ๋‚ ์งœ: 2024-03-18 | URL: https://arxiv.org/abs/2403.12037 📄 PDF


Essence

Figure 1

Fig. 1: Comparison between MineDreamer and previous studies. In โ€œChop

MineDreamer๋Š” Chain-of-Imagination(CoI) ๋ฉ”์ปค๋‹ˆ์ฆ˜์„ ํ†ตํ•ด MLLM๊ณผ diffusion model์„ ํ™œ์šฉํ•˜์—ฌ Minecraft์—์„œ ์ž์—ฐ์–ด ์ง€์‹œ๋ฅผ ๋‹จ๊ณ„๋ณ„๋กœ ์ƒ์ƒํ•˜๊ณ  ์‹คํ–‰ํ•˜๋Š” embodied agent์ด๋‹ค. CoI๋Š” ํ˜„์žฌ ์ƒํƒœ์— ๋งž์ถ˜ ์‹œ๊ฐ์  ํ”„๋กฌํ”„ํŠธ๋ฅผ ๋ฐ˜๋ณต์ ์œผ๋กœ ์ƒ์„ฑํ•˜์—ฌ ์ง€์‹œ ์ถ”์ข… ๋Šฅ๋ ฅ์„ ํฌ๊ฒŒ ํ–ฅ์ƒ์‹œํ‚จ๋‹ค.

Motivation

Achievement

Figure 5

Fig. 5: Performance on Programmatic Evaluation. MineDreamer surpasses the

How

Figure 2

Fig. 2: The Overview of Chain-of-Imagination. The Imaginator imagines a goal

Originality

Limitation & Further Study

Evaluation

Novelty: 4/5 Technical Soundness: 3/5 Significance: 4/5 Clarity: 4/5 Overall: 4/5

์ดํ‰: MineDreamer๋Š” Chain-of-Imagination ๋ฉ”์ปค๋‹ˆ์ฆ˜์„ ํ†ตํ•ด ์ž์—ฐ์–ด ์ง€์‹œ ์ถ”์ข… ์—์ด์ „ํŠธ์˜ ์„ค๊ณ„์— ์ฐฝ์˜์ ์ธ ์ ‘๊ทผ์„ ์ œ์‹œํ•˜๋ฉฐ, MLLM-enhanced diffusion ๋ชจ๋ธ๊ณผ Goal Drift Collection์„ ๊ฒฐํ•ฉํ•˜์—ฌ ๊ธฐ์กด ๋ฐฉ๋ฒ• ๋Œ€๋น„ ํ˜„์ €ํžˆ ์šฐ์ˆ˜ํ•œ ์„ฑ๋Šฅ์„ ๋‹ฌ์„ฑํ–ˆ๋‹ค. Minecraft ํ™˜๊ฒฝ์— ํ•œ์ •๋˜์ง€๋งŒ, embodied AI์˜ ์ง€์‹œ ์ถ”์ข… ๋Šฅ๋ ฅ ํ–ฅ์ƒ์— ์ค‘์š”ํ•œ ๊ธฐ์—ฌ๋ฅผ ํ•œ๋‹ค.

← ๋ชฉ๋ก์œผ๋กœ ๋Œ์•„๊ฐ€๊ธฐ

๐ŸŽง Audio Overview

์ด ๋…ผ๋ฌธ ๋ฆฌ๋ทฐ๋ฅผ ํŒŸ์บ์ŠคํŠธํ˜• ์˜ค๋””์˜ค๋กœ ์ƒ์„ฑํ•ฉ๋‹ˆ๋‹ค. (Gemini ยท ํ‚ค๋Š” ๋ธŒ๋ผ์šฐ์ €์—๋งŒ ์ €์žฅ ยท ์™„์„ฑ๋ณธ์€ ์ด๋ฉ”์ผ๋กœ๋„ ์ „์†ก)
โ–ธ ๊ณ ๊ธ‰: ๊ตฌ์„ฑ ๋ฐฉํ–ฅ(๋Œ€๋ณธ ์ž‘์„ฑ ์ง€์นจ) ์ง์ ‘ ์ˆ˜์ •