Learning Interactive Real-World Simulators

์ €์ž: Sherry Yang, Yilun Du, Kamyar Ghasemipour, Jonathan Tompson, Leslie Kaelbling, Dale Schuurmans, Pieter Abbeel | ๋‚ ์งœ: 2023-10-09 | URL: https://arxiv.org/abs/2310.06114 📄 PDF


Essence

Figure 1

Figure 1: A universal simulator (UniSim). The simulator of the real-world learns from broad data with diverse

์ธํ„ฐ๋„ท ๋ฐ์ดํ„ฐ๋กœ๋ถ€ํ„ฐ ํ•™์Šต๋œ generative model์„ ๊ธฐ๋ฐ˜์œผ๋กœ ์ธ๊ฐ„, ๋กœ๋ด‡ ๋“ฑ์˜ ์ƒํ˜ธ์ž‘์šฉ์— ๋Œ€ํ•œ ์‹œ๊ฐ์  ๊ฒฐ๊ณผ๋ฅผ ์‹œ๋ฎฌ๋ ˆ์ด์…˜ํ•˜๋Š” universal simulator (UniSim)๋ฅผ ์ œ์•ˆํ•œ๋‹ค. ๋‹ค์–‘ํ•œ ๋ฐ์ดํ„ฐ์…‹์„ ํ†ตํ•ฉํ•˜์—ฌ ์–ธ์–ด ์ง€์‹œ, ๋กœ๋ด‡ ์ œ์–ด, ์ธ๊ฐ„ ํ™œ๋™ ๋“ฑ ๋‹ค์–‘ํ•œ ๋ชจ๋‹ฌ๋ฆฌํ‹ฐ์˜ ํ–‰๋™์„ ์ž…๋ ฅ๋ฐ›์•„ ์ผ๊ด€์„ฑ ์žˆ๋Š” ๋น„๋””์˜ค๋ฅผ ์ƒ์„ฑํ•œ๋‹ค.

Motivation

Achievement

Figure 3

Figure 3: Action-rich simulations. UniSim can support manipulation actions such as โ€œcut carrotsโ€, โ€œwash

How

Figure 2

Figure 2: Training and inference of UniSim. UniSim is a video diffusion model trained to predict the

Originality

Limitation & Further Study

Evaluation

Novelty: 4/5 Technical Soundness: 3/5 Significance: 4/5 Clarity: 4/5 Overall: 4/5

์ดํ‰: ๋ณธ ๋…ผ๋ฌธ์€ ์ด์งˆ์ ์ธ ๋‹ค์ค‘ ๋ฐ์ดํ„ฐ์…‹์„ unified ์ธํ„ฐํŽ˜์ด์Šค๋กœ ํ†ตํ•ฉํ•˜์—ฌ interactive real-world simulator๋ฅผ ๊ตฌ์ถ•ํ•œ ์˜๋ฏธ ์žˆ๋Š” ์ž‘์—…์œผ๋กœ, video diffusion model์„ ํ™œ์šฉํ•œ ๊ธฐ์ˆ ์  ๊ตฌํ˜„๊ณผ ๋‹ค์–‘ํ•œ ์‘์šฉ ๊ฐ€๋Šฅ์„ฑ์„ ๋ณด์—ฌ์ค€๋‹ค. ๋‹ค๋งŒ ํ˜„์‹ค์„ฑ ๊ฒ€์ฆ์˜ ์ •๋Ÿ‰์„ฑ๊ณผ ์‹ค์ œ ๋กœ๋ด‡ ํ™˜๊ฒฝ์—์„œ์˜ ๊ด‘๋ฒ”์œ„ํ•œ ๊ฒ€์ฆ์ด ์ถ”๊ฐ€๋˜๋ฉด ๋”์šฑ ๊ฐ•๋ ฅํ•œ ๊ธฐ์—ฌ๊ฐ€ ๋  ์ˆ˜ ์žˆ๋‹ค.

← ๋ชฉ๋ก์œผ๋กœ ๋Œ์•„๊ฐ€๊ธฐ

๐ŸŽง Audio Overview

์ด ๋…ผ๋ฌธ ๋ฆฌ๋ทฐ๋ฅผ ํŒŸ์บ์ŠคํŠธํ˜• ์˜ค๋””์˜ค๋กœ ์ƒ์„ฑํ•ฉ๋‹ˆ๋‹ค. (Gemini ยท ํ‚ค๋Š” ๋ธŒ๋ผ์šฐ์ €์—๋งŒ ์ €์žฅ ยท ์™„์„ฑ๋ณธ์€ ์ด๋ฉ”์ผ๋กœ๋„ ์ „์†ก)
โ–ธ ๊ณ ๊ธ‰: ๊ตฌ์„ฑ ๋ฐฉํ–ฅ(๋Œ€๋ณธ ์ž‘์„ฑ ์ง€์นจ) ์ง์ ‘ ์ˆ˜์ •