World Models

์ €์ž: David Ha, Jรผrgen Schmidhuber | ๋‚ ์งœ: 2018-03-27 | URL: https://arxiv.org/abs/1803.10122 📄 PDF


Essence

Figure 3

Figure 3. In this work, we build probabilistic generative models of

ํ™˜๊ฒฝ์˜ ์ƒ์„ฑํ˜• ์‹ ๊ฒฝ๋ง world model์„ ๋น„์ง€๋„ํ•™์Šต์œผ๋กœ ํ•™์Šตํ•œ ํ›„, ์ถ”์ถœ๋œ ํŠน์ง•์œผ๋กœ ๊ฐ„๋‹จํ•œ policy๋ฅผ ํ›ˆ๋ จํ•˜์—ฌ ๊ฐ•ํ™”ํ•™์Šต ๋ฌธ์ œ๋ฅผ ํ•ด๊ฒฐํ•˜๋Š” ๋ฐฉ๋ฒ•์„ ์ œ์‹œํ•œ๋‹ค. ์‹ฌ์ง€์–ด world model์ด ์ƒ์„ฑํ•œ ์ƒ์ƒ์˜ ํ™˜๊ฒฝ์—์„œ ํ›ˆ๋ จํ•œ policy๋ฅผ ์‹ค์ œ ํ™˜๊ฒฝ์— ์ „์ด ๊ฐ€๋Šฅํ•จ์„ ๋ณด์ธ๋‹ค.

Motivation

Achievement

How

Figure 5

Figure 5. Flow diagram of a Variational Autoencoder (VAE).

Originality

Limitation & Further Study

Evaluation

Novelty: 4/5 Technical Soundness: 3/5 Significance: 4/5 Clarity: 4/5 Overall: 4/5

์ดํ‰: ์ด ๋…ผ๋ฌธ์€ reinforcement learning๊ณผ ์ƒ์„ฑ ๋ชจ๋ธ์„ ์šฐ์•„ํ•˜๊ฒŒ ๊ฒฐํ•ฉํ•˜์—ฌ ํšจ์œจ์ ์ธ policy ํ•™์Šต์„ ๋‹ฌ์„ฑํ–ˆ์œผ๋ฉฐ, world model ๊ธฐ๋ฐ˜ ์ ‘๊ทผ๋ฒ•์˜ ์‹ค์šฉ์„ฑ์„ ๋ช…ํ™•ํžˆ ์ž…์ฆํ•œ ์˜ํ–ฅ๋ ฅ ์žˆ๋Š” ์ž‘์—…์ด๋‹ค. ๋ชจ๋“ˆํ™”๋œ ์„ค๊ณ„์™€ dream training ๊ฐœ๋…์€ ์ดํ›„ ์—ฐ๊ตฌ์— ํฐ ์˜๊ฐ์„ ์ฃผ์—ˆ๋‹ค.

← ๋ชฉ๋ก์œผ๋กœ ๋Œ์•„๊ฐ€๊ธฐ

๐ŸŽง Audio Overview

์ด ๋…ผ๋ฌธ ๋ฆฌ๋ทฐ๋ฅผ ํŒŸ์บ์ŠคํŠธํ˜• ์˜ค๋””์˜ค๋กœ ์ƒ์„ฑํ•ฉ๋‹ˆ๋‹ค. (Gemini ยท ํ‚ค๋Š” ๋ธŒ๋ผ์šฐ์ €์—๋งŒ ์ €์žฅ ยท ์™„์„ฑ๋ณธ์€ ์ด๋ฉ”์ผ๋กœ๋„ ์ „์†ก)
โ–ธ ๊ณ ๊ธ‰: ๊ตฌ์„ฑ ๋ฐฉํ–ฅ(๋Œ€๋ณธ ์ž‘์„ฑ ์ง€์นจ) ์ง์ ‘ ์ˆ˜์ •