Learning Latent Plans from Play

์ €์ž: Corey Lynch, Mohi Khansari, Ted Xiao, Vikash Kumar, Jonathan Tompson, Sergey Levine, Pierre Sermanet | ๋‚ ์งœ: 2019-03-05 | URL: https://arxiv.org/abs/1903.01973 📄 PDF


Essence

Figure 1

Figure 1: Play-LMP: A single model that self-supervises control from play data, then generalizes to a wide

์ธ๊ฐ„์˜ ๋น„์ง€๋„ ์›๊ฒฉ์กฐ์ข… ํ”Œ๋ ˆ์ด ๋ฐ์ดํ„ฐ๋กœ๋ถ€ํ„ฐ ์ž๊ธฐ๊ฐ๋… ํ•™์Šต์„ ํ†ตํ•ด ์ž ์žฌ ๊ณ„ํš ๊ณต๊ฐ„์—์„œ ํ–‰๋™์„ ์กฐ์งํ™”ํ•˜๊ณ  ์žฌ์‚ฌ์šฉํ•˜์—ฌ ๋‹ค์–‘ํ•œ ์กฐ์ž‘ ์ž‘์—…์„ ์ˆ˜ํ–‰ํ•  ์ˆ˜ ์žˆ๋Š” Play-LMP ๋ฐฉ๋ฒ•์„ ์ œ์•ˆํ•œ๋‹ค.

Motivation

Achievement

Figure 2

Figure 2: The continuum of skills and its coverage. We advocate for learning the full continuum of skills

How

Figure 1

Figure 1: Play-LMP: A single model that self-supervises control from play data, then generalizes to a wide

Originality

Limitation & Further Study

Evaluation

Novelty: 4/5 Technical Soundness: 3/5 Significance: 4/5 Clarity: 4/5 Overall: 4/5

์ดํ‰: ํ”Œ๋ ˆ์ด ๋ฐ์ดํ„ฐ๋ผ๋Š” ์ƒˆ๋กœ์šด ๊ฐ๋… ์‹ ํ˜ธ๋ฅผ ํ†ตํ•ด ๋กœ๋ด‡ ํ•™์Šต์˜ ํ™•์žฅ์„ฑ ๋ฌธ์ œ๋ฅผ ํ˜์‹ ์ ์œผ๋กœ ์ ‘๊ทผํ–ˆ์œผ๋ฉฐ, ์ด์› ์ธ์ฝ”๋” ๊ตฌ์กฐ์™€ ์ž๊ธฐ๊ฐ๋… ํ•™์Šต์˜ ๊ฒฐํ•ฉ์€ ๋‹ค์ค‘์–‘์‹ ์ œ์–ด ๋ฌธ์ œ๋ฅผ ์šฐ์•„ํ•˜๊ฒŒ ํ•ด๊ฒฐํ•œ๋‹ค. ์‹œ๋ฎฌ๋ ˆ์ด์…˜ ํ™˜๊ฒฝ์—์„œ์˜ ๊ฐ•๋ ฅํ•œ ์‹ค์ฆ์  ๊ฒฐ๊ณผ์™€ ๋ช…ํ™•ํ•œ ์ œ์‹œ์—๋„ ๋ถˆ๊ตฌํ•˜๊ณ , ์‹ค์ œ ๋กœ๋ด‡ ์ ์šฉ์„ ํ†ตํ•œ ๊ฒ€์ฆ์ด ์‹ค์šฉ์  ์˜ํ–ฅ๋ ฅ์„ ํŒ๋‹จํ•˜๋Š” ๋ฐ ์ค‘์š”ํ•  ๊ฒƒ์œผ๋กœ ๋ณด์ธ๋‹ค.

← ๋ชฉ๋ก์œผ๋กœ ๋Œ์•„๊ฐ€๊ธฐ

๐ŸŽง Audio Overview

์ด ๋…ผ๋ฌธ ๋ฆฌ๋ทฐ๋ฅผ ํŒŸ์บ์ŠคํŠธํ˜• ์˜ค๋””์˜ค๋กœ ์ƒ์„ฑํ•ฉ๋‹ˆ๋‹ค. (Gemini ยท ํ‚ค๋Š” ๋ธŒ๋ผ์šฐ์ €์—๋งŒ ์ €์žฅ ยท ์™„์„ฑ๋ณธ์€ ์ด๋ฉ”์ผ๋กœ๋„ ์ „์†ก)
โ–ธ ๊ณ ๊ธ‰: ๊ตฌ์„ฑ ๋ฐฉํ–ฅ(๋Œ€๋ณธ ์ž‘์„ฑ ์ง€์นจ) ์ง์ ‘ ์ˆ˜์ •