ASE: Large-Scale Reusable Adversarial Skill Embeddings for Physically Simulated Characters

์ €์ž: Xue Bin Peng, Yunrong Guo, Lina Halper, Sergey Levine, Sanja Fidler | ๋‚ ์งœ: 2022-05-04 | URL: https://arxiv.org/abs/2205.01906 📄 PDF


Essence

Figure 1

Fig. 1. Our framework enables physically simulated characters to learn versatile and reusable skill embeddings from larg

๋Œ€๊ทœ๋ชจ ๋น„์ •ํ˜• ๋ชจ์…˜ ๋ฐ์ดํ„ฐ์…‹์œผ๋กœ๋ถ€ํ„ฐ adversarial imitation learning๊ณผ unsupervised reinforcement learning์„ ๊ฒฐํ•ฉํ•˜์—ฌ ๋ฌผ๋ฆฌ ์‹œ๋ฎฌ๋ ˆ์ด์…˜ ์บ๋ฆญํ„ฐ์˜ ์žฌ์‚ฌ์šฉ ๊ฐ€๋Šฅํ•œ ์Šคํ‚ฌ ์ž„๋ฒ ๋”ฉ์„ ํ•™์Šตํ•˜๋Š” ๋ฐ์ดํ„ฐ ๊ธฐ๋ฐ˜ ํ”„๋ ˆ์ž„์›Œํฌ๋ฅผ ์ œ์‹œํ•œ๋‹ค. ํ•™์Šต๋œ ์Šคํ‚ฌ ์ž„๋ฒ ๋”ฉ์€ ๋‹ค์–‘ํ•œ ์ƒˆ๋กœ์šด ๊ณผ์ œ์— ํšจ๊ณผ์ ์œผ๋กœ ์ „์ด๋˜๋ฉฐ ์ž์—ฐ์Šค๋Ÿฌ์šด ํ–‰๋™์„ ํ•ฉ์„ฑํ•œ๋‹ค.

Motivation

Achievement

Figure 4

Fig. 4. Our framework is used to learn skill embeddings for a 37 degrees-of-

How

Figure 2

Fig. 2. The ASE framework consists of two stages: pre-training and transfer.

Originality

Limitation & Further Study

Evaluation

Novelty: 4/5 Technical Soundness: 4/5 Significance: 4/5 Clarity: 4/5 Overall: 4/5

์ดํ‰: ๋ณธ ๋…ผ๋ฌธ์€ adversarial imitation learning๊ณผ information maximization์„ ๊ฒฐํ•ฉํ•˜์—ฌ ๋Œ€๊ทœ๋ชจ ๋น„์ •ํ˜• ๋ชจ์…˜ ๋ฐ์ดํ„ฐ๋กœ๋ถ€ํ„ฐ ์žฌ์‚ฌ์šฉ ๊ฐ€๋Šฅํ•œ ์Šคํ‚ฌ ์ž„๋ฒ ๋”ฉ์„ ํ•™์Šตํ•˜๋Š” ํ˜์‹ ์ ์ธ ํ”„๋ ˆ์ž„์›Œํฌ๋ฅผ ์ œ์‹œํ•œ๋‹ค. ์‹ญ ๋…„ ๊ทœ๋ชจ์˜ ๋Œ€๊ทœ๋ชจ ์‚ฌ์ „ ํ•™์Šต๊ณผ ํƒ์›”ํ•œ ์ „์ด ์„ฑ๋Šฅ์œผ๋กœ ๋ฌผ๋ฆฌ ๊ธฐ๋ฐ˜ ์บ๋ฆญํ„ฐ ์• ๋‹ˆ๋ฉ”์ด์…˜ ๋ถ„์•ผ์— significant contribution์„ ์ œ๊ณตํ•œ๋‹ค.

← ๋ชฉ๋ก์œผ๋กœ ๋Œ์•„๊ฐ€๊ธฐ

๐ŸŽง Audio Overview

์ด ๋…ผ๋ฌธ ๋ฆฌ๋ทฐ๋ฅผ ํŒŸ์บ์ŠคํŠธํ˜• ์˜ค๋””์˜ค๋กœ ์ƒ์„ฑํ•ฉ๋‹ˆ๋‹ค. (Gemini ยท ํ‚ค๋Š” ๋ธŒ๋ผ์šฐ์ €์—๋งŒ ์ €์žฅ ยท ์™„์„ฑ๋ณธ์€ ์ด๋ฉ”์ผ๋กœ๋„ ์ „์†ก)
โ–ธ ๊ณ ๊ธ‰: ๊ตฌ์„ฑ ๋ฐฉํ–ฅ(๋Œ€๋ณธ ์ž‘์„ฑ ์ง€์นจ) ์ง์ ‘ ์ˆ˜์ •