Deep Imitation Learning for Humanoid Loco-manipulation through Human Teleoperation

์ €์ž: Mingyo Seo, Steve Han, Kyutae Sim, Seung Hyeon Bang, Carlos Gonzalez, Luis Sentis, Yuke Zhu | ๋‚ ์งœ: 2023-09-05 | URL: https://arxiv.org/abs/2309.01952 📄 PDF


Essence

Figure 1

Fig. 1: Overview of TRILL. TRILL addresses the challenge of learning

๋ณธ ๋…ผ๋ฌธ์€ VR ํ…”๋ ˆ์˜คํผ๋ ˆ์ด์…˜์„ ํ†ตํ•ด ์ˆ˜์ง‘ํ•œ ์ธ๊ฐ„ ์‹œ์—ฐ ๋ฐ์ดํ„ฐ๋กœ๋ถ€ํ„ฐ humanoid ๋กœ๋ด‡์˜ loco-manipulation ๋Šฅ๋ ฅ์„ deep imitation learning์œผ๋กœ ํ•™์Šตํ•˜๋Š” TRILL ํ”„๋ ˆ์ž„์›Œํฌ๋ฅผ ์ œ์‹œํ•œ๋‹ค. Whole-body control ๊ธฐ๋ฐ˜์˜ ๊ณ„์ธต์  ์ •์ฑ… ๊ตฌ์กฐ๋ฅผ ํ†ตํ•ด ๋†’์€ ์ž์œ ๋„ humanoid์˜ ๋ณต์žกํ•œ ๋™์ž‘์„ ๋ฐ์ดํ„ฐ ํšจ์œจ์ ์œผ๋กœ ํ•™์Šตํ•  ์ˆ˜ ์žˆ๋‹ค.

Motivation

Achievement

Figure 3

Fig. 3: Timelapse of deploying TRILL in simulation. We present the deployment of policies trained through our method acr

How

Figure 2

Fig. 2: Model architecture of TRILL. The trained policies generate the target task-space command ut at 20 Hz from the on

Originality

Limitation & Further Study

Evaluation

Novelty: 4/5 Technical Soundness: 4/5 Significance: 4/5 Clarity: 4/5 Overall: 4/5

์ดํ‰: ๋ณธ ๋…ผ๋ฌธ์€ humanoid loco-manipulation์„ ์œ„ํ•œ ๋ฐ์ดํ„ฐ ํšจ์œจ์  deep imitation learning ๋ฐฉ๋ฒ•์„ ์ œ์‹œํ•˜๋ฉฐ, whole-body control๊ณผ์˜ ์˜๋ฆฌํ•œ ๊ฒฐํ•ฉ์„ ํ†ตํ•ด ๋†’์€ ์ž์œ ๋„ ์‹œ์Šคํ…œ์˜ ์•ˆ์ •์„ฑ๊ณผ ํ•™์Šต ํšจ์œจ์„ฑ์„ ๋™์‹œ์— ๋‹ฌ์„ฑํ–ˆ๋‹ค. ์‹ค์ œ humanoid ๋กœ๋ด‡์—์„œ ์ฒ˜์Œ์œผ๋กœ ์„ฑ๊ณต์ ์œผ๋กœ ๋ณต์žกํ•œ manipulation์„ ํ•™์Šตํ•œ ์„ ๋„์  ์„ฑ๊ณผ๋กœ, ์•ž์œผ๋กœ humanoid์˜ ์ž์œจ ๋Šฅ๋ ฅ ํ–ฅ์ƒ์— ์ค‘์š”ํ•œ ๊ธฐ์—ฌ๋ฅผ ํ•  ๊ฒƒ์œผ๋กœ ์˜ˆ์ƒ๋œ๋‹ค.

← ๋ชฉ๋ก์œผ๋กœ ๋Œ์•„๊ฐ€๊ธฐ

๐ŸŽง Audio Overview

์ด ๋…ผ๋ฌธ ๋ฆฌ๋ทฐ๋ฅผ ํŒŸ์บ์ŠคํŠธํ˜• ์˜ค๋””์˜ค๋กœ ์ƒ์„ฑํ•ฉ๋‹ˆ๋‹ค. (Gemini ยท ํ‚ค๋Š” ๋ธŒ๋ผ์šฐ์ €์—๋งŒ ์ €์žฅ ยท ์™„์„ฑ๋ณธ์€ ์ด๋ฉ”์ผ๋กœ๋„ ์ „์†ก)
โ–ธ ๊ณ ๊ธ‰: ๊ตฌ์„ฑ ๋ฐฉํ–ฅ(๋Œ€๋ณธ ์ž‘์„ฑ ์ง€์นจ) ์ง์ ‘ ์ˆ˜์ •