Sim-to-Real Learning for Humanoid Box Loco-Manipulation

์ €์ž: Jeremy Dao, Helei Duan, Alan Fern | ๋‚ ์งœ: 2023-10-04 | URL: https://arxiv.org/abs/2310.03191 📄 PDF


Essence

Figure 1

Fig. 1: We learn box loco-manipulation policies in simulation

๋ณธ ์—ฐ๊ตฌ๋Š” ์ธ๊ฐ„ํ˜• ๋กœ๋ด‡ Digit์˜ ๋ฐ•์Šค ์ง‘๊ธฐ ๋ฐ ์šด๋ฐ˜ ์ž‘์—…์„ ์œ„ํ•ด ๊ฐ•ํ™”ํ•™์Šต ๊ธฐ๋ฐ˜์˜ sim-to-real ์ ‘๊ทผ๋ฒ•์„ ์ œ์‹œํ•˜๋ฉฐ, 5๊ฐ€์ง€ ๋ถ„๋ฆฌ๋œ ์ •์ฑ…(๊ฑท๊ธฐ, ์„œ๊ธฐ, ์ง‘๊ธฐ, ๋ฐ•์Šค ๋“ค๊ณ  ๊ฑท๊ธฐ, ๋ฐ•์Šค ๋“ค๊ณ  ์„œ๊ธฐ)์„ ํ•™์Šตํ•˜์—ฌ ์‹ค์ œ ํ•˜๋“œ์›จ์–ด์—์„œ ์„ฑ๊ณต์ ์œผ๋กœ ์ „์ดํ–ˆ๋‹ค.

Motivation

Achievement

Figure 1

Fig. 1: We learn box loco-manipulation policies in simulation

How

Originality

Limitation & Further Study

Evaluation

Novelty: 4/5 Technical Soundness: 3/5 Significance: 4/5 Clarity: 4/5 Overall: 4/5

์ดํ‰: ๋ณธ ๋…ผ๋ฌธ์€ ์ธ๊ฐ„ํ˜• ์ด์กฑ ๋กœ๋ด‡์˜ ๋ณตํ•ฉ์ ์ธ loco-manipulation ์ž‘์—…์— ๋Œ€ํ•œ ์ฒซ sim-to-real RL ์„ฑ๊ณต ์‚ฌ๋ก€๋ฅผ ์ œ์‹œํ•˜๋ฉฐ, ์‹ค์šฉ์ ์ธ ๋ณด์ƒ ํ•จ์ˆ˜ ์„ค๊ณ„์™€ action space ์„ ํƒ์„ ํ†ตํ•ด ์ž์—ฐ์Šค๋Ÿฌ์šด ๋™์ž‘์„ ํ•™์Šตํ–ˆ๋‹ค๋Š” ์ ์—์„œ ์˜์˜๊ฐ€ ์žˆ๋‹ค. ๋‹ค๋งŒ phase ๊ด€๋ฆฌ์˜ ๊ฒฝ์ง์„ฑ๊ณผ ๋ฐ•์Šค pose ์ถ”์ • ์˜ค์ฐจ ๋“ฑ ๊ฐœ์„ ์˜ ์—ฌ์ง€๊ฐ€ ์žˆ์–ด ๊ธฐ์ˆ ์ ์œผ๋กœ๋Š” ์ค‘๊ฐ„ ์ˆ˜์ค€์ด์ง€๋งŒ ์‹ค์ œ ํ•˜๋“œ์›จ์–ด ์ ์šฉ์ด๋ผ๋Š” ์ค‘์š”ํ•œ ์„ฑ๊ณผ์™€ ๋ช…ํ™•ํ•œ ๊ธฐ์—ฌ๋กœ ๋†’์€ ๊ฐ€์น˜๋ฅผ ๊ฐ€์ง„๋‹ค.

← ๋ชฉ๋ก์œผ๋กœ ๋Œ์•„๊ฐ€๊ธฐ

๐ŸŽง Audio Overview

์ด ๋…ผ๋ฌธ ๋ฆฌ๋ทฐ๋ฅผ ํŒŸ์บ์ŠคํŠธํ˜• ์˜ค๋””์˜ค๋กœ ์ƒ์„ฑํ•ฉ๋‹ˆ๋‹ค. (Gemini ยท ํ‚ค๋Š” ๋ธŒ๋ผ์šฐ์ €์—๋งŒ ์ €์žฅ ยท ์™„์„ฑ๋ณธ์€ ์ด๋ฉ”์ผ๋กœ๋„ ์ „์†ก)
โ–ธ ๊ณ ๊ธ‰: ๊ตฌ์„ฑ ๋ฐฉํ–ฅ(๋Œ€๋ณธ ์ž‘์„ฑ ์ง€์นจ) ์ง์ ‘ ์ˆ˜์ •