Learning Humanoid Locomotion with World Model Reconstruction

์ €์ž: Wandong Sun, Long Chen, Yongbo Su, Baoshi Cao, Yang Liu, Zongwu Xie | ๋‚ ์งœ: 2025-02-22 | URL: https://arxiv.org/abs/2502.16230 📄 PDF


Essence

Figure 2

Fig. 2: Illustration of the World Model Reconstruction framework. Our framework explicitly reconstructs world state from

๋ณธ ๋…ผ๋ฌธ์€ humanoid robot์˜ blind locomotion์„ ์œ„ํ•ด World Model Reconstruction (WMR)์„ ์ œ์•ˆํ•œ๋‹ค. ์„ผ์„œ ๋…ธ์ด์ฆˆ๋กœ๋ถ€ํ„ฐ world state๋ฅผ ๋ช…์‹œ์ ์œผ๋กœ ์žฌ๊ตฌ์„ฑํ•˜๊ณ , gradient cutoff๋ฅผ ํ†ตํ•ด estimator์™€ policy๋ฅผ ๋…๋ฆฝ์ ์œผ๋กœ ํ•™์Šต์‹œํ‚ด์œผ๋กœ์จ ์‹ค์ œ ๋ณต์žกํ•œ ์ง€ํ˜•์—์„œ์˜ ๊ฒฌ๊ณ ํ•œ ์ฃผํ–‰์„ ์‹คํ˜„ํ•œ๋‹ค.

Motivation

Achievement

Figure 1

Fig. 1: Deployment to outdoor environments. We deployed the model in an outdoor environment covered in ice and snow.

How

Figure 2

Fig. 2: Illustration of the World Model Reconstruction framework. Our framework explicitly reconstructs world state from

Originality

Limitation & Further Study

Evaluation

Novelty: 4/5 Technical Soundness: 4/5 Significance: 4/5 Clarity: 4/5 Overall: 4/5

์ดํ‰: ๋ณธ ๋…ผ๋ฌธ์€ humanoid ๋กœ๋ด‡์˜ blind locomotion์„ ์œ„ํ•œ ๋ช…์‹œ์  world model reconstruction์˜ ํšจ๊ณผ๋ฅผ ์ฒด๊ณ„์ ์œผ๋กœ ์ž…์ฆํ•˜๊ณ , gradient cutoff ๋ฉ”์ปค๋‹ˆ์ฆ˜์„ ํ†ตํ•ด estimation๊ณผ policy learning์˜ ์ถฉ๋Œ์„ ์ฐฝ์˜์ ์œผ๋กœ ํ•ด๊ฒฐํ•œ๋‹ค. ๋‹จ์ผ ํ•™์Šต ๋‹จ๊ณ„๋กœ ๋ณต์žกํ•œ ์‹ค์ œ ์ง€ํ˜•์—์„œ์˜ ์žฅ๊ฑฐ๋ฆฌ ์ฃผํ–‰์„ ๋‹ฌ์„ฑํ•œ ๊ฒƒ์€ ์‹ค์งˆ์  ์ž„ํŒฉํŠธ๊ฐ€ ํฌ๋ฉฐ, 3.2 km hike์˜ ๊ตฌ์ฒด์  ์„ฑ๊ณผ๋Š” ๋ฐฉ๋ฒ•์˜ ์‹คํšจ์„ฑ์„ ๋ช…ํ™•ํžˆ ๋ณด์—ฌ์ค€๋‹ค. ๋‹ค๋งŒ ๋‹จ์ผ ๋กœ๋ด‡ ํ”Œ๋žซํผ ์‹คํ—˜๊ณผ failure case ๋ถ„์„์˜ ๋ถ€์กฑ์ด ์•„์‰ฌ์šฐ๋‚˜, ์ „์ฒด์ ์œผ๋กœ humanoid locomotion ๋ถ„์•ผ์— ์˜๋ฏธ์žˆ๋Š” ๊ธฐ์—ฌ๋ฅผ ํ•˜๋Š” ๊ณ ํ’ˆ์งˆ ์—ฐ๊ตฌ์ด๋‹ค.

← ๋ชฉ๋ก์œผ๋กœ ๋Œ์•„๊ฐ€๊ธฐ

๐ŸŽง Audio Overview

์ด ๋…ผ๋ฌธ ๋ฆฌ๋ทฐ๋ฅผ ํŒŸ์บ์ŠคํŠธํ˜• ์˜ค๋””์˜ค๋กœ ์ƒ์„ฑํ•ฉ๋‹ˆ๋‹ค. (Gemini ยท ํ‚ค๋Š” ๋ธŒ๋ผ์šฐ์ €์—๋งŒ ์ €์žฅ ยท ์™„์„ฑ๋ณธ์€ ์ด๋ฉ”์ผ๋กœ๋„ ์ „์†ก)
โ–ธ ๊ณ ๊ธ‰: ๊ตฌ์„ฑ ๋ฐฉํ–ฅ(๋Œ€๋ณธ ์ž‘์„ฑ ์ง€์นจ) ์ง์ ‘ ์ˆ˜์ •