Booster Gym: An End-to-End Reinforcement Learning Framework for Humanoid Robot Locomotion

์ €์ž: Yushi Wang, Penghui Chen, Xinyu Han, Feng Wu, Mingguo Zhao | ๋‚ ์งœ: 2025-06-18 | URL: https://arxiv.org/abs/2506.15132 📄 PDF


Essence

Figure 1

Fig. 1: Training, testing, and deployment on Booster T1

Booster Gym์€ ์‹œ๋ฎฌ๋ ˆ์ด์…˜์—์„œ ์‹ค์ œ ๋กœ๋ด‡๊นŒ์ง€ humanoid robot locomotion์„ ์œ„ํ•œ RL ๊ธฐ๋ฐ˜ ์ •์ฑ…์„ ํ›ˆ๋ จํ•˜๊ณ  ๋ฐฐํฌํ•˜๋Š” end-to-end ํ”„๋ ˆ์ž„์›Œํฌ๋ฅผ ์ œ์‹œํ•œ๋‹ค. ์ด ํ”„๋ ˆ์ž„์›Œํฌ๋Š” domain randomization, ๋ณด์ƒ ํ•จ์ˆ˜ ์„ค๊ณ„, parallel structures ์ฒ˜๋ฆฌ ๋“ฑ์„ ํฌํ•จํ•˜๋ฉฐ Booster T1 ๋กœ๋ด‡์—์„œ omnidirectional walking, disturbance resistance, terrain adaptability๋ฅผ ๋‹ฌ์„ฑํ–ˆ๋‹ค.

Motivation

Achievement

Figure 1

Fig. 1: Training, testing, and deployment on Booster T1

How

Figure 2

Fig. 2: An overview of the control architecture for training

Originality

Limitation & Further Study

Evaluation

Novelty: 3/5 Technical Soundness: 3/5 Significance: 4/5 Clarity: 4/5 Overall: 4/5

์ดํ‰: ์ด ๋…ผ๋ฌธ์€ humanoid robot locomotion์˜ RL ๊ธฐ๋ฐ˜ ํ›ˆ๋ จ๊ณผ ๋ฐฐํฌ๋ฅผ ์œ„ํ•œ ์‹ค์šฉ์ ์ด๊ณ  ์™„์ „ํ•œ ์˜คํ”ˆ์†Œ์Šค ํ”„๋ ˆ์ž„์›Œํฌ๋ฅผ ์ œ์‹œํ•˜๋ฉฐ, ๋‹ค์ค‘ ์‹œ๋ฎฌ๋ ˆ์ดํ„ฐ ๊ฒ€์ฆ๊ณผ ์‹ค์ œ ๋กœ๋ด‡ ๋ฐฐํฌ๋ฅผ ํ†ตํ•ด ์‹ค์šฉ์„ฑ์„ ์ž…์ฆํ•œ๋‹ค. ํ•™์ˆ ์  ๊ธฐ์—ฌ๋Š” ์ œํ•œ์ ์ด์ง€๋งŒ ๋กœ๋ณดํ‹ฑ์Šค ์ปค๋ฎค๋‹ˆํ‹ฐ์— ์ฆ‰์‹œ ํ™œ์šฉ ๊ฐ€๋Šฅํ•œ ๋„๊ตฌ๋ฅผ ์ œ๊ณตํ•˜๋Š” ์ ์—์„œ ๊ฐ€์น˜ ์žˆ๋‹ค.

← ๋ชฉ๋ก์œผ๋กœ ๋Œ์•„๊ฐ€๊ธฐ

๐ŸŽง Audio Overview

์ด ๋…ผ๋ฌธ ๋ฆฌ๋ทฐ๋ฅผ ํŒŸ์บ์ŠคํŠธํ˜• ์˜ค๋””์˜ค๋กœ ์ƒ์„ฑํ•ฉ๋‹ˆ๋‹ค. (Gemini ยท ํ‚ค๋Š” ๋ธŒ๋ผ์šฐ์ €์—๋งŒ ์ €์žฅ ยท ์™„์„ฑ๋ณธ์€ ์ด๋ฉ”์ผ๋กœ๋„ ์ „์†ก)
โ–ธ ๊ณ ๊ธ‰: ๊ตฌ์„ฑ ๋ฐฉํ–ฅ(๋Œ€๋ณธ ์ž‘์„ฑ ์ง€์นจ) ์ง์ ‘ ์ˆ˜์ •