HuMam: Humanoid Motion Control via End-to-End Deep Reinforcement Learning with Mamba

์ €์ž: Yinuo Wang, Yuanyang Qi, Jinzhao Zhou, Pengxiang Meng, Xiaowen Tao | ๋‚ ์งœ: 2025-09-22 | URL: https://arxiv.org/abs/2509.18046 📄 PDF


Essence

Figure 1

Figure 1: Overall architecture of the proposed humanoid locomotion framework. At each time step, robot-centric and exter

HuMam์€ Mamba ์ธ์ฝ”๋”๋ฅผ ๋ฐฑ๋ณธ์œผ๋กœ ์‚ฌ์šฉํ•˜๋Š” end-to-end ๊ฐ•ํ™”ํ•™์Šต ๊ธฐ๋ฐ˜ ํœด๋จธ๋…ธ์ด๋“œ ๋กœ๋ด‡ ๋ณดํ–‰ ์ œ์–ด ํ”„๋ ˆ์ž„์›Œํฌ๋กœ, ๋กœ๋ด‡ ์ค‘์‹ฌ ์ƒํƒœ์™€ ๋ชฉํ‘œ ๋ฐœ๊ฑธ์Œ์„ ํšจ์œจ์ ์œผ๋กœ ์œตํ•ฉํ•˜์—ฌ ์•ˆ์ •์ ์ด๊ณ  ์—๋„ˆ์ง€ ํšจ์œจ์ ์ธ ์ œ์–ด๋ฅผ ์‹คํ˜„ํ•œ๋‹ค.

Motivation

Achievement

Figure 3

Figure 3: Training curves of HuMam and Baseline across

How

Figure 1

Figure 1: Overall architecture of the proposed humanoid locomotion framework. At each time step, robot-centric and exter

Originality

Limitation & Further Study

Evaluation

Novelty: 4/5 Technical Soundness: 3/5 Significance: 4/5 Clarity: 4/5 Overall: 4/5

์ดํ‰: HuMam์€ Mamba๋ฅผ ํ™œ์šฉํ•œ ํœด๋จธ๋…ธ์ด๋“œ ๋ณดํ–‰ ์ œ์–ด์˜ ์ฒซ ์„ฑ๊ณต ์‚ฌ๋ก€๋กœ, ํ•™์Šต ํšจ์œจ์„ฑ๊ณผ ์—๋„ˆ์ง€ ํšจ์œจ์„ฑ์„ ๋™์‹œ์— ๊ฐœ์„ ํ•˜๋Š” ์‹ค์งˆ์  ๊ธฐ์—ฌ๋ฅผ ํ•œ๋‹ค. ๋‹ค๋งŒ ์‹œ๋ฎฌ๋ ˆ์ด์…˜ ๊ธฐ๋ฐ˜ ๊ฒฐ๊ณผ์™€ ๋‹จ์ผ ํ”Œ๋žซํผ ๊ฒ€์ฆ์˜ ์ œ์•ฝ์ด ์žˆ์–ด ์‹ค์ œ ์‘์šฉ ๊ฐ€๋Šฅ์„ฑ ์ž…์ฆ์„ ์œ„ํ•œ ์ถ”๊ฐ€ ์—ฐ๊ตฌ๊ฐ€ ํ•„์š”ํ•˜๋‹ค.

← ๋ชฉ๋ก์œผ๋กœ ๋Œ์•„๊ฐ€๊ธฐ

๐ŸŽง Audio Overview

์ด ๋…ผ๋ฌธ ๋ฆฌ๋ทฐ๋ฅผ ํŒŸ์บ์ŠคํŠธํ˜• ์˜ค๋””์˜ค๋กœ ์ƒ์„ฑํ•ฉ๋‹ˆ๋‹ค. (Gemini ยท ํ‚ค๋Š” ๋ธŒ๋ผ์šฐ์ €์—๋งŒ ์ €์žฅ ยท ์™„์„ฑ๋ณธ์€ ์ด๋ฉ”์ผ๋กœ๋„ ์ „์†ก)
โ–ธ ๊ณ ๊ธ‰: ๊ตฌ์„ฑ ๋ฐฉํ–ฅ(๋Œ€๋ณธ ์ž‘์„ฑ ์ง€์นจ) ์ง์ ‘ ์ˆ˜์ •