Cost-Matching Model Predictive Control for Efficient Reinforcement Learning in Humanoid Locomotion

์ €์ž: | ๋‚ ์งœ: 2026-03-30 | URL: https://arxiv.org/abs/2603.28243 📄 PDF


Essence

Figure 1

Fig. 1: Cost-Matching MPC-RL framework for humanoids.

์ธ๊ฐ„ํ˜• ๋กœ๋ด‡ ๋ณดํ–‰ ์ œ์–ด๋ฅผ ์œ„ํ•ด MPC๋ฅผ RL๋กœ ํ•™์Šตํ•  ๋•Œ ๋ฐ˜๋ณต์ ์ธ MPC ํ•ด๊ฒฐ์˜ ๊ณ„์‚ฐ ๋ถ€๋‹ด์„ ์ œ๊ฑฐํ•˜๋Š” Cost-Matching MPC ๋ฐฉ๋ฒ•์„ ์ œ์•ˆํ•œ๋‹ค. ๋งค๊ฐœ๋ณ€์ˆ˜ํ™”๋œ MPC์˜ ๋น„์šฉ-๋ฏธ๋ž˜๊ฐ€์น˜(cost-to-go)์™€ ์‹ค์ œ ์ธก์ •๋œ ๋ฆฌํ„ด๊ฐ’์˜ ๋ถˆ์ผ์น˜๋ฅผ ์ตœ์†Œํ™”ํ•˜์—ฌ ํšจ์œจ์ ์œผ๋กœ ํ•™์Šตํ•œ๋‹ค.

Motivation

Achievement

Figure 4

Fig. 4: Simulation snapshots of the humanoid during locomo-

How

Figure 1

Fig. 1: Cost-Matching MPC-RL framework for humanoids.

Originality

Limitation & Further Study

Evaluation

Novelty: 4/5 Technical Soundness: 3/5 Significance: 4/5 Clarity: 4/5 Overall: 4/5

์ดํ‰: ๋ณธ ๋…ผ๋ฌธ์€ MPC-RL์˜ ๊ณ„์‚ฐ ๋ณ‘๋ชฉ์„ ํ•ด๊ฒฐํ•˜๋Š” ์ฐฝ์˜์ ์ธ cost-matching ๋ฐฉ๋ฒ•์„ ์ œ์‹œํ•˜๋ฉฐ, ๋ณต์žกํ•œ ์ธ๊ฐ„ํ˜• ๋กœ๋ด‡ ์ œ์–ด ๋ฌธ์ œ์— ์ฒด๊ณ„์ ์œผ๋กœ ์ ์šฉํ•œ ์šฐ์ˆ˜ํ•œ ์—ฐ๊ตฌ๋‹ค. ๋‹ค๋งŒ ์‹ค์ œ ๋กœ๋ด‡ ๊ฒ€์ฆ์˜ ๋ถ€์žฌ๊ฐ€ ์ž„ํŒฉํŠธ๋ฅผ ์ œํ•œํ•˜๋ฏ€๋กœ, ํ–ฅํ›„ sim-to-real ์ „์ด ์—ฐ๊ตฌ๊ฐ€ ํ•„์š”ํ•˜๋‹ค.

← ๋ชฉ๋ก์œผ๋กœ ๋Œ์•„๊ฐ€๊ธฐ

๐ŸŽง Audio Overview

์ด ๋…ผ๋ฌธ ๋ฆฌ๋ทฐ๋ฅผ ํŒŸ์บ์ŠคํŠธํ˜• ์˜ค๋””์˜ค๋กœ ์ƒ์„ฑํ•ฉ๋‹ˆ๋‹ค. (Gemini ยท ํ‚ค๋Š” ๋ธŒ๋ผ์šฐ์ €์—๋งŒ ์ €์žฅ ยท ์™„์„ฑ๋ณธ์€ ์ด๋ฉ”์ผ๋กœ๋„ ์ „์†ก)
โ–ธ ๊ณ ๊ธ‰: ๊ตฌ์„ฑ ๋ฐฉํ–ฅ(๋Œ€๋ณธ ์ž‘์„ฑ ์ง€์นจ) ์ง์ ‘ ์ˆ˜์ •