Sim-to-Real Reinforcement Learning for Vision-Based Dexterous Manipulation on Humanoids

์ €์ž: Toru Lin, Kartik Sachdev, Linxi Fan, Jitendra Malik, Yuke Zhu | ๋‚ ์งœ: 2025-02-27 | URL: https://arxiv.org/abs/2502.20396 📄 PDF


Essence

Figure 2

Figure 2: A sim-to-real RL recipe for vision-based dexterous manipulation. We close the environment

๋ณธ ๋…ผ๋ฌธ์€ ํœด๋จธ๋…ธ์ด๋“œ ๋กœ๋ด‡์˜ ๋‹ค์ค‘ ์†๊ฐ€๋ฝ ์†์„ ์ด์šฉํ•œ ์‹œ๊ฐ ๊ธฐ๋ฐ˜ ์ •๊ตํ•œ ์กฐ์ž‘์„ ์œ„ํ•ด sim-to-real RL์„ ์ ์šฉํ•˜๋Š” ์‹ค์šฉ์ ์ธ ๋ ˆ์‹œํ”ผ๋ฅผ ์ œ์‹œํ•˜๋ฉฐ, ์ž๋™ํ™”๋œ ์‹ค-์‹œ๋ฎฌ๋ ˆ์ด์…˜ ํŠœ๋‹, ์ผ๋ฐ˜ํ™”๋œ ๋ณด์ƒ ์„ค๊ณ„, ๋ถ„ํ• -์ •๋ณต ์ •์ฑ… ์ฆ๋ฅ˜, ํ•˜์ด๋ธŒ๋ฆฌ๋“œ ๊ฐ์ฒด ํ‘œํ˜„์„ ํ†ตํ•ฉํ•œ๋‹ค.

Motivation

Achievement

Figure 1

Figure 1: Overview. We train a humanoid robot with two multi-fingered hands to perform a range of contact-

How

Figure 2

Figure 2: A sim-to-real RL recipe for vision-based dexterous manipulation. We close the environment

Originality

Limitation & Further Study

Evaluation

Novelty: 4/5 Technical Soundness: 3/5 Significance: 4/5 Clarity: 4/5 Overall: 4/5

์ดํ‰: ๋ณธ ๋…ผ๋ฌธ์€ sim-to-real RL์„ ์‹ค์ œ ํœด๋จธ๋…ธ์ด๋“œ ๋‹ค์ค‘ ์†๊ฐ€๋ฝ ์กฐ์ž‘์œผ๋กœ ์ฒ˜์Œ ํ™•์žฅํ•˜๋Š” ์‹ค์šฉ์ ์ด๊ณ  ํฌ๊ด„์ ์ธ ์†”๋ฃจ์…˜์„ ์ œ์‹œํ•˜๋ฉฐ, ์ž๋™ํ™”๋œ ์‹œ์Šคํ…œ ์‹๋ณ„๊ณผ ์ •์ฑ… ์ฆ๋ฅ˜ ๋“ฑ ์—ฌ๋Ÿฌ ํ˜์‹ ์„ ํ†ตํ•ด ๋†’์€ ์„ฑ๊ณต๋ฅ ๊ณผ ์ผ๋ฐ˜ํ™” ๋Šฅ๋ ฅ์„ ์ž…์ฆํ•œ๋‹ค. ๋‹ค๋งŒ ๋ฏธ๋ณธ ๊ฐ์ฒด ์„ฑ๋Šฅ๊ณผ ๋ฐฉ๋ฒ•์˜ ๋ณต์žก์„ฑ ๊ฐœ์„ ์—๋Š” ์—ฌ์ง€๊ฐ€ ์žˆ๋‹ค.

← ๋ชฉ๋ก์œผ๋กœ ๋Œ์•„๊ฐ€๊ธฐ

๐ŸŽง Audio Overview

์ด ๋…ผ๋ฌธ ๋ฆฌ๋ทฐ๋ฅผ ํŒŸ์บ์ŠคํŠธํ˜• ์˜ค๋””์˜ค๋กœ ์ƒ์„ฑํ•ฉ๋‹ˆ๋‹ค. (Gemini ยท ํ‚ค๋Š” ๋ธŒ๋ผ์šฐ์ €์—๋งŒ ์ €์žฅ ยท ์™„์„ฑ๋ณธ์€ ์ด๋ฉ”์ผ๋กœ๋„ ์ „์†ก)
โ–ธ ๊ณ ๊ธ‰: ๊ตฌ์„ฑ ๋ฐฉํ–ฅ(๋Œ€๋ณธ ์ž‘์„ฑ ์ง€์นจ) ์ง์ ‘ ์ˆ˜์ •