GR-RL: Going Dexterous and Precise for Long-Horizon Robotic Manipulation

์ €์ž: Yunfei Li, Xiao Ma, Jiafeng Xu, Yu Cui, Zhongren Cui, Zhigang Han, Liqun Huang, Tao Kong, Yuxiao Liu, Hao Niu, Wanli Peng, Jingchao Qiao, Zeyu Ren, Haixin Shi, Zhi Su, Jiawen Tian, Yuyang Xiao, Shenyu Zhang, Liwei Zheng, Hang Li, Yonghui Wu | ๋‚ ์งœ: 2025-12-01 | URL: https://arxiv.org/abs/2512.01801 📄 PDF


Essence

Figure 1

Figure 1 GR-RL performs long-horizon, dexterous, and high-precision manipulation, in the task of shoe lacing, by

GR-RL์€ ์ผ๋ฐ˜์ ์ธ vision-language-action (VLA) ์ •์ฑ…์„ ๋‹ค๋‹จ๊ณ„ ํ•™์Šต ํŒŒ์ดํ”„๋ผ์ธ(๋ฐ์ดํ„ฐ ํ•„ํ„ฐ๋ง, ํ˜•ํƒœ ๋Œ€์นญ ์ฆ๊ฐ•, ์˜จ๋ผ์ธ RL)์„ ํ†ตํ•ด ์žฅ๊ธฐ ๋ณต์žก ์กฐ์ž‘์„ ์œ„ํ•œ ๊ณ ์ •๋ฐ€ ์ „๋ฌธ๊ฐ€ ์ •์ฑ…์œผ๋กœ ๋ณ€ํ™˜ํ•˜๋Š” ๋กœ๋ด‡ ํ•™์Šต ํ”„๋ ˆ์ž„์›Œํฌ์ด๋‹ค.

Motivation

Achievement

Figure 5

Figure 5 Left: the success rate of our multi-stage training recipe. Data filtering, mirror augmentation, and online

How

Figure 2

Figure 2

Originality

Limitation & Further Study

Evaluation

Novelty: 4/5 Technical Soundness: 3/5 Significance: 4/5 Clarity: 4/5 Overall: 4/5

์ดํ‰: GR-RL์€ ์ธ๊ฐ„ ์‹œ์—ฐ์˜ ๋ถ€๋ถ„์ตœ์ ์„ฑ๊ณผ ํ•™์Šต-๋ฐฐํฌ ๋ถˆ์ผ์น˜๋ผ๋Š” ์‹ค์งˆ์  ๋ฌธ์ œ๋ฅผ ์ฒด๊ณ„์ ์œผ๋กœ ํ•ด๊ฒฐํ•˜๋Š” ์‹ค์šฉ์ ์ธ ๋‹ค๋‹จ๊ณ„ ํŒŒ์ดํ”„๋ผ์ธ์„ ์ œ์‹œํ•˜๋ฉฐ, ์‹ ๋ฐœ๋ˆ ๊ฟฐ๊ธฐ์™€ ๊ฐ™์€ ๊ทน๋„๋กœ ์ •๋ฐ€ํ•œ ์กฐ์ž‘ ๊ณผ์ œ๋ฅผ ์„ฑ๊ณต์‹œํ‚ด์œผ๋กœ์จ ๋กœ๋ด‡ ๊ธฐ์ดˆ ๋ชจ๋ธ์˜ ์ „๋ฌธํ™” ๋ฐฉํ–ฅ์„ ์ œ์‹œํ•˜๋Š” ์ค‘์š”ํ•œ ๊ธฐ์—ฌ๋ฅผ ํ•œ๋‹ค.

← ๋ชฉ๋ก์œผ๋กœ ๋Œ์•„๊ฐ€๊ธฐ

๐ŸŽง Audio Overview

์ด ๋…ผ๋ฌธ ๋ฆฌ๋ทฐ๋ฅผ ํŒŸ์บ์ŠคํŠธํ˜• ์˜ค๋””์˜ค๋กœ ์ƒ์„ฑํ•ฉ๋‹ˆ๋‹ค. (Gemini ยท ํ‚ค๋Š” ๋ธŒ๋ผ์šฐ์ €์—๋งŒ ์ €์žฅ ยท ์™„์„ฑ๋ณธ์€ ์ด๋ฉ”์ผ๋กœ๋„ ์ „์†ก)
โ–ธ ๊ณ ๊ธ‰: ๊ตฌ์„ฑ ๋ฐฉํ–ฅ(๋Œ€๋ณธ ์ž‘์„ฑ ์ง€์นจ) ์ง์ ‘ ์ˆ˜์ •