RoboBrain: A Unified Brain Model for Robotic Manipulation from Abstract to Concrete

์ €์ž: Yuheng Ji, Huajie Tan, Jiayu Shi, Xiaoshuai Hao, Yuan Zhang, Hengyuan Zhang, Pengwei Wang, Mengdi Zhao, Yao Mu, Pengju An, Xinda Xue, Qinghang Su, Huaihai Lyu, Xiaolong Zheng, Jiaming Liu, Zhongyuan Wang, Shanghang Zhang | ๋‚ ์งœ: 2025-02-28 | URL: https://arxiv.org/abs/2502.21257 📄 PDF


Essence

Figure 1

Figure 1. Overview of RoboBrain. RoboBrain consists of three key robotic capabilities: planning capability, affordance p

RoboBrain์€ ๋กœ๋ด‡ ์กฐ์ž‘์„ ์œ„ํ•ด Planning Capability, Affordance Perception, Trajectory Prediction์˜ ์„ธ ๊ฐ€์ง€ ํ•ต์‹ฌ ๋Šฅ๋ ฅ์„ ๊ฐ–์ถ˜ ํ†ตํ•ฉ MLLM ๋ชจ๋ธ์ด๋ฉฐ, ์ด๋ฅผ ํ•™์Šตํ•˜๊ธฐ ์œ„ํ•ด ShareRobot์ด๋ผ๋Š” ๋Œ€๊ทœ๋ชจ ๊ณ ํ’ˆ์งˆ ์ด์งˆ ๋ฐ์ดํ„ฐ์…‹์„ ์ œ์‹œํ•œ๋‹ค.

Motivation

Achievement

Figure 5

Figure 5. The performance of our model RoboBrain on the OpenEQA, ShareRobot, and RoboVQA benchmarks. RoboBrain surpassed

How

Figure 2

Figure 2. The generation procession of our ShareRobot dataset. Our dataset labels multi-dimensional information, includi

Originality

Limitation & Further Study

Evaluation

Novelty: 4/5 Technical Soundness: 3/5 Significance: 4/5 Clarity: 4/5 Overall: 4/5

์ดํ‰: RoboBrain์€ ๋กœ๋ด‡ ์กฐ์ž‘์„ ์œ„ํ•œ ์„ธ ๊ฐ€์ง€ ํ•ต์‹ฌ ๋Šฅ๋ ฅ์„ ์ฒด๊ณ„์ ์œผ๋กœ ์ •์˜ํ•˜๊ณ  ์ด๋ฅผ ํ†ตํ•ฉํ•œ MLLM๊ณผ ๊ณ ํ’ˆ์งˆ ๋ฐ์ดํ„ฐ์…‹์„ ํ•จ๊ป˜ ์ œ์‹œํ•˜์—ฌ, ๋กœ๋ด‡ AI์˜ ๊ตฌ์ฒด์  ์‹คํ–‰ ๋Šฅ๋ ฅ ํ–ฅ์ƒ์— ์˜๋ฏธ ์žˆ๋Š” ๊ธฐ์—ฌ๋ฅผ ํ•œ๋‹ค.

← ๋ชฉ๋ก์œผ๋กœ ๋Œ์•„๊ฐ€๊ธฐ

๐ŸŽง Audio Overview

์ด ๋…ผ๋ฌธ ๋ฆฌ๋ทฐ๋ฅผ ํŒŸ์บ์ŠคํŠธํ˜• ์˜ค๋””์˜ค๋กœ ์ƒ์„ฑํ•ฉ๋‹ˆ๋‹ค. (Gemini ยท ํ‚ค๋Š” ๋ธŒ๋ผ์šฐ์ €์—๋งŒ ์ €์žฅ ยท ์™„์„ฑ๋ณธ์€ ์ด๋ฉ”์ผ๋กœ๋„ ์ „์†ก)
โ–ธ ๊ณ ๊ธ‰: ๊ตฌ์„ฑ ๋ฐฉํ–ฅ(๋Œ€๋ณธ ์ž‘์„ฑ ์ง€์นจ) ์ง์ ‘ ์ˆ˜์ •