RoboCasa: Large-Scale Simulation of Everyday Tasks for Generalist Robots
์ ์: Soroush Nasiriany, Abhiram Maddukuri, Lance Zhang, Adeet Parikh, Aaron Lo, Abhishek Joshi, Ajay Mandlekar, Yuke Zhu | ๋ ์ง: 2024-06-04 | URL: https://arxiv.org/abs/2406.02523 📄 PDF
Essence
Fig. 1: Overview of RoboCasa. RoboCasa is a simulation framework for training generalist robot agents. Four pillars unde
RoboCasa๋ kitchen ํ๊ฒฝ์ ์ค์ ์ ๋ ๋๊ท๋ชจ ๋ก๋ด ์๋ฎฌ๋ ์ด์
ํ๋ ์์ํฌ๋ก, ์์ฑํ AI๋ฅผ ํ์ฉํ์ฌ ๋ค์ํ 3D ์์ฐ๊ณผ task๋ฅผ ํ๋ณดํ๊ณ 100K ์ด์์ synthetic trajectory๋ก generalist robot ํ์ต์ ๊ฐ๋ฅํ๊ฒ ํ๋ค.
Motivation
- Known: ๋ก๋ด ํ์ต์ ์ํด์๋ ๋๊ท๋ชจ dataset์ด ํ์ํ๋ฉฐ, ํ์ค ์ธ๊ณ ๋ฐ์ดํฐ ์์ง์ ๋น์ฉ๊ณผ ๋
ธ๋ ฅ์ด ๋ง์ด ๋ ๋ค. ์ต๊ทผ simulation ๊ธฐ๋ฐ์ ๋ก๋ด ํ์ต๊ณผ imitation learning์ด ์ฃผ๋ชฉ๋ฐ๊ณ ์๋ค.
- Gap: ๊ธฐ์กด simulation framework๋ค์ realistic physics, diverse scenes/assets, room-scale ํ๊ฒฝ, ๋๊ท๋ชจ dataset์ ๋ชจ๋ ๋ง์กฑํ๋ ๊ฒฝ์ฐ๊ฐ ๋๋ฌผ๋ค. ๋ํ ์์ฑํ AI๋ฅผ ํ์ฉํ ๋๊ท๋ชจ asset๊ณผ task ์์ฑ์ ํตํฉํ ํ๋ ์์ํฌ๊ฐ ๋ถ์ฌํ๋ค.
- Why: ๋ก๋ด ํ์ต์ scaling์ ์ํด์๋ ํ์ค์ ์ด๊ณ ๋ค์ํ ํ๊ฒฝ์์์ ๋๊ท๋ชจ ํฉ์ฑ ๋ฐ์ดํฐ๊ฐ ํ์์ ์ด๋ฉฐ, ์ด๋ real-world ๋ก๋ด ๋ฐฐํฌ์ ์ฑ๋ฅ ํฅ์์ผ๋ก ์ด์ด์ง๋ค.
- Approach: RoboCasa๋ MuJoCo ๊ธฐ๋ฐ์ modular framework์ generative AI ๋๊ตฌ(text-to-3D, text-to-image)๋ฅผ ํ์ฉํ์ฌ 2,500+ 3D object, 120 kitchen scene, 100 task๋ฅผ ๊ตฌ์ถํ๊ณ , human demonstration๊ณผ MimicGen์ ํตํ automated trajectory generation์ผ๋ก dataset์ ํ๋ํ๋ค.
Achievement
Fig. 1: Overview of RoboCasa. RoboCasa is a simulation framework for training generalist robot agents. Four pillars unde
- Framework ๊ตฌ์ฑ: realistic kitchen scene, diverse interactable furniture/appliances, cross-embodiment support (mobile manipulator, humanoid, quadruped)๋ฅผ ๊ฐ์ถ simulation framework ๊ฐ๋ฐ
- AI ํ์ฉ asset/task ์์ฑ: text-to-3D, text-to-image, LLM ํ์ฉ์ผ๋ก 2,500+ object (150+ category), 100 task (atomic 25 + composite 75) ๊ตฌ์ถ
- ๋๊ท๋ชจ dataset: human demonstration๊ณผ MimicGen ๊ธฐ๋ฐ automated generation์ผ๋ก 100K+ trajectory ํ๋ณด
- Scaling ๊ฒ์ฆ: synthetic data ์ฆ๊ฐ์ ๋ฐ๋ฅธ ์ฑ๋ฅ ํฅ์์ ์ค์ฆํ์๊ณ , real-world kitchen์์ simulation co-training์ effectiveness ์ฆ๋ช
How
Fig. 3: Kitchen Floor Plans. We consult home planning and architecture magazines and compile a list of common kitchen fl
- Kitchen ์ค๊ณ: architecture/home design magazine ์ฐธ๊ณ ํ์ฌ ๋ค์ํ kitchen layout๊ณผ style ๋ชจ๋ธ๋ง
- Object asset: 2,500+ 3D object๋ฅผ text-to-3D ๋ชจ๋ธ๋ก ์์ฑํ์ฌ 150+ category ํ๋ณด
- Environment texture: text-to-image ๋ชจ๋ธ๋ก rendering realism ํฅ์
- Task design: 25๊ฐ atomic task (picking, placing, opening door, twisting knob)์ LLM ์ ์ ๊ธฐ๋ฐ 75๊ฐ composite task (washing dishes, frying, restocking) ๊ตฌ์ฑ
- Dataset augmentation: MimicGen ํ์ฅ์ผ๋ก atomic task์ 100K trajectory ์๋ ์์ฑ
- Learning: behavioral cloning์ผ๋ก human demonstration + generated data๋ฅผ ํ์ฉํ policy ํ์ต
- Real-world validation: real kitchen ํ๊ฒฝ์์ sim-trained policy์ transfer ์ฑ๋ฅ ํ๊ฐ
Originality
- ๊ธฐ์กด framework ๋๋น ์ฒ์์ผ๋ก realistic object physics + room-scale scene + AI-generated asset/task + large-scale dataset์ ํตํฉ
- LLM์ ํ์ฉํ naturalistic task design - human-centered Internet content๋ก๋ถํฐ ecological statistics ์ถ์ถ
- Generative AI (text-to-3D, text-to-image, LLM)๋ฅผ systematicํ๊ฒ ํ์ฉํ์ฌ simulator ํ์ฅ ๊ฐ๋ฅ์ฑ ์
์ฆ
- MimicGen์ ์ ์์์ผ kitchen manipulation task์ ๋ํ automated trajectory generation pipeline ๊ฐ๋ฐ
Limitation & Further Study
- ํ์ฌ๋ kitchen ํ๊ฒฝ์๋ง ์ง์คํ์ฌ home์ ๋ค๋ฅธ ๊ณต๊ฐ(bedroom, bathroom, living room)์ผ๋ก์ ํ์ฅ ํ์
- Real-world transfer๋ ์ ํ๋ kitchen ํ๊ฒฝ์์๋ง ๊ฒ์ฆ๋์์ผ๋ฉฐ, ๋ ๋ค์ํ ์ค์ ํ๊ฒฝ์์์ ํ๊ฐ ํ์
- LLM ๊ธฐ๋ฐ task ์์ฑ์ด ์ค์ ์ธ๊ฐ์ ํ๋ ๋ค์์ฑ์ ์ถฉ๋ถํ ํฌ๊ดํ์ง ๋ชปํ ๊ฐ๋ฅ์ฑ
- Text-to-3D๋ก ์์ฑ๋ asset์ ๋ฌผ๋ฆฌ ํน์ฑ(mass, friction, shape accuracy) ๊ฒ์ฆ ๋ถ์ฌ
- Cross-embodiment ์ง์์ด ์์ผ๋, ์ค์ ์ฌ๋ฌ embodiment์์์ sim-to-real transfer ์ฑ๋ฅ ๋น๊ต ๋ถ์กฑ
Evaluation
Novelty: 4/5 Technical Soundness: 3/5 Significance: 4/5 Clarity: 4/5 Overall: 4/5
์ดํ: RoboCasa๋ generative AI๋ฅผ ํ์ฉํ์ฌ robot learning์ ์ํ ๋๊ท๋ชจ realistic simulation์ ๊ตฌ์ถํ ์๋ฏธ ์๋ contribution์ด๋ฉฐ, ์ค์ real-world transfer ์ฑ๊ณต์ ๋ณด์ฌ์ค์ผ๋ก์จ sim-to-real robot learning์ ์ค์ง์ ๊ฒฝ๋ก๋ฅผ ์ ์ํ๋ค. ๋ค๋ง ํ์ฌ kitchen ํ๊ฒฝ ์ง์ค๊ณผ ์ ํ๋ real-world ๊ฒ์ฆ์ ํฅํ ๊ฐ์ ์ด ํ์ํ๋ค.
๐ง Audio Overview
์ด ๋
ผ๋ฌธ ๋ฆฌ๋ทฐ๋ฅผ ํ์บ์คํธํ ์ค๋์ค๋ก ์์ฑํฉ๋๋ค. (Gemini ยท ํค๋ ๋ธ๋ผ์ฐ์ ์๋ง ์ ์ฅ ยท ์์ฑ๋ณธ์ ์ด๋ฉ์ผ๋ก๋ ์ ์ก)
โธ ๊ณ ๊ธ: ๊ตฌ์ฑ ๋ฐฉํฅ(๋๋ณธ ์์ฑ ์ง์นจ) ์ง์ ์์