์ ์: Albert Wilcox, Mohamed Ghanem, Masoud Moghani, Pierre Barroso, Benjamin Joffe, Animesh Garg | ๋ ์ง: 2025-03-06 | URL: https://arxiv.org/abs/2503.04877 📄 PDF
Figure 2: Adapt3R extracts scene representations from RGBD inputs for use with a variety of imitation learning
Adapt3R๋ calibrated RGBD ์นด๋ฉ๋ผ๋ก๋ถํฐ 3D ์ฅ๋ฉด ํํ์ ์ถ์ถํ์ฌ ๋ชจ๋ฐฉ ํ์ต(IL) ์๊ณ ๋ฆฌ์ฆ์ ์กฐ๊ฑด์ผ๋ก ์ฌ์ฉํ๋ ๊ด์ฐฐ ์ธ์ฝ๋์ด๋ฉฐ, pretrained 2D backbone์ผ๋ก ์๋ฏธ๋ก ์ ์ ๋ณด๋ฅผ ์ถ์ถํ๊ณ 3D ์ ๋ณด๋ end-effector์ ์๋์ ์ธ localization์๋ง ์ฌ์ฉํ์ฌ novel embodiment๊ณผ camera viewpoint์ผ๋ก์ zero-shot transfer๋ฅผ ์คํํ๋ค.
Figure 1: (a) Adapt3R facilitates zero-shot transfer to novel embodiments and viewpoints. (b) Adapt3R can
Figure 2: Adapt3R extracts scene representations from RGBD inputs for use with a variety of imitation learning
์ดํ: Adapt3R์ semantic ์ ๋ณด์ 3D localization์ ๋ช ํํ ๋ถ๋ฆฌํ๋ ์ค๊ณ ์ฒ ํ์ผ๋ก ๊ธฐ์กด 3D ๊ธฐ๋ฐ ๋ฐฉ๋ฒ์ ํ๊ณ๋ฅผ ์ฒด๊ณ์ ์ผ๋ก ํด๊ฒฐํ๋ฉฐ, ๊ด๋ฒ์ํ ์คํ๊ณผ ์ค์ ์ฑ๊ณผ๋ก multitask imitation learning์์ embodiment๊ณผ viewpoint generalization์ ์ค์ํ ์ง์ ์ ์ด๋ฃจ์๋ค.