AutoRT: Embodied Foundation Models for Large Scale Orchestration of Robotic Agents

์ €์ž: Michael Ahn, Debidatta Dwibedi, Chelsea Finn, Montse Gonzalez Arenas, Keerthana Gopalakrishnan, Karol Hausman, Brian Ichter, Alex Irpan, Nikhil Joshi, Ryan Julian, Sean Kirmani, Isabel Leal, Edward Lee, Sergey Levine, Yao Lu, Isabel Leal, Sharath Maddineni, Kanishka Rao, Dorsa Sadigh, Pannag Sanketi, Pierre Sermanet, Quan Vuong, Stefan Welker, Fei Xia, Ted Xiao, Peng Xu, Steve Xu, Zhuo Xu | ๋‚ ์งœ: 2024-01-23 | URL: https://arxiv.org/abs/2401.12963 📄 PDF


Essence

Figure 5

Fig. 5 shows the visual diversity across each of AutoRTโ€™s data collection policies, along with the

AutoRT๋Š” VLM๊ณผ LLM์„ ํ™œ์šฉํ•˜์—ฌ ๋กœ๋ด‡ ํ•จ๋Œ€์˜ ๋Œ€๊ทœ๋ชจ ์ž์œจ ๋ฐ์ดํ„ฐ ์ˆ˜์ง‘์„ ์˜ค์ผ€์ŠคํŠธ๋ ˆ์ด์…˜ํ•˜๋Š” ์‹œ์Šคํ…œ์œผ๋กœ, 77,000๊ฐœ์˜ ์‹ค์ œ ๋กœ๋ด‡ ์—ํ”ผ์†Œ๋“œ๋ฅผ ๋‹ค์–‘ํ•œ ๋ฏธ์ง€์˜ ํ™˜๊ฒฝ์—์„œ ์ˆ˜์ง‘ํ–ˆ๋‹ค.

Motivation

Achievement

Figure 3

Figure 3: On the left is AutoRT robot usage and on the right is t-SNE visualization of tasks, colored by collect

How

Figure 5

Fig. 5 shows the visual diversity across each of AutoRTโ€™s data collection policies, along with the

Originality

Limitation & Further Study

Evaluation

Novelty: 4/5 Technical Soundness: 3/5 Significance: 4/5 Clarity: 4/5 Overall: 4/5

์ดํ‰: AutoRT๋Š” foundation model์„ ํ™œ์šฉํ•œ ๋Œ€๊ทœ๋ชจ ๋กœ๋ด‡ ํ•จ๋Œ€ ์˜ค์ผ€์ŠคํŠธ๋ ˆ์ด์…˜์˜ ์ตœ์ดˆ ์‹ค์ฆ ์‚ฌ๋ก€๋กœ์„œ, ์‹ค์ œ ํ™˜๊ฒฝ์—์„œ์˜ ์ž์œจ์„ฑ๊ณผ ์•ˆ์ „์„ฑ์˜ ๊ท ํ˜•์„ ์ด๋ฃฌ ํ˜์‹ ์  ์‹œ์Šคํ…œ์ด๋‹ค. 77,000 ์—ํ”ผ์†Œ๋“œ์˜ ์‹ค์ œ ๋ฐ์ดํ„ฐ ์ˆ˜์ง‘ ๋ฐ ํšจ์œจ์  ์ธ๋ ฅ ํ™œ์šฉ ๋‹ฌ์„ฑ์€ embodied AI์˜ ์Šค์ผ€์ผ๋ง์— ์ค‘๋Œ€ํ•œ ๊ธฐ์—ฌ๋ฅผ ์ œ์‹œํ•œ๋‹ค.

← ๋ชฉ๋ก์œผ๋กœ ๋Œ์•„๊ฐ€๊ธฐ

๐ŸŽง Audio Overview

์ด ๋…ผ๋ฌธ ๋ฆฌ๋ทฐ๋ฅผ ํŒŸ์บ์ŠคํŠธํ˜• ์˜ค๋””์˜ค๋กœ ์ƒ์„ฑํ•ฉ๋‹ˆ๋‹ค. (Gemini ยท ํ‚ค๋Š” ๋ธŒ๋ผ์šฐ์ €์—๋งŒ ์ €์žฅ ยท ์™„์„ฑ๋ณธ์€ ์ด๋ฉ”์ผ๋กœ๋„ ์ „์†ก)
โ–ธ ๊ณ ๊ธ‰: ๊ตฌ์„ฑ ๋ฐฉํ–ฅ(๋Œ€๋ณธ ์ž‘์„ฑ ์ง€์นจ) ์ง์ ‘ ์ˆ˜์ •