Robot Learning from Human Videos: A Survey

์ €์ž: Junyi Ma, Erhang Zhang, Haoran Yang, Ditao Li, Chenyang Xu, Guangming Wang, Hesheng Wang | ๋‚ ์งœ: 2026-04-30 | URL: https://arxiv.org/abs/2604.27621 📄 PDF


Essence

Figure 2

Figure 2. Taxonomy of robot learning from human videos.

๋ณธ ๋…ผ๋ฌธ์€ ๋กœ๋ด‡์ด ์ธ๊ฐ„ ์˜์ƒ ์‹œ์—ฐ์œผ๋กœ๋ถ€ํ„ฐ ์กฐ์ž‘ ๊ธฐ์ˆ ์„ ์Šต๋“ํ•˜๋Š” ๋ฐฉ๋ฒ•์— ๋Œ€ํ•œ ํฌ๊ด„์  ๋ฆฌ๋ทฐ๋กœ์„œ, taskยทobservationยทaction ๋ ˆ๋ฒจ์—์„œ์˜ ๊ณ„์ธต์  ์ „์ด ๊ฒฝ๋กœ๋ฅผ ์ œ์‹œํ•˜๊ณ  ๋ฐ์ดํ„ฐ ๊ธฐ์ดˆ๋ฅผ ์ฒด๊ณ„์ ์œผ๋กœ ๋ถ„์„ํ•œ๋‹ค. ์ธ๊ฐ„ ์˜์ƒ ๊ธฐ๋ฐ˜ ํ•™์Šต์ด ๊ธฐ์กด ๋กœ๋ด‡ ํ…”๋ ˆ์ž‘๋™์— ๋น„ํ•ด 5-10๋ฐฐ ์ด์ƒ์˜ ๋ฐ์ดํ„ฐ ํšจ์œจ์„ฑ์„ ์ œ๊ณตํ•จ์„ ๊ฐ•์กฐํ•œ๋‹ค.

Motivation

Achievement

Figure 2

Figure 2. Taxonomy of robot learning from human videos.

๊ณ„์ธต์  ์ „์ด ๋ฉ”์ปค๋‹ˆ์ฆ˜์˜ ์ œ์‹œ: task/observation/action ๋ ˆ๋ฒจ์˜ ๋ช…ํ™•ํ•œ ๋ถ„๋ฅ˜ ํ‹€๊ณผ ๊ฐ ๊ฒฝ๋กœ์˜ ์„ค๊ณ„ ์›์น™ยทํŠธ๋ ˆ์ด๋“œ์˜คํ”„ ๋ถ„์„. ๋ฐ์ดํ„ฐ ๊ตฌ์„ฑ๊ณผ ํ•™์Šต ํŒจ๋Ÿฌ๋‹ค์ž„์˜ ๋น„๊ต ๋ถ„์„: ์„œ๋กœ ๋‹ค๋ฅธ ์ „์ด ๊ณ„์—ด ๊ฐ„ methodological couplings ๊ทœ๋ช…. ์ธ๊ฐ„-๊ฐ์ฒด ์ƒํ˜ธ์ž‘์šฉ ๋ถ„์„ ๋„๊ตฌ์˜ ์ฒด๊ณ„ํ™”: hand detectionยทobject trackingยทpose estimation ๋“ฑ ๊ธฐ์กด ๋ฐฉ๋ฒ•๋ก  ์ข…ํ•ฉ. ์ธ๊ฐ„ ์˜์ƒ ๋ฐ์ดํ„ฐ์˜ ๋Œ€๊ทœ๋ชจ ํ†ต๊ณ„ ๋ถ„์„: dataset ๋ฐœ์ „ ์ถ”์„ธ์™€ LfHV ๋ฉ”์„œ๋“œ์˜ ๋ฐ์ดํ„ฐ ์„ ํ˜ธ๋„ ๋ถ„์„ (๊ธฐ์กด ์—ฐ๊ตฌ ๋Œ€๋น„ ๊ฐ€์žฅ ํฌ๊ด„์ ). ํ–ฅํ›„ ์—ฐ๊ตฌ ๋ฐฉํ–ฅ ์ œ์‹œ: ๋ชจ๋ธ๋ง ํŒจ๋Ÿฌ๋‹ค์ž„ยท๋ฐ์ดํ„ฐ ๋ชจ๋‹ฌ๋ฆฌํ‹ฐยท๋ฒค์น˜๋งˆํฌยท์ƒํƒœ๊ณ„ ํ˜‘๋ ฅ ์ธก๋ฉด์—์„œ์˜ ๊ธฐํšŒ ์˜์—ญ ๋„์ถœ.

How

Figure 2

Figure 2. Taxonomy of robot learning from human videos.

Originality

Limitation & Further Study

ํ›„์† ์—ฐ๊ตฌ:

Evaluation

Novelty: 4/5 Technical Soundness: 4/5 Significance: 5/5 Clarity: 4/5 Overall: 4/5

์ดํ‰: ๋ณธ survey๋Š” ๋กœ๋ด‡ ํ•™์Šต ๋ถ„์•ผ์—์„œ ์ธ๊ฐ„ ์˜์ƒ ๊ธฐ๋ฐ˜ ์Šคํ‚ฌ ํš๋“์ด๋ผ๋Š” ๊ธ‰์„ฑ์žฅํ•˜๋Š” ๋ถ„์•ผ์— ๋Œ€ํ•ด ์ฒ˜์Œ์œผ๋กœ ์ฒด๊ณ„์ ์ด๊ณ  ํฌ๊ด„์ ์ธ ๋ถ„๋ฅ˜ ์ฒด๊ณ„๋ฅผ ์ œ์‹œํ•˜๋ฉฐ, ๋‹ค๊ฐ์ ์ธ ๋น„๊ต ๋ถ„์„๊ณผ ๋Œ€๊ทœ๋ชจ ๋ฐ์ดํ„ฐ ํ†ต๊ณ„๋ฅผ ๋ฐ”ํƒ•์œผ๋กœ ํ˜„์žฌ ์—ฐ๊ตฌ ๊ฒฝ๊ด€์„ ๋ช…ํ™•ํžˆ ์กฐ๋งํ•œ๋‹ค. ์‹ค์ œ ๋ฐ์ดํ„ฐ ํšจ์œจ์„ฑ ๊ฐœ์„ (5-10๋ฐฐ)์ด ์‹ค์ฆ๋˜์–ด ์žˆ์–ด ํ•™์ˆ ์ ยท์‹ค๋ฌด์  ์ค‘์š”์„ฑ์ด ๋†’์œผ๋‚˜, ์ •๋Ÿ‰์  ์„ฑ๋Šฅ ๋น„๊ต์™€ ์ƒˆ๋กœ์šด ๋ฉ”์„œ๋“œ ์ œ์‹œ๊ฐ€ ์—†๋Š” ์ˆœ์ˆ˜ ๋ฆฌ๋ทฐ ๋…ผ๋ฌธ์ด๋ผ๋Š” ํ•œ๊ณ„๊ฐ€ ์žˆ๋‹ค.

← ๋ชฉ๋ก์œผ๋กœ ๋Œ์•„๊ฐ€๊ธฐ

๐ŸŽง Audio Overview

์ด ๋…ผ๋ฌธ ๋ฆฌ๋ทฐ๋ฅผ ํŒŸ์บ์ŠคํŠธํ˜• ์˜ค๋””์˜ค๋กœ ์ƒ์„ฑํ•ฉ๋‹ˆ๋‹ค. (Gemini ยท ํ‚ค๋Š” ๋ธŒ๋ผ์šฐ์ €์—๋งŒ ์ €์žฅ ยท ์™„์„ฑ๋ณธ์€ ์ด๋ฉ”์ผ๋กœ๋„ ์ „์†ก)
โ–ธ ๊ณ ๊ธ‰: ๊ตฌ์„ฑ ๋ฐฉํ–ฅ(๋Œ€๋ณธ ์ž‘์„ฑ ์ง€์นจ) ์ง์ ‘ ์ˆ˜์ •