UniSkill: Imitating Human Videos via Cross-Embodiment Skill Representations

์ €์ž: Hanjung Kim, Jaehyun Kang, Hyolim Kang, Meedeum Cho, Seon Joo Kim, Youngwoon Lee | ๋‚ ์งœ: 2025-05-13 | URL: https://arxiv.org/abs/2505.08787 📄 PDF


Essence

Figure 2

Figure 2: The overview of UniSkill. (a) Inverse Skill Dynamics (ISD) and Forward Skill Dynamics

UniSkill์€ ๋Œ€๊ทœ๋ชจ์˜ ๋ผ๋ฒจ ์—†๋Š” ๊ต์ฐจ-๊ตฌํ˜„(cross-embodiment) ๋น„๋””์˜ค ๋ฐ์ดํ„ฐ๋กœ๋ถ€ํ„ฐ ๊ตฌํ˜„-๋ฌด๊ด€ํ•œ ์Šคํ‚ฌ ํ‘œํ˜„์„ ํ•™์Šตํ•˜์—ฌ, ์ธ๊ฐ„ ๋น„๋””์˜ค ์‹œ์—ฐ์œผ๋กœ๋ถ€ํ„ฐ ์ถ”์ถœํ•œ ์Šคํ‚ฌ์„ ๋กœ๋ด‡ ์ •์ฑ…์œผ๋กœ ์ง์ ‘ ์ „์ดํ•  ์ˆ˜ ์žˆ๋Š” ํ”„๋ ˆ์ž„์›Œํฌ์ด๋‹ค.

Motivation

Achievement

Figure 3

Figure 3: Overview of our tabletop experiments. (a) Average results on the tabletop benchmark using

How

Figure 2

Figure 2: The overview of UniSkill. (a) Inverse Skill Dynamics (ISD) and Forward Skill Dynamics

Originality

Limitation & Further Study

Evaluation

Novelty: 4/5 Technical Soundness: 3/5 Significance: 4/5 Clarity: 4/5 Overall: 4/5

์ดํ‰: UniSkill์€ ๋ฐ์ดํ„ฐ ์ •๋ ฌ ์ œ์•ฝ์„ ์ œ๊ฑฐํ•˜๊ณ  ์›น ๊ทœ๋ชจ ๋น„๋””์˜ค๋ฅผ ํ™œ์šฉํ•œ cross-embodiment ์Šคํ‚ฌ ํ•™์Šต์˜ ์ƒˆ๋กœ์šด ํŒจ๋Ÿฌ๋‹ค์ž„์„ ์ œ์‹œํ•˜๋ฉฐ, ์‹คํ—˜์ ์œผ๋กœ ์ธ๊ฐ„-๋กœ๋ด‡ imitation์˜ ๊ฐ€๋Šฅ์„ฑ์„ ์ž…์ฆํ•œ ์˜๋ฏธ ์žˆ๋Š” ์—ฐ๊ตฌ์ด๋‹ค. ๋‹ค๋งŒ ํ‰๊ฐ€ ๋ฒ”์œ„์˜ ํ™•๋Œ€์™€ ๋” ๋ณต์žกํ•œ ์ž‘์—…์— ๋Œ€ํ•œ ๊ฒ€์ฆ์ด ํ•„์š”ํ•˜๋‹ค.

← ๋ชฉ๋ก์œผ๋กœ ๋Œ์•„๊ฐ€๊ธฐ

๐ŸŽง Audio Overview

์ด ๋…ผ๋ฌธ ๋ฆฌ๋ทฐ๋ฅผ ํŒŸ์บ์ŠคํŠธํ˜• ์˜ค๋””์˜ค๋กœ ์ƒ์„ฑํ•ฉ๋‹ˆ๋‹ค. (Gemini ยท ํ‚ค๋Š” ๋ธŒ๋ผ์šฐ์ €์—๋งŒ ์ €์žฅ ยท ์™„์„ฑ๋ณธ์€ ์ด๋ฉ”์ผ๋กœ๋„ ์ „์†ก)
โ–ธ ๊ณ ๊ธ‰: ๊ตฌ์„ฑ ๋ฐฉํ–ฅ(๋Œ€๋ณธ ์ž‘์„ฑ ์ง€์นจ) ์ง์ ‘ ์ˆ˜์ •