ECHO: Edge-Cloud Humanoid Orchestration for Language-to-Motion Control

์ €์ž: Haozhe Jia, Jianfei Song, Yuan Zhang, Honglei Jin, Youcheng Fan, Wenshuo Chen, Wei Zhang, Yutao Yue | ๋‚ ์งœ: 2026-03-17 | URL: https://arxiv.org/abs/2603.16188 📄 PDF


Essence

Figure 1

Fig. 1.

ECHO๋Š” ์ž์—ฐ์–ด ๋ช…๋ น์œผ๋กœ ํœด๋จธ๋…ธ์ด๋“œ ๋กœ๋ด‡์„ ์ œ์–ดํ•˜๋Š” ์—ฃ์ง€-ํด๋ผ์šฐ๋“œ ํ”„๋ ˆ์ž„์›Œํฌ๋กœ, ํด๋ผ์šฐ๋“œ์˜ diffusion ๊ธฐ๋ฐ˜ text-to-motion ์ƒ์„ฑ๊ธฐ์™€ ์—ฃ์ง€์˜ RL ํŠธ๋ž˜์ปค๋ฅผ ๋กœ๋ด‡ ๋„ค์ดํ‹ฐ๋ธŒ 38์ฐจ์› ํ‘œํ˜„์œผ๋กœ ์—ฐ๊ฒฐํ•˜์—ฌ ์‹ค์‹œ๊ฐ„ ํ๋ฃจํ”„ ์‹คํ–‰์„ ์‹คํ˜„ํ•œ๋‹ค.

Motivation

Achievement

Figure 1

Fig. 1.

How

Figure 1

Fig. 1.

Originality

Limitation & Further Study

Evaluation

Novelty: 4/5 Technical Soundness: 3/5 Significance: 4/5 Clarity: 4/5 Overall: 4/5

์ดํ‰: ECHO๋Š” ์ƒ์„ฑ๊ณผ ์‹คํ–‰์˜ ๋ช…ํ™•ํ•œ ๋ถ„๋ฆฌ, robot-native ํ‘œํ˜„ ์„ค๊ณ„, ์‹ค์„ธ๊ณ„ ๋ฐฐํฌ ๋‹ฌ์„ฑ์„ ํ†ตํ•ด ์–ธ์–ด-๊ธฐ๋ฐ˜ ํœด๋จธ๋…ธ์ด๋“œ ์ œ์–ด ๋ถ„์•ผ์—์„œ modularity์™€ deployability์˜ ์ƒˆ๋กœ์šด ๊ธฐ์ค€์„ ์ œ์‹œํ•˜๋Š” ์˜๋ฏธ ์žˆ๋Š” ์—ฐ๊ตฌ์ด๋‹ค.

← ๋ชฉ๋ก์œผ๋กœ ๋Œ์•„๊ฐ€๊ธฐ

๐ŸŽง Audio Overview

์ด ๋…ผ๋ฌธ ๋ฆฌ๋ทฐ๋ฅผ ํŒŸ์บ์ŠคํŠธํ˜• ์˜ค๋””์˜ค๋กœ ์ƒ์„ฑํ•ฉ๋‹ˆ๋‹ค. (Gemini ยท ํ‚ค๋Š” ๋ธŒ๋ผ์šฐ์ €์—๋งŒ ์ €์žฅ ยท ์™„์„ฑ๋ณธ์€ ์ด๋ฉ”์ผ๋กœ๋„ ์ „์†ก)
โ–ธ ๊ณ ๊ธ‰: ๊ตฌ์„ฑ ๋ฐฉํ–ฅ(๋Œ€๋ณธ ์ž‘์„ฑ ์ง€์นจ) ์ง์ ‘ ์ˆ˜์ •