Contrastive Representation Learning for Robust Sim-to-Real Transfer of Adaptive Humanoid Locomotion

์ €์ž: Yidan Lu, Rurui Yang, Qiran Kou, Mengting Chen, Tao Fan, Peter Cui, Yinzhao Dong, Peng Lu | ๋‚ ์งœ: 2025-09-16 | URL: https://arxiv.org/abs/2509.12858 📄 PDF


Essence

Figure 2

Fig. 2: Overview of our proposed training framework. An asymmetric Actor-

Contrastive learning์„ ์ด์šฉํ•ด ์‹œ๋ฎฌ๋ ˆ์ด์…˜์˜ ํŠน๊ถŒ ์ •๋ณด(terrain heightmap)๋ฅผ ์ˆœ์ˆ˜ proprioceptive policy์— ์ฆ๋ฅ˜์‹œ์ผœ ์ง€๊ฐ์˜ ์„ ๊ฒฌ์„ฑ์„ ์–ป์œผ๋ฉด์„œ๋„ ๋ฐฐํฌ ์‹œ ์ง€๊ฐ ์„ผ์„œ์˜ ๋น„์šฉ์„ ํ”ผํ•œ๋‹ค. Adaptive gait clock์„ ํ†ตํ•ด ๊ณ ์ •๋œ ํด๋Ÿญ ๋ณดํ–‰๊ณผ ๋ถˆ์•ˆ์ •ํ•œ ์ž์œ  ํด๋Ÿญ ๋ณดํ–‰ ์‚ฌ์ด์˜ ๊ทผ๋ณธ์  trade-off๋ฅผ ํ•ด๊ฒฐํ•œ๋‹ค.

Motivation

Achievement

Figure 1

Fig. 1: Our policy, trained via contrastive knowledge distillation, enables

How

Figure 2

Fig. 2: Overview of our proposed training framework. An asymmetric Actor-

Originality

Limitation & Further Study

Evaluation

Novelty: 4/5 Technical Soundness: 3/5 Significance: 4/5 Clarity: 4/5 Overall: 4/5

์ดํ‰: ์ด ๋…ผ๋ฌธ์€ contrastive learning์„ ํ†ตํ•ด ์‹œ๋ฎฌ๋ ˆ์ด์…˜ ํŠน๊ถŒ ์ •๋ณด๋ฅผ proprioceptive policy์— ํšจ๊ณผ์ ์œผ๋กœ ์ฆ๋ฅ˜ํ•˜์—ฌ ์ง€๊ฐ ์„ผ์„œ ์—†์ด๋„ ์„ ๊ฒฌ์„ฑ ์žˆ๋Š” ์ œ์–ด๋ฅผ ๋‹ฌ์„ฑํ•˜๋Š” ์ฐฝ์˜์  ํ•ด๊ฒฐ์ฑ…์„ ์ œ์‹œํ•œ๋‹ค. Zero-shot sim-to-real ์ „์ด๋กœ ๊ทน๋„๋กœ ๋„์ „์ ์ธ ์ง€ํ˜•์—์„œ์˜ ๊ฐ•๊ฑดํ•œ ๋ณดํ–‰์„ ์‹ค์ฆํ•จ์œผ๋กœ์จ ์ธ๊ฐ„ํ˜• ๋กœ๋ด‡ ์‹ค์šฉํ™”์˜ ์ค‘์š”ํ•œ ์ง„์ „์„ ๋ณด์—ฌ์ค€๋‹ค.

← ๋ชฉ๋ก์œผ๋กœ ๋Œ์•„๊ฐ€๊ธฐ

๐ŸŽง Audio Overview

์ด ๋…ผ๋ฌธ ๋ฆฌ๋ทฐ๋ฅผ ํŒŸ์บ์ŠคํŠธํ˜• ์˜ค๋””์˜ค๋กœ ์ƒ์„ฑํ•ฉ๋‹ˆ๋‹ค. (Gemini ยท ํ‚ค๋Š” ๋ธŒ๋ผ์šฐ์ €์—๋งŒ ์ €์žฅ ยท ์™„์„ฑ๋ณธ์€ ์ด๋ฉ”์ผ๋กœ๋„ ์ „์†ก)
โ–ธ ๊ณ ๊ธ‰: ๊ตฌ์„ฑ ๋ฐฉํ–ฅ(๋Œ€๋ณธ ์ž‘์„ฑ ์ง€์นจ) ์ง์ ‘ ์ˆ˜์ •