A comprehensive survey of cross-domain policy transfer for embodied agents

์ €์ž: Haoyi Niu, Jianming Hu, Guyue Zhou, Xianyuan Zhan | ๋‚ ์งœ: 2024 | URL: https://arxiv.org/abs/2402.04580 📄 PDF


Essence

Figure 1

Figure 1: The main architecture of the survey: domain gap taxonomy, overarching insights on methodologies, and future tr

๊ตฌํ˜„ ๋กœ๋ด‡(embodied agents)์„ ์œ„ํ•œ ํฌ๋กœ์Šค ๋„๋ฉ”์ธ ์ •์ฑ… ์ „์ด(cross-domain policy transfer) ๋ฐฉ๋ฒ•๋“ค์„ ์ฒด๊ณ„์ ์œผ๋กœ ๊ฒ€ํ† ํ•œ ์ข…ํ•ฉ ์„œ๋ฒ ์ด. ์‹œ๋ฎฌ๋ ˆ์ด์…˜, ์‹คํ—˜์‹ค ๋“ฑ ์ €๋น„์šฉ ์†Œ์Šค ๋„๋ฉ”์ธ์˜ ๋ฐ์ดํ„ฐ๋ฅผ ์‹ค์ œ ํ™˜๊ฒฝ(ํƒ€๊ฒŸ ๋„๋ฉ”์ธ)์— ํšจ๊ณผ์ ์œผ๋กœ ์ „์ดํ•˜๋Š” ๊ธฐ์ˆ ๋“ค์„ ๋ถ„๋ฅ˜ ๋ฐ ๋ถ„์„.

Motivation

Achievement

Figure 1

Figure 1: The main architecture of the survey: domain gap taxonomy, overarching insights on methodologies, and future tr

How

Figure 1

Figure 1: The main architecture of the survey: domain gap taxonomy, overarching insights on methodologies, and future tr

Originality

Limitation & Further Study

Evaluation

Novelty: 4/5 Technical Soundness: 3/5 Significance: 4/5 Clarity: 4/5 Overall: 4/5

์ดํ‰: ์ด ์„œ๋ฒ ์ด๋Š” ํฌ๋กœ์Šค ๋„๋ฉ”์ธ ์ •์ฑ… ์ „์ด ๋ถ„์•ผ์˜ ์ฒซ ์ฒด๊ณ„์  ๊ฒ€ํ† ๋กœ์„œ, ๋ถ„์‚ฐ๋œ ์—ฐ๊ตฌ๋“ค์„ ํ†ตํ•ฉํ•˜๊ณ  ๋„๋ฉ”์ธ ๊ฐญ์„ ๋ช…ํ™•ํžˆ ๋ถ„๋ฅ˜ํ•˜์—ฌ ํ•ด๋‹น ๋ถ„์•ผ์— ์ค‘์š”ํ•œ ๊ธฐ์ดˆ ์ž๋ฃŒ๋ฅผ ์ œ๊ณตํ•œ๋‹ค. ๋กœ๋ด‡ ํ•™์Šต๊ณผ ๊ตฌํ˜„ AI์˜ ์‹ค์„ธ๊ณ„ ๋ฐฐํฌ๋ฅผ ์œ„ํ•œ ํ•„์ˆ˜์ ์ธ ๊ธฐ์ˆ  ์˜์—ญ์„ ํฌ๊ด„์ ์œผ๋กœ ์ •๋ฆฌํ•˜์—ฌ ํ–ฅํ›„ ์—ฐ๊ตฌ ๋ฐฉํ–ฅ์„ ์ œ์‹œํ•˜๋Š” ๊ฐ€์น˜ ์žˆ๋Š” ๊ธฐ์—ฌ์ด๋‹ค.

๊ฐ™์ด ๋ณด๋ฉด ์ข‹์€ ๋…ผ๋ฌธ

๊ธฐ๋ฐ˜ ์—ฐ๊ตฌ
ํฌ๋กœ์Šค ๋„๋ฉ”์ธ ์ •์ฑ… ์ „์ด๋ฅผ ์œ„ํ•œ ๊ธฐ๊ณ„์  ํ•ด์„๊ฐ€๋Šฅ์„ฑ ๊ธฐ๋ฐ˜ ๋ฐฉ๋ฒ•๋ก ์„ ์ œ๊ณตํ•˜๋Š” ๊ธฐ์ดˆ ์—ฐ๊ตฌ์ด๋‹ค.
๊ธฐ๋ฐ˜ ์—ฐ๊ตฌ
LLM๊ธฐ๋ฐ˜ ์—์ด์ „ํŠธ์˜ ์†Œํ”„ํŠธ์›จ์–ด/์‹œ๋ฎฌ๋ ˆ์ด์…˜ ์ž๋™ํ™” ์—ฐ๊ตฌ๋ฅผ ๋‹ค๋ฃจ๋ฉฐ, cross-domain transfer์˜ ์†Œํ”„ํŠธ์›จ์–ด์  ํ™•์žฅ์— ๋Œ€ํ•œ ์ด๋ก ์  ๋ฐฐ๊ฒฝ์„ ์ œ๊ณตํ•œ๋‹ค.
๊ธฐ๋ฐ˜ ์—ฐ๊ตฌ
๊ตฌํ˜„ํ˜• ์—์ด์ „ํŠธ์˜ ๋„๋ฉ”์ธ ์ „์ด ๋ฐ ์‹ค์ œ ํ™˜๊ฒฝ ์ ์šฉ ๋ฐฉ๋ฒ•(์‹œ๋ฎฌ-์‹คํ—˜์‹ค-ํ˜„์žฅ)์˜ ์ด๋ก ยท๋ถ„๋ฅ˜๊ฐ€ ์˜๋ฃŒ ๋ถ„์•ผ์˜ LLM ์—์ด์ „ํŠธ ์‹ค์šฉ์„ฑ ๋…ผ์˜์— ๊ธฐ์ดˆ๋ฅผ ์ œ๊ณตํ•จ.
๊ธฐ๋ฐ˜ ์—ฐ๊ตฌ
๊ตฌํ˜„ํ˜• ๋กœ๋ด‡ ์—์ด์ „ํŠธ์˜ ์ •์ฑ…์ „์ด ๋ฐ ์‹ค์ œ ํ™˜๊ฒฝ ์ ์šฉ์— ๊ด€ํ•œ ์ด๋ก ยท๋ถ„๋ฅ˜๋ฅผ ์ œ๊ณตํ•ด 010์˜ ๊ณ„์ธต์  ์ œ์–ด ํ”„๋ ˆ์ž„์›Œํฌ์˜ ์ด๋ก ์  ๊ธฐ๋ฐ˜์ž„.
๊ธฐ๋ฐ˜ ์—ฐ๊ตฌ
003๋ฒˆ ๋…ผ๋ฌธ์€ ๋‹ค์–‘ํ•œ ๋„๋ฉ”์ธ์—์„œ์˜ ์ •์ฑ… ์ „์ด์™€ ์—์ด์ „ํŠธ ์ ์šฉ์„ ์ข…ํ•ฉ์ ์œผ๋กœ ์ •๋ฆฌํ•˜์˜€์œผ๋ฏ€๋กœ, 793๋ฒˆ ์‚ฌ์šฉ์ž ๊ธฐ๋ฐ˜ ์—์ด์ „ํŠธ ์ฑ„ํƒ ๋ถ„์„์˜ ์ด๋ก ์  ๋ฐฐ๊ฒฝ์ด ๋ฉ๋‹ˆ๋‹ค.
๊ธฐ๋ฐ˜ ์—ฐ๊ตฌ
003์€ AI ๊ธฐ๋ฐ˜ ๊ณผํ•™ ์—ฐ๊ตฌ์—์„œ ๊ต์ฐจ ๋„๋ฉ”์ธ ์˜ค์ผ€์ŠคํŠธ๋ ˆ์ด์…˜ ์ •์ฑ… ์ˆ˜๋ฆฝ ๋ฐ ์ „์ด ํ•™์Šต ์‚ฌ๋ก€๋ฅผ ์ข…ํ•ฉ์ ์œผ๋กœ ๊ฒ€ํ† ํ•ด, 354์˜ ๋ณ‘๋ ฌ ์ฒ˜๋ฆฌ์™€ ์ ‘๋ชฉํ•ด ์ฝ์„ ๋งŒํ•˜๋‹ค.
๋‹ค๋ฅธ ์ ‘๊ทผ
๊ตฌํ˜„ ๋กœ๋ด‡์„ ์œ„ํ•œ ๋„๋ฉ”์ธ ์ ์‘ ๋ฐ ์ •์ฑ… ์ „์ด ๋ฐฉ๋ฒ•์„ ๋‹ค๋ฃจ๋Š” ์œ ์‚ฌํ•œ ์—ฐ๊ตฌ์ด๋‹ค.
๋‹ค๋ฅธ ์ ‘๊ทผ
์‹คํ—˜ ์ž๋™ํ™” ์ •์ฑ… ์ „์ด ๋ฐ ๋‹ค์ค‘ ๋„๋ฉ”์ธ ์ตœ์ ํ™”์— ๋Œ€ํ•œ ์„œ๋ฒ ์ด๋กœ, ๋ชฉํ‘œ์ง€ํ–ฅ์  ๋ฐ์ดํ„ฐ ์ˆ˜์ง‘ ๋ฐ ์ตœ์ ํ™” ์ „๋žต๊ณผ ๊ด€๋ จ์ด ์žˆ์Šต๋‹ˆ๋‹ค.
๋‹ค๋ฅธ ์ ‘๊ทผ
๊ตฌํ˜„ ์—์ด์ „ํŠธ์˜ ๋„๋ฉ”์ธ ์ „์ด ์ •์ฑ… ์„œ๋ฒ ์ด์™€ ๋‹ค๊ตญ์–ด ์›น ์—์ด์ „ํŠธ ํ‰๊ฐ€ ๋ฒค์น˜๋งˆํฌ๋Š” ์—์ด์ „ํŠธ์˜ ๋ฒ”์šฉ์„ฑ๊ณผ ํ™•์žฅ์„ฑ ์ธก๋ฉด์—์„œ ์ƒํ˜ธ๋ณด์™„์ ์ž…๋‹ˆ๋‹ค.
๋‹ค๋ฅธ ์ ‘๊ทผ
์‹œ๋ฎฌ๋ ˆ์ด์…˜์—์„œ ์‹ค์ œ ํ™˜๊ฒฝ์œผ๋กœ์˜ ์ •์ฑ… ์ „์ด ๋ฌธ์ œ๋ฅผ ๋‹ค๋ฃจ๋Š” ๊ด€๋ จ ์—ฐ๊ตฌ์ด๋‹ค.
ํ›„์† ์—ฐ๊ตฌ
LLM ๊ธฐ๋ฐ˜ ์ž์œจ ์—์ด์ „ํŠธ์˜ ์„ค๊ณ„์™€ ํ‰๊ฐ€๋ฅผ ๋‹ค๋ฃจ๋ฉฐ, embodied agent์—์„œ LLM์„ ํ†ตํ•œ ์ •์ฑ…์ „์ด์˜ ์ตœ์‹  ๋™ํ–ฅ๊ณผ ์‹œ๋„ˆ์ง€๋ฅผ ๋ณด์™„์ ์œผ๋กœ ๋ณด์—ฌ์คŒ.
ํ›„์† ์—ฐ๊ตฌ
ํฌ๋กœ์Šค ๋„๋ฉ”์ธ ์ •์ฑ… ์ „์ด์˜ ํŠน์ • ์ธก๋ฉด์„ ํ™•์žฅํ•˜์—ฌ ๋‹ค๋ฃจ๋Š” ์—ฐ๊ตฌ์ด๋‹ค.
ํ›„์† ์—ฐ๊ตฌ
LLM ์—์ด์ „ํŠธ์˜ ์ž์œจ์  ์ •๋ณด ํƒ์ƒ‰ ๋ฐ ์‹คํ—˜ํ™˜๊ฒฝ ์ ์‘ ์‚ฌ๋ก€๋ฅผ ๊ณต์œ ํ•˜์—ฌ, ๊ตฌํ˜„ ๋กœ๋ด‡ ๋ฐ ์—์ด์ „ํŠธ์˜ ๊ต์ฐจ๋„๋ฉ”์ธ ์ „์ด ํ™•์žฅ์— ์ฐธ๊ณ ํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.
ํ›„์† ์—ฐ๊ตฌ
LLM ๊ธฐ๋ฐ˜ ๊ณผํ•™ ์—์ด์ „ํŠธ์˜ ์‹ ๋ขฐ์„ฑ, ํ‰๊ฐ€, ์œ„ํ—˜๊ด€๋ฆฌ ๋“ฑ cross-domain policy transfer์˜ ์‹ค์ œ ์ ์šฉ ์‹œ ๊ณ ๋ ค์ ๊ณผ ํ‰๊ฐ€ ๋ฐฉ๋ฒ•์— ๋Œ€ํ•ด ์‹ฌ๋„ ์žˆ๊ฒŒ ๋‹ค๋ฃฌ๋‹ค.
์‘์šฉ ์‚ฌ๋ก€
๋‹ค์ค‘ ์—์ด์ „ํŠธ์˜ ๊ฒฝํ—˜์ถ•์  ๋ฐ ์ •์ฑ… ์ „์ดํ•™์Šต ํ”„๋ ˆ์ž„์›Œํฌ๊ฐ€ ๋กœ๋ด‡ ๊ตฌํ˜„ ๋“ฑ ํƒ€ ๋„๋ฉ”์ธ์œผ๋กœ ๋‹ค์–‘ํ•˜๊ฒŒ ์ ์šฉ ๊ฐ€๋Šฅํ•ฉ๋‹ˆ๋‹ค.
์‘์šฉ ์‚ฌ๋ก€
๊ฐ•ํ™”ํ•™์Šต๊ณผ ์‹œ๋ฎฌ๋ ˆ์ด์…˜-์‹คํ—˜ ๊ฐ„ ์ •์ฑ… ์ „์ด ์—ฐ๊ตฌ์˜ ์‹ค์ œ ๋ฌผ๋ฆฌยท๋กœ๋ณดํ‹ฑ์Šค ์ ์šฉ ์‚ฌ๋ก€๋กœ, embodied agents์˜ ์‹คํ™˜๊ฒฝ(์‹œ๋ฎฌโ†’๋ฆฌ์–ผ) ์ „์ด๋ฅผ ์ดํ•ดํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.
์‘์šฉ ์‚ฌ๋ก€
A comprehensive survey of cross-domain policy transfer ๋…ผ๋ฌธ์€ ์ž๋™ํ™”๋œ ๊ณผํ•™ ๊ณต์‹ ๋ฐ ์ตœ์ ํ™” ๋ฐœ๊ฒฌ ํ”„๋ ˆ์ž„์›Œํฌ์˜ ์‹ค์ œ ๋‹ค๋ถ„์•ผ ์‘์šฉ ๊ฐ€๋Šฅ์„ฑ์„ ๋“œ๋Ÿฌ๋ƒ…๋‹ˆ๋‹ค.
์‘์šฉ ์‚ฌ๋ก€
์ž์œจ ์‹คํ—˜์‹ค์—์„œ ์—์ด์ „ํŠธ ๊ธฐ๋ฐ˜ ์ •์ฑ… ์ „์ด๋ฅผ ์‹ค์ œ workflow์— ์ ์šฉํ•˜๋Š” ์˜ˆ์‹œ๋กœ ํฌ๋กœ์Šค๋„๋ฉ”์ธ ์ด์‹์˜ ๊ตฌํ˜„ ๋ฐฉ๋ฒ•์„ ํ™•์ธํ•  ์ˆ˜ ์žˆ๋‹ค.
← ๋ชฉ๋ก์œผ๋กœ ๋Œ์•„๊ฐ€๊ธฐ

๐ŸŽง Audio Overview

์ด ๋…ผ๋ฌธ ๋ฆฌ๋ทฐ๋ฅผ ํŒŸ์บ์ŠคํŠธํ˜• ์˜ค๋””์˜ค๋กœ ์ƒ์„ฑํ•ฉ๋‹ˆ๋‹ค. (Gemini ยท ํ‚ค๋Š” ๋ธŒ๋ผ์šฐ์ €์—๋งŒ ์ €์žฅ ยท ์™„์„ฑ๋ณธ์€ ์ด๋ฉ”์ผ๋กœ๋„ ์ „์†ก)
โ–ธ ๊ณ ๊ธ‰: ๊ตฌ์„ฑ ๋ฐฉํ–ฅ(๋Œ€๋ณธ ์ž‘์„ฑ ์ง€์นจ) ์ง์ ‘ ์ˆ˜์ •