CANVAS: Commonsense-Aware Navigation System for Intuitive Human-Robot Interaction

์ €์ž: Suhwan Choi, Yongjun Cho, Minchan Kim, Jaeyoon Jung, Myunchul Joe, Yubeen Park, Minseo Kim, Sungwoong Kim, Sungjae Lee, Hwiseong Park, Jiwan Chung, Youngjae Yu | ๋‚ ์งœ: 2024-10-02 | URL: https://arxiv.org/abs/2410.01273 📄 PDF


Essence

Figure 1

Fig. 1: Humans often give abstract navigation directions using simple instruction, relying on the recipientโ€™s commonsens

CANVAS๋Š” ๋ชจํ˜ธํ•˜๊ฑฐ๋‚˜ ์žก์Œ์ด ์žˆ๋Š” ์ธ๊ฐ„์˜ ์–ธ์–ด ๋ฐ ์‹œ๊ฐ์  ์ง€์‹œ(์Šค์ผ€์น˜, ํ…์ŠคํŠธ)๋ฅผ ๋‹ค์ค‘๋ชจ๋“œ ์ž…๋ ฅ์œผ๋กœ ๋ฐ›์•„ ์ƒ์‹์  ์ดํ•ด๋ฅผ ๋ฐ”ํƒ•์œผ๋กœ ๋กœ๋ด‡์ด ์ธ๊ฐ„์˜ ๊ธฐ๋Œ€์— ๋งž๊ฒŒ ๋„ค๋น„๊ฒŒ์ด์…˜์„ ์ˆ˜ํ–‰ํ•˜๋„๋ก ํ•˜๋Š” ์ž„๋ฒ ๋”ฉ ๋Ÿฌ๋‹ ๊ธฐ๋ฐ˜ ํ”„๋ ˆ์ž„์›Œํฌ์ด๋‹ค.

Motivation

Achievement

Figure 2

Fig. 2: Data collection pipeline for COMMAND dataset. (a) First, we create diverse navigation environments and extract m

ROS NavStack ๋Œ€๋น„ ์„ฑ๋Šฅ ์šฐ์œ„: ๋ชจ๋“  ํ™˜๊ฒฝ์—์„œ ROS NavStack์„ ๋Šฅ๊ฐ€ํ•˜๋ฉฐ, ํŠนํžˆ ๊ณผ์ˆ˜์› ํ™˜๊ฒฝ์—์„œ ROS NavStack์ด 0% ์„ฑ๊ณต๋ฅ ์„ ๊ธฐ๋กํ•  ๋•Œ 67% ์„ฑ๊ณต๋ฅ  ๋‹ฌ์„ฑ

๋Œ€๊ทœ๋ชจ ๊ณ ํ’ˆ์งˆ ๋ฐ์ดํ„ฐ์…‹: COMMAND๋Š” 48์‹œ๊ฐ„์˜ ์ฃผํ–‰ ๋ฐ์ดํ„ฐ๋กœ GoStanford์˜ ์•ฝ 3๋ฐฐ ๊ทœ๋ชจ์ด๋ฉฐ 3๊ฐœ ํ™˜๊ฒฝ(์‚ฌ๋ฌด์‹ค, ๊ฑฐ๋ฆฌ, ๊ณผ์ˆ˜์›)์—์„œ 3,343๊ฐœ์˜ ์ธ๊ฐ„ ์ฃผ์„ ๋„ค๋น„๊ฒŒ์ด์…˜ ๊ฒฐ๊ณผ ์ œ๊ณต

๊ฐ•๋ ฅํ•œ Sim2Real ์ „์ด: ์‹œ๋ฎฌ๋ ˆ์ด์…˜๋งŒ์œผ๋กœ ํ›ˆ๋ จ๋˜์—ˆ์œผ๋‚˜ ์‹ค์ œ ๋กœ๋ด‡ ๋ฐฐํฌ์—์„œ 69% ์„ฑ๊ณต๋ฅ ๋กœ ์šฐ์ˆ˜ํ•œ ์„ฑ๋Šฅ ์ž…์ฆ

์ƒ์‹ ์ œ์•ฝ ์ค€์ˆ˜: ์ธ๊ฐ„ ์‹œ์—ฐ๊ณผ ์œ ์‚ฌํ•œ ๊ถค์ ์„ ๋”ฐ๋ฅด๋ฉฐ ์ƒ์‹ ์ œ์•ฝ ์œ„๋ฐ˜์ด ์ ์Œ์„ ์ •๋Ÿ‰์ ์œผ๋กœ ์ž…์ฆ

How

Figure 2

Fig. 2: Data collection pipeline for COMMAND dataset. (a) First, we create diverse navigation environments and extract m

Originality

Limitation & Further Study

Evaluation

Novelty: 4/5 Technical Soundness: 3/5 Significance: 4/5 Clarity: 4/5 Overall: 4/5

์ดํ‰: CANVAS๋Š” ์ถ”์ƒ์ ์ด๊ณ  ์žก์Œ์ด ์žˆ๋Š” ์ธ๊ฐ„ ์ง€์‹œ๋ฅผ ์ƒ์‹ ๊ธฐ๋ฐ˜์œผ๋กœ ํ•ด์„ํ•˜์—ฌ ๋กœ๋ด‡ ๋„ค๋น„๊ฒŒ์ด์…˜์„ ์ˆ˜ํ–‰ํ•˜๋Š” ํ˜์‹ ์ ์ธ ํ”„๋ ˆ์ž„์›Œํฌ์ด๋ฉฐ, ๋Œ€๊ทœ๋ชจ COMMAND ๋ฐ์ดํ„ฐ์…‹๊ณผ ํ•จ๊ป˜ ๊ฐ•๋ ฅํ•œ ์„ฑ๋Šฅ(ํŠนํžˆ ์–ด๋ ค์šด ํ™˜๊ฒฝ์—์„œ 67% vs 0%), ๊ทธ๋ฆฌ๊ณ  ์šฐ์ˆ˜ํ•œ Sim2Real ์ „์ด(69%)๋ฅผ ์ž…์ฆํ•จ์œผ๋กœ์จ ์ธ๊ฐ„-๋กœ๋ด‡ ์ƒํ˜ธ์ž‘์šฉ์˜ ์ž์—ฐ์„ฑ ํ–ฅ์ƒ๊ณผ ํ˜„์‹ค ์ ์šฉ ๊ฐ€๋Šฅ์„ฑ์„ ํšจ๊ณผ์ ์œผ๋กœ ์ œ์‹œํ•œ๋‹ค.

← ๋ชฉ๋ก์œผ๋กœ ๋Œ์•„๊ฐ€๊ธฐ

๐ŸŽง Audio Overview

์ด ๋…ผ๋ฌธ ๋ฆฌ๋ทฐ๋ฅผ ํŒŸ์บ์ŠคํŠธํ˜• ์˜ค๋””์˜ค๋กœ ์ƒ์„ฑํ•ฉ๋‹ˆ๋‹ค. (Gemini ยท ํ‚ค๋Š” ๋ธŒ๋ผ์šฐ์ €์—๋งŒ ์ €์žฅ ยท ์™„์„ฑ๋ณธ์€ ์ด๋ฉ”์ผ๋กœ๋„ ์ „์†ก)
โ–ธ ๊ณ ๊ธ‰: ๊ตฌ์„ฑ ๋ฐฉํ–ฅ(๋Œ€๋ณธ ์ž‘์„ฑ ์ง€์นจ) ์ง์ ‘ ์ˆ˜์ •