FocusNav: Spatial Selective Attention with Waypoint Guidance for Humanoid Local Navigation

์ €์ž: Yang Zhang, Jianming Ma, Liyun Yan, Zhanxiang Cao, Yazhou Zhang, Haoyang Li, Yue Gao | ๋‚ ์งœ: 2026-01-19 | URL: https://arxiv.org/abs/2601.12790 📄 PDF


Essence

Figure 4

Fig. 4: Overview of the FocusNav framework. (a) Multi-modal perception encoder fuses spatially aligned LiDAR and depth

FocusNav๋Š” ์ธ๊ฐ„ํ˜• ๋กœ๋ด‡์˜ ๊ตญ์†Œ ํ•ญ๋ฒ•์„ ์œ„ํ•ด Waypoint-Guided Spatial Cross-Attention (WGSCA)์™€ Stability-Aware Selective Gating (SASG) ๋ชจ๋“ˆ์„ ๊ฒฐํ•ฉํ•œ ๊ณต๊ฐ„ ์„ ํƒ์  ์ฃผ์˜ ํ”„๋ ˆ์ž„์›Œํฌ๋ฅผ ์ œ์•ˆํ•œ๋‹ค. ์˜ˆ์ธก๋œ ๋ฌด์ถฉ๋Œ ๊ฒฝ๋กœ์ ์„ ๊ธฐ์ค€์œผ๋กœ ํ™˜๊ฒฝ ์ง€๊ฐ์„ ๋™์ ์œผ๋กœ ์กฐ์ •ํ•˜์—ฌ ๋ถˆ์•ˆ์ • ์‹œ ์›๊ฑฐ๋ฆฌ ์ •๋ณด๋ฅผ ์ œ๊ฑฐํ•จ์œผ๋กœ์จ ๋™์ ยท๋ณต์žกํ•œ ํ™˜๊ฒฝ์—์„œ์˜ ๊ฒฌ๊ณ ํ•œ ํ•ญ๋ฒ•์„ ๋‹ฌ์„ฑํ•œ๋‹ค.

Motivation

Achievement

Figure 1

Fig. 1: Snapshots of dynamic obstacle avoidance on stairs.

How

Figure 4

Fig. 4: Overview of the FocusNav framework. (a) Multi-modal perception encoder fuses spatially aligned LiDAR and depth

Originality

Limitation & Further Study

Evaluation

Novelty: 4/5 Technical Soundness: 3/5 Significance: 4/5 Clarity: 4/5 Overall: 4/5

์ดํ‰: FocusNav๋Š” ์ƒ๋ฌผํ•™์  ์˜๊ฐ๊ณผ ๊ธฐ์ˆ ์  ํ˜์‹ ์„ ๊ฒฐํ•ฉํ•˜์—ฌ ์ธ๊ฐ„ํ˜• ๋กœ๋ด‡์˜ ๋ณต์žกํ•œ ๋™์  ํ™˜๊ฒฝ ํ•ญ๋ฒ•์ด๋ผ๋Š” ์ค‘๋Œ€ํ•œ ๊ณผ์ œ๋ฅผ ์ฒด๊ณ„์ ์œผ๋กœ ํ•ด๊ฒฐํ•œ๋‹ค. WGSCA์™€ SASG ๋ชจ๋“ˆ์˜ ์„ค๊ณ„๊ฐ€ ์šฐ์ˆ˜ํ•˜๊ณ  ์‹ค์ œ ๋กœ๋ด‡ ์‹คํ—˜์œผ๋กœ ๊ฒ€์ฆ๋˜์—ˆ์œผ๋‚˜, ๋‹จ์ผ ํ”Œ๋žซํผ ์‹คํ—˜๊ณผ ์ˆ˜๋™ ํŒŒ๋ผ๋ฏธํ„ฐ ์กฐ์ •์ด๋ผ๋Š” ์ œ์•ฝ์ด ์žˆ๋‹ค.

← ๋ชฉ๋ก์œผ๋กœ ๋Œ์•„๊ฐ€๊ธฐ

๐ŸŽง Audio Overview

์ด ๋…ผ๋ฌธ ๋ฆฌ๋ทฐ๋ฅผ ํŒŸ์บ์ŠคํŠธํ˜• ์˜ค๋””์˜ค๋กœ ์ƒ์„ฑํ•ฉ๋‹ˆ๋‹ค. (Gemini ยท ํ‚ค๋Š” ๋ธŒ๋ผ์šฐ์ €์—๋งŒ ์ €์žฅ ยท ์™„์„ฑ๋ณธ์€ ์ด๋ฉ”์ผ๋กœ๋„ ์ „์†ก)
โ–ธ ๊ณ ๊ธ‰: ๊ตฌ์„ฑ ๋ฐฉํ–ฅ(๋Œ€๋ณธ ์ž‘์„ฑ ์ง€์นจ) ์ง์ ‘ ์ˆ˜์ •