Openfly: A comprehensive platform for aerial vision-language navigation

์ €์ž: Yunpeng Gao, Chenhui Li, Zhongrui You, Junli Liu, Zhen Li, Pengan Chen, Qizhi Chen, Zhonghan Tang, Liansheng Wang, Penghui Yang, Yiwen Tang, Yuhang Tang, Shuai Liang, Songyi Zhu, Ziqin Xiong, Yifei Su, Xinyi Ye, Jianan Li, Yan Ding, Dong Wang, Xuelong Li, Zhigang Wang, Bin Zhao | ๋‚ ์งœ: 2025-02-25 | URL: https://arxiv.org/abs/2502.18041 📄 PDF


Essence

Figure 1

Figure 1: Overview of OpenFly. This work consists of (1) the integration of 4 rendering engines, significantly

OpenFly๋Š” ํ•ญ๊ณต Vision-Language Navigation์„ ์œ„ํ•œ ์ข…ํ•ฉ ํ”Œ๋žซํผ์œผ๋กœ, 4๊ฐœ ๋ Œ๋”๋ง ์—”์ง„, ์ž๋™ํ™”๋œ ๋ฐ์ดํ„ฐ ์ƒ์„ฑ ํˆด์ฒด์ธ, 100k ๊ถค์ ์˜ ๋Œ€๊ทœ๋ชจ ๋ฐ์ดํ„ฐ์…‹, ๊ทธ๋ฆฌ๊ณ  keyframe-aware VLN ๋ชจ๋ธ์„ ์ œ๊ณตํ•œ๋‹ค.

Motivation

Achievement

Figure 1

Figure 1: Overview of OpenFly. This work consists of (1) the integration of 4 rendering engines, significantly

How

Figure 2

Figure 2: Framework of the automatic data generation. Multiple rendering engines are integrated

Originality

Limitation & Further Study

Evaluation

Novelty: 4/5 Technical Soundness: 4/5 Significance: 4/5 Clarity: 4/5 Overall: 4/5

์ดํ‰: OpenFly๋Š” ํ•ญ๊ณต VLN ์—ฐ๊ตฌ์˜ ๋ฐ์ดํ„ฐ ๋ถ€์กฑ ๋ฌธ์ œ๋ฅผ ํš๊ธฐ์ ์œผ๋กœ ํ•ด๊ฒฐํ•œ ์ข…ํ•ฉ ํ”Œ๋žซํผ์œผ๋กœ, ๋‹ค์ค‘ ๋ Œ๋”๋ง ์—”์ง„ ํ†ตํ•ฉ, ์™„์ „ ์ž๋™ํ™” ํŒŒ์ดํ”„๋ผ์ธ, 100k ๊ทœ๋ชจ ๋ฒค์น˜๋งˆํฌ๋ฅผ ํ†ตํ•ด embodied AI ๋ถ„์•ผ์— ์ค‘์š”ํ•œ ๊ธฐ์—ฌ๋ฅผ ํ•œ๋‹ค. ์ œ์•ˆ๋œ keyframe-aware ๋ชจ๋ธ๋„ ํ•ญ๊ณต VLN์˜ ํŠน์ˆ˜์„ฑ์„ ๋ฐ˜์˜ํ•œ ํšจ๊ณผ์ ์ธ ์ ‘๊ทผ๋ฒ•์ด๋‹ค.

← ๋ชฉ๋ก์œผ๋กœ ๋Œ์•„๊ฐ€๊ธฐ

๐ŸŽง Audio Overview

์ด ๋…ผ๋ฌธ ๋ฆฌ๋ทฐ๋ฅผ ํŒŸ์บ์ŠคํŠธํ˜• ์˜ค๋””์˜ค๋กœ ์ƒ์„ฑํ•ฉ๋‹ˆ๋‹ค. (Gemini ยท ํ‚ค๋Š” ๋ธŒ๋ผ์šฐ์ €์—๋งŒ ์ €์žฅ ยท ์™„์„ฑ๋ณธ์€ ์ด๋ฉ”์ผ๋กœ๋„ ์ „์†ก)
โ–ธ ๊ณ ๊ธ‰: ๊ตฌ์„ฑ ๋ฐฉํ–ฅ(๋Œ€๋ณธ ์ž‘์„ฑ ์ง€์นจ) ์ง์ ‘ ์ˆ˜์ •