Sentinel-VLA: A Metacognitive VLA Model with Active Status Monitoring for Dynamic Reasoning and Error Recovery

์ €์ž: Wenhao Li, Xiu Su, Dan Niu, Yichao Cao, Hongyan Xu, Zhe Qu, Lei Fan, Shan You, Chang Xu | ๋‚ ์งœ: 2026 | DOI: 10.48550/ARXIV.2605.01191 📄 PDF


Essence

Figure 1

Figure 1. The performance and mechanism of Sentinel-VLA.

๋ณธ ๋…ผ๋ฌธ์€ embodied manipulation์„ ์œ„ํ•œ metacognitive VLA ๋ชจ๋ธ์ธ Sentinel-VLA๋ฅผ ์ œ์•ˆํ•œ๋‹ค. ์‹ค์‹œ๊ฐ„ ์‹คํ–‰ ์ƒํƒœ๋ฅผ ๋ชจ๋‹ˆํ„ฐ๋งํ•˜๋Š” sentinel ๋ชจ๋“ˆ์„ ํ†ตํ•ด ํ•„์š”ํ•  ๋•Œ๋งŒ ๋™์  ์ถ”๋ก ๊ณผ ์—๋Ÿฌ ๋ณต๊ตฌ๋ฅผ ์ˆ˜ํ–‰ํ•˜๋Š” ์˜จ๋””๋งจ๋“œ ์ถ”๋ก  ๋ฉ”์ปค๋‹ˆ์ฆ˜์„ ํŠน์ง•์œผ๋กœ ํ•œ๋‹ค.

Motivation

Achievement

Figure 3

Figure 3. Left: Pipeline of Sentinel-VLA. The Status Monitor Expert activates on-demand Adaptive Thought. Right: Pipelin

์„ฑ๋Šฅ ๊ฐœ์„ : RLBench์—์„œ 22% ์ด์ƒ, ์‹ค์„ธ๊ณ„ ํ™˜๊ฒฝ์—์„œ 30% ์ด์ƒ์˜ ์ƒ๋Œ€ ์„ฑ๋Šฅ ํ–ฅ์ƒ์„ PI0 ๋Œ€๋น„ ๋‹ฌ์„ฑ. ์ž๋™ ๋ฐ์ดํ„ฐ ์ƒ์„ฑ: EC-Gen์„ ํ†ตํ•ด ์ˆ˜๋™ ๋ผ๋ฒจ๋ง ์—†์ด 44๊ฐœ ์ž‘์—…์— ๊ฑธ์นœ 2.6M ์ด์ƒ์˜ ์ „์ด ๋ฐ์ดํ„ฐ ์ž๋™ ์ƒ์„ฑ. ์ง€์†์  ํ•™์Šต: SECL๊ณผ OC-Adapter๋ฅผ ํ†ตํ•ด catastrophic forgetting์„ ๋ฐฉ์ง€ํ•˜๋ฉด์„œ ๋Šฅ๋ ฅ ๊ฒฝ๊ณ„ ์‹๋ณ„ ๋ฐ ์ž๋™ ๋ฐ์ดํ„ฐ ์ˆ˜์ง‘. ์˜คํ”ˆ์†Œ์Šค ๊ณต๊ฐœ: ๋ชจ๋“  ์ฝ”๋“œ, ๊ฐ€์ค‘์น˜, ๋ฐ์ดํ„ฐ ์ƒ์„ฑ ํŒŒ์ดํ”„๋ผ์ธ ๊ณต๊ฐœ ์˜ˆ์ •.

How

Figure 3

Figure 3. Left: Pipeline of Sentinel-VLA. The Status Monitor Expert activates on-demand Adaptive Thought. Right: Pipelin

Originality

Limitation & Further Study

Evaluation

Novelty: 4/5 Technical Soundness: 4/5 Significance: 4/5 Clarity: 4/5 Overall: 4/5

์ดํ‰: Sentinel-VLA๋Š” metacognitive ์ ‘๊ทผ์„ ํ†ตํ•ด VLA ๋ชจ๋ธ์˜ ์ถ”๋ก , ์ƒํƒœ ๋ชจ๋‹ˆํ„ฐ๋ง, ์—๋Ÿฌ ๋ณต๊ตฌ๋ผ๋Š” ์„ธ ๊ฐ€์ง€ ํ•ต์‹ฌ ๋ฌธ์ œ๋ฅผ ํ†ตํ•ฉ์ ์œผ๋กœ ํ•ด๊ฒฐํ•˜๋Š” ์ฐฝ์˜์ ์ธ ๋ฐฉ์•ˆ์„ ์ œ์‹œํ•œ๋‹ค. ํŠนํžˆ ์˜จ๋””๋งจ๋“œ ์ถ”๋ก  ๋ฉ”์ปค๋‹ˆ์ฆ˜๊ณผ ์ž๋™ํ™”๋œ ๋Œ€๊ทœ๋ชจ ๋ฐ์ดํ„ฐ ์ƒ์„ฑ ํŒŒ์ดํ”„๋ผ์ธ์˜ ์กฐํ•ฉ, ๊ทธ๋ฆฌ๊ณ  orthogonal constraint์„ ์ด์šฉํ•œ ์ง€์†์  ํ•™์Šต ๋ฐฉ์‹์€ ๊ธฐ์ˆ ์ ์œผ๋กœ ๊ฒฌ๊ณ ํ•˜๋ฉฐ ์‹ค์„ธ๊ณ„ ์„ฑ๋Šฅ ํ–ฅ์ƒ(30%)์œผ๋กœ ์‹ค์ฆ๋˜์—ˆ๋‹ค. ๋‹ค๋งŒ ์—๋Ÿฌ ๊ฐ์ง€์˜ ํ•œ๊ณ„ ๋ถ„์„๊ณผ ํŠธ๋ฆฌ๊ฑฐ ๊ธฐ์ค€์˜ ๋ช…ํ™•ํ•œ ์ •์˜๊ฐ€ ๋ณด๊ฐ•๋˜๋ฉด ๋”์šฑ ์™„์„ฑ๋„ ๋†’์„ ๊ฒƒ์ด๋‹ค.

← ๋ชฉ๋ก์œผ๋กœ ๋Œ์•„๊ฐ€๊ธฐ

๐ŸŽง Audio Overview

์ด ๋…ผ๋ฌธ ๋ฆฌ๋ทฐ๋ฅผ ํŒŸ์บ์ŠคํŠธํ˜• ์˜ค๋””์˜ค๋กœ ์ƒ์„ฑํ•ฉ๋‹ˆ๋‹ค. (Gemini ยท ํ‚ค๋Š” ๋ธŒ๋ผ์šฐ์ €์—๋งŒ ์ €์žฅ ยท ์™„์„ฑ๋ณธ์€ ์ด๋ฉ”์ผ๋กœ๋„ ์ „์†ก)
โ–ธ ๊ณ ๊ธ‰: ๊ตฌ์„ฑ ๋ฐฉํ–ฅ(๋Œ€๋ณธ ์ž‘์„ฑ ์ง€์นจ) ์ง์ ‘ ์ˆ˜์ •