GenerativeMPC: VLM-RAG-guided Whole-Body MPC with Virtual Impedance for Bimanual Mobile Manipulation

์ €์ž: Marcelino Julio Fernando, Miguel Altamirano Cabrera, Jeffrin Sam, Yara Mahmoud, Konstantin Gubernatorov, Dzmitry Tsetserukou | ๋‚ ์งœ: 2026-04-21 | URL: https://arxiv.org/abs/2604.19522 📄 PDF


Essence

Figure 3

Fig. 3.

GenerativeMPC๋Š” Vision-Language Model๊ณผ Retrieval-Augmented Generation์„ ํ™œ์šฉํ•˜์—ฌ ์˜๋ฏธ๋ก ์  ์žฅ๋ฉด ์ดํ•ด๋ฅผ ๋ฌผ๋ฆฌ์  ์ œ์–ด ํŒŒ๋ผ๋ฏธํ„ฐ๋กœ ๋ณ€ํ™˜ํ•˜๊ณ , Whole-Body MPC์™€ ํ†ตํ•ฉ ์ž„ํ”ผ๋˜์Šค-์–ด๋“œ๋ฏธํ„ด์Šค ์ œ์–ด๊ธฐ๋ฅผ ํ†ตํ•ด ์–‘ํŒ” ์ด๋™ํ˜• ์กฐ์ž‘ ๋กœ๋ด‡์˜ ์•ˆ์ „ํ•˜๊ณ  ๋งฅ๋ฝ์ธ์‹์ ์ธ ์ œ์–ด๋ฅผ ์‹คํ˜„ํ•œ๋‹ค.

Motivation

Achievement

Figure 2

Fig. 2. Bimanual manipulation in IsaacSim. Left: the robot performs a pick-

How

Figure 3

Fig. 3.

Originality

Limitation & Further Study

Evaluation

Novelty: 4/5 Technical Soundness: 4/5 Significance: 4/5 Clarity: 4/5 Overall: 4/5

์ดํ‰: GenerativeMPC๋Š” ์˜๋ฏธ๋ก ์  ์ดํ•ด์™€ ๋ฌผ๋ฆฌ์  ์•ˆ์ „์„ฑ์„ ์ฒด๊ณ„์ ์œผ๋กœ ํ†ตํ•ฉํ•˜๋Š” ์ฐฝ์˜์  ์ ‘๊ทผ์œผ๋กœ, VLM-RAG ๊ธฐ๋ฐ˜ ํŒŒ๋ผ๋ฏธํ„ฐ ์ƒ์„ฑ๊ณผ ๊ฒฝํ—˜ ๋ฉ”๋ชจ๋ฆฌ์˜ ์‹ ๊ทœ ํ™œ์šฉ์„ ํ†ตํ•ด ์–‘ํŒ” ์ด๋™ํ˜• ์กฐ์ž‘ ๋กœ๋ด‡์˜ ์ธ๊ฐ„์ค‘์‹ฌ ์ž์œจ์„ฑ์„ ํฌ๊ฒŒ ํ–ฅ์ƒ์‹œํ‚จ๋‹ค. ๊ด‘๋ฒ”์œ„ํ•œ ์‹œ๋ฎฌ๋ ˆ์ด์…˜ ๋ฐ ์‹ค์ œ ๊ฒ€์ฆ์œผ๋กœ ์‹ ๋ขฐ์„ฑ์„ ์ž…์ฆํ–ˆ์œผ๋‚˜, ์‹ค์ œ ํ”Œ๋žซํผ ์‹คํ—˜ ํ™•๋Œ€์™€ ๋ถ„ํฌ ์™ธ robustness ๋ถ„์„์ด ์ถ”๊ฐ€ ํ•„์š”ํ•˜๋‹ค.

← ๋ชฉ๋ก์œผ๋กœ ๋Œ์•„๊ฐ€๊ธฐ

๐ŸŽง Audio Overview

์ด ๋…ผ๋ฌธ ๋ฆฌ๋ทฐ๋ฅผ ํŒŸ์บ์ŠคํŠธํ˜• ์˜ค๋””์˜ค๋กœ ์ƒ์„ฑํ•ฉ๋‹ˆ๋‹ค. (Gemini ยท ํ‚ค๋Š” ๋ธŒ๋ผ์šฐ์ €์—๋งŒ ์ €์žฅ ยท ์™„์„ฑ๋ณธ์€ ์ด๋ฉ”์ผ๋กœ๋„ ์ „์†ก)
โ–ธ ๊ณ ๊ธ‰: ๊ตฌ์„ฑ ๋ฐฉํ–ฅ(๋Œ€๋ณธ ์ž‘์„ฑ ์ง€์นจ) ์ง์ ‘ ์ˆ˜์ •