On the Vulnerability of LLM/VLM-Controlled Robotics

์ €์ž: Xiyang Wu, Souradip Chakraborty, Ruiqi Xian, Jing Liang, Tianrui Guan, Fuxiao Liu, Brian M. Sadler, Dinesh Manocha, Amrit Singh Bedi | ๋‚ ์งœ: 2024-02-15 | URL: https://arxiv.org/abs/2402.10340 📄 PDF


Essence

Figure 1

Fig. 1: Vulnerability-Triggering Perturbations. We showcase perturbations inducing misalignment-related vulnerabilities

LLM/VLM ๊ธฐ๋ฐ˜ ๋กœ๋ด‡ ์‹œ์Šคํ…œ์ด ์ž…๋ ฅ ๋ชจ๋‹ฌ๋ฆฌํ‹ฐ์˜ ์ž‘์€ ๋ณ€ํ™”์— ๋งค์šฐ ์ทจ์•ฝํ•˜๋ฉฐ, ์˜๋ฏธ์ƒ ๋™์ผํ•œ ์ง€์‹œ์‚ฌํ•ญ์˜ ์•ฝ๊ฐ„์˜ ๋ณ€ํ˜•๋งŒ์œผ๋กœ๋„ ๋กœ๋ด‡์˜ ํ–‰๋™์ด ํฌ๊ฒŒ ๋‹ฌ๋ผ์ง€๋Š” ๋ฌธ์ œ๋ฅผ ๋ถ„์„ํ•œ๋‹ค.

Motivation

Achievement

Figure 1

Fig. 1: Vulnerability-Triggering Perturbations. We showcase perturbations inducing misalignment-related vulnerabilities

How

Figure 1

Fig. 1: Vulnerability-Triggering Perturbations. We showcase perturbations inducing misalignment-related vulnerabilities

Originality

Limitation & Further Study

Evaluation

Novelty: 4/5 Technical Soundness: 3/5 Significance: 4/5 Clarity: 4/5 Overall: 4/5

์ดํ‰: ๋ณธ ๋…ผ๋ฌธ์€ LLM/VLM ์ œ์–ด ๋กœ๋ด‡์˜ ์•ˆ์ „ ๋ฐฐํฌ์— ์ค‘์š”ํ•œ ์ž…๋ ฅ ๋ชจ๋‹ฌ๋ฆฌํ‹ฐ ๋ฏผ๊ฐ์„ฑ ๋ฌธ์ œ๋ฅผ ์ฒ˜์Œ์œผ๋กœ ์ฒด๊ณ„์ ์œผ๋กœ ๋ถ„์„ํ•˜๋ฉฐ, ๋ช…ํ™•ํ•œ ์‹ค์ฆ ๊ฒฐ๊ณผ๋ฅผ ์ œ์‹œํ•จ์œผ๋กœ์จ ๋กœ๋ด‡ ์•ˆ์ „์„ฑ ์—ฐ๊ตฌ์— ์ค‘์š”ํ•œ ๊ธฐ์—ฌ๋ฅผ ํ•œ๋‹ค. ๋‹ค๋งŒ ๊ตฌ์ฒด์ ์ธ ํ•ด๊ฒฐ์ฑ… ์ œ์‹œ๊ฐ€ ๋ฏธํกํ•˜๊ณ  ์‹คํ—˜ ๋ฒ”์œ„ ํ™•๋Œ€๊ฐ€ ํ•„์š”ํ•˜๋‹ค.

← ๋ชฉ๋ก์œผ๋กœ ๋Œ์•„๊ฐ€๊ธฐ

๐ŸŽง Audio Overview

์ด ๋…ผ๋ฌธ ๋ฆฌ๋ทฐ๋ฅผ ํŒŸ์บ์ŠคํŠธํ˜• ์˜ค๋””์˜ค๋กœ ์ƒ์„ฑํ•ฉ๋‹ˆ๋‹ค. (Gemini ยท ํ‚ค๋Š” ๋ธŒ๋ผ์šฐ์ €์—๋งŒ ์ €์žฅ ยท ์™„์„ฑ๋ณธ์€ ์ด๋ฉ”์ผ๋กœ๋„ ์ „์†ก)
โ–ธ ๊ณ ๊ธ‰: ๊ตฌ์„ฑ ๋ฐฉํ–ฅ(๋Œ€๋ณธ ์ž‘์„ฑ ์ง€์นจ) ์ง์ ‘ ์ˆ˜์ •