SE(3)-Equivariant Robot Learning and Control: A Tutorial Survey

์ €์ž: Joohwan Seo, Soochul Yoo, Junwoo Chang, Hyunseok An, Hyunwoo Ryu | ๋‚ ์งœ: 2025.03 | DOI: N/A 📄 PDF


Essence

Figure 4

Fig. 4. Coordinate frames {A} and {B} for specifying

๋ณธ ๋…ผ๋ฌธ์€ ๋กœ๋ด‡ ํ•™์Šต ๋ฐ ์ œ์–ด์—์„œ SE(3) ๋™ํ˜•์„ฑ(equivariance)์„ ํ™œ์šฉํ•˜๋Š” ์‹ฌํ™” ํŠœํ† ๋ฆฌ์–ผ ์„œ๋ฒ ์ด์ด๋‹ค. Group theory, Lie groups, SE(3) ๋“ฑ ์ˆ˜ํ•™์  ๊ธฐ์ดˆ๋ถ€ํ„ฐ equivariant neural networks์˜ ๋กœ๋ด‡ ์‘์šฉ๊นŒ์ง€ ํฌ๊ด„์ ์œผ๋กœ ๋‹ค๋ฃฌ๋‹ค.

Motivation

Achievement

Figure 4

Fig. 4. Coordinate frames {A} and {B} for specifying

์ฃผ์š” ์„ฑ๊ณผ:

โ€ข SE(3)-equivariance์˜ ํ†ต์ผ๋œ ์ˆ˜ํ•™์  ํ”„๋ ˆ์ž„์›Œํฌ ์ œ์‹œ

โ€ข Lie groups, Lie algebras๋กœ๋ถ€ํ„ฐ SE(3) ์ •์˜ ๋ฐ group action์˜ ๋ช…ํ™•ํ•œ ์„ค๋ช…

โ€ข Group convolutional networks์™€ steerability on SE(3)์˜ equivariant ์‹ ๊ฒฝ๋ง ์„ค๊ณ„ ๋ฐฉ๋ฒ•๋ก  ์ œ๊ณต

โ€ข Imitation learning๊ณผ reinforcement learning์—์„œ์˜ SE(3)-equivariant ๋ชจ๋ธ ์ ์šฉ ์‚ฌ๋ก€ ๊ฒ€ํ† 

โ€ข Geometric control ๊ด€์ ์—์„œ SE(3) manifold ์ƒ์˜ ์ œ์–ด ์„ค๊ณ„ ๊ธฐ๋ฒ• ์†Œ๊ฐœ

โ€ข End-to-end SE(3)-equivariant energy-based models์™€ diffusion ๊ธฐ๋ฐ˜ ๋ฐฉ๋ฒ•์˜ ์ตœ์‹  ๋™ํ–ฅ ์ •๋ฆฌ

How

Figure 1

Fig. 1. Illustration of a Lie group G and two of its tan-

โ€ข Group theory์˜ ๊ธฐ์ดˆ ๊ฐœ๋…(Group, Subgroup, Group action)์„ ๋ช…ํ™•ํžˆ ์ •์˜ํ•˜๊ณ  SO(3), SE(3) ๋“ฑ ํ•ต์‹ฌ ๊ทธ๋ฃน ์†Œ๊ฐœ

โ€ข Lie groups์™€ Lie algebras์˜ ๊ด€๊ณ„, exponential map๊ณผ logarithmic map์„ ์„ค๋ช…ํ•˜์—ฌ ๊ตฐ์˜ ์ง€์—ญ ๊ตฌ์กฐ ํ•ด์„

โ€ข Matrix Lie groups๋ฅผ ํ†ตํ•ด rigid body transformations์˜ ์ˆ˜ํ•™์  ํ‘œํ˜„ ์ œ๊ณต

โ€ข Group-equivariant neural network์˜ ์„ค๊ณ„ ์›๋ฆฌ(fiber bundles, representation theory) ์„ค๋ช…

โ€ข Equivariance ์ œ์•ฝ์„ ๋งŒ์กฑํ•˜๋Š” convolutional filters์™€ linear maps์˜ ๊ตฌ์„ฑ ๋ฐฉ๋ฒ• ์ œ์‹œ

โ€ข SE(3)-equivariant deep learning ๋ชจ๋ธ์˜ ๊ตฌ์ฒด์  ๊ตฌํ˜„(point cloud ์ฒ˜๋ฆฌ, energy-based models)

โ€ข Imitation learning๊ณผ reinforcement learning ์•Œ๊ณ ๋ฆฌ์ฆ˜๊ณผ equivariance์˜ ๊ฒฐํ•ฉ ๋ฐฉ์‹ ๋…ผ์˜

โ€ข Error functions, Riemannian metrics, velocity errors๋ฅผ ํ†ตํ•œ geometric control ์„ค๊ณ„ ์›๋ฆฌ ์„ค๋ช…

Originality

โ€ข SE(3)-equivariance๋ฅผ ๋กœ๋ด‡ ํ•™์Šต๊ณผ ์ œ์–ด์— ํ†ตํ•ฉํ•˜๋Š” ํฌ๊ด„์ ์ธ ํŠœํ† ๋ฆฌ์–ผ ์ œ๊ณต์œผ๋กœ ๊ธฐ์กด ์‚ฐ์žฌ๋œ ์—ฐ๊ตฌ๋“ค์„ ์ฒด๊ณ„ํ™”

โ€ข ๋กœ๋ด‡ ์ปค๋ฎค๋‹ˆํ‹ฐ์˜ ๋‹ค์–‘ํ•œ ํ‘œ๊ธฐ๋ฒ•์„ ํ†ต์ผํ•˜์—ฌ ์ดํ•ด์˜ ์žฅ๋ฒฝ ์ œ๊ฑฐ

โ€ข Group theory ๊ธฐ์ดˆ๋ถ€ํ„ฐ ์ตœ์‹  ๋”ฅ๋Ÿฌ๋‹ ์‘์šฉ๊นŒ์ง€ ์ผ๊ด€๋œ ์ˆ˜ํ•™์  ๊ด€์ ์œผ๋กœ ์„ค๋ช…

โ€ข Geometric control๊ณผ equivariant deep learning์„ ์—ฐ๊ฒฐํ•˜๋Š” ๊ด€์  ์ œ์‹œ

Limitation & Further Study

โ€ข ์ด๋ก ์  ์„ค๋ช…์— ์ค‘์ ์„ ๋‘์–ด ์‹ค์ œ ๋กœ๋ด‡ ์‹œ์Šคํ…œ์—์„œ์˜ ๊ตฌํ˜„ ๋ฐ ์„ฑ๋Šฅ ๋น„๊ต ์‹คํ—˜์ด ๋ถ€์กฑ

โ€ข ๋‹ค์–‘ํ•œ ๋กœ๋ด‡ ์ž‘์—…(manipulation ์™ธ navigation, perception ๋“ฑ)์—์„œ์˜ equivariance ํ™œ์šฉ์ด ์ œํ•œ์ ์œผ๋กœ ๋‹ค๋ฃจ์–ด์ง

โ€ข ๊ณ„์‚ฐ ๋ณต์žก๋„ ๋ฐ ํ™•์žฅ์„ฑ ๋ฌธ์ œ์— ๋Œ€ํ•œ ์ถฉ๋ถ„ํ•œ ๋ถ„์„ ๋ถ€์žฌ

โ€ข ํ›„์† ์—ฐ๊ตฌ: multi-modal sensor fusion, lifelong learning, ๋™์  ํ™˜๊ฒฝ์—์„œ์˜ robust equivariant models ๊ฐœ๋ฐœ ํ•„์š”

Evaluation

Novelty: 3/5 Technical Soundness: 4/5 Significance: 4/5 Clarity: 4/5 Overall: 4/5

์ดํ‰: ๋ณธ ๋…ผ๋ฌธ์€ SE(3)-equivariance ๊ด€์ ์—์„œ ๋กœ๋ด‡ ํ•™์Šต ๋ฐ ์ œ์–ด๋ฅผ ๋‹ค๋ฃจ๋Š” ํฌ๊ด„์ ์ด๊ณ  ์ฒด๊ณ„์ ์ธ ํŠœํ† ๋ฆฌ์–ผ ์„œ๋ฒ ์ด์ด๋‹ค. Group theory ๊ธฐ์ดˆ๋ถ€ํ„ฐ ์ตœ์‹  deep learning ์‘์šฉ๊นŒ์ง€ ํ†ต์ผ๋œ ์ˆ˜ํ•™ ํ‘œ๊ธฐ๋ฒ•์œผ๋กœ ์„ค๋ช…ํ•˜์—ฌ ๋กœ๋ด‡ ์ปค๋ฎค๋‹ˆํ‹ฐ์— ํฐ ๊ธฐ์—ฌ๋ฅผ ํ•  ์ˆ˜ ์žˆ์œผ๋‚˜, ์‹ค์ œ ๊ตฌํ˜„๊ณผ ์‹คํ—˜์  ๊ฒ€์ฆ์ด ๋”์šฑ ๊ฐ•ํ™”๋œ๋‹ค๋ฉด ์˜ํ–ฅ๋ ฅ์ด ๋”์šฑ ์ฆ๋Œ€๋  ๊ฒƒ์œผ๋กœ ํŒ๋‹จ๋œ๋‹ค.

← ๋ชฉ๋ก์œผ๋กœ ๋Œ์•„๊ฐ€๊ธฐ

๐ŸŽง Audio Overview

์ด ๋…ผ๋ฌธ ๋ฆฌ๋ทฐ๋ฅผ ํŒŸ์บ์ŠคํŠธํ˜• ์˜ค๋””์˜ค๋กœ ์ƒ์„ฑํ•ฉ๋‹ˆ๋‹ค. (Gemini ยท ํ‚ค๋Š” ๋ธŒ๋ผ์šฐ์ €์—๋งŒ ์ €์žฅ ยท ์™„์„ฑ๋ณธ์€ ์ด๋ฉ”์ผ๋กœ๋„ ์ „์†ก)
โ–ธ ๊ณ ๊ธ‰: ๊ตฌ์„ฑ ๋ฐฉํ–ฅ(๋Œ€๋ณธ ์ž‘์„ฑ ์ง€์นจ) ์ง์ ‘ ์ˆ˜์ •