Compose Your Policies! Improving Diffusion-based or Flow-based Robot Policies via Test-time Distribution-level Composition

์ €์ž: Jiahang Cao, Yize Huang, Hanzhong Guo, Rui Zhang, Mu Nan, Weijian Mai, Jiaxu Wang, Hao Cheng, Jingkai Sun, Gang Han, Wen Zhao, Qiang Zhang, Yijie Guo, Qihao Zheng, Chunfeng Song, Xiao Li, Ping Luo, Andrew F. Luo | ๋‚ ์งœ: 2025-10-01 | URL: https://arxiv.org/abs/2510.01068 📄 PDF


Essence

๋ณธ ๋…ผ๋ฌธ์€ General Policy Composition (GPC)๋ฅผ ์ œ์•ˆํ•˜์—ฌ ์‚ฌ์ „ํ•™์Šต๋œ diffusion ๋˜๋Š” flow ๊ธฐ๋ฐ˜ ๋กœ๋ด‡ ์ •์ฑ…๋“ค์˜ ๋ถ„ํฌ ์ˆ˜์ค€ ์ ์ˆ˜๋ฅผ convex ์กฐํ•ฉ์œผ๋กœ ๊ฒฐํ•ฉํ•จ์œผ๋กœ์จ, ์ถ”๊ฐ€ ํ•™์Šต ์—†์ด ๊ฐœ๋ณ„ ์ •์ฑ…๋ณด๋‹ค ์šฐ์ˆ˜ํ•œ ์„ฑ๋Šฅ์„ ๋‹ฌ์„ฑํ•œ๋‹ค.

Motivation

Achievement

How

Figure 2

Figure 2: Overview of our proposed General Policy Composition. Combining distributional

Originality

Limitation & Further Study

Evaluation

Novelty: 4/5 Technical Soundness: 3/5 Significance: 4/5 Clarity: 4/5 Overall: 4/5

์ดํ‰: ๋ณธ ๋…ผ๋ฌธ์€ ๊ธฐ์กด ์ •์ฑ… ํ™œ์šฉ์„ ํ†ตํ•œ ์„ฑ๋Šฅ ํ–ฅ์ƒ์ด๋ผ๋Š” ์‹ค์šฉ์  ๋ฌธ์ œ๋ฅผ ์ด๋ก ์  ๊ธฐ์ดˆ์™€ ํ•จ๊ป˜ ํ•ด๊ฒฐํ•˜๋ฉฐ, GPC๋Š” ๊ฐ„๋‹จํ•˜๋ฉด์„œ๋„ ํšจ๊ณผ์ ์ธ ๋ฐฉ๋ฒ•์œผ๋กœ ๋กœ๋ด‡ ํ•™์Šต์˜ ๋ฐ์ดํ„ฐ ํšจ์œจ์„ฑ ๋ฌธ์ œ์— ๋Œ€ํ•œ ์ƒˆ๋กœ์šด ๊ด€์ ์„ ์ œ์‹œํ•œ๋‹ค. ๊ด‘๋ฒ”์œ„ํ•œ ์‹คํ—˜ ๊ฒ€์ฆ๊ณผ ์šฐ์ˆ˜ํ•œ ์„ฑ๋Šฅ ํ–ฅ์ƒ์€ ๋กœ๋ด‡ ์ œ์–ด ๋ถ„์•ผ์— ์ƒ๋‹นํ•œ ๊ธฐ์—ฌ๋ฅผ ํ•œ๋‹ค.

← ๋ชฉ๋ก์œผ๋กœ ๋Œ์•„๊ฐ€๊ธฐ

๐ŸŽง Audio Overview

์ด ๋…ผ๋ฌธ ๋ฆฌ๋ทฐ๋ฅผ ํŒŸ์บ์ŠคํŠธํ˜• ์˜ค๋””์˜ค๋กœ ์ƒ์„ฑํ•ฉ๋‹ˆ๋‹ค. (Gemini ยท ํ‚ค๋Š” ๋ธŒ๋ผ์šฐ์ €์—๋งŒ ์ €์žฅ ยท ์™„์„ฑ๋ณธ์€ ์ด๋ฉ”์ผ๋กœ๋„ ์ „์†ก)
โ–ธ ๊ณ ๊ธ‰: ๊ตฌ์„ฑ ๋ฐฉํ–ฅ(๋Œ€๋ณธ ์ž‘์„ฑ ์ง€์นจ) ์ง์ ‘ ์ˆ˜์ •