Cosmos-Reason1: From Physical Common Sense To Embodied Reasoning

์ €์ž: , , Alisson Azzolini, Junjie Bai, Hannah Brandon, Jiaxin Cao, Prithvijit Chattopadhyay, Huayu Chen, Jinju Chu, Yin Cui, Jenna Diamond, Yifan Ding, Liang Feng, Francesco Ferroni, Rama Govindaraju, Jinwei Gu, Siddharth Gururani, Imad El Hanafi, Zekun Hao, Jacob Huffman, Jingyi Jin, Brendan Johnson, Rizwan Khan, George Kurian, Elena Lantz, Nayeon Lee, Zhaoshuo Li, Xuan Li, Maosheng Liao, Tsung-Yi Lin, Yen-Chen Lin, Ming-Yu Liu, Xiangyu Lu, Alice Luo, Andrew Mathau, Yun Ni, Lindsey Pavao, Wei Ping, David W. Romero, Misha Smelyanskiy, Shuran Song, Lyne Tchapmi, Andrew Z. Wang, Boxin Wang, Haoxiang Wang, Fangyin Wei, Jiashu Xu, Yao Xu, Dinghao Yang, Xiaodong Yang, Zhuolin Yang, Jingxu Zhang, Xiaohui Zeng, Zhe Zhang | ๋‚ ์งœ: 2025-03-18 | URL: https://arxiv.org/abs/2503.15558 📄 PDF


Essence

Figure 1

Figure 1: An overview of Cosmos-Reason1. Cosmos-Reason1 contains two multimodal large language models of

NVIDIA์—์„œ ์ œ์‹œํ•œ Cosmos-Reason1์€ ๋น„๋””์˜ค๋ฅผ ์ž…๋ ฅ์œผ๋กœ ๋ฐ›์•„ ๋ฌผ๋ฆฌ์  ์ƒ์‹๊ณผ ๊ตฌ์ฒดํ™”๋œ ์ถ”๋ก (embodied reasoning)์„ ํ†ตํ•ด ์ž์—ฐ์–ธ์–ด๋กœ ์‹ ์ฒด์  ์˜์‚ฌ๊ฒฐ์ •์„ ์ƒ์„ฑํ•˜๋Š” ๋ฉ€ํ‹ฐ๋ชจ๋‹ฌ LLM์ž…๋‹ˆ๋‹ค. ๊ณ„์ธต์  ์˜จํ†จ๋กœ์ง€ ๊ธฐ๋ฐ˜ ๋ฐ์ดํ„ฐ ํ๋ ˆ์ด์…˜๊ณผ Physical AI SFT ๋ฐ RL ํ•™์Šต์œผ๋กœ ๋ฌผ๋ฆฌ์  AI ์ถ”๋ก  ๋Šฅ๋ ฅ์„ ๊ฐ•ํ™”ํ•ฉ๋‹ˆ๋‹ค.

Motivation

Achievement

Figure 1

Figure 1: An overview of Cosmos-Reason1. Cosmos-Reason1 contains two multimodal large language models of

How

Originality

Limitation & Further Study

Evaluation

Novelty: 4/5 Technical Soundness: 3/5 Significance: 4/5 Clarity: 4/5 Overall: 4/5

์ดํ‰: Cosmos-Reason1์€ ๋ฌผ๋ฆฌ์  AI ์ถ”๋ก ์˜ ๊ทผ๋ณธ์ ์ธ ๊ฐœ๋…ํ™”์—์„œ๋ถ€ํ„ฐ ๋ฒค์น˜๋งˆํฌ ๊ตฌ์ถ•, ๋ชจ๋ธ ํ•™์Šต๊นŒ์ง€ ์ผ๊ด€์„ฑ ์žˆ๊ฒŒ ์ ‘๊ทผํ•œ ํฌ๊ด„์  ์—ฐ๊ตฌ์ž…๋‹ˆ๋‹ค. ๋ฌผ๋ฆฌ ์ƒ์‹๊ณผ embodied reasoning์„ ์œ„ํ•œ ์ฒซ ์ฒด๊ณ„์  ์˜จํ†จ๋กœ์ง€ ์ •์˜์™€ rule-based RL ๋ณด์ƒ์˜ ์ž๋™ ์ƒ์„ฑ์ด๋ผ๋Š” ๋‘ ๊ฐ€์ง€ ์ฃผ์š” ๊ธฐ์—ฌ๊ฐ€ ๋‹๋ณด์ด๋ฉฐ, ์˜คํ”ˆ์†Œ์Šค ๊ณต๊ฐœ๋กœ ๋ฌผ๋ฆฌ์  AI ์ปค๋ฎค๋‹ˆํ‹ฐ์— ์ฆ‰๊ฐ์ ์ธ ์˜ํ–ฅ์„ ๋ฏธ์น  ๊ฐ€๋Šฅ์„ฑ์ด ๋†’์Šต๋‹ˆ๋‹ค.

← ๋ชฉ๋ก์œผ๋กœ ๋Œ์•„๊ฐ€๊ธฐ

๐ŸŽง Audio Overview

์ด ๋…ผ๋ฌธ ๋ฆฌ๋ทฐ๋ฅผ ํŒŸ์บ์ŠคํŠธํ˜• ์˜ค๋””์˜ค๋กœ ์ƒ์„ฑํ•ฉ๋‹ˆ๋‹ค. (Gemini ยท ํ‚ค๋Š” ๋ธŒ๋ผ์šฐ์ €์—๋งŒ ์ €์žฅ ยท ์™„์„ฑ๋ณธ์€ ์ด๋ฉ”์ผ๋กœ๋„ ์ „์†ก)
โ–ธ ๊ณ ๊ธ‰: ๊ตฌ์„ฑ ๋ฐฉํ–ฅ(๋Œ€๋ณธ ์ž‘์„ฑ ์ง€์นจ) ์ง์ ‘ ์ˆ˜์ •