Genie: Generative Interactive Environments

์ €์ž: Jake Bruce, Michael Dennis, Ashley Edwards, Jack Parker-Holder, Yuge Shi, Edward Hughes, Matthew Lai, Aditi Mavalankar, Richie Steigerwald, Chris Apps, Yusuf Aytar, Sarah Bechtle, Feryal Behbahani, Stephanie Chan, Nicolas Heess, Lucy Gonzalez, Simon Osindero, Sherjil Ozair, Scott Reed, Jingwei Zhang, Konrad Zolna, Jeff Clune, Nando de Freitas, Satinder Singh, Tim Rocktรคschel | ๋‚ ์งœ: 2024-02-23 | URL: https://arxiv.org/abs/2402.15391 📄 PDF


Essence

Figure 1

Figure 1 | A whole new world: Genie is capable of converting a variety of different prompts into

Genie๋Š” ์ธํ„ฐ๋„ท ๋น„๋””์˜ค๋กœ๋ถ€ํ„ฐ ์™„์ „ํžˆ ๋น„๊ฐ๋… ๋ฐฉ์‹์œผ๋กœ ํ•™์Šต๋œ ์ฒซ ๋ฒˆ์งธ ์ƒ์„ฑํ˜• ์ธํ„ฐ๋ž™ํ‹ฐ๋ธŒ ํ™˜๊ฒฝ์œผ๋กœ, ํ…์ŠคํŠธ, ์ด๋ฏธ์ง€, ์Šค์ผ€์น˜ ๋“ฑ ๋‹ค์–‘ํ•œ ํ”„๋กฌํ”„ํŠธ๋กœ๋ถ€ํ„ฐ ํ”„๋ ˆ์ž„ ๋‹จ์œ„๋กœ ์ œ์–ด ๊ฐ€๋Šฅํ•œ ๊ฐ€์ƒ ์„ธ๊ณ„๋ฅผ ์ƒ์„ฑํ•  ์ˆ˜ ์žˆ๋‹ค.

Motivation

Achievement

Figure 2

Figure 2 | Diverse trajectories: Genie is a gen-

How

Figure 3

Figure 3 | Genie model training: Genie takes in ๐‘‡frames of video as input, tokenizes them into

Originality

Limitation & Further Study

Evaluation

Novelty: 4/5 Technical Soundness: 4/5 Significance: 4/5 Clarity: 4/5 Overall: 4/5

์ดํ‰: Genie๋Š” ๋น„๊ฐ๋… ํ–‰๋™ ํ•™์Šต๊ณผ ์ธํ„ฐ๋ž™ํ‹ฐ๋ธŒ ํ™˜๊ฒฝ ์ƒ์„ฑ์˜ ์ƒˆ๋กœ์šด ํŒจ๋Ÿฌ๋‹ค์ž„์„ ์ œ์‹œํ•˜๋Š” ๋งค์šฐ ํ˜์‹ ์ ์ธ ์—ฐ๊ตฌ๋กœ, Foundation Model ๊ทœ๋ชจ์—์„œ ํ”„๋ ˆ์ž„ ๋‹จ์œ„ ์ œ์–ด์„ฑ์„ ๋‹ฌ์„ฑํ•˜๋ฉฐ ๋ฏธ๋ž˜์˜ ์ผ๋ฐ˜ํ™”๋œ ์—์ด์ „ํŠธ ํ›ˆ๋ จ์„ ์œ„ํ•œ ์ค‘์š”ํ•œ ๊ธฐ์ดˆ๋ฅผ ๋งˆ๋ จํ•œ๋‹ค.

← ๋ชฉ๋ก์œผ๋กœ ๋Œ์•„๊ฐ€๊ธฐ

๐ŸŽง Audio Overview

์ด ๋…ผ๋ฌธ ๋ฆฌ๋ทฐ๋ฅผ ํŒŸ์บ์ŠคํŠธํ˜• ์˜ค๋””์˜ค๋กœ ์ƒ์„ฑํ•ฉ๋‹ˆ๋‹ค. (Gemini ยท ํ‚ค๋Š” ๋ธŒ๋ผ์šฐ์ €์—๋งŒ ์ €์žฅ ยท ์™„์„ฑ๋ณธ์€ ์ด๋ฉ”์ผ๋กœ๋„ ์ „์†ก)
โ–ธ ๊ณ ๊ธ‰: ๊ตฌ์„ฑ ๋ฐฉํ–ฅ(๋Œ€๋ณธ ์ž‘์„ฑ ์ง€์นจ) ์ง์ ‘ ์ˆ˜์ •