์ ์: Jake Bruce, Michael Dennis, Ashley Edwards, Jack Parker-Holder, Yuge Shi, Edward Hughes, Matthew Lai, Aditi Mavalankar, Richie Steigerwald, Chris Apps, Yusuf Aytar, Sarah Bechtle, Feryal Behbahani, Stephanie Chan, Nicolas Heess, Lucy Gonzalez, Simon Osindero, Sherjil Ozair, Scott Reed, Jingwei Zhang, Konrad Zolna, Jeff Clune, Nando de Freitas, Satinder Singh, Tim Rocktรคschel | ๋ ์ง: 2024-02-23 | URL: https://arxiv.org/abs/2402.15391 📄 PDF
Figure 1 | A whole new world: Genie is capable of converting a variety of different prompts into
Genie๋ ์ธํฐ๋ท ๋น๋์ค๋ก๋ถํฐ ์์ ํ ๋น๊ฐ๋ ๋ฐฉ์์ผ๋ก ํ์ต๋ ์ฒซ ๋ฒ์งธ ์์ฑํ ์ธํฐ๋ํฐ๋ธ ํ๊ฒฝ์ผ๋ก, ํ ์คํธ, ์ด๋ฏธ์ง, ์ค์ผ์น ๋ฑ ๋ค์ํ ํ๋กฌํํธ๋ก๋ถํฐ ํ๋ ์ ๋จ์๋ก ์ ์ด ๊ฐ๋ฅํ ๊ฐ์ ์ธ๊ณ๋ฅผ ์์ฑํ ์ ์๋ค.
Figure 2 | Diverse trajectories: Genie is a gen-
Figure 3 | Genie model training: Genie takes in ๐frames of video as input, tokenizes them into
์ดํ: Genie๋ ๋น๊ฐ๋ ํ๋ ํ์ต๊ณผ ์ธํฐ๋ํฐ๋ธ ํ๊ฒฝ ์์ฑ์ ์๋ก์ด ํจ๋ฌ๋ค์์ ์ ์ํ๋ ๋งค์ฐ ํ์ ์ ์ธ ์ฐ๊ตฌ๋ก, Foundation Model ๊ท๋ชจ์์ ํ๋ ์ ๋จ์ ์ ์ด์ฑ์ ๋ฌ์ฑํ๋ฉฐ ๋ฏธ๋์ ์ผ๋ฐํ๋ ์์ด์ ํธ ํ๋ จ์ ์ํ ์ค์ํ ๊ธฐ์ด๋ฅผ ๋ง๋ จํ๋ค.