Self-Supervised Learning from Images with a Joint-Embedding Predictive Architecture

์ €์ž: Mahmoud Assran, Quentin Duval, Ishan Misra, Piotr Bojanowski, Pascal Vincent, Michael Rabbat, Yann LeCun, Nicolas Ballas | ๋‚ ์งœ: 2023-01-19 | URL: https://arxiv.org/abs/2301.08243 📄 PDF


Essence

Figure 3

Figure 3. I-JEPA. The Image-based Joint-Embedding Predictive

I-JEPA๋Š” ์†์œผ๋กœ ๋งŒ๋“  ๋ฐ์ดํ„ฐ ์ฆ๊ฐ• ์—†์ด ์ด๋ฏธ์ง€์˜ ๋ฌธ๋งฅ ๋ธ”๋ก์œผ๋กœ๋ถ€ํ„ฐ ๋Œ€์ƒ ๋ธ”๋ก์˜ ํ‘œํ˜„์„ ์˜ˆ์ธกํ•˜์—ฌ ์˜๋ฏธ๋ก ์  ์ด๋ฏธ์ง€ ํ‘œํ˜„์„ ํ•™์Šตํ•˜๋Š” Joint-Embedding Predictive Architecture ๊ธฐ๋ฐ˜์˜ ์ž๊ธฐ ์ง€๋„ ํ•™์Šต ๋ฐฉ๋ฒ•์ด๋‹ค.

Motivation

Achievement

Figure 1

Figure 1. ImageNet Linear Evaluation. The I-JEPA method

How

Figure 3

Figure 3. I-JEPA. The Image-based Joint-Embedding Predictive

Originality

Limitation & Further Study

Evaluation

Novelty: 4/5 Technical Soundness: 3/5 Significance: 4/5 Clarity: 4/5 Overall: 4/5

์ดํ‰: I-JEPA๋Š” ํ‘œํ˜„ ๊ณต๊ฐ„์—์„œ์˜ ์˜ˆ์ธก์ด๋ผ๋Š” ์ฐฝ์˜์  ์•„์ด๋””์–ด๋กœ ์†์œผ๋กœ ๋งŒ๋“  ์ฆ๊ฐ•์„ ์ œ๊ฑฐํ•˜๋ฉด์„œ๋„ ๋†’์€ ์˜๋ฏธ๋ก ์  ํ‘œํ˜„์„ ํ•™์Šตํ•˜๊ณ , ๋›ฐ์–ด๋‚œ ๊ณ„์‚ฐ ํšจ์œจ์„ฑ์œผ๋กœ ์ž๊ธฐ ์ง€๋„ ํ•™์Šต์˜ ์‹ค์šฉ์„ฑ์„ ํฌ๊ฒŒ ํ–ฅ์ƒ์‹œํ‚จ ์ค‘์š”ํ•œ ๊ธฐ์—ฌ์ด๋‹ค.

← ๋ชฉ๋ก์œผ๋กœ ๋Œ์•„๊ฐ€๊ธฐ

๐ŸŽง Audio Overview

์ด ๋…ผ๋ฌธ ๋ฆฌ๋ทฐ๋ฅผ ํŒŸ์บ์ŠคํŠธํ˜• ์˜ค๋””์˜ค๋กœ ์ƒ์„ฑํ•ฉ๋‹ˆ๋‹ค. (Gemini ยท ํ‚ค๋Š” ๋ธŒ๋ผ์šฐ์ €์—๋งŒ ์ €์žฅ ยท ์™„์„ฑ๋ณธ์€ ์ด๋ฉ”์ผ๋กœ๋„ ์ „์†ก)
โ–ธ ๊ณ ๊ธ‰: ๊ตฌ์„ฑ ๋ฐฉํ–ฅ(๋Œ€๋ณธ ์ž‘์„ฑ ์ง€์นจ) ์ง์ ‘ ์ˆ˜์ •