LERF: Language Embedded Radiance Fields

์ €์ž: Justin Kerr, Chung Min Kim, Ken Goldberg, Angjoo Kanazawa, Matthew Tancik | ๋‚ ์งœ: 2023-03-16 | URL: https://arxiv.org/abs/2303.09553 📄 PDF


Essence

Figure 1

Figure 1: Language Embedded Radiance Fields (LERF). LERF grounds CLIP representations in a dense, multi-scale 3D ๏ฌeld. A

LERF๋Š” CLIP ์ž„๋ฒ ๋”ฉ์„ NeRF์— ์ •ํ•ฉํ•˜์—ฌ ์ž์—ฐ์–ด๋กœ 3D ์žฅ๋ฉด์„ ์ฟผ๋ฆฌํ•  ์ˆ˜ ์žˆ๋„๋ก ํ•˜๋Š” ๋ฐฉ๋ฒ•์ด๋‹ค. ๋‹ค์ค‘ ์Šค์ผ€์ผ ์–ธ์–ด ํ•„๋“œ๋ฅผ ํ•™์Šตํ•จ์œผ๋กœ์จ ์‹œ๊ฐ์  ์†์„ฑ, ์˜๋ฏธ๋ก , ์ถ”์ƒ์  ๊ฐœ๋…, ์žฅ๊ธฐ ๊ผฌ๋ฆฌ ๊ฐ์ฒด ๋“ฑ ๋‹ค์–‘ํ•œ ํ˜•ํƒœ์˜ ์ž์—ฐ์–ด ์งˆ์˜์— ์‹ค์‹œ๊ฐ„์œผ๋กœ ์‘๋‹ตํ•œ๋‹ค.

Motivation

Achievement

Figure 3

Figure 3: Results with LERF for 5 in-the-wild scenes. Each image shows a visual rendering of the LERF (Sec. 3), along wi

How

Figure 2

Figure 2: LERF Optimization: Left: LERF represents a ๏ฌeld of 3D volumes, parameterized by position x, y, z and scale s (

Originality

Limitation & Further Study

Evaluation

Novelty: 4/5 Technical Soundness: 3/5 Significance: 4/5 Clarity: 4/5 Overall: 4/5

์ดํ‰: LERF๋Š” NeRF์™€ CLIP์„ ์ฐฝ์˜์ ์œผ๋กœ ๊ฒฐํ•ฉํ•˜์—ฌ 3D ์žฅ๋ฉด์˜ ๋ฐ€์ง‘ ์ž์—ฐ์–ด ์ฟผ๋ฆฌ๋ฅผ ์‹คํ˜„ํ•œ ์šฐ์ˆ˜ํ•œ ๋…ผ๋ฌธ์ด๋‹ค. ๋‹ค์ค‘ ์Šค์ผ€์ผ ์–ธ์–ด ํ•„๋“œ, ๋งˆ์Šคํฌ ๋น„์˜์กด ์„ค๊ณ„, ์‹ค์‹œ๊ฐ„ ์„ฑ๋Šฅ์€ ์‹ค์šฉ์  ๊ฐ€์น˜๊ฐ€ ํฌ๋ฉฐ, ๋กœ๋ด‡๊ณตํ•™ ๋ฐ 3D UI ๋ถ„์•ผ์—์„œ ์ฆ‰๊ฐ์ ์ธ ์˜ํ–ฅ์„ ๋ฏธ์น  ์ˆ˜ ์žˆ๋‹ค.

← ๋ชฉ๋ก์œผ๋กœ ๋Œ์•„๊ฐ€๊ธฐ

๐ŸŽง Audio Overview

์ด ๋…ผ๋ฌธ ๋ฆฌ๋ทฐ๋ฅผ ํŒŸ์บ์ŠคํŠธํ˜• ์˜ค๋””์˜ค๋กœ ์ƒ์„ฑํ•ฉ๋‹ˆ๋‹ค. (Gemini ยท ํ‚ค๋Š” ๋ธŒ๋ผ์šฐ์ €์—๋งŒ ์ €์žฅ ยท ์™„์„ฑ๋ณธ์€ ์ด๋ฉ”์ผ๋กœ๋„ ์ „์†ก)
โ–ธ ๊ณ ๊ธ‰: ๊ตฌ์„ฑ ๋ฐฉํ–ฅ(๋Œ€๋ณธ ์ž‘์„ฑ ์ง€์นจ) ์ง์ ‘ ์ˆ˜์ •