MMSD2.0: Towards a Reliable Multi-modal Sarcasm Detection System

์ €์ž: Mayur Wankhade, Annavarapu Chandra Sekhara Rao, Chaitanya Kulkarni | ๋‚ ์งœ: 2023 | URL: https://arxiv.org/abs/2307.07135 📄 PDF


Essence

Figure 2

Figure 2: Overall process of construction MMSD2.0 dataset. Given the example in (a), Spurious Cues Removal

MMSD ๋ฒค์น˜๋งˆํฌ์˜ ํ—ˆ์œ„ ์‹ ํ˜ธ(spurious cues)์™€ ๋ถˆํ•ฉ๋ฆฌํ•œ ์ฃผ์„์„ ์ œ๊ฑฐํ•œ MMSD2.0 ๋ฐ์ดํ„ฐ์…‹๊ณผ ๋‹ค์ค‘ ๊ด€์  CLIP ํ”„๋ ˆ์ž„์›Œํฌ๋ฅผ ์ œ์•ˆํ•˜์—ฌ ์‹ ๋ขฐํ•  ์ˆ˜ ์žˆ๋Š” ๋ฉ€ํ‹ฐ๋ชจ๋‹ฌ ํ’์ž ํƒ์ง€ ์‹œ์Šคํ…œ ๊ตฌ์ถ•.

Motivation

Achievement

Figure 2

Figure 2: Overall process of construction MMSD2.0 dataset. Given the example in (a), Spurious Cues Removal

How

Originality

Limitation & Further Study

Evaluation

Novelty: 4/5 Technical Soundness: 3/5 Significance: 4/5 Clarity: 4/5 Overall: 4/5

์ดํ‰: ๋ฉ€ํ‹ฐ๋ชจ๋‹ฌ ํ’์ž ํƒ์ง€ ์—ฐ๊ตฌ์˜ ๊ธฐ์ดˆ๊ฐ€ ๋˜๋Š” ๋ฒค์น˜๋งˆํฌ์˜ ๊ทผ๋ณธ์ ์ธ ๋ฌธ์ œ๋ฅผ ์ตœ์ดˆ๋กœ ์ง€์ ํ•˜๊ณ  ์ด๋ฅผ ์ฒด๊ณ„์ ์œผ๋กœ ๊ฐœ์„ ํ•œ MMSD2.0์„ ์ œ์‹œํ•œ ์ ์ด ๋งค์šฐ ๊ฐ€์น˜์žˆ๋‹ค. ์ œ์•ˆ๋œ Multi-view CLIP์€ ๊ฐ„๊ฒฐํ•˜๋ฉด์„œ๋„ ์šฐ์ˆ˜ํ•œ ์„ฑ๋Šฅ์„ ๋ณด์—ฌ์ฃผ๋ฉฐ, ์ถฉ๋ถ„ํžˆ ์‹ ๋ขฐํ•  ์ˆ˜ ์žˆ๋Š” ๋ฒค์น˜๋งˆํฌ์™€ ๋ฐฉ๋ฒ•์„ ์ œ๊ณตํ•จ์œผ๋กœ์จ ํ–ฅํ›„ ์—ฐ๊ตฌ์˜ ๋ฐœ์ „์— ํฌ๊ฒŒ ๊ธฐ์—ฌํ•  ๊ฒƒ์œผ๋กœ ์˜ˆ์ƒ๋œ๋‹ค.

๊ฐ™์ด ๋ณด๋ฉด ์ข‹์€ ๋…ผ๋ฌธ

๊ธฐ๋ฐ˜ ์—ฐ๊ตฌ
๋ฉ€ํ‹ฐ๋ชจ๋‹ฌ ํ’์ž ํƒ์ง€๋ฅผ ์œ„ํ•œ ๋ฒค์น˜๋งˆํฌ ๋ฐ์ดํ„ฐ์…‹ ๊ตฌ์ถ•์˜ ๊ธฐ๋ฐ˜์ด ๋˜๋Š” ์—ฐ๊ตฌ๋ฅผ ์ œ๊ณตํ•œ๋‹ค.
๋‹ค๋ฅธ ์ ‘๊ทผ
Bio-SIEVE๋Š” ๋ฉ€ํ‹ฐ๋ชจ๋‹ฌ ์˜ํ•™ QA ๋ฒค์น˜๋งˆํฌ๋กœ, MMSD2.0๊ณผ ๋งˆ์ฐฌ๊ฐ€์ง€๋กœ ๋ฉ€ํ‹ฐ๋ชจ๋‹ฌ ๋ฐ์ดํ„ฐ์˜ ์‹ ๋ขฐ์„ฑ ๋ฐ ํŽธํ–ฅ ์ œ๊ฑฐ ์ด์Šˆ๋ฅผ ๋‹ค๋ฃน๋‹ˆ๋‹ค.
๋‹ค๋ฅธ ์ ‘๊ทผ
๋ฐ์ดํ„ฐ์…‹์˜ ํ—ˆ์œ„ ์‹ ํ˜ธ ๋ฌธ์ œ๋ฅผ ํ•ด๊ฒฐํ•˜๊ธฐ ์œ„ํ•œ ๊ฐ•๊ฑดํ•œ ๋ฉ€ํ‹ฐ๋ชจ๋‹ฌ ํ•™์Šต ๋ฐฉ๋ฒ•๋ก ์„ ์ œ์•ˆํ•œ๋‹ค.
๋‹ค๋ฅธ ์ ‘๊ทผ
์ค‘๊ตญ์–ด ๊ธฐ๋ฐ˜ ๋ณตํ•ฉ ํŒฉํŠธ์ฒดํ‚น ๋ฒค์น˜๋งˆํฌ ๊ตฌ์ถ•๊ณผ ํƒ์ง€ ๋ฉ”ํŠธ๋ฆญ ์ œ์‹œ๋กœ, ์‹ ๋ขฐ์„ฑ ์žˆ๋Š” ๋ฉ€ํ‹ฐ๋ชจ๋‹ฌ ํ’์ž ํƒ์ง€ ์‹œ์Šคํ…œ ์ธก๋ฉด์—์„œ ์ถ”๊ฐ€ ์‹œ๊ฐ์„ ์ œ๊ณตํ•œ๋‹ค.
๋‹ค๋ฅธ ์ ‘๊ทผ
551์€ ๋Œ€๊ทœ๋ชจ ๋ฉ€ํ‹ฐ๋ชจ๋‹ฌ ์ฐจํŠธ ๋ฐ์ดํ„ฐ์…‹์„ ํ†ตํ•œ ์ฐจํŠธ ์ดํ•ด ๋ชจ๋ธ์„ ์ œ์•ˆํ•˜์—ฌ ๋ฉ€ํ‹ฐ๋ชจ๋‹ฌ ์ •๋ณด ์ดํ•ด๋ผ๋Š” ์ธก๋ฉด์—์„œ MMSD2.0์˜ ํ’์ž ํƒ์ง€์™€ ์œ ์‚ฌํ•œ ๋ฌธ์ œ์— ์ ‘๊ทผํ•œ๋‹ค.
๋‹ค๋ฅธ ์ ‘๊ทผ
DEFAME ๋…ผ๋ฌธ์€ ๋ฉ€ํ‹ฐ๋ชจ๋‹ฌ ์ฆ๊ฑฐ ๊ธฐ๋ฐ˜ ๋™์  ํŒฉํŠธ์ฒดํ‚น ๋ฐฉ์‹์„ ์‚ฌ์šฉํ•ด MMSD2.0๊ณผ ๋‹ค๋ฅธ ์‹œ๊ฐ์  ์ ‘๊ทผ์„ ๋น„๊ตํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.
ํ›„์† ์—ฐ๊ตฌ
๊ณผํ•™์  ์ฃผ์žฅ ํŒฉํŠธ ๊ฒ€์ฆ์„ ์•ฝํ•œ ์ง€๋„ ๋ฐ์ดํ„ฐ๋กœ ๋ณด์™„ํ•˜๋Š” ๋ฐฉ๋ฒ•์„ ์ ์šฉํ•˜์—ฌ, ๋ฉ€ํ‹ฐ๋ชจ๋‹ฌ ํŒจ๋Ÿฌ๋””ยทํ’์ž ํƒ์ง€์— ์•ฝ์ง€๋„ ๊ธฐ๋ฐ˜ ๋ชจ๋ธ ๋ฐœ์ „ ๊ฐ€๋Šฅ์„ฑ์„ ๋ณด์—ฌ์ค€๋‹ค.
๋ฐ˜๋ก /๋น„ํŒ
541์€ ์‚ฌ์‹ค ๊ฒ€์ฆ์—์„œ ๋ฐ˜์ฆ ์ฆ๊ฑฐ ๋ถ€์กฑ ๋ฌธ์ œ๋ฅผ ๋ถ„์„ํ•˜์—ฌ, ํ—ˆ์œ„ ์‹ ํ˜ธ ์ œ๊ฑฐ ์ค‘์‹ฌ์˜ MMSD2.0๊ณผ ๋Œ€์กฐ์ ์œผ๋กœ ๋…ผ์˜ํ•  ์ˆ˜ ์žˆ๋‹ค.
← ๋ชฉ๋ก์œผ๋กœ ๋Œ์•„๊ฐ€๊ธฐ

๐ŸŽง Audio Overview

์ด ๋…ผ๋ฌธ ๋ฆฌ๋ทฐ๋ฅผ ํŒŸ์บ์ŠคํŠธํ˜• ์˜ค๋””์˜ค๋กœ ์ƒ์„ฑํ•ฉ๋‹ˆ๋‹ค. (Gemini ยท ํ‚ค๋Š” ๋ธŒ๋ผ์šฐ์ €์—๋งŒ ์ €์žฅ ยท ์™„์„ฑ๋ณธ์€ ์ด๋ฉ”์ผ๋กœ๋„ ์ „์†ก)
โ–ธ ๊ณ ๊ธ‰: ๊ตฌ์„ฑ ๋ฐฉํ–ฅ(๋Œ€๋ณธ ์ž‘์„ฑ ์ง€์นจ) ์ง์ ‘ ์ˆ˜์ •