Img2Vocab: Explore Words Tied to Your Life With LLMs and Social Media Images

Psychological studies highlight the importance of combining new knowledge with one’s prior experience. Hence personalization for a learner plays a key role for vocabulary acquisition. However, this faces two challenges: probing a learner’s experiences in their lives and craftin...

Full description

Saved in:
Bibliographic Details
Main Authors: Kanta Yamaoka, Ko Watanabe, Koichi Kise, Andreas Dengel, Shoya Ishimaru
Format: Article
Language:English
Published: IEEE 2025-01-01
Series:IEEE Access
Subjects:
Online Access:https://ieeexplore.ieee.org/document/10851279/
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Psychological studies highlight the importance of combining new knowledge with one’s prior experience. Hence personalization for a learner plays a key role for vocabulary acquisition. However, this faces two challenges: probing a learner’s experiences in their lives and crafting tailored material for every different individual. With the prevalence of visual social media, such as Instagram, people share their photos from favorite moments, providing rich contexts, and emerging generative AI would create learning material in an effortless fashion. We prototyped an online vocabulary exploration system, which displays a learner’s selected photos from their Instagram along with a generated sentence using image recognition and a language model, GPT-3. The system lets a learner find new words that are strongly tied to their daily life with the approximated context. We carried out our within-subject design evaluation of the system with 23 participants with three conditions: contexts grounded with learner’s Instagram photos, contexts grounded from general images, and text-only modality. From learners’ recall task accuracy, we found that having a context grounded with a learner’s social image allowed them to find difficult words to quickly learn than having context generated by someone’s image, or text only modality—although this finding is statistically insignificant. The Zipf frequency comparisons revealed that generally having image-based context allowed learners to extract difficult vocabulary than having text-only context. We also discuss quantitative and qualitative results regarding participants’ acceptance of the personalization system using their personal photos from social media. Generally, they reported positive impressions for our system such as high engagement. While our system prioritizes user privacy with opt-in data control and secure design, we explore additional ethical considerations. This paves the way for a future where personalized language learning, grounded in real-world experiences and generative AI, benefits learners.
ISSN:2169-3536