SauDial: The Saudi Arabic dialects game localization datasetMendeley Data

Content creation and localization for video games demand substantial effort from script writers and localization teams. Consequently, we present SauDial, the Saudi Arabic Dialects Game Localization Parallel Dataset, a collection of Saudi dialectal expressions tailored for localization-related tasks....

Full description

Saved in:
Bibliographic Details
Main Authors: Naif Alanazi, Mohammed Al-Batineh, Hussein Abu-Rayyash
Format: Article
Language:English
Published: Elsevier 2025-10-01
Series:Data in Brief
Subjects:
Online Access:http://www.sciencedirect.com/science/article/pii/S2352340925006304
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Content creation and localization for video games demand substantial effort from script writers and localization teams. Consequently, we present SauDial, the Saudi Arabic Dialects Game Localization Parallel Dataset, a collection of Saudi dialectal expressions tailored for localization-related tasks. The corpus features samples from four Saudi dialects, namely Najdi, Hijazi, Janoubi, and Eastern. The dataset was first produced through an AI‑driven process informed by cultural knowledge, linguistic expertise, and game‑specific context, then manually cleaned, refined, and revised to ensure dialectal accuracy, tonal appropriateness, and cultural and semantic fidelity. Each entry contains an English source line, a Modern Standard Arabic (MSA) translation, and a dialectal counterpart together with context clues, age ratings, and linguistic notes. The dataset spans a broad array of scenarios relevant to multiple game genres and tonal indicators, and it aligns with the General Authority of Media Regulation (GCAM) official rating system. In addition, it opens avenues for research in Translation, Cultural, Localization, and Game Studies, while in educational settings it can support translation and localization courses and serve as a translation memory that aids professional translators and localizers. To the best of our knowledge, SauDial is the first dataset of its kind in game localization and offers a foundation that can strengthen the authenticity and cultural resonance of games localized for the Saudi market.
ISSN:2352-3409