PCGOD: Enhancing Object Detection With Synthetic Data for Scarce and Sensitive Computer Vision Tasks

Object detection models rely on large-scale, high-quality annotated datasets, which are often expensive, scarce, or restricted due to privacy concerns. Synthetic data generation has emerged as an alternative, yet existing approaches have limitations: generative models lack structured annotations and...

Full description

Saved in:
Bibliographic Details
Main Authors: Walid Remmas, Martin Lints, Jaak Joonas Uudmae
Format: Article
Language:English
Published: IEEE 2025-01-01
Series:IEEE Access
Subjects:
Online Access:https://ieeexplore.ieee.org/document/11009168/
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Object detection models rely on large-scale, high-quality annotated datasets, which are often expensive, scarce, or restricted due to privacy concerns. Synthetic data generation has emerged as an alternative, yet existing approaches have limitations: generative models lack structured annotations and precise spatial control, while game-engine-based datasets suffer from inaccuracies due to 3D bounding box projections, limited scene diversity, and poor handling of articulated objects. We propose PCGOD, an Unreal Engine-based framework that combines photorealistic rendering with comprehensive domain randomization to bridge the synthetic-to-real (sim2real) domain gap. PCGOD employs a marker-based extremity projection method that places markers at key points on object geometries and projects only visible markers to create tight-fitting bounding boxes. For articulated objects, our approach dynamically tracks skeletal pose changes, ensuring annotations adapt to varied configurations. The framework addresses sim2real transfer through six-dimensional randomization: background environments, model textures and poses, landscape textures, weather conditions, camera perspectives, and procedural scene composition. Evaluations using YOLOv11 and Salience-DETR in an object detection task demonstrate that our marker-based approach achieves up to 41.61% improvement in annotation accuracy over conventional methods. Models trained with just 10% real data supplemented by our synthetic data achieve over 80% of the performance of models trained on 100% real data. Moreover, mixed datasets containing 25% synthetic and 75% real data outperform pure real-data training by up to 5.1%. These results confirm that our approach significantly enhances synthetic data utility for object detection, offering an effective solution for domains with limited training data availability.
ISSN:2169-3536