“Play by play”: A dataset of handball and basketball game situations in a standardized spaceZenodo

This paper presents a synthetic dataset of labeled game situations in recordings of federated handball and basketball matches played in Galicia, Spain. The dataset consists of synthetic data generated from real video frames, including 308,805 labeled handball frames and 56,578 labeled basketball fra...

Full description

Saved in:
Bibliographic Details
Main Authors: Bruno Cabado, Bertha Guijarro-Berdiñas, Emilio J. Padrón
Format: Article
Language:English
Published: Elsevier 2025-02-01
Series:Data in Brief
Subjects:
Online Access:http://www.sciencedirect.com/science/article/pii/S2352340924012277
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1832576485600264192
author Bruno Cabado
Bertha Guijarro-Berdiñas
Emilio J. Padrón
author_facet Bruno Cabado
Bertha Guijarro-Berdiñas
Emilio J. Padrón
author_sort Bruno Cabado
collection DOAJ
description This paper presents a synthetic dataset of labeled game situations in recordings of federated handball and basketball matches played in Galicia, Spain. The dataset consists of synthetic data generated from real video frames, including 308,805 labeled handball frames and 56,578 labeled basketball frames extracted from 2105 handball and 383 basketball 5-s video clips.Experts manually labeled the video clips based on the respective sports, while the individual frames were automatically labeled using computer vision and machine learning techniques. The dataset encompasses seven classes of game situations: left attack, left counterattack, left penalty, right attack, right counterattack, right penalty, and timeout. In basketball, the penalty class refers to the free throws attempted by players after they have been fouled by an opposing player.Each frame in the dataset is assigned to one of these classes, considering the game situation and specific context. Importantly, the dataset does not contain actual video frames; instead, it provides a synthetic, normalized representation of each frame in JSON format. This tabular data includes player, referee, and ball positions on a normalized field, player and referee velocities, and key regions on the court. Positions of players, referees, and the ball were automatically inferred in each frame by an object detector, followed by a tracking step to detect object positions across frames and compute the velocity vectors. Finally, the obtained coordinates underwent normalization through a perspective transformation, ensuring that the data remained unaffected by variations in camera configurations across different arenas and camera setups. We refer to this standardized coordinate space as the 'unified space'.The dataset holds significant potential for reuse in various domains related to sports analytics and machine learning research. It can serve as a valuable resource for researchers, coaches, and sports enthusiasts, contributing to improvements in player performance, game strategies, match retransmissions, and sports-related technologies.
format Article
id doaj-art-7ed0c1c5cf66487dbcbfa7e83fb9f57c
institution Kabale University
issn 2352-3409
language English
publishDate 2025-02-01
publisher Elsevier
record_format Article
series Data in Brief
spelling doaj-art-7ed0c1c5cf66487dbcbfa7e83fb9f57c2025-01-31T05:11:44ZengElsevierData in Brief2352-34092025-02-0158111265“Play by play”: A dataset of handball and basketball game situations in a standardized spaceZenodoBruno Cabado0Bertha Guijarro-Berdiñas1Emilio J. Padrón2Universidade da Coruña, CITIC Research Center, A Coruña 15071, Spain; CINFO CONTENIDOS INFORMATIVOS PERSONALIZADOS SL, Ciudad de las TIC, A Coruña 15008, Spain; Corresponding author at: Universidade da Coruña, CITIC Research Center, A Coruña 15071, Spain.Universidade da Coruña, CITIC Research Center, A Coruña 15071, SpainUniversidade da Coruña, CITIC Research Center, A Coruña 15071, SpainThis paper presents a synthetic dataset of labeled game situations in recordings of federated handball and basketball matches played in Galicia, Spain. The dataset consists of synthetic data generated from real video frames, including 308,805 labeled handball frames and 56,578 labeled basketball frames extracted from 2105 handball and 383 basketball 5-s video clips.Experts manually labeled the video clips based on the respective sports, while the individual frames were automatically labeled using computer vision and machine learning techniques. The dataset encompasses seven classes of game situations: left attack, left counterattack, left penalty, right attack, right counterattack, right penalty, and timeout. In basketball, the penalty class refers to the free throws attempted by players after they have been fouled by an opposing player.Each frame in the dataset is assigned to one of these classes, considering the game situation and specific context. Importantly, the dataset does not contain actual video frames; instead, it provides a synthetic, normalized representation of each frame in JSON format. This tabular data includes player, referee, and ball positions on a normalized field, player and referee velocities, and key regions on the court. Positions of players, referees, and the ball were automatically inferred in each frame by an object detector, followed by a tracking step to detect object positions across frames and compute the velocity vectors. Finally, the obtained coordinates underwent normalization through a perspective transformation, ensuring that the data remained unaffected by variations in camera configurations across different arenas and camera setups. We refer to this standardized coordinate space as the 'unified space'.The dataset holds significant potential for reuse in various domains related to sports analytics and machine learning research. It can serve as a valuable resource for researchers, coaches, and sports enthusiasts, contributing to improvements in player performance, game strategies, match retransmissions, and sports-related technologies.http://www.sciencedirect.com/science/article/pii/S2352340924012277SportsPlayersBallPositionVelocityNormalized
spellingShingle Bruno Cabado
Bertha Guijarro-Berdiñas
Emilio J. Padrón
“Play by play”: A dataset of handball and basketball game situations in a standardized spaceZenodo
Data in Brief
Sports
Players
Ball
Position
Velocity
Normalized
title “Play by play”: A dataset of handball and basketball game situations in a standardized spaceZenodo
title_full “Play by play”: A dataset of handball and basketball game situations in a standardized spaceZenodo
title_fullStr “Play by play”: A dataset of handball and basketball game situations in a standardized spaceZenodo
title_full_unstemmed “Play by play”: A dataset of handball and basketball game situations in a standardized spaceZenodo
title_short “Play by play”: A dataset of handball and basketball game situations in a standardized spaceZenodo
title_sort play by play a dataset of handball and basketball game situations in a standardized spacezenodo
topic Sports
Players
Ball
Position
Velocity
Normalized
url http://www.sciencedirect.com/science/article/pii/S2352340924012277
work_keys_str_mv AT brunocabado playbyplayadatasetofhandballandbasketballgamesituationsinastandardizedspacezenodo
AT berthaguijarroberdinas playbyplayadatasetofhandballandbasketballgamesituationsinastandardizedspacezenodo
AT emiliojpadron playbyplayadatasetofhandballandbasketballgamesituationsinastandardizedspacezenodo