Improving internet of vehicles research: A systematic preprocessing framework for the VeReMi datasetZenodo

The Vehicular Reference Misbehavior Dataset (VeReMi) is a vital resource for advancing Intelligent Transportation Systems (ITS) and the Internet of Vehicles (IoV). However, its large size (∼7 GB) and inherent class imbalance pose significant challenges for machine learning model development. This pa...

Full description

Saved in:

Bibliographic Details
Main Authors:	Aparup Roy, Debotosh Bhattacharjee, Ondrej Krejcar
Format:	Article
Language:	English
Published:	Elsevier 2025-06-01
Series:	Data in Brief
Subjects:	Vehicular reference misbehavior dataset (VeReMi) Intelligent transportation systems (ITS) Internet of vehicles (IoV) Intrusion detection systems (IDS) Data preprocessing Dataset optimization
Online Access:	http://www.sciencedirect.com/science/article/pii/S2352340925003312
Tags:	Add Tag No Tags, Be the first to tag this record!

Description
Summary:	The Vehicular Reference Misbehavior Dataset (VeReMi) is a vital resource for advancing Intelligent Transportation Systems (ITS) and the Internet of Vehicles (IoV). However, its large size (∼7 GB) and inherent class imbalance pose significant challenges for machine learning model development. This paper presents a preprocessing framework to enhance VeReMi’s usability and relevance. Through 10 % down-sampling, the dataset was reduced to ∼724MB, making it computationally manageable. Biases were addressed by balancing benign and malicious samples through synthesis and identifying benign instances using predefined criteria. A refined feature set, including key attributes like rcvTime, pos_0, pos_1, and attack_type (renamed attacker_type), was selected to improve machine learning compatibility. This preprocessing pipeline effectively maintains data integrity and preserves the representativeness of malicious patterns. The optimized dataset is well-suited for ITS and IoV applications, such as anomaly detection and network security, underscoring the crucial role of preprocessing in overcoming real-world constraints and enhancing model performance.
ISSN:	2352-3409

Improving internet of vehicles research: A systematic preprocessing framework for the VeReMi datasetZenodo

Similar Items