Violence Detection From Industrial Surveillance Videos Using Deep Learning

The integration of Internet of Things (IoT) technology in industrial surveillance and the proliferation of surveillance cameras in smart cities has empowered the development of real-time activity recognition and violence detection systems, respectively. These systems are crucial in enhancing safety...

Full description

Saved in:

Bibliographic Details
Main Authors:	Hamza Khan, Xiaohong Yuan, Letu Qingge, Kaushik Roy
Format:	Article
Language:	English
Published:	IEEE 2025-01-01
Series:	IEEE Access
Subjects:	Activity detection industrial surveillance violence detection computer vision deep learning
Online Access:	https://ieeexplore.ieee.org/document/10844266/
Tags:	Add Tag No Tags, Be the first to tag this record!

_version_	1832584007720632320
author	Hamza Khan Xiaohong Yuan Letu Qingge Kaushik Roy
author_facet	Hamza Khan Xiaohong Yuan Letu Qingge Kaushik Roy
author_sort	Hamza Khan
collection	DOAJ
description	The integration of Internet of Things (IoT) technology in industrial surveillance and the proliferation of surveillance cameras in smart cities has empowered the development of real-time activity recognition and violence detection systems, respectively. These systems are crucial in enhancing safety measures, improving operational efficiency, reducing accident risks, and providing automatic monitoring in dynamic environments. In this paper, we propose a three-stage deep learning-based end-to-end framework for violence detection. The lightweight convolutional neural network (CNN) model initially identifies individuals in the video stream to minimize the processing of irrelevant frames. Subsequently, a sequence of 50 frames with identified persons is directed to a 3D-CNN model, where the spatiotemporal features of these sequences are extracted and passed to the classifier. Unlike traditional methods that process all frames indiscriminately, this targeted filtering mechanism allows computational resources to be allocated more effectively. Next, SoftMax classifier processes the extracted features to categorize frame sequences as violent or non-violent. The classifier’s predictions trigger real-time alerts, enabling rapid intervention. The modularity of this stage supports adaptability to new datasets, as it can leverage transfer learning to generalize across diverse surveillance contexts. Unlike traditional systems constrained by hand-crafted features, this design dynamically learns from data, reducing reliance on prior domain knowledge and improving generalizability. We conducted experiments on violence detection across four datasets, comparing the performance of our model with convolutional CNN models. A computation time analysis revealed that our lightweight model requires significantly less computation time, demonstrating its efficiency. We also conducted cross-data experiments to assess the model’s capacity to perform consistently across various datasets. Experiments show that our proposed model outperforms the methods mentioned in the existing literature. These experiments demonstrate that the model’s adaptability and robustness need to be improved.
format	Article
id	doaj-art-f8df41443fa649a1b1b3eaf1d9b49b46
institution	Kabale University
issn	2169-3536
language	English
publishDate	2025-01-01
publisher	IEEE
record_format	Article
series	IEEE Access
spelling	doaj-art-f8df41443fa649a1b1b3eaf1d9b49b462025-01-28T00:01:41ZengIEEEIEEE Access2169-35362025-01-0113153631537510.1109/ACCESS.2025.353121310844266Violence Detection From Industrial Surveillance Videos Using Deep LearningHamza Khan0https://orcid.org/0000-0003-1857-7468Xiaohong Yuan1https://orcid.org/0000-0002-1295-9812Letu Qingge2https://orcid.org/0000-0002-9745-9584Kaushik Roy3https://orcid.org/0000-0002-9026-5322Department of Computer Science, North Carolina Agricultural and Technical State University, Greensboro, NC, USADepartment of Computer Science, North Carolina Agricultural and Technical State University, Greensboro, NC, USADepartment of Computer Science, North Carolina Agricultural and Technical State University, Greensboro, NC, USADepartment of Computer Science, North Carolina Agricultural and Technical State University, Greensboro, NC, USAThe integration of Internet of Things (IoT) technology in industrial surveillance and the proliferation of surveillance cameras in smart cities has empowered the development of real-time activity recognition and violence detection systems, respectively. These systems are crucial in enhancing safety measures, improving operational efficiency, reducing accident risks, and providing automatic monitoring in dynamic environments. In this paper, we propose a three-stage deep learning-based end-to-end framework for violence detection. The lightweight convolutional neural network (CNN) model initially identifies individuals in the video stream to minimize the processing of irrelevant frames. Subsequently, a sequence of 50 frames with identified persons is directed to a 3D-CNN model, where the spatiotemporal features of these sequences are extracted and passed to the classifier. Unlike traditional methods that process all frames indiscriminately, this targeted filtering mechanism allows computational resources to be allocated more effectively. Next, SoftMax classifier processes the extracted features to categorize frame sequences as violent or non-violent. The classifier’s predictions trigger real-time alerts, enabling rapid intervention. The modularity of this stage supports adaptability to new datasets, as it can leverage transfer learning to generalize across diverse surveillance contexts. Unlike traditional systems constrained by hand-crafted features, this design dynamically learns from data, reducing reliance on prior domain knowledge and improving generalizability. We conducted experiments on violence detection across four datasets, comparing the performance of our model with convolutional CNN models. A computation time analysis revealed that our lightweight model requires significantly less computation time, demonstrating its efficiency. We also conducted cross-data experiments to assess the model’s capacity to perform consistently across various datasets. Experiments show that our proposed model outperforms the methods mentioned in the existing literature. These experiments demonstrate that the model’s adaptability and robustness need to be improved.https://ieeexplore.ieee.org/document/10844266/Activity detectionindustrial surveillanceviolence detectioncomputer visiondeep learning
spellingShingle	Hamza Khan Xiaohong Yuan Letu Qingge Kaushik Roy Violence Detection From Industrial Surveillance Videos Using Deep Learning IEEE Access Activity detection industrial surveillance violence detection computer vision deep learning
title	Violence Detection From Industrial Surveillance Videos Using Deep Learning
title_full	Violence Detection From Industrial Surveillance Videos Using Deep Learning
title_fullStr	Violence Detection From Industrial Surveillance Videos Using Deep Learning
title_full_unstemmed	Violence Detection From Industrial Surveillance Videos Using Deep Learning
title_short	Violence Detection From Industrial Surveillance Videos Using Deep Learning
title_sort	violence detection from industrial surveillance videos using deep learning
topic	Activity detection industrial surveillance violence detection computer vision deep learning
url	https://ieeexplore.ieee.org/document/10844266/
work_keys_str_mv	AT hamzakhan violencedetectionfromindustrialsurveillancevideosusingdeeplearning AT xiaohongyuan violencedetectionfromindustrialsurveillancevideosusingdeeplearning AT letuqingge violencedetectionfromindustrialsurveillancevideosusingdeeplearning AT kaushikroy violencedetectionfromindustrialsurveillancevideosusingdeeplearning

Violence Detection From Industrial Surveillance Videos Using Deep Learning

Similar Items