Angus: efficient active learning strategies for provenance based intrusion detection

Abstract As modern attack methods become more concealed and complex, obtaining many labeled samples in big data streams is difficult. Active learning has long been used to achieve better intrusion detection performance by using only a small number of training samples. Intrusion behaviors can be desc...

Full description

Saved in:

Bibliographic Details
Main Authors:	Lin Wu, Yulai Xie, Jin Li, Dan Feng, Jinyuan Liang, Yafeng Wu
Format:	Article
Language:	English
Published:	SpringerOpen 2025-01-01
Series:	Cybersecurity
Subjects:	Provenance Intrusion detection Active learning The most similar graph query strategy The maximum difference query strategy
Online Access:	https://doi.org/10.1186/s42400-024-00311-y
Tags:	Add Tag No Tags, Be the first to tag this record!

_version_	1832571577546309632
author	Lin Wu Yulai Xie Jin Li Dan Feng Jinyuan Liang Yafeng Wu
author_facet	Lin Wu Yulai Xie Jin Li Dan Feng Jinyuan Liang Yafeng Wu
author_sort	Lin Wu
collection	DOAJ
description	Abstract As modern attack methods become more concealed and complex, obtaining many labeled samples in big data streams is difficult. Active learning has long been used to achieve better intrusion detection performance by using only a small number of training samples. Intrusion behaviors can be described by provenance graphs that record the dependency relationships between intrusion processes and the infected files. It is a challenge to develop active learning strategies that consider defining and selecting the most valuable provenance and ensure that the strategy for querying provenance is efficient. We present Angus, an active learning framework for provenance-based intrusion detection. We propose two novel active learning strategies: the most similar graph query strategy and the maximum difference query strategy. They either select samples to update the training set according to similarities of provenance graphs or preferentially select samples with low redundancy and large differences from the current training set. Besides, we also improve the above query strategies by using the parallel query to reduce detection time overheads. The experiments on various real-world applications demonstrate their performance and efficiency.
format	Article
id	doaj-art-4301928d28ed4ecd9744b08b5c68894e
institution	Kabale University
issn	2523-3246
language	English
publishDate	2025-01-01
publisher	SpringerOpen
record_format	Article
series	Cybersecurity
spelling	doaj-art-4301928d28ed4ecd9744b08b5c68894e2025-02-02T12:30:05ZengSpringerOpenCybersecurity2523-32462025-01-018111710.1186/s42400-024-00311-yAngus: efficient active learning strategies for provenance based intrusion detectionLin Wu0Yulai Xie1Jin Li2Dan Feng3Jinyuan Liang4Yafeng Wu5Hubei Engineering Research Center on Big Data Security, School of Cyber Science and Engineering, Huazhong University of Science and TechnologyHubei Engineering Research Center on Big Data Security, School of Cyber Science and Engineering, Huazhong University of Science and TechnologyHubei Engineering Research Center on Big Data Security, School of Cyber Science and Engineering, Huazhong University of Science and TechnologySchool of Science and Technology, Wuhan National Laboratory for Optoelectronics, Key Laboratory of Information Storage, Huazhong University of Science and TechnologyUniversity of British Columbia Vancouver British ColumbiaHubei Engineering Research Center on Big Data Security, School of Cyber Science and Engineering, Huazhong University of Science and TechnologyAbstract As modern attack methods become more concealed and complex, obtaining many labeled samples in big data streams is difficult. Active learning has long been used to achieve better intrusion detection performance by using only a small number of training samples. Intrusion behaviors can be described by provenance graphs that record the dependency relationships between intrusion processes and the infected files. It is a challenge to develop active learning strategies that consider defining and selecting the most valuable provenance and ensure that the strategy for querying provenance is efficient. We present Angus, an active learning framework for provenance-based intrusion detection. We propose two novel active learning strategies: the most similar graph query strategy and the maximum difference query strategy. They either select samples to update the training set according to similarities of provenance graphs or preferentially select samples with low redundancy and large differences from the current training set. Besides, we also improve the above query strategies by using the parallel query to reduce detection time overheads. The experiments on various real-world applications demonstrate their performance and efficiency.https://doi.org/10.1186/s42400-024-00311-yProvenanceIntrusion detectionActive learningThe most similar graph query strategyThe maximum difference query strategy
spellingShingle	Lin Wu Yulai Xie Jin Li Dan Feng Jinyuan Liang Yafeng Wu Angus: efficient active learning strategies for provenance based intrusion detection Cybersecurity Provenance Intrusion detection Active learning The most similar graph query strategy The maximum difference query strategy
title	Angus: efficient active learning strategies for provenance based intrusion detection
title_full	Angus: efficient active learning strategies for provenance based intrusion detection
title_fullStr	Angus: efficient active learning strategies for provenance based intrusion detection
title_full_unstemmed	Angus: efficient active learning strategies for provenance based intrusion detection
title_short	Angus: efficient active learning strategies for provenance based intrusion detection
title_sort	angus efficient active learning strategies for provenance based intrusion detection
topic	Provenance Intrusion detection Active learning The most similar graph query strategy The maximum difference query strategy
url	https://doi.org/10.1186/s42400-024-00311-y
work_keys_str_mv	AT linwu angusefficientactivelearningstrategiesforprovenancebasedintrusiondetection AT yulaixie angusefficientactivelearningstrategiesforprovenancebasedintrusiondetection AT jinli angusefficientactivelearningstrategiesforprovenancebasedintrusiondetection AT danfeng angusefficientactivelearningstrategiesforprovenancebasedintrusiondetection AT jinyuanliang angusefficientactivelearningstrategiesforprovenancebasedintrusiondetection AT yafengwu angusefficientactivelearningstrategiesforprovenancebasedintrusiondetection

Angus: efficient active learning strategies for provenance based intrusion detection

Similar Items