Application research of sample data generation based on improved Cycle-GAN in intrusion detection

The issue of slow data updates, insufficient data samples for certain intrusion categories, and imbalanced distributions between normal and abnormal data sets in standard intrusion detection data sets have been addressed through both data sample augmentation and detection model optimization. The int...

Full description

Saved in:
Bibliographic Details
Main Authors: ZENG Qingpeng, GUO Hangkai
Format: Article
Language:English
Published: POSTS&TELECOM PRESS Co., LTD 2025-04-01
Series:网络与信息安全学报
Subjects:
Online Access:http://www.cjnis.com.cn/thesisDetails#10.11959/j.issn.2096-109x.2025019
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:The issue of slow data updates, insufficient data samples for certain intrusion categories, and imbalanced distributions between normal and abnormal data sets in standard intrusion detection data sets have been addressed through both data sample augmentation and detection model optimization. The intrusion sample data was converted into graph data to serve as input for Cycle-GAN, and a spatial attention mechanism was introduced into the Cycle-GAN generator. This approach preserved data traffic characteristics while extracting key feature information and utilized unsupervised learning to optimize the original data set distribution. A classification network based on global attention and residual structure was proposed, taking the preprocessed graph data as input. After serializing channel attention and spatial attention to obtain global attention, input features were weighted. Finally, the model outputted the intrusion classification. Experiments on the CIC-IDS2017 and NSL-KDD data sets showed that, compared to similar models trained with the original data, the F1 score increased from 0.853 2 to 0.978 6 and the recall rate from 0.914 8 to 0.984 2 on the CIC-IDS2017 data set, and the F1 score increased from 0.646 2 to 0.844 3 and the recall rate from 0.726 to 0.876 8 on the NSL-KDD data set. This indicates that the proposed method effectively addressed the problems of slow data updates and insufficient samples for certain intrusion categories in intrusion detection.
ISSN:2096-109X