Windows Malware Detection via Enhanced Graph Representations with Node2Vec and Graph Attention Network

As malware has become increasingly complex, advanced techniques have emerged to improve traditional detection systems. The increasing complexity of malware poses significant challenges in cybersecurity due to the inability of existing methods to understand detailed and contextual relationships in mo...

Full description

Saved in:
Bibliographic Details
Main Authors: Nisa Vuran Sarı, Mehmet Acı, Çiğdem İnan Acı
Format: Article
Language:English
Published: MDPI AG 2025-04-01
Series:Applied Sciences
Subjects:
Online Access:https://www.mdpi.com/2076-3417/15/9/4775
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:As malware has become increasingly complex, advanced techniques have emerged to improve traditional detection systems. The increasing complexity of malware poses significant challenges in cybersecurity due to the inability of existing methods to understand detailed and contextual relationships in modern software behavior. Therefore, developing innovative detection frameworks that can effectively analyze and interpret these complex patterns has become critical. This work presents a novel framework integrating API call sequences and DLL information into a unified, graph-based representation to analyze malware behavior comprehensively. The proposed model generates initial embeddings using Node2Vec, which uses a random walk approach to understand structural relationships between nodes. Graph Attention Network (GAT) then enhances these initial embeddings, which utilizes attention mechanisms to incorporate contextual dependencies and enhance semantic representations. Finally, the enhanced embeddings are classified using Convolutional Neural Network (CNN) and Gated Recurrent Units (GRU)s, a custom hybrid CNN-GRU-3 deep learning-based model capable of effectively modeling sequential patterns. The dual role of GAT as a classifier and feature extractor is also analyzed to evaluate its impact on embedding quality and classification accuracy. Experimental results show that the proposed model achieves superior results with an accuracy rate of 0.9961 compared to state-of-the-art approaches such as ensemble learning and standalone GAT. This achievement highlights the framework’s ability to utilize contextual information for malware detection. The real-world dataset used provides a benchmark for future work, and this research lays a comprehensive foundation for advancing graph-based malware analysis.
ISSN:2076-3417