Interpretable Detection of Malicious Behavior in Windows Portable Executables Using Multi-Head 2D Transformers
Windows malware is becoming an increasingly pressing problem as the amount of malware continues to grow and more sensitive information is stored on systems. One of the major challenges in tackling this problem is the complexity of malware analysis, which requires expertise from human analysts. Recen...
Saved in:
Main Authors: | , |
---|---|
Format: | Article |
Language: | English |
Published: |
Tsinghua University Press
2024-06-01
|
Series: | Big Data Mining and Analytics |
Subjects: | |
Online Access: | https://www.sciopen.com/article/10.26599/BDMA.2023.9020025 |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
_version_ | 1832544896000458752 |
---|---|
author | Sohail Khan Mohammad Nauman |
author_facet | Sohail Khan Mohammad Nauman |
author_sort | Sohail Khan |
collection | DOAJ |
description | Windows malware is becoming an increasingly pressing problem as the amount of malware continues to grow and more sensitive information is stored on systems. One of the major challenges in tackling this problem is the complexity of malware analysis, which requires expertise from human analysts. Recent developments in machine learning have led to the creation of deep models for malware detection. However, these models often lack transparency, making it difficult to understand the reasoning behind the model’s decisions, otherwise known as the black-box problem. To address these limitations, this paper presents a novel model for malware detection, utilizing vision transformers to analyze the Operation Code (OpCode) sequences of more than 350000 Windows portable executable malware samples from real-world datasets. The model achieves a high accuracy of 0.9864, not only surpassing the previous results but also providing valuable insights into the reasoning behind the classification. Our model is able to pinpoint specific instructions that lead to malicious behavior in malware samples, aiding human experts in their analysis and driving further advancements in the field. We report our findings and show how causality can be established between malicious code and actual classification by a deep learning model, thus opening up this black-box problem for deeper analysis. |
format | Article |
id | doaj-art-73abe221c8bf4929a4e5b86cf8f4127c |
institution | Kabale University |
issn | 2096-0654 |
language | English |
publishDate | 2024-06-01 |
publisher | Tsinghua University Press |
record_format | Article |
series | Big Data Mining and Analytics |
spelling | doaj-art-73abe221c8bf4929a4e5b86cf8f4127c2025-02-03T09:08:16ZengTsinghua University PressBig Data Mining and Analytics2096-06542024-06-017248549910.26599/BDMA.2023.9020025Interpretable Detection of Malicious Behavior in Windows Portable Executables Using Multi-Head 2D TransformersSohail Khan0Mohammad Nauman1Computer Science Department, Effat College of Engineering, Effat University, Jeddah 23341, Kingdom of Saudi ArabiaComputer Science Department, Effat College of Engineering, Effat University, Jeddah 23341, Kingdom of Saudi ArabiaWindows malware is becoming an increasingly pressing problem as the amount of malware continues to grow and more sensitive information is stored on systems. One of the major challenges in tackling this problem is the complexity of malware analysis, which requires expertise from human analysts. Recent developments in machine learning have led to the creation of deep models for malware detection. However, these models often lack transparency, making it difficult to understand the reasoning behind the model’s decisions, otherwise known as the black-box problem. To address these limitations, this paper presents a novel model for malware detection, utilizing vision transformers to analyze the Operation Code (OpCode) sequences of more than 350000 Windows portable executable malware samples from real-world datasets. The model achieves a high accuracy of 0.9864, not only surpassing the previous results but also providing valuable insights into the reasoning behind the classification. Our model is able to pinpoint specific instructions that lead to malicious behavior in malware samples, aiding human experts in their analysis and driving further advancements in the field. We report our findings and show how causality can be established between malicious code and actual classification by a deep learning model, thus opening up this black-box problem for deeper analysis.https://www.sciopen.com/article/10.26599/BDMA.2023.9020025malwarewindows protable executable (pe)machine learningvision transformers |
spellingShingle | Sohail Khan Mohammad Nauman Interpretable Detection of Malicious Behavior in Windows Portable Executables Using Multi-Head 2D Transformers Big Data Mining and Analytics malware windows protable executable (pe) machine learning vision transformers |
title | Interpretable Detection of Malicious Behavior in Windows Portable Executables Using Multi-Head 2D Transformers |
title_full | Interpretable Detection of Malicious Behavior in Windows Portable Executables Using Multi-Head 2D Transformers |
title_fullStr | Interpretable Detection of Malicious Behavior in Windows Portable Executables Using Multi-Head 2D Transformers |
title_full_unstemmed | Interpretable Detection of Malicious Behavior in Windows Portable Executables Using Multi-Head 2D Transformers |
title_short | Interpretable Detection of Malicious Behavior in Windows Portable Executables Using Multi-Head 2D Transformers |
title_sort | interpretable detection of malicious behavior in windows portable executables using multi head 2d transformers |
topic | malware windows protable executable (pe) machine learning vision transformers |
url | https://www.sciopen.com/article/10.26599/BDMA.2023.9020025 |
work_keys_str_mv | AT sohailkhan interpretabledetectionofmaliciousbehaviorinwindowsportableexecutablesusingmultihead2dtransformers AT mohammadnauman interpretabledetectionofmaliciousbehaviorinwindowsportableexecutablesusingmultihead2dtransformers |