Malware recognition approach based on self‐similarity and an improved clustering algorithm

Abstract The recognition of malware in network traffic is an important research problem. However, existing solutions addressing this problem rely heavily on the source code and misrecognise vulnerabilities (i.e. incur a high false positive rate (FPR)) in some cases. In this paper, we initially use t...

Full description

Saved in:

Bibliographic Details
Main Authors:	Jinfu Chen, Chi Zhang, Saihua Cai, Zufa Zhang, Lu Liu, Longxia Huang
Format:	Article
Language:	English
Published:	Wiley 2022-10-01
Series:	IET Software
Online Access:	https://doi.org/10.1049/sfw2.12067
Tags:	Add Tag No Tags, Be the first to tag this record!

_version_	1832559601263837184
author	Jinfu Chen Chi Zhang Saihua Cai Zufa Zhang Lu Liu Longxia Huang
author_facet	Jinfu Chen Chi Zhang Saihua Cai Zufa Zhang Lu Liu Longxia Huang
author_sort	Jinfu Chen
collection	DOAJ
description	Abstract The recognition of malware in network traffic is an important research problem. However, existing solutions addressing this problem rely heavily on the source code and misrecognise vulnerabilities (i.e. incur a high false positive rate (FPR)) in some cases. In this paper, we initially use the K‐means clustering algorithm to extract malware patterns under user to root attacks in network traffic. Since the traditional K‐means algorithm needs to determine the number of clusters in advance and it is easily affected by the initial cluster centres, we propose an improved K‐means clustering algorithm (NIKClustering algorithm) for cluster analysis. Furthermore, we propose the use of self‐similarity and our improved clustering algorithm to recognise buffer overflow vulnerabilities for malware in network traffic. This motivates us to design and implement a recognition approach for buffer overflow vulnerabilities based on self‐similarity and our improved clustering algorithm, called Reliable Self‐Similarity with Improved K‐means Clustering (RSS‐IKClustering). Extensive experiments conducted on two different datasets demonstrate that the RSS‐IKClustering can achieve much fewer false positives than other notable approaches while increasing accuracy. We further apply our RSS‐IKClustering approach on a public dataset (Center for Applied Internet Data Analysis), which also exhibited a high accuracy and low FPR of 96% and 1.5%, respectively.
format	Article
id	doaj-art-06bb424d6c03469493ff71e7c550c615
institution	Kabale University
issn	1751-8806 1751-8814
language	English
publishDate	2022-10-01
publisher	Wiley
record_format	Article
series	IET Software
spelling	doaj-art-06bb424d6c03469493ff71e7c550c6152025-02-03T01:29:38ZengWileyIET Software1751-88061751-88142022-10-0116552754110.1049/sfw2.12067Malware recognition approach based on self‐similarity and an improved clustering algorithmJinfu Chen0Chi Zhang1Saihua Cai2Zufa Zhang3Lu Liu4Longxia Huang5School of Computer Science and Communication Engineering Jiangsu University Zhenjiang ChinaSchool of Computer Science and Communication Engineering Jiangsu University Zhenjiang ChinaSchool of Computer Science and Communication Engineering Jiangsu University Zhenjiang ChinaSchool of Computer Science and Communication Engineering Jiangsu University Zhenjiang ChinaSchool of Computing and Mathematical Sciences University of Leicester Leicester UKSchool of Computer Science and Communication Engineering Jiangsu University Zhenjiang ChinaAbstract The recognition of malware in network traffic is an important research problem. However, existing solutions addressing this problem rely heavily on the source code and misrecognise vulnerabilities (i.e. incur a high false positive rate (FPR)) in some cases. In this paper, we initially use the K‐means clustering algorithm to extract malware patterns under user to root attacks in network traffic. Since the traditional K‐means algorithm needs to determine the number of clusters in advance and it is easily affected by the initial cluster centres, we propose an improved K‐means clustering algorithm (NIKClustering algorithm) for cluster analysis. Furthermore, we propose the use of self‐similarity and our improved clustering algorithm to recognise buffer overflow vulnerabilities for malware in network traffic. This motivates us to design and implement a recognition approach for buffer overflow vulnerabilities based on self‐similarity and our improved clustering algorithm, called Reliable Self‐Similarity with Improved K‐means Clustering (RSS‐IKClustering). Extensive experiments conducted on two different datasets demonstrate that the RSS‐IKClustering can achieve much fewer false positives than other notable approaches while increasing accuracy. We further apply our RSS‐IKClustering approach on a public dataset (Center for Applied Internet Data Analysis), which also exhibited a high accuracy and low FPR of 96% and 1.5%, respectively.https://doi.org/10.1049/sfw2.12067
spellingShingle	Jinfu Chen Chi Zhang Saihua Cai Zufa Zhang Lu Liu Longxia Huang Malware recognition approach based on self‐similarity and an improved clustering algorithm IET Software
title	Malware recognition approach based on self‐similarity and an improved clustering algorithm
title_full	Malware recognition approach based on self‐similarity and an improved clustering algorithm
title_fullStr	Malware recognition approach based on self‐similarity and an improved clustering algorithm
title_full_unstemmed	Malware recognition approach based on self‐similarity and an improved clustering algorithm
title_short	Malware recognition approach based on self‐similarity and an improved clustering algorithm
title_sort	malware recognition approach based on self similarity and an improved clustering algorithm
url	https://doi.org/10.1049/sfw2.12067
work_keys_str_mv	AT jinfuchen malwarerecognitionapproachbasedonselfsimilarityandanimprovedclusteringalgorithm AT chizhang malwarerecognitionapproachbasedonselfsimilarityandanimprovedclusteringalgorithm AT saihuacai malwarerecognitionapproachbasedonselfsimilarityandanimprovedclusteringalgorithm AT zufazhang malwarerecognitionapproachbasedonselfsimilarityandanimprovedclusteringalgorithm AT luliu malwarerecognitionapproachbasedonselfsimilarityandanimprovedclusteringalgorithm AT longxiahuang malwarerecognitionapproachbasedonselfsimilarityandanimprovedclusteringalgorithm

Malware recognition approach based on self‐similarity and an improved clustering algorithm

Similar Items