Cross-project software defect prediction based on the reduction and hybridization of software metrics

Cross-project defect prediction (CPDP) plays an essential role in identifying potential defects in target projects, especially those with limited historical data, using relevant information from similar source projects. The current studies focused on three main types of software metrics for CPDP: st...

Full description

Saved in:
Bibliographic Details
Main Authors: Ahmed Abdu, Zhengjun Zhai, Hakim A. Abdo, Sungon Lee, Mohammed A. Al-masni, Yeong Hyeon Gu, Redhwan Algabri
Format: Article
Language:English
Published: Elsevier 2025-01-01
Series:Alexandria Engineering Journal
Subjects:
Online Access:http://www.sciencedirect.com/science/article/pii/S111001682401189X
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1832583120462807040
author Ahmed Abdu
Zhengjun Zhai
Hakim A. Abdo
Sungon Lee
Mohammed A. Al-masni
Yeong Hyeon Gu
Redhwan Algabri
author_facet Ahmed Abdu
Zhengjun Zhai
Hakim A. Abdo
Sungon Lee
Mohammed A. Al-masni
Yeong Hyeon Gu
Redhwan Algabri
author_sort Ahmed Abdu
collection DOAJ
description Cross-project defect prediction (CPDP) plays an essential role in identifying potential defects in target projects, especially those with limited historical data, using relevant information from similar source projects. The current studies focused on three main types of software metrics for CPDP: static metrics, code-change metrics, and semantic features. However, these existing CPDP studies encounter two primary challenges: class overlap due to reduced feature dimensions and multicollinearity from integrating various software metrics. To address these challenges, we propose a CPDP model based on both reduction and hybridization techniques (RH-CPDP). The proposed model uses hybrid deep neural networks as a hybridization technique to combine the essential metrics from all metric categories, addressing the issue of class overlap to enhance prediction model efficiency. Principal component analysis (PCA) was used as a reduction method to keep the number of metrics used small, focusing on influential relationships among metrics and fault proneness and avoiding the multicollinearity problem. The experimental analysis conducted using nine open-source projects from the PROMISE dataset demonstrates that RH-CPDP surpasses current CPDP methods (TCSBoost, TPTL, DA-KTSVMO, DBN, and 3SW-MSTL) regarding area under the curve (AUC) and F1-measure. These findings highlight the effectiveness of RH-CPDP in improving the performance of CPDP techniques.
format Article
id doaj-art-d0094f25b95d4311ba5c955aeec4c762
institution Kabale University
issn 1110-0168
language English
publishDate 2025-01-01
publisher Elsevier
record_format Article
series Alexandria Engineering Journal
spelling doaj-art-d0094f25b95d4311ba5c955aeec4c7622025-01-29T05:00:04ZengElsevierAlexandria Engineering Journal1110-01682025-01-01112161176Cross-project software defect prediction based on the reduction and hybridization of software metricsAhmed Abdu0Zhengjun Zhai1Hakim A. Abdo2Sungon Lee3Mohammed A. Al-masni4Yeong Hyeon Gu5Redhwan Algabri6School of Software, Northwestern Polytechnical University, Xi’an, 710072, ChinaSchool of Software, Northwestern Polytechnical University, Xi’an, 710072, China; School of Computer Science, Northwestern Polytechnical University, Xi’an, 710072, China; Corresponding authors.Dr. Babasaheb Ambedkar Marathwada University, Aurangabad, IndiaDepartment of Robotics, Hanyang University, Ansan 15588, Republic of KoreaDepartment of Artificial Intelligence and Data Science, College of AI Convergence, Sejong University, Seoul 05006, Republic of KoreaDepartment of Artificial Intelligence and Data Science, College of AI Convergence, Sejong University, Seoul 05006, Republic of Korea; Corresponding authors.Research Institute of Engineering and Technology, Hanyang University, Ansan 15588, Republic of Korea; Corresponding authors.Cross-project defect prediction (CPDP) plays an essential role in identifying potential defects in target projects, especially those with limited historical data, using relevant information from similar source projects. The current studies focused on three main types of software metrics for CPDP: static metrics, code-change metrics, and semantic features. However, these existing CPDP studies encounter two primary challenges: class overlap due to reduced feature dimensions and multicollinearity from integrating various software metrics. To address these challenges, we propose a CPDP model based on both reduction and hybridization techniques (RH-CPDP). The proposed model uses hybrid deep neural networks as a hybridization technique to combine the essential metrics from all metric categories, addressing the issue of class overlap to enhance prediction model efficiency. Principal component analysis (PCA) was used as a reduction method to keep the number of metrics used small, focusing on influential relationships among metrics and fault proneness and avoiding the multicollinearity problem. The experimental analysis conducted using nine open-source projects from the PROMISE dataset demonstrates that RH-CPDP surpasses current CPDP methods (TCSBoost, TPTL, DA-KTSVMO, DBN, and 3SW-MSTL) regarding area under the curve (AUC) and F1-measure. These findings highlight the effectiveness of RH-CPDP in improving the performance of CPDP techniques.http://www.sciencedirect.com/science/article/pii/S111001682401189XCross-project defect predictionSoftware metricsFeatures reductionPrincipal component analysisDeep neural network
spellingShingle Ahmed Abdu
Zhengjun Zhai
Hakim A. Abdo
Sungon Lee
Mohammed A. Al-masni
Yeong Hyeon Gu
Redhwan Algabri
Cross-project software defect prediction based on the reduction and hybridization of software metrics
Alexandria Engineering Journal
Cross-project defect prediction
Software metrics
Features reduction
Principal component analysis
Deep neural network
title Cross-project software defect prediction based on the reduction and hybridization of software metrics
title_full Cross-project software defect prediction based on the reduction and hybridization of software metrics
title_fullStr Cross-project software defect prediction based on the reduction and hybridization of software metrics
title_full_unstemmed Cross-project software defect prediction based on the reduction and hybridization of software metrics
title_short Cross-project software defect prediction based on the reduction and hybridization of software metrics
title_sort cross project software defect prediction based on the reduction and hybridization of software metrics
topic Cross-project defect prediction
Software metrics
Features reduction
Principal component analysis
Deep neural network
url http://www.sciencedirect.com/science/article/pii/S111001682401189X
work_keys_str_mv AT ahmedabdu crossprojectsoftwaredefectpredictionbasedonthereductionandhybridizationofsoftwaremetrics
AT zhengjunzhai crossprojectsoftwaredefectpredictionbasedonthereductionandhybridizationofsoftwaremetrics
AT hakimaabdo crossprojectsoftwaredefectpredictionbasedonthereductionandhybridizationofsoftwaremetrics
AT sungonlee crossprojectsoftwaredefectpredictionbasedonthereductionandhybridizationofsoftwaremetrics
AT mohammedaalmasni crossprojectsoftwaredefectpredictionbasedonthereductionandhybridizationofsoftwaremetrics
AT yeonghyeongu crossprojectsoftwaredefectpredictionbasedonthereductionandhybridizationofsoftwaremetrics
AT redhwanalgabri crossprojectsoftwaredefectpredictionbasedonthereductionandhybridizationofsoftwaremetrics