Cross-project software defect prediction based on the reduction and hybridization of software metrics
Cross-project defect prediction (CPDP) plays an essential role in identifying potential defects in target projects, especially those with limited historical data, using relevant information from similar source projects. The current studies focused on three main types of software metrics for CPDP: st...
Saved in:
Main Authors: | , , , , , , |
---|---|
Format: | Article |
Language: | English |
Published: |
Elsevier
2025-01-01
|
Series: | Alexandria Engineering Journal |
Subjects: | |
Online Access: | http://www.sciencedirect.com/science/article/pii/S111001682401189X |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
_version_ | 1832583120462807040 |
---|---|
author | Ahmed Abdu Zhengjun Zhai Hakim A. Abdo Sungon Lee Mohammed A. Al-masni Yeong Hyeon Gu Redhwan Algabri |
author_facet | Ahmed Abdu Zhengjun Zhai Hakim A. Abdo Sungon Lee Mohammed A. Al-masni Yeong Hyeon Gu Redhwan Algabri |
author_sort | Ahmed Abdu |
collection | DOAJ |
description | Cross-project defect prediction (CPDP) plays an essential role in identifying potential defects in target projects, especially those with limited historical data, using relevant information from similar source projects. The current studies focused on three main types of software metrics for CPDP: static metrics, code-change metrics, and semantic features. However, these existing CPDP studies encounter two primary challenges: class overlap due to reduced feature dimensions and multicollinearity from integrating various software metrics. To address these challenges, we propose a CPDP model based on both reduction and hybridization techniques (RH-CPDP). The proposed model uses hybrid deep neural networks as a hybridization technique to combine the essential metrics from all metric categories, addressing the issue of class overlap to enhance prediction model efficiency. Principal component analysis (PCA) was used as a reduction method to keep the number of metrics used small, focusing on influential relationships among metrics and fault proneness and avoiding the multicollinearity problem. The experimental analysis conducted using nine open-source projects from the PROMISE dataset demonstrates that RH-CPDP surpasses current CPDP methods (TCSBoost, TPTL, DA-KTSVMO, DBN, and 3SW-MSTL) regarding area under the curve (AUC) and F1-measure. These findings highlight the effectiveness of RH-CPDP in improving the performance of CPDP techniques. |
format | Article |
id | doaj-art-d0094f25b95d4311ba5c955aeec4c762 |
institution | Kabale University |
issn | 1110-0168 |
language | English |
publishDate | 2025-01-01 |
publisher | Elsevier |
record_format | Article |
series | Alexandria Engineering Journal |
spelling | doaj-art-d0094f25b95d4311ba5c955aeec4c7622025-01-29T05:00:04ZengElsevierAlexandria Engineering Journal1110-01682025-01-01112161176Cross-project software defect prediction based on the reduction and hybridization of software metricsAhmed Abdu0Zhengjun Zhai1Hakim A. Abdo2Sungon Lee3Mohammed A. Al-masni4Yeong Hyeon Gu5Redhwan Algabri6School of Software, Northwestern Polytechnical University, Xi’an, 710072, ChinaSchool of Software, Northwestern Polytechnical University, Xi’an, 710072, China; School of Computer Science, Northwestern Polytechnical University, Xi’an, 710072, China; Corresponding authors.Dr. Babasaheb Ambedkar Marathwada University, Aurangabad, IndiaDepartment of Robotics, Hanyang University, Ansan 15588, Republic of KoreaDepartment of Artificial Intelligence and Data Science, College of AI Convergence, Sejong University, Seoul 05006, Republic of KoreaDepartment of Artificial Intelligence and Data Science, College of AI Convergence, Sejong University, Seoul 05006, Republic of Korea; Corresponding authors.Research Institute of Engineering and Technology, Hanyang University, Ansan 15588, Republic of Korea; Corresponding authors.Cross-project defect prediction (CPDP) plays an essential role in identifying potential defects in target projects, especially those with limited historical data, using relevant information from similar source projects. The current studies focused on three main types of software metrics for CPDP: static metrics, code-change metrics, and semantic features. However, these existing CPDP studies encounter two primary challenges: class overlap due to reduced feature dimensions and multicollinearity from integrating various software metrics. To address these challenges, we propose a CPDP model based on both reduction and hybridization techniques (RH-CPDP). The proposed model uses hybrid deep neural networks as a hybridization technique to combine the essential metrics from all metric categories, addressing the issue of class overlap to enhance prediction model efficiency. Principal component analysis (PCA) was used as a reduction method to keep the number of metrics used small, focusing on influential relationships among metrics and fault proneness and avoiding the multicollinearity problem. The experimental analysis conducted using nine open-source projects from the PROMISE dataset demonstrates that RH-CPDP surpasses current CPDP methods (TCSBoost, TPTL, DA-KTSVMO, DBN, and 3SW-MSTL) regarding area under the curve (AUC) and F1-measure. These findings highlight the effectiveness of RH-CPDP in improving the performance of CPDP techniques.http://www.sciencedirect.com/science/article/pii/S111001682401189XCross-project defect predictionSoftware metricsFeatures reductionPrincipal component analysisDeep neural network |
spellingShingle | Ahmed Abdu Zhengjun Zhai Hakim A. Abdo Sungon Lee Mohammed A. Al-masni Yeong Hyeon Gu Redhwan Algabri Cross-project software defect prediction based on the reduction and hybridization of software metrics Alexandria Engineering Journal Cross-project defect prediction Software metrics Features reduction Principal component analysis Deep neural network |
title | Cross-project software defect prediction based on the reduction and hybridization of software metrics |
title_full | Cross-project software defect prediction based on the reduction and hybridization of software metrics |
title_fullStr | Cross-project software defect prediction based on the reduction and hybridization of software metrics |
title_full_unstemmed | Cross-project software defect prediction based on the reduction and hybridization of software metrics |
title_short | Cross-project software defect prediction based on the reduction and hybridization of software metrics |
title_sort | cross project software defect prediction based on the reduction and hybridization of software metrics |
topic | Cross-project defect prediction Software metrics Features reduction Principal component analysis Deep neural network |
url | http://www.sciencedirect.com/science/article/pii/S111001682401189X |
work_keys_str_mv | AT ahmedabdu crossprojectsoftwaredefectpredictionbasedonthereductionandhybridizationofsoftwaremetrics AT zhengjunzhai crossprojectsoftwaredefectpredictionbasedonthereductionandhybridizationofsoftwaremetrics AT hakimaabdo crossprojectsoftwaredefectpredictionbasedonthereductionandhybridizationofsoftwaremetrics AT sungonlee crossprojectsoftwaredefectpredictionbasedonthereductionandhybridizationofsoftwaremetrics AT mohammedaalmasni crossprojectsoftwaredefectpredictionbasedonthereductionandhybridizationofsoftwaremetrics AT yeonghyeongu crossprojectsoftwaredefectpredictionbasedonthereductionandhybridizationofsoftwaremetrics AT redhwanalgabri crossprojectsoftwaredefectpredictionbasedonthereductionandhybridizationofsoftwaremetrics |