IAPCP: An Effective Cross-Project Defect Prediction Model via Intra-Domain Alignment and Programming-Based Distribution Adaptation

Cross-project defect prediction (CPDP) aims to identify defect-prone software instances in one project (target) using historical data collected from other software projects (source), which can help maintainers allocate limited testing resources reasonably. Unfortunately, the feature distribution dis...

Full description

Saved in:
Bibliographic Details
Main Authors: Nana Zhang, Kun Zhu, Dandan Zhu
Format: Article
Language:English
Published: Wiley 2024-01-01
Series:IET Software
Online Access:http://dx.doi.org/10.1049/2024/5358773
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1832544385868234752
author Nana Zhang
Kun Zhu
Dandan Zhu
author_facet Nana Zhang
Kun Zhu
Dandan Zhu
author_sort Nana Zhang
collection DOAJ
description Cross-project defect prediction (CPDP) aims to identify defect-prone software instances in one project (target) using historical data collected from other software projects (source), which can help maintainers allocate limited testing resources reasonably. Unfortunately, the feature distribution discrepancy between the source and target projects makes it challenging to transfer the matching feature representation and severely hinders CPDP performance. Besides, existing CPDP models require an intensively expensive and time-consuming process to tune a lot of parameters. To address the above limitations, we propose an effective CPDP model named IAPCP based on distribution adaptation in this study, which consists of two stages: correlation alignment and intra-domain programming. Correlation alignment first calculates the covariance matrices of the source and target projects and then erases some features of the source project (i.e., whitening operation) and employs the features of the target project (i.e., target covariance) to fill the source project, thereby well aligning the source and target feature distributions and reducing the distribution discrepancy across projects. Intra-domain programming can directly learn a nonparametric linear transfer defect predictor with strong discriminative capacity by solving a probabilistic annotation matrix (PAM) based on the adjusted features of the source project. The model does not require model selection and parameter tuning. Extensive experiments on a total of 82 cross-project pairs from 16 software projects demonstrate that IAPCP can achieve competitive CPDP effectiveness and efficiency compared with multiple state-of-the-art baseline models.
format Article
id doaj-art-14687fb101a44a17a6e7956d4a4443a4
institution Kabale University
issn 1751-8814
language English
publishDate 2024-01-01
publisher Wiley
record_format Article
series IET Software
spelling doaj-art-14687fb101a44a17a6e7956d4a4443a42025-02-03T10:21:31ZengWileyIET Software1751-88142024-01-01202410.1049/2024/5358773IAPCP: An Effective Cross-Project Defect Prediction Model via Intra-Domain Alignment and Programming-Based Distribution AdaptationNana Zhang0Kun Zhu1Dandan Zhu2School of Computer Science and TechnologyKey Laboratory of Embedded System and Service ComputingInstitute of AI Education, ShanghaiCross-project defect prediction (CPDP) aims to identify defect-prone software instances in one project (target) using historical data collected from other software projects (source), which can help maintainers allocate limited testing resources reasonably. Unfortunately, the feature distribution discrepancy between the source and target projects makes it challenging to transfer the matching feature representation and severely hinders CPDP performance. Besides, existing CPDP models require an intensively expensive and time-consuming process to tune a lot of parameters. To address the above limitations, we propose an effective CPDP model named IAPCP based on distribution adaptation in this study, which consists of two stages: correlation alignment and intra-domain programming. Correlation alignment first calculates the covariance matrices of the source and target projects and then erases some features of the source project (i.e., whitening operation) and employs the features of the target project (i.e., target covariance) to fill the source project, thereby well aligning the source and target feature distributions and reducing the distribution discrepancy across projects. Intra-domain programming can directly learn a nonparametric linear transfer defect predictor with strong discriminative capacity by solving a probabilistic annotation matrix (PAM) based on the adjusted features of the source project. The model does not require model selection and parameter tuning. Extensive experiments on a total of 82 cross-project pairs from 16 software projects demonstrate that IAPCP can achieve competitive CPDP effectiveness and efficiency compared with multiple state-of-the-art baseline models.http://dx.doi.org/10.1049/2024/5358773
spellingShingle Nana Zhang
Kun Zhu
Dandan Zhu
IAPCP: An Effective Cross-Project Defect Prediction Model via Intra-Domain Alignment and Programming-Based Distribution Adaptation
IET Software
title IAPCP: An Effective Cross-Project Defect Prediction Model via Intra-Domain Alignment and Programming-Based Distribution Adaptation
title_full IAPCP: An Effective Cross-Project Defect Prediction Model via Intra-Domain Alignment and Programming-Based Distribution Adaptation
title_fullStr IAPCP: An Effective Cross-Project Defect Prediction Model via Intra-Domain Alignment and Programming-Based Distribution Adaptation
title_full_unstemmed IAPCP: An Effective Cross-Project Defect Prediction Model via Intra-Domain Alignment and Programming-Based Distribution Adaptation
title_short IAPCP: An Effective Cross-Project Defect Prediction Model via Intra-Domain Alignment and Programming-Based Distribution Adaptation
title_sort iapcp an effective cross project defect prediction model via intra domain alignment and programming based distribution adaptation
url http://dx.doi.org/10.1049/2024/5358773
work_keys_str_mv AT nanazhang iapcpaneffectivecrossprojectdefectpredictionmodelviaintradomainalignmentandprogrammingbaseddistributionadaptation
AT kunzhu iapcpaneffectivecrossprojectdefectpredictionmodelviaintradomainalignmentandprogrammingbaseddistributionadaptation
AT dandanzhu iapcpaneffectivecrossprojectdefectpredictionmodelviaintradomainalignmentandprogrammingbaseddistributionadaptation