Revisiting ‘revisiting supervised methods for effort‐aware cross‐project defect prediction’

Abstract Effort‐aware cross‐project defect prediction (EACPDP), which uses cross‐project software modules to build a model to rank within‐project software modules based on the defect density, has been suggested to allocate limited testing resource efficiently. Recently, Ni et al. proposed an EACPDP...

Full description

Saved in:

Bibliographic Details
Main Authors:	Fuyang Li, Peixin Yang, Jacky Wai Keung, Wenhua Hu, Haoyu Luo, Xiao Yu
Format:	Article
Language:	English
Published:	Wiley 2023-08-01
Series:	IET Software
Subjects:	data mining quality assurance software engineering software maintenance software metrics software quality
Online Access:	https://doi.org/10.1049/sfw2.12133
Tags:	Add Tag No Tags, Be the first to tag this record!

_version_	1832547386795229184
author	Fuyang Li Peixin Yang Jacky Wai Keung Wenhua Hu Haoyu Luo Xiao Yu
author_facet	Fuyang Li Peixin Yang Jacky Wai Keung Wenhua Hu Haoyu Luo Xiao Yu
author_sort	Fuyang Li
collection	DOAJ
description	Abstract Effort‐aware cross‐project defect prediction (EACPDP), which uses cross‐project software modules to build a model to rank within‐project software modules based on the defect density, has been suggested to allocate limited testing resource efficiently. Recently, Ni et al. proposed an EACPDP method called EASC, which used all cross‐project modules to train a model without considering the data distribution difference between cross‐project and within‐project data. In addition, Ni et al. employed the different defect density calculation strategies when comparing EASC and baseline methods. To explore the effective defect density calculation strategies and methods on EACPDP, the authors compare four data filtering methods and five transfer learning methods with EASC using four commonly used defect density calculation strategies. The authors use three classification evaluation metrics and seven effort‐aware metrics to assess the performance of methods on 11 PROMISE datasets comprehensively. The results show that (1) The classification before sorting (CBS+) defect density calculation strategy achieves the best overall performance. (2) Using balanced distribution adaption (BDA) and joint distribution adaptation (JDA) with the K‐nearest neighbour classifier to build the EACPDP model can find 15% and 14.3% more defective modules and 11.6% and 8.9% more defects while achieving the acceptable initial false alarms (IFA). (3) Better comprehensive classification performance of the methods can bring better EACPDP performance to some extent. (4) A flexible adjustment of the defect threshold λ of the CBS+ strategy contribute to different goals. In summary, the authors recommend researchers and practitioners use to BDA and JDA with the CBS+ strategy to build the EACPDP model.
format	Article
id	doaj-art-bd87505daf4a402d8c6b7dbd54b7a924
institution	Kabale University
issn	1751-8806 1751-8814
language	English
publishDate	2023-08-01
publisher	Wiley
record_format	Article
series	IET Software
spelling	doaj-art-bd87505daf4a402d8c6b7dbd54b7a9242025-02-03T06:45:11ZengWileyIET Software1751-88061751-88142023-08-0117447249510.1049/sfw2.12133Revisiting ‘revisiting supervised methods for effort‐aware cross‐project defect prediction’Fuyang Li0Peixin Yang1Jacky Wai Keung2Wenhua Hu3Haoyu Luo4Xiao Yu5School of Computer Science and Artificial Intelligence Wuhan University of Technology Wuhan ChinaSchool of Computer Science and Artificial Intelligence Wuhan University of Technology Wuhan ChinaDepartment of Computer Science City University of Hong Kong Hong Kong ChinaSchool of Computer Science and Artificial Intelligence Wuhan University of Technology Wuhan ChinaCollege of Mathematics and Informatics South China Agricultural University Guangzhou ChinaSchool of Computer Science and Artificial Intelligence Wuhan University of Technology Wuhan ChinaAbstract Effort‐aware cross‐project defect prediction (EACPDP), which uses cross‐project software modules to build a model to rank within‐project software modules based on the defect density, has been suggested to allocate limited testing resource efficiently. Recently, Ni et al. proposed an EACPDP method called EASC, which used all cross‐project modules to train a model without considering the data distribution difference between cross‐project and within‐project data. In addition, Ni et al. employed the different defect density calculation strategies when comparing EASC and baseline methods. To explore the effective defect density calculation strategies and methods on EACPDP, the authors compare four data filtering methods and five transfer learning methods with EASC using four commonly used defect density calculation strategies. The authors use three classification evaluation metrics and seven effort‐aware metrics to assess the performance of methods on 11 PROMISE datasets comprehensively. The results show that (1) The classification before sorting (CBS+) defect density calculation strategy achieves the best overall performance. (2) Using balanced distribution adaption (BDA) and joint distribution adaptation (JDA) with the K‐nearest neighbour classifier to build the EACPDP model can find 15% and 14.3% more defective modules and 11.6% and 8.9% more defects while achieving the acceptable initial false alarms (IFA). (3) Better comprehensive classification performance of the methods can bring better EACPDP performance to some extent. (4) A flexible adjustment of the defect threshold λ of the CBS+ strategy contribute to different goals. In summary, the authors recommend researchers and practitioners use to BDA and JDA with the CBS+ strategy to build the EACPDP model.https://doi.org/10.1049/sfw2.12133data miningquality assurancesoftware engineeringsoftware maintenancesoftware metricssoftware quality
spellingShingle	Fuyang Li Peixin Yang Jacky Wai Keung Wenhua Hu Haoyu Luo Xiao Yu Revisiting ‘revisiting supervised methods for effort‐aware cross‐project defect prediction’ IET Software data mining quality assurance software engineering software maintenance software metrics software quality
title	Revisiting ‘revisiting supervised methods for effort‐aware cross‐project defect prediction’
title_full	Revisiting ‘revisiting supervised methods for effort‐aware cross‐project defect prediction’
title_fullStr	Revisiting ‘revisiting supervised methods for effort‐aware cross‐project defect prediction’
title_full_unstemmed	Revisiting ‘revisiting supervised methods for effort‐aware cross‐project defect prediction’
title_short	Revisiting ‘revisiting supervised methods for effort‐aware cross‐project defect prediction’
title_sort	revisiting revisiting supervised methods for effort aware cross project defect prediction
topic	data mining quality assurance software engineering software maintenance software metrics software quality
url	https://doi.org/10.1049/sfw2.12133
work_keys_str_mv	AT fuyangli revisitingrevisitingsupervisedmethodsforeffortawarecrossprojectdefectprediction AT peixinyang revisitingrevisitingsupervisedmethodsforeffortawarecrossprojectdefectprediction AT jackywaikeung revisitingrevisitingsupervisedmethodsforeffortawarecrossprojectdefectprediction AT wenhuahu revisitingrevisitingsupervisedmethodsforeffortawarecrossprojectdefectprediction AT haoyuluo revisitingrevisitingsupervisedmethodsforeffortawarecrossprojectdefectprediction AT xiaoyu revisitingrevisitingsupervisedmethodsforeffortawarecrossprojectdefectprediction

Revisiting ‘revisiting supervised methods for effort‐aware cross‐project defect prediction’

Similar Items