Surprise Bug Report Prediction Utilizing Optimized Integration with Imbalanced Learning Strategy

In software projects, a large number of bugs are usually reported to bug repositories. Due to the limited budge and work force, the developers often may not have enough time and ability to inspect all the reported bugs, and thus they often focus on inspecting and repairing the highly impacting bugs....

Full description

Saved in:
Bibliographic Details
Main Authors: Hui Li, Yang Qu, Shikai Guo, Guofeng Gao, Rong Chen, Guo Chen
Format: Article
Language:English
Published: Wiley 2020-01-01
Series:Complexity
Online Access:http://dx.doi.org/10.1155/2020/8509821
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1832566419151126528
author Hui Li
Yang Qu
Shikai Guo
Guofeng Gao
Rong Chen
Guo Chen
author_facet Hui Li
Yang Qu
Shikai Guo
Guofeng Gao
Rong Chen
Guo Chen
author_sort Hui Li
collection DOAJ
description In software projects, a large number of bugs are usually reported to bug repositories. Due to the limited budge and work force, the developers often may not have enough time and ability to inspect all the reported bugs, and thus they often focus on inspecting and repairing the highly impacting bugs. Among the high-impact bugs, surprise bugs are reported to be a fatal threat to the software systems, though they only account for a small proportion. Therefore, the identification of surprise bugs becomes an important work in practices. In recent years, some methods have been proposed by the researchers to identify surprise bugs. Unfortunately, the performance of these methods in identifying surprise bugs is still not satisfied for the software projects. The main reason is that surprise bugs only occupy a small percentage of all the bugs, and it is difficult to identify these surprise bugs from the imbalanced distribution. In order to overcome the imbalanced category distribution of the bugs, a method based on machine learning to predict surprise bugs is presented in this paper. This method takes into account the textual features of the bug reports and employs an imbalanced learning strategy to balance the datasets of the bug reports. Then these datasets after balancing are used to train three selected classifiers which are built by three different classification algorithms and predict the datasets with unknown type. In particular, an ensemble method named optimization integration is proposed to generate a unique and best result, according to the results produced by the three classifiers. This ensemble method is able to adjust the ability of the classifier to detect different categories based on the characteristics of different projects and integrate the advantages of three classifiers. The experiments performed on the datasets from 4 software projects show that this method performs better than the previous methods in terms of detecting surprise bugs.
format Article
id doaj-art-8d26cb8a1f0744e4b38a59d36026bdd7
institution Kabale University
issn 1076-2787
1099-0526
language English
publishDate 2020-01-01
publisher Wiley
record_format Article
series Complexity
spelling doaj-art-8d26cb8a1f0744e4b38a59d36026bdd72025-02-03T01:04:14ZengWileyComplexity1076-27871099-05262020-01-01202010.1155/2020/85098218509821Surprise Bug Report Prediction Utilizing Optimized Integration with Imbalanced Learning StrategyHui Li0Yang Qu1Shikai Guo2Guofeng Gao3Rong Chen4Guo Chen5Information Science and Technology College, Dalian Maritime University, Dalian 116026, ChinaInformation Science and Technology College, Dalian Maritime University, Dalian 116026, ChinaInformation Science and Technology College, Dalian Maritime University, Dalian 116026, ChinaInformation Science and Technology College, Dalian Maritime University, Dalian 116026, ChinaInformation Science and Technology College, Dalian Maritime University, Dalian 116026, ChinaMarine Electrical Engineering College, Dalian Maritime University, Dalian 116026, ChinaIn software projects, a large number of bugs are usually reported to bug repositories. Due to the limited budge and work force, the developers often may not have enough time and ability to inspect all the reported bugs, and thus they often focus on inspecting and repairing the highly impacting bugs. Among the high-impact bugs, surprise bugs are reported to be a fatal threat to the software systems, though they only account for a small proportion. Therefore, the identification of surprise bugs becomes an important work in practices. In recent years, some methods have been proposed by the researchers to identify surprise bugs. Unfortunately, the performance of these methods in identifying surprise bugs is still not satisfied for the software projects. The main reason is that surprise bugs only occupy a small percentage of all the bugs, and it is difficult to identify these surprise bugs from the imbalanced distribution. In order to overcome the imbalanced category distribution of the bugs, a method based on machine learning to predict surprise bugs is presented in this paper. This method takes into account the textual features of the bug reports and employs an imbalanced learning strategy to balance the datasets of the bug reports. Then these datasets after balancing are used to train three selected classifiers which are built by three different classification algorithms and predict the datasets with unknown type. In particular, an ensemble method named optimization integration is proposed to generate a unique and best result, according to the results produced by the three classifiers. This ensemble method is able to adjust the ability of the classifier to detect different categories based on the characteristics of different projects and integrate the advantages of three classifiers. The experiments performed on the datasets from 4 software projects show that this method performs better than the previous methods in terms of detecting surprise bugs.http://dx.doi.org/10.1155/2020/8509821
spellingShingle Hui Li
Yang Qu
Shikai Guo
Guofeng Gao
Rong Chen
Guo Chen
Surprise Bug Report Prediction Utilizing Optimized Integration with Imbalanced Learning Strategy
Complexity
title Surprise Bug Report Prediction Utilizing Optimized Integration with Imbalanced Learning Strategy
title_full Surprise Bug Report Prediction Utilizing Optimized Integration with Imbalanced Learning Strategy
title_fullStr Surprise Bug Report Prediction Utilizing Optimized Integration with Imbalanced Learning Strategy
title_full_unstemmed Surprise Bug Report Prediction Utilizing Optimized Integration with Imbalanced Learning Strategy
title_short Surprise Bug Report Prediction Utilizing Optimized Integration with Imbalanced Learning Strategy
title_sort surprise bug report prediction utilizing optimized integration with imbalanced learning strategy
url http://dx.doi.org/10.1155/2020/8509821
work_keys_str_mv AT huili surprisebugreportpredictionutilizingoptimizedintegrationwithimbalancedlearningstrategy
AT yangqu surprisebugreportpredictionutilizingoptimizedintegrationwithimbalancedlearningstrategy
AT shikaiguo surprisebugreportpredictionutilizingoptimizedintegrationwithimbalancedlearningstrategy
AT guofenggao surprisebugreportpredictionutilizingoptimizedintegrationwithimbalancedlearningstrategy
AT rongchen surprisebugreportpredictionutilizingoptimizedintegrationwithimbalancedlearningstrategy
AT guochen surprisebugreportpredictionutilizingoptimizedintegrationwithimbalancedlearningstrategy