A Cost-Sensitive Ensemble Method for Class-Imbalanced Datasets

In imbalanced learning methods, resampling methods modify an imbalanced dataset to form a balanced dataset. Balanced data sets perform better than imbalanced datasets for many base classifiers. This paper proposes a cost-sensitive ensemble method based on cost-sensitive support vector machine (SVM),...

Full description

Saved in:
Bibliographic Details
Main Authors: Yong Zhang, Dapeng Wang
Format: Article
Language:English
Published: Wiley 2013-01-01
Series:Abstract and Applied Analysis
Online Access:http://dx.doi.org/10.1155/2013/196256
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1832562335484477440
author Yong Zhang
Dapeng Wang
author_facet Yong Zhang
Dapeng Wang
author_sort Yong Zhang
collection DOAJ
description In imbalanced learning methods, resampling methods modify an imbalanced dataset to form a balanced dataset. Balanced data sets perform better than imbalanced datasets for many base classifiers. This paper proposes a cost-sensitive ensemble method based on cost-sensitive support vector machine (SVM), and query-by-committee (QBC) to solve imbalanced data classification. The proposed method first divides the majority-class dataset into several subdatasets according to the proportion of imbalanced samples and trains subclassifiers using AdaBoost method. Then, the proposed method generates candidate training samples by QBC active learning method and uses cost-sensitive SVM to learn the training samples. By using 5 class-imbalanced datasets, experimental results show that the proposed method has higher area under ROC curve (AUC), F-measure, and G-mean than many existing class-imbalanced learning methods.
format Article
id doaj-art-b912f077a2bc48ec8d723fd692eeec9a
institution Kabale University
issn 1085-3375
1687-0409
language English
publishDate 2013-01-01
publisher Wiley
record_format Article
series Abstract and Applied Analysis
spelling doaj-art-b912f077a2bc48ec8d723fd692eeec9a2025-02-03T01:22:54ZengWileyAbstract and Applied Analysis1085-33751687-04092013-01-01201310.1155/2013/196256196256A Cost-Sensitive Ensemble Method for Class-Imbalanced DatasetsYong Zhang0Dapeng Wang1School of Computer and Information Technology, Liaoning Normal University, No. 1, Liushu South Street, Ganjingzi, Dalian, Liaoning 116081, ChinaSchool of Computer and Information Technology, Liaoning Normal University, No. 1, Liushu South Street, Ganjingzi, Dalian, Liaoning 116081, ChinaIn imbalanced learning methods, resampling methods modify an imbalanced dataset to form a balanced dataset. Balanced data sets perform better than imbalanced datasets for many base classifiers. This paper proposes a cost-sensitive ensemble method based on cost-sensitive support vector machine (SVM), and query-by-committee (QBC) to solve imbalanced data classification. The proposed method first divides the majority-class dataset into several subdatasets according to the proportion of imbalanced samples and trains subclassifiers using AdaBoost method. Then, the proposed method generates candidate training samples by QBC active learning method and uses cost-sensitive SVM to learn the training samples. By using 5 class-imbalanced datasets, experimental results show that the proposed method has higher area under ROC curve (AUC), F-measure, and G-mean than many existing class-imbalanced learning methods.http://dx.doi.org/10.1155/2013/196256
spellingShingle Yong Zhang
Dapeng Wang
A Cost-Sensitive Ensemble Method for Class-Imbalanced Datasets
Abstract and Applied Analysis
title A Cost-Sensitive Ensemble Method for Class-Imbalanced Datasets
title_full A Cost-Sensitive Ensemble Method for Class-Imbalanced Datasets
title_fullStr A Cost-Sensitive Ensemble Method for Class-Imbalanced Datasets
title_full_unstemmed A Cost-Sensitive Ensemble Method for Class-Imbalanced Datasets
title_short A Cost-Sensitive Ensemble Method for Class-Imbalanced Datasets
title_sort cost sensitive ensemble method for class imbalanced datasets
url http://dx.doi.org/10.1155/2013/196256
work_keys_str_mv AT yongzhang acostsensitiveensemblemethodforclassimbalanceddatasets
AT dapengwang acostsensitiveensemblemethodforclassimbalanceddatasets
AT yongzhang costsensitiveensemblemethodforclassimbalanceddatasets
AT dapengwang costsensitiveensemblemethodforclassimbalanceddatasets