Cost-Sensitive Feature Selection of Numeric Data with Measurement Errors

Feature selection is an essential process in data mining applications since it reduces a model’s complexity. However, feature selection with various types of costs is still a new research topic. In this paper, we study the cost-sensitive feature selection problem of numeric data with measurement err...

Full description

Saved in:
Bibliographic Details
Main Authors: Hong Zhao, Fan Min, William Zhu
Format: Article
Language:English
Published: Wiley 2013-01-01
Series:Journal of Applied Mathematics
Online Access:http://dx.doi.org/10.1155/2013/754698
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1832556103314964480
author Hong Zhao
Fan Min
William Zhu
author_facet Hong Zhao
Fan Min
William Zhu
author_sort Hong Zhao
collection DOAJ
description Feature selection is an essential process in data mining applications since it reduces a model’s complexity. However, feature selection with various types of costs is still a new research topic. In this paper, we study the cost-sensitive feature selection problem of numeric data with measurement errors. The major contributions of this paper are fourfold. First, a new data model is built to address test costs and misclassification costs as well as error boundaries. It is distinguished from the existing models mainly on the error boundaries. Second, a covering-based rough set model with normal distribution measurement errors is constructed. With this model, coverings are constructed from data rather than assigned by users. Third, a new cost-sensitive feature selection problem is defined on this model. It is more realistic than the existing feature selection problems. Fourth, both backtracking and heuristic algorithms are proposed to deal with the new problem. Experimental results show the efficiency of the pruning techniques for the backtracking algorithm and the effectiveness of the heuristic algorithm. This study is a step toward realistic applications of the cost-sensitive learning.
format Article
id doaj-art-06b71b6144df4274a6b9b31d896f4ae4
institution Kabale University
issn 1110-757X
1687-0042
language English
publishDate 2013-01-01
publisher Wiley
record_format Article
series Journal of Applied Mathematics
spelling doaj-art-06b71b6144df4274a6b9b31d896f4ae42025-02-03T05:46:21ZengWileyJournal of Applied Mathematics1110-757X1687-00422013-01-01201310.1155/2013/754698754698Cost-Sensitive Feature Selection of Numeric Data with Measurement ErrorsHong Zhao0Fan Min1William Zhu2Laboratory of Granular Computing, Zhangzhou Normal University, Zhangzhou 363000, ChinaLaboratory of Granular Computing, Zhangzhou Normal University, Zhangzhou 363000, ChinaLaboratory of Granular Computing, Zhangzhou Normal University, Zhangzhou 363000, ChinaFeature selection is an essential process in data mining applications since it reduces a model’s complexity. However, feature selection with various types of costs is still a new research topic. In this paper, we study the cost-sensitive feature selection problem of numeric data with measurement errors. The major contributions of this paper are fourfold. First, a new data model is built to address test costs and misclassification costs as well as error boundaries. It is distinguished from the existing models mainly on the error boundaries. Second, a covering-based rough set model with normal distribution measurement errors is constructed. With this model, coverings are constructed from data rather than assigned by users. Third, a new cost-sensitive feature selection problem is defined on this model. It is more realistic than the existing feature selection problems. Fourth, both backtracking and heuristic algorithms are proposed to deal with the new problem. Experimental results show the efficiency of the pruning techniques for the backtracking algorithm and the effectiveness of the heuristic algorithm. This study is a step toward realistic applications of the cost-sensitive learning.http://dx.doi.org/10.1155/2013/754698
spellingShingle Hong Zhao
Fan Min
William Zhu
Cost-Sensitive Feature Selection of Numeric Data with Measurement Errors
Journal of Applied Mathematics
title Cost-Sensitive Feature Selection of Numeric Data with Measurement Errors
title_full Cost-Sensitive Feature Selection of Numeric Data with Measurement Errors
title_fullStr Cost-Sensitive Feature Selection of Numeric Data with Measurement Errors
title_full_unstemmed Cost-Sensitive Feature Selection of Numeric Data with Measurement Errors
title_short Cost-Sensitive Feature Selection of Numeric Data with Measurement Errors
title_sort cost sensitive feature selection of numeric data with measurement errors
url http://dx.doi.org/10.1155/2013/754698
work_keys_str_mv AT hongzhao costsensitivefeatureselectionofnumericdatawithmeasurementerrors
AT fanmin costsensitivefeatureselectionofnumericdatawithmeasurementerrors
AT williamzhu costsensitivefeatureselectionofnumericdatawithmeasurementerrors