A Novel Feature Selection Method for Classification of Medical Data Using Filters, Wrappers, and Embedded Approaches

Feature selection is the process of identifying the most relevant features from the given data having a large feature space. Microarray datasets are comprised of high-quality features and very few samples of data. Feature selection is performed on such datasets to identify the optimal feature subset...

Full description

Saved in:
Bibliographic Details
Main Authors: Saba Bashir, Irfan Ullah Khattak, Aihab Khan, Farhan Hassan Khan, Abdullah Gani, Muhammad Shiraz
Format: Article
Language:English
Published: Wiley 2022-01-01
Series:Complexity
Online Access:http://dx.doi.org/10.1155/2022/8190814
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1849468280512708608
author Saba Bashir
Irfan Ullah Khattak
Aihab Khan
Farhan Hassan Khan
Abdullah Gani
Muhammad Shiraz
author_facet Saba Bashir
Irfan Ullah Khattak
Aihab Khan
Farhan Hassan Khan
Abdullah Gani
Muhammad Shiraz
author_sort Saba Bashir
collection DOAJ
description Feature selection is the process of identifying the most relevant features from the given data having a large feature space. Microarray datasets are comprised of high-quality features and very few samples of data. Feature selection is performed on such datasets to identify the optimal feature subset. The major goal of feature selection is to improve the accuracy by identifying a minimal feature subset. For this purpose, the proposed research focused on analyzing and identifying effective feature selection algorithms. A novel framework is proposed which utilizes different feature selection methods from filters, wrappers, and embedded algorithms. Furthermore, classification is then performed on selected features to classify the data using a support vector machine (SVM) classifier. Two publically available benchmark datasets are used, i.e., the Microarray dataset and the Cleveland Heart Disease dataset, for experimentation and analysis, and they are archived from the UCI data repository. The performance of SVM is analyzed using accuracy, sensitivity, specificity, and f-measure. The accuracy of 94.45% and 91% is achieved on each dataset, respectively.
format Article
id doaj-art-f72419fec23c41b7b11f8b816f17e4f1
institution Kabale University
issn 1099-0526
language English
publishDate 2022-01-01
publisher Wiley
record_format Article
series Complexity
spelling doaj-art-f72419fec23c41b7b11f8b816f17e4f12025-08-20T03:25:53ZengWileyComplexity1099-05262022-01-01202210.1155/2022/8190814A Novel Feature Selection Method for Classification of Medical Data Using Filters, Wrappers, and Embedded ApproachesSaba Bashir0Irfan Ullah Khattak1Aihab Khan2Farhan Hassan Khan3Abdullah Gani4Muhammad Shiraz5Department of Computer ScienceDepartment of Computing and TechnologyDepartment of Computing and TechnologyKnowledge & Data Science Research Center (KDRC)Faculty of Computing and InformaticsDepartment of Computer ScienceFeature selection is the process of identifying the most relevant features from the given data having a large feature space. Microarray datasets are comprised of high-quality features and very few samples of data. Feature selection is performed on such datasets to identify the optimal feature subset. The major goal of feature selection is to improve the accuracy by identifying a minimal feature subset. For this purpose, the proposed research focused on analyzing and identifying effective feature selection algorithms. A novel framework is proposed which utilizes different feature selection methods from filters, wrappers, and embedded algorithms. Furthermore, classification is then performed on selected features to classify the data using a support vector machine (SVM) classifier. Two publically available benchmark datasets are used, i.e., the Microarray dataset and the Cleveland Heart Disease dataset, for experimentation and analysis, and they are archived from the UCI data repository. The performance of SVM is analyzed using accuracy, sensitivity, specificity, and f-measure. The accuracy of 94.45% and 91% is achieved on each dataset, respectively.http://dx.doi.org/10.1155/2022/8190814
spellingShingle Saba Bashir
Irfan Ullah Khattak
Aihab Khan
Farhan Hassan Khan
Abdullah Gani
Muhammad Shiraz
A Novel Feature Selection Method for Classification of Medical Data Using Filters, Wrappers, and Embedded Approaches
Complexity
title A Novel Feature Selection Method for Classification of Medical Data Using Filters, Wrappers, and Embedded Approaches
title_full A Novel Feature Selection Method for Classification of Medical Data Using Filters, Wrappers, and Embedded Approaches
title_fullStr A Novel Feature Selection Method for Classification of Medical Data Using Filters, Wrappers, and Embedded Approaches
title_full_unstemmed A Novel Feature Selection Method for Classification of Medical Data Using Filters, Wrappers, and Embedded Approaches
title_short A Novel Feature Selection Method for Classification of Medical Data Using Filters, Wrappers, and Embedded Approaches
title_sort novel feature selection method for classification of medical data using filters wrappers and embedded approaches
url http://dx.doi.org/10.1155/2022/8190814
work_keys_str_mv AT sababashir anovelfeatureselectionmethodforclassificationofmedicaldatausingfilterswrappersandembeddedapproaches
AT irfanullahkhattak anovelfeatureselectionmethodforclassificationofmedicaldatausingfilterswrappersandembeddedapproaches
AT aihabkhan anovelfeatureselectionmethodforclassificationofmedicaldatausingfilterswrappersandembeddedapproaches
AT farhanhassankhan anovelfeatureselectionmethodforclassificationofmedicaldatausingfilterswrappersandembeddedapproaches
AT abdullahgani anovelfeatureselectionmethodforclassificationofmedicaldatausingfilterswrappersandembeddedapproaches
AT muhammadshiraz anovelfeatureselectionmethodforclassificationofmedicaldatausingfilterswrappersandembeddedapproaches
AT sababashir novelfeatureselectionmethodforclassificationofmedicaldatausingfilterswrappersandembeddedapproaches
AT irfanullahkhattak novelfeatureselectionmethodforclassificationofmedicaldatausingfilterswrappersandembeddedapproaches
AT aihabkhan novelfeatureselectionmethodforclassificationofmedicaldatausingfilterswrappersandembeddedapproaches
AT farhanhassankhan novelfeatureselectionmethodforclassificationofmedicaldatausingfilterswrappersandembeddedapproaches
AT abdullahgani novelfeatureselectionmethodforclassificationofmedicaldatausingfilterswrappersandembeddedapproaches
AT muhammadshiraz novelfeatureselectionmethodforclassificationofmedicaldatausingfilterswrappersandembeddedapproaches