Comparison of Feature Selection and Feature Extraction Role in Dimensionality Reduction of Big Data

Recently, researchers intensified their efforts on a dataset with a large number of features named Big Data because of the technological revolution and the development in the data science sector. Dimensionality reduction technology has efficient, effective, and influential methods for analyzing thi...

Full description

Saved in:
Bibliographic Details
Main Authors: Haidar Khalid Malik, Nashaat Jasim Al-Anber, Fuad AbdoEsmail Al- Mekhlafi
Format: Article
Language:English
Published: middle technical university 2023-03-01
Series:Journal of Techniques
Subjects:
Online Access:https://journal.mtu.edu.iq/index.php/MTU/article/view/1027
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1832595137851555840
author Haidar Khalid Malik
Nashaat Jasim Al-Anber
Fuad AbdoEsmail Al- Mekhlafi
author_facet Haidar Khalid Malik
Nashaat Jasim Al-Anber
Fuad AbdoEsmail Al- Mekhlafi
author_sort Haidar Khalid Malik
collection DOAJ
description Recently, researchers intensified their efforts on a dataset with a large number of features named Big Data because of the technological revolution and the development in the data science sector. Dimensionality reduction technology has efficient, effective, and influential methods for analyzing this data, which contains many variables. The importance of Dimensionality Reduction technology lies in several fields, including “data processing, patterns recognition, machine learning, and data mining”. This paper compares two essential methods of dimensionality reduction, Feature Extraction and Feature Selection Which Machine Learning models frequently employ. We applied many classifiers like (Support vector machines, k-nearest neighbors, Decision tree, and Naive Bayes ) to the data of the anthropometric survey of US Army personnel (ANSUR 2) to classify the data and test the relevance of features by predicting a specific feature in USA Army personnel results showing that (k-nearest neighbors) achieved high accuracy (83%) in prediction, then reducing the dimensions by several techniques like (Highly Correlated Filter, Recursive  Feature Elimination, and principal components Analysis) results showing that (Recursive  Feature Elimination) have the best accuracy by (66%), From these results, it is clear that the efficiency of dimension reduction techniques varies according to the nature of the data. Some techniques are more efficient than others in text data and others are more efficient in dealing with images.
format Article
id doaj-art-a2fa07de466d4de2ba02b7da6830efb4
institution Kabale University
issn 1818-653X
2708-8383
language English
publishDate 2023-03-01
publisher middle technical university
record_format Article
series Journal of Techniques
spelling doaj-art-a2fa07de466d4de2ba02b7da6830efb42025-01-19T11:01:58Zengmiddle technical universityJournal of Techniques1818-653X2708-83832023-03-015110.51173/jt.v5i1.1027Comparison of Feature Selection and Feature Extraction Role in Dimensionality Reduction of Big DataHaidar Khalid Malik0Nashaat Jasim Al-Anber1Fuad AbdoEsmail Al- Mekhlafi2Technical College of Management - Baghdad, Middle Technical University, Baghdad, IraqTechnical College of Management - Baghdad, Middle Technical University, Baghdad, Iraq.Sana'a University, Sana'a, Yemen Recently, researchers intensified their efforts on a dataset with a large number of features named Big Data because of the technological revolution and the development in the data science sector. Dimensionality reduction technology has efficient, effective, and influential methods for analyzing this data, which contains many variables. The importance of Dimensionality Reduction technology lies in several fields, including “data processing, patterns recognition, machine learning, and data mining”. This paper compares two essential methods of dimensionality reduction, Feature Extraction and Feature Selection Which Machine Learning models frequently employ. We applied many classifiers like (Support vector machines, k-nearest neighbors, Decision tree, and Naive Bayes ) to the data of the anthropometric survey of US Army personnel (ANSUR 2) to classify the data and test the relevance of features by predicting a specific feature in USA Army personnel results showing that (k-nearest neighbors) achieved high accuracy (83%) in prediction, then reducing the dimensions by several techniques like (Highly Correlated Filter, Recursive  Feature Elimination, and principal components Analysis) results showing that (Recursive  Feature Elimination) have the best accuracy by (66%), From these results, it is clear that the efficiency of dimension reduction techniques varies according to the nature of the data. Some techniques are more efficient than others in text data and others are more efficient in dealing with images. https://journal.mtu.edu.iq/index.php/MTU/article/view/1027Feature ExtractionFeature SelectionPrincipal Component Analysis (PCA)Dimensionality Reduction
spellingShingle Haidar Khalid Malik
Nashaat Jasim Al-Anber
Fuad AbdoEsmail Al- Mekhlafi
Comparison of Feature Selection and Feature Extraction Role in Dimensionality Reduction of Big Data
Journal of Techniques
Feature Extraction
Feature Selection
Principal Component Analysis (PCA)
Dimensionality Reduction
title Comparison of Feature Selection and Feature Extraction Role in Dimensionality Reduction of Big Data
title_full Comparison of Feature Selection and Feature Extraction Role in Dimensionality Reduction of Big Data
title_fullStr Comparison of Feature Selection and Feature Extraction Role in Dimensionality Reduction of Big Data
title_full_unstemmed Comparison of Feature Selection and Feature Extraction Role in Dimensionality Reduction of Big Data
title_short Comparison of Feature Selection and Feature Extraction Role in Dimensionality Reduction of Big Data
title_sort comparison of feature selection and feature extraction role in dimensionality reduction of big data
topic Feature Extraction
Feature Selection
Principal Component Analysis (PCA)
Dimensionality Reduction
url https://journal.mtu.edu.iq/index.php/MTU/article/view/1027
work_keys_str_mv AT haidarkhalidmalik comparisonoffeatureselectionandfeatureextractionroleindimensionalityreductionofbigdata
AT nashaatjasimalanber comparisonoffeatureselectionandfeatureextractionroleindimensionalityreductionofbigdata
AT fuadabdoesmailalmekhlafi comparisonoffeatureselectionandfeatureextractionroleindimensionalityreductionofbigdata