Novel and Efficient Randomized Algorithms for Feature Selection

Feature selection is a crucial problem in efficient machine learning, and it also greatly contributes to the explainability of machine-driven decisions. Methods, like decision trees and Least Absolute Shrinkage and Selection Operator (LASSO), can select features during training. However, these embed...

Full description

Saved in:

Bibliographic Details
Main Authors:	Zigeng Wang, Xia Xiao, Sanguthevar Rajasekaran
Format:	Article
Language:	English
Published:	Tsinghua University Press 2020-09-01
Series:	Big Data Mining and Analytics
Subjects:	feature selection randomized algorithms efficient selection
Online Access:	https://www.sciopen.com/article/10.26599/BDMA.2020.9020005
Tags:	Add Tag No Tags, Be the first to tag this record!

_version_	1832573628288335872
author	Zigeng Wang Xia Xiao Sanguthevar Rajasekaran
author_facet	Zigeng Wang Xia Xiao Sanguthevar Rajasekaran
author_sort	Zigeng Wang
collection	DOAJ
description	Feature selection is a crucial problem in efficient machine learning, and it also greatly contributes to the explainability of machine-driven decisions. Methods, like decision trees and Least Absolute Shrinkage and Selection Operator (LASSO), can select features during training. However, these embedded approaches can only be applied to a small subset of machine learning models. Wrapper based methods can select features independently from machine learning models but they often suffer from a high computational cost. To enhance their efficiency, many randomized algorithms have been designed. In this paper, we propose automatic breadth searching and attention searching adjustment approaches to further speedup randomized wrapper based feature selection. We conduct theoretical computational complexity analysis and further explain our algorithms’ generic parallelizability. We conduct experiments on both synthetic and real datasets with different machine learning base models. Results show that, compared with existing approaches, our proposed techniques can locate a more meaningful set of features with a high efficiency.
format	Article
id	doaj-art-84ef381e566e41bebf74107fc29eb08a
institution	Kabale University
issn	2096-0654
language	English
publishDate	2020-09-01
publisher	Tsinghua University Press
record_format	Article
series	Big Data Mining and Analytics
spelling	doaj-art-84ef381e566e41bebf74107fc29eb08a2025-02-02T03:45:08ZengTsinghua University PressBig Data Mining and Analytics2096-06542020-09-013320822410.26599/BDMA.2020.9020005Novel and Efficient Randomized Algorithms for Feature SelectionZigeng Wang0Xia Xiao1Sanguthevar Rajasekaran2<institution content-type="dept">Department of Computer Science and Engineering</institution>, <institution>University of Connecticut</institution>, <city>Storrs</city>, <state>CT</state> <postal-code>06269</postal-code>, <country>USA</country>.<institution content-type="dept">Department of Computer Science and Engineering</institution>, <institution>University of Connecticut</institution>, <city>Storrs</city>, <state>CT</state> <postal-code>06269</postal-code>, <country>USA</country>.<institution content-type="dept">Department of Computer Science and Engineering</institution>, <institution>University of Connecticut</institution>, <city>Storrs</city>, <state>CT</state> <postal-code>06269</postal-code>, <country>USA</country>.Feature selection is a crucial problem in efficient machine learning, and it also greatly contributes to the explainability of machine-driven decisions. Methods, like decision trees and Least Absolute Shrinkage and Selection Operator (LASSO), can select features during training. However, these embedded approaches can only be applied to a small subset of machine learning models. Wrapper based methods can select features independently from machine learning models but they often suffer from a high computational cost. To enhance their efficiency, many randomized algorithms have been designed. In this paper, we propose automatic breadth searching and attention searching adjustment approaches to further speedup randomized wrapper based feature selection. We conduct theoretical computational complexity analysis and further explain our algorithms’ generic parallelizability. We conduct experiments on both synthetic and real datasets with different machine learning base models. Results show that, compared with existing approaches, our proposed techniques can locate a more meaningful set of features with a high efficiency.https://www.sciopen.com/article/10.26599/BDMA.2020.9020005feature selectionrandomized algorithmsefficient selection
spellingShingle	Zigeng Wang Xia Xiao Sanguthevar Rajasekaran Novel and Efficient Randomized Algorithms for Feature Selection Big Data Mining and Analytics feature selection randomized algorithms efficient selection
title	Novel and Efficient Randomized Algorithms for Feature Selection
title_full	Novel and Efficient Randomized Algorithms for Feature Selection
title_fullStr	Novel and Efficient Randomized Algorithms for Feature Selection
title_full_unstemmed	Novel and Efficient Randomized Algorithms for Feature Selection
title_short	Novel and Efficient Randomized Algorithms for Feature Selection
title_sort	novel and efficient randomized algorithms for feature selection
topic	feature selection randomized algorithms efficient selection
url	https://www.sciopen.com/article/10.26599/BDMA.2020.9020005
work_keys_str_mv	AT zigengwang novelandefficientrandomizedalgorithmsforfeatureselection AT xiaxiao novelandefficientrandomizedalgorithmsforfeatureselection AT sanguthevarrajasekaran novelandefficientrandomizedalgorithmsforfeatureselection

Novel and Efficient Randomized Algorithms for Feature Selection

Similar Items