Feature Selection with Graph Mining Technology

Many real world applications have problems with high dimensionality, which existing algorithms cannot overcome. A critical data preprocessing problem is feature selection, whereby its non-scalability negatively influences both the efficiency and performance of big data applications. In this research...

Full description

Saved in:

Bibliographic Details
Main Authors:	Thosini Bamunu Mudiyanselage, Yanqing Zhang
Format:	Article
Language:	English
Published:	Tsinghua University Press 2019-06-01
Series:	Big Data Mining and Analytics
Subjects:	graph mining network embedding big data analysis feature selection high-dimensional data
Online Access:	https://www.sciopen.com/article/10.26599/BDMA.2018.9020032
Tags:	Add Tag No Tags, Be the first to tag this record!

_version_	1832572804679073792
author	Thosini Bamunu Mudiyanselage Yanqing Zhang
author_facet	Thosini Bamunu Mudiyanselage Yanqing Zhang
author_sort	Thosini Bamunu Mudiyanselage
collection	DOAJ
description	Many real world applications have problems with high dimensionality, which existing algorithms cannot overcome. A critical data preprocessing problem is feature selection, whereby its non-scalability negatively influences both the efficiency and performance of big data applications. In this research, we developed a new algorithm to reduce the dimensionality of a problem using graph-based analysis, which retains the physical meaning of the original high-dimensional feature space. Most existing feature-selection methods are based on a strong assumption that features are independent of each other. However, if the feature-selection algorithm does not take into consideration the interdependencies of the feature space, the selected data fail to correctly represent the original data. We developed a new feature-selection method to address this challenge. Our aim in this research was to examine the dependencies between features and select the optimal feature set with respect to the original data structure. Another important factor in our proposed method is that it can perform even in the absence of class labels. This is a more difficult problem that many feature-selection algorithms fail to address. In this case, they only use wrapper techniques that require a learning algorithm to select features. It is important to note that our experimental results indicates, this proposed simple ranking method performs better than other methods, independent of any particular learning algorithm used.
format	Article
id	doaj-art-51216d57c7ae4f39aa729063d30bc32f
institution	Kabale University
issn	2096-0654
language	English
publishDate	2019-06-01
publisher	Tsinghua University Press
record_format	Article
series	Big Data Mining and Analytics
spelling	doaj-art-51216d57c7ae4f39aa729063d30bc32f2025-02-02T06:50:33ZengTsinghua University PressBig Data Mining and Analytics2096-06542019-06-0122738210.26599/BDMA.2018.9020032Feature Selection with Graph Mining TechnologyThosini Bamunu Mudiyanselage0Yanqing Zhang1<institution content-type="dept">Department of Computer Science</institution>, <institution>Georgia State University</institution>, <city>Atlanta</city>, <state>GA</state> <postal-code>30302</postal-code>, <country>USA</country>.<institution content-type="dept">Department of Computer Science</institution>, <institution>Georgia State University</institution>, <city>Atlanta</city>, <state>GA</state> <postal-code>30302</postal-code>, <country>USA</country>.Many real world applications have problems with high dimensionality, which existing algorithms cannot overcome. A critical data preprocessing problem is feature selection, whereby its non-scalability negatively influences both the efficiency and performance of big data applications. In this research, we developed a new algorithm to reduce the dimensionality of a problem using graph-based analysis, which retains the physical meaning of the original high-dimensional feature space. Most existing feature-selection methods are based on a strong assumption that features are independent of each other. However, if the feature-selection algorithm does not take into consideration the interdependencies of the feature space, the selected data fail to correctly represent the original data. We developed a new feature-selection method to address this challenge. Our aim in this research was to examine the dependencies between features and select the optimal feature set with respect to the original data structure. Another important factor in our proposed method is that it can perform even in the absence of class labels. This is a more difficult problem that many feature-selection algorithms fail to address. In this case, they only use wrapper techniques that require a learning algorithm to select features. It is important to note that our experimental results indicates, this proposed simple ranking method performs better than other methods, independent of any particular learning algorithm used.https://www.sciopen.com/article/10.26599/BDMA.2018.9020032graph miningnetwork embeddingbig data analysisfeature selectionhigh-dimensional data
spellingShingle	Thosini Bamunu Mudiyanselage Yanqing Zhang Feature Selection with Graph Mining Technology Big Data Mining and Analytics graph mining network embedding big data analysis feature selection high-dimensional data
title	Feature Selection with Graph Mining Technology
title_full	Feature Selection with Graph Mining Technology
title_fullStr	Feature Selection with Graph Mining Technology
title_full_unstemmed	Feature Selection with Graph Mining Technology
title_short	Feature Selection with Graph Mining Technology
title_sort	feature selection with graph mining technology
topic	graph mining network embedding big data analysis feature selection high-dimensional data
url	https://www.sciopen.com/article/10.26599/BDMA.2018.9020032
work_keys_str_mv	AT thosinibamunumudiyanselage featureselectionwithgraphminingtechnology AT yanqingzhang featureselectionwithgraphminingtechnology

Feature Selection with Graph Mining Technology

Similar Items