Group Feature Screening Based on Information Gain Ratio for Ultrahigh-Dimensional Data

Most model-free feature screening approaches focus on the -individual predictor; therefore, they are not able to incorporate structured predictors like grouped variables. In this article, we propose a group screening procedure via the information gain ratio for a classification model, which is a dir...

Full description

Saved in:
Bibliographic Details
Main Authors: Zhongzheng Wang, Guangming Deng, Jianqi Yu
Format: Article
Language:English
Published: Wiley 2022-01-01
Series:Journal of Mathematics
Online Access:http://dx.doi.org/10.1155/2022/1600986
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1832567742543167488
author Zhongzheng Wang
Guangming Deng
Jianqi Yu
author_facet Zhongzheng Wang
Guangming Deng
Jianqi Yu
author_sort Zhongzheng Wang
collection DOAJ
description Most model-free feature screening approaches focus on the -individual predictor; therefore, they are not able to incorporate structured predictors like grouped variables. In this article, we propose a group screening procedure via the information gain ratio for a classification model, which is a direct extension of the original sure independence screening procedure and also model-free. The proposed method yields a better screening performance and classification accuracy. It is demonstrated that the proposed group screening method possesses the sure screening property and ranking consistency properties under certain regularity conditions. Through simulation studies and real-world data analysis, we demonstrate the proposed method with the finite sample performance.
format Article
id doaj-art-95cca40a4a24404fa7e5424050be1848
institution Kabale University
issn 2314-4785
language English
publishDate 2022-01-01
publisher Wiley
record_format Article
series Journal of Mathematics
spelling doaj-art-95cca40a4a24404fa7e5424050be18482025-02-03T01:00:44ZengWileyJournal of Mathematics2314-47852022-01-01202210.1155/2022/1600986Group Feature Screening Based on Information Gain Ratio for Ultrahigh-Dimensional DataZhongzheng Wang0Guangming Deng1Jianqi Yu2College of ScienceCollege of ScienceCollege of ScienceMost model-free feature screening approaches focus on the -individual predictor; therefore, they are not able to incorporate structured predictors like grouped variables. In this article, we propose a group screening procedure via the information gain ratio for a classification model, which is a direct extension of the original sure independence screening procedure and also model-free. The proposed method yields a better screening performance and classification accuracy. It is demonstrated that the proposed group screening method possesses the sure screening property and ranking consistency properties under certain regularity conditions. Through simulation studies and real-world data analysis, we demonstrate the proposed method with the finite sample performance.http://dx.doi.org/10.1155/2022/1600986
spellingShingle Zhongzheng Wang
Guangming Deng
Jianqi Yu
Group Feature Screening Based on Information Gain Ratio for Ultrahigh-Dimensional Data
Journal of Mathematics
title Group Feature Screening Based on Information Gain Ratio for Ultrahigh-Dimensional Data
title_full Group Feature Screening Based on Information Gain Ratio for Ultrahigh-Dimensional Data
title_fullStr Group Feature Screening Based on Information Gain Ratio for Ultrahigh-Dimensional Data
title_full_unstemmed Group Feature Screening Based on Information Gain Ratio for Ultrahigh-Dimensional Data
title_short Group Feature Screening Based on Information Gain Ratio for Ultrahigh-Dimensional Data
title_sort group feature screening based on information gain ratio for ultrahigh dimensional data
url http://dx.doi.org/10.1155/2022/1600986
work_keys_str_mv AT zhongzhengwang groupfeaturescreeningbasedoninformationgainratioforultrahighdimensionaldata
AT guangmingdeng groupfeaturescreeningbasedoninformationgainratioforultrahighdimensionaldata
AT jianqiyu groupfeaturescreeningbasedoninformationgainratioforultrahighdimensionaldata