Effects of Pooling Samples on the Performance of Classification Algorithms: A Comparative Study

A pooling design can be used as a powerful strategy to compensate for limited amounts of samples or high biological variation. In this paper, we perform a comparative study to model and quantify the effects of virtual pooling on the performance of the widely applied classifiers, support vector machi...

Full description

Saved in:

Bibliographic Details
Main Authors:	Kanthida Kusonmano, Michael Netzer, Christian Baumgartner, Matthias Dehmer, Klaus R. Liedl, Armin Graber
Format:	Article
Language:	English
Published:	Wiley 2012-01-01
Series:	The Scientific World Journal
Online Access:	http://dx.doi.org/10.1100/2012/278352
Tags:	Add Tag No Tags, Be the first to tag this record!

_version_	1832545763655155712
author	Kanthida Kusonmano Michael Netzer Christian Baumgartner Matthias Dehmer Klaus R. Liedl Armin Graber
author_facet	Kanthida Kusonmano Michael Netzer Christian Baumgartner Matthias Dehmer Klaus R. Liedl Armin Graber
author_sort	Kanthida Kusonmano
collection	DOAJ
description	A pooling design can be used as a powerful strategy to compensate for limited amounts of samples or high biological variation. In this paper, we perform a comparative study to model and quantify the effects of virtual pooling on the performance of the widely applied classifiers, support vector machines (SVMs), random forest (RF), k-nearest neighbors (k-NN), penalized logistic regression (PLR), and prediction analysis for microarrays (PAMs). We evaluate a variety of experimental designs using mock omics datasets with varying levels of pool sizes and considering effects from feature selection. Our results show that feature selection significantly improves classifier performance for non-pooled and pooled data. All investigated classifiers yield lower misclassification rates with smaller pool sizes. RF mainly outperforms other investigated algorithms, while accuracy levels are comparable among all the remaining ones. Guidelines are derived to identify an optimal pooling scheme for obtaining adequate predictive power and, hence, to motivate a study design that meets best experimental objectives and budgetary conditions, including time constraints.
format	Article
id	doaj-art-21307f9d6ba94534911143bde69cdf57
institution	Kabale University
issn	1537-744X
language	English
publishDate	2012-01-01
publisher	Wiley
record_format	Article
series	The Scientific World Journal
spelling	doaj-art-21307f9d6ba94534911143bde69cdf572025-02-03T07:24:47ZengWileyThe Scientific World Journal1537-744X2012-01-01201210.1100/2012/278352278352Effects of Pooling Samples on the Performance of Classification Algorithms: A Comparative StudyKanthida Kusonmano0Michael Netzer1Christian Baumgartner2Matthias Dehmer3Klaus R. Liedl4Armin Graber5Institute for Bioinformatics and Translational Research, UMIT, 6060 Hall in Tyrol, AustriaInstitute of Electrical and Biomedical Engineering, UMIT, 6060 Hall in Tyrol, AustriaInstitute of Electrical and Biomedical Engineering, UMIT, 6060 Hall in Tyrol, AustriaInstitute for Bioinformatics and Translational Research, UMIT, 6060 Hall in Tyrol, AustriaFaculty of Chemistry and Pharmacy, Leopold-Franzens-University Innsbruck, 6020 Innsbruck, AustriaInstitute for Bioinformatics and Translational Research, UMIT, 6060 Hall in Tyrol, AustriaA pooling design can be used as a powerful strategy to compensate for limited amounts of samples or high biological variation. In this paper, we perform a comparative study to model and quantify the effects of virtual pooling on the performance of the widely applied classifiers, support vector machines (SVMs), random forest (RF), k-nearest neighbors (k-NN), penalized logistic regression (PLR), and prediction analysis for microarrays (PAMs). We evaluate a variety of experimental designs using mock omics datasets with varying levels of pool sizes and considering effects from feature selection. Our results show that feature selection significantly improves classifier performance for non-pooled and pooled data. All investigated classifiers yield lower misclassification rates with smaller pool sizes. RF mainly outperforms other investigated algorithms, while accuracy levels are comparable among all the remaining ones. Guidelines are derived to identify an optimal pooling scheme for obtaining adequate predictive power and, hence, to motivate a study design that meets best experimental objectives and budgetary conditions, including time constraints.http://dx.doi.org/10.1100/2012/278352
spellingShingle	Kanthida Kusonmano Michael Netzer Christian Baumgartner Matthias Dehmer Klaus R. Liedl Armin Graber Effects of Pooling Samples on the Performance of Classification Algorithms: A Comparative Study The Scientific World Journal
title	Effects of Pooling Samples on the Performance of Classification Algorithms: A Comparative Study
title_full	Effects of Pooling Samples on the Performance of Classification Algorithms: A Comparative Study
title_fullStr	Effects of Pooling Samples on the Performance of Classification Algorithms: A Comparative Study
title_full_unstemmed	Effects of Pooling Samples on the Performance of Classification Algorithms: A Comparative Study
title_short	Effects of Pooling Samples on the Performance of Classification Algorithms: A Comparative Study
title_sort	effects of pooling samples on the performance of classification algorithms a comparative study
url	http://dx.doi.org/10.1100/2012/278352
work_keys_str_mv	AT kanthidakusonmano effectsofpoolingsamplesontheperformanceofclassificationalgorithmsacomparativestudy AT michaelnetzer effectsofpoolingsamplesontheperformanceofclassificationalgorithmsacomparativestudy AT christianbaumgartner effectsofpoolingsamplesontheperformanceofclassificationalgorithmsacomparativestudy AT matthiasdehmer effectsofpoolingsamplesontheperformanceofclassificationalgorithmsacomparativestudy AT klausrliedl effectsofpoolingsamplesontheperformanceofclassificationalgorithmsacomparativestudy AT armingraber effectsofpoolingsamplesontheperformanceofclassificationalgorithmsacomparativestudy

Effects of Pooling Samples on the Performance of Classification Algorithms: A Comparative Study

Similar Items