Attribute Selection Impact on Linear and Nonlinear Regression Models for Crop Yield Prediction

Efficient cropping requires yield estimation for each involved crop, where data-driven models are commonly applied. In recent years, some data-driven modeling technique comparisons have been made, looking for the best model to yield prediction. However, attributes are usually selected based on exper...

Full description

Saved in:

Bibliographic Details
Main Authors:	Alberto Gonzalez-Sanchez, Juan Frausto-Solis, Waldo Ojeda-Bustamante
Format:	Article
Language:	English
Published:	Wiley 2014-01-01
Series:	The Scientific World Journal
Online Access:	http://dx.doi.org/10.1155/2014/509429
Tags:	Add Tag No Tags, Be the first to tag this record!

_version_	1832551870484185088
author	Alberto Gonzalez-Sanchez Juan Frausto-Solis Waldo Ojeda-Bustamante
author_facet	Alberto Gonzalez-Sanchez Juan Frausto-Solis Waldo Ojeda-Bustamante
author_sort	Alberto Gonzalez-Sanchez
collection	DOAJ
description	Efficient cropping requires yield estimation for each involved crop, where data-driven models are commonly applied. In recent years, some data-driven modeling technique comparisons have been made, looking for the best model to yield prediction. However, attributes are usually selected based on expertise assessment or in dimensionality reduction algorithms. A fairer comparison should include the best subset of features for each regression technique; an evaluation including several crops is preferred. This paper evaluates the most common data-driven modeling techniques applied to yield prediction, using a complete method to define the best attribute subset for each model. Multiple linear regression, stepwise linear regression, M5′ regression trees, and artificial neural networks (ANN) were ranked. The models were built using real data of eight crops sowed in an irrigation module of Mexico. To validate the models, three accuracy metrics were used: the root relative square error (RRSE), relative mean absolute error (RMAE), and correlation factor (R). The results show that ANNs are more consistent in the best attribute subset composition between the learning and the training stages, obtaining the lowest average RRSE (86.04%), lowest average RMAE (8.75%), and the highest average correlation factor (0.63).
format	Article
id	doaj-art-41538ddf4aff4d0fa7c44b0ec681c7f4
institution	Kabale University
issn	2356-6140 1537-744X
language	English
publishDate	2014-01-01
publisher	Wiley
record_format	Article
series	The Scientific World Journal
spelling	doaj-art-41538ddf4aff4d0fa7c44b0ec681c7f42025-02-03T06:00:19ZengWileyThe Scientific World Journal2356-61401537-744X2014-01-01201410.1155/2014/509429509429Attribute Selection Impact on Linear and Nonlinear Regression Models for Crop Yield PredictionAlberto Gonzalez-Sanchez0Juan Frausto-Solis1Waldo Ojeda-Bustamante2IMTA, Boulevard Cuauhnáhuac 8532, Colonia Progreso, 62550 Jiutepec, MOR, MexicoUPEMOR, Boulevard Cuauhnáhuac 566, Colonia Lomas del Texcal, 62550 Jiutepec, MOR, MexicoIMTA, Boulevard Cuauhnáhuac 8532, Colonia Progreso, 62550 Jiutepec, MOR, MexicoEfficient cropping requires yield estimation for each involved crop, where data-driven models are commonly applied. In recent years, some data-driven modeling technique comparisons have been made, looking for the best model to yield prediction. However, attributes are usually selected based on expertise assessment or in dimensionality reduction algorithms. A fairer comparison should include the best subset of features for each regression technique; an evaluation including several crops is preferred. This paper evaluates the most common data-driven modeling techniques applied to yield prediction, using a complete method to define the best attribute subset for each model. Multiple linear regression, stepwise linear regression, M5′ regression trees, and artificial neural networks (ANN) were ranked. The models were built using real data of eight crops sowed in an irrigation module of Mexico. To validate the models, three accuracy metrics were used: the root relative square error (RRSE), relative mean absolute error (RMAE), and correlation factor (R). The results show that ANNs are more consistent in the best attribute subset composition between the learning and the training stages, obtaining the lowest average RRSE (86.04%), lowest average RMAE (8.75%), and the highest average correlation factor (0.63).http://dx.doi.org/10.1155/2014/509429
spellingShingle	Alberto Gonzalez-Sanchez Juan Frausto-Solis Waldo Ojeda-Bustamante Attribute Selection Impact on Linear and Nonlinear Regression Models for Crop Yield Prediction The Scientific World Journal
title	Attribute Selection Impact on Linear and Nonlinear Regression Models for Crop Yield Prediction
title_full	Attribute Selection Impact on Linear and Nonlinear Regression Models for Crop Yield Prediction
title_fullStr	Attribute Selection Impact on Linear and Nonlinear Regression Models for Crop Yield Prediction
title_full_unstemmed	Attribute Selection Impact on Linear and Nonlinear Regression Models for Crop Yield Prediction
title_short	Attribute Selection Impact on Linear and Nonlinear Regression Models for Crop Yield Prediction
title_sort	attribute selection impact on linear and nonlinear regression models for crop yield prediction
url	http://dx.doi.org/10.1155/2014/509429
work_keys_str_mv	AT albertogonzalezsanchez attributeselectionimpactonlinearandnonlinearregressionmodelsforcropyieldprediction AT juanfraustosolis attributeselectionimpactonlinearandnonlinearregressionmodelsforcropyieldprediction AT waldoojedabustamante attributeselectionimpactonlinearandnonlinearregressionmodelsforcropyieldprediction

Attribute Selection Impact on Linear and Nonlinear Regression Models for Crop Yield Prediction

Similar Items