ϵ-Confidence Approximately Correct (ϵ-CoAC) Learnability and Hyperparameter Selection in Linear Regression Modeling

In a data based learning process, training data set is utilized to provide a hypothesis that can be generalized to explain all data points from a domain set. The hypothesis is chosen from classes with potentially different complexities. Linear regression modeling is an important category of learning...

Full description

Saved in:
Bibliographic Details
Main Authors: Soosan Beheshti, Mahdi Shamsi
Format: Article
Language:English
Published: IEEE 2025-01-01
Series:IEEE Access
Subjects:
Online Access:https://ieeexplore.ieee.org/document/10840229/
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1832586857188163584
author Soosan Beheshti
Mahdi Shamsi
author_facet Soosan Beheshti
Mahdi Shamsi
author_sort Soosan Beheshti
collection DOAJ
description In a data based learning process, training data set is utilized to provide a hypothesis that can be generalized to explain all data points from a domain set. The hypothesis is chosen from classes with potentially different complexities. Linear regression modeling is an important category of learning algorithms. The practical uncertainty of the label samples in the training data set has a major effect in the generalization ability of the learned model. Failing to choose a proper model or hypothesis class can lead to serious issues such as underfitting or overfitting. These issues have been addressed mostly by alternating modeling cost functions or by utilizing cross-validation methods. Drawbacks of these methods include introducing new hyperparameters with their own new challenges and uncertainties, potential increase of the computational complexity or requiring large set of training data sets. On the other hand, the theory of probably approximately correct (PAC) aims at defining learnability based on probabilistic settings. Despite its theoretical value, PAC bounds can&#x2019;t be utilized in practical regression learning applications with only available training data sets. This work is motivated by practical issues in regression learning generalization and is inspired by the foundations of the theory of statistical learning. The proposed approach, denoted by <inline-formula> <tex-math notation="LaTeX">$\epsilon $ </tex-math></inline-formula>-Confidence Approximately Correct (<inline-formula> <tex-math notation="LaTeX">$\epsilon $ </tex-math></inline-formula>-CoAC), utilizes the conventional Kullback-Leibler divergence (relative entropy) and defines new related typical sets to develop a unique method of probabilistic statistical learning for practical regression learning and generalization. <inline-formula> <tex-math notation="LaTeX">$\epsilon $ </tex-math></inline-formula>-CoAC learnability is able to validate the learning process as a function of training data sample size, as well as a function of the hypothesis class complexity order. Consequently, it enables the learner to automatically compare hypothesis classes of different complexity orders and to choose among them the optimum class with the minimum <inline-formula> <tex-math notation="LaTeX">$\epsilon $ </tex-math></inline-formula> in the <inline-formula> <tex-math notation="LaTeX">$\epsilon $ </tex-math></inline-formula>-CoAC framework. The <inline-formula> <tex-math notation="LaTeX">$\epsilon $ </tex-math></inline-formula>-CoAC learnability overcomes the issues of overfitting and underfitting. In addition, it shows advantages over the well-known cross-validation method in the sense of accuracy and data length requirements for convergence. Simulation results, for both synthetic and real data, confirm not only strength and capability of <inline-formula> <tex-math notation="LaTeX">$\epsilon $ </tex-math></inline-formula>-CoAC in providing learning measurements as a function of data length and/or hypothesis complexity, but also superiority of the method over the existing approaches in hypothesis complexity and model selection.
format Article
id doaj-art-564182b53fb743f0898df366b989c2bb
institution Kabale University
issn 2169-3536
language English
publishDate 2025-01-01
publisher IEEE
record_format Article
series IEEE Access
spelling doaj-art-564182b53fb743f0898df366b989c2bb2025-01-25T00:00:39ZengIEEEIEEE Access2169-35362025-01-0113142731428910.1109/ACCESS.2025.352962210840229&#x03F5;-Confidence Approximately Correct (&#x03F5;-CoAC) Learnability and Hyperparameter Selection in Linear Regression ModelingSoosan Beheshti0https://orcid.org/0000-0001-7161-5887Mahdi Shamsi1https://orcid.org/0000-0002-0795-6238Department of Electrical, Computer, and Biomedical Engineering, Toronto Metropolitan University, Toronto, ON, CanadaDepartment of Electrical, Computer, and Biomedical Engineering, Toronto Metropolitan University, Toronto, ON, CanadaIn a data based learning process, training data set is utilized to provide a hypothesis that can be generalized to explain all data points from a domain set. The hypothesis is chosen from classes with potentially different complexities. Linear regression modeling is an important category of learning algorithms. The practical uncertainty of the label samples in the training data set has a major effect in the generalization ability of the learned model. Failing to choose a proper model or hypothesis class can lead to serious issues such as underfitting or overfitting. These issues have been addressed mostly by alternating modeling cost functions or by utilizing cross-validation methods. Drawbacks of these methods include introducing new hyperparameters with their own new challenges and uncertainties, potential increase of the computational complexity or requiring large set of training data sets. On the other hand, the theory of probably approximately correct (PAC) aims at defining learnability based on probabilistic settings. Despite its theoretical value, PAC bounds can&#x2019;t be utilized in practical regression learning applications with only available training data sets. This work is motivated by practical issues in regression learning generalization and is inspired by the foundations of the theory of statistical learning. The proposed approach, denoted by <inline-formula> <tex-math notation="LaTeX">$\epsilon $ </tex-math></inline-formula>-Confidence Approximately Correct (<inline-formula> <tex-math notation="LaTeX">$\epsilon $ </tex-math></inline-formula>-CoAC), utilizes the conventional Kullback-Leibler divergence (relative entropy) and defines new related typical sets to develop a unique method of probabilistic statistical learning for practical regression learning and generalization. <inline-formula> <tex-math notation="LaTeX">$\epsilon $ </tex-math></inline-formula>-CoAC learnability is able to validate the learning process as a function of training data sample size, as well as a function of the hypothesis class complexity order. Consequently, it enables the learner to automatically compare hypothesis classes of different complexity orders and to choose among them the optimum class with the minimum <inline-formula> <tex-math notation="LaTeX">$\epsilon $ </tex-math></inline-formula> in the <inline-formula> <tex-math notation="LaTeX">$\epsilon $ </tex-math></inline-formula>-CoAC framework. The <inline-formula> <tex-math notation="LaTeX">$\epsilon $ </tex-math></inline-formula>-CoAC learnability overcomes the issues of overfitting and underfitting. In addition, it shows advantages over the well-known cross-validation method in the sense of accuracy and data length requirements for convergence. Simulation results, for both synthetic and real data, confirm not only strength and capability of <inline-formula> <tex-math notation="LaTeX">$\epsilon $ </tex-math></inline-formula>-CoAC in providing learning measurements as a function of data length and/or hypothesis complexity, but also superiority of the method over the existing approaches in hypothesis complexity and model selection.https://ieeexplore.ieee.org/document/10840229/Statistical learning theorysample complexityhypothesis class complexityKullback-Leibler divergence
spellingShingle Soosan Beheshti
Mahdi Shamsi
&#x03F5;-Confidence Approximately Correct (&#x03F5;-CoAC) Learnability and Hyperparameter Selection in Linear Regression Modeling
IEEE Access
Statistical learning theory
sample complexity
hypothesis class complexity
Kullback-Leibler divergence
title &#x03F5;-Confidence Approximately Correct (&#x03F5;-CoAC) Learnability and Hyperparameter Selection in Linear Regression Modeling
title_full &#x03F5;-Confidence Approximately Correct (&#x03F5;-CoAC) Learnability and Hyperparameter Selection in Linear Regression Modeling
title_fullStr &#x03F5;-Confidence Approximately Correct (&#x03F5;-CoAC) Learnability and Hyperparameter Selection in Linear Regression Modeling
title_full_unstemmed &#x03F5;-Confidence Approximately Correct (&#x03F5;-CoAC) Learnability and Hyperparameter Selection in Linear Regression Modeling
title_short &#x03F5;-Confidence Approximately Correct (&#x03F5;-CoAC) Learnability and Hyperparameter Selection in Linear Regression Modeling
title_sort x03f5 confidence approximately correct x03f5 coac learnability and hyperparameter selection in linear regression modeling
topic Statistical learning theory
sample complexity
hypothesis class complexity
Kullback-Leibler divergence
url https://ieeexplore.ieee.org/document/10840229/
work_keys_str_mv AT soosanbeheshti x03f5confidenceapproximatelycorrectx03f5coaclearnabilityandhyperparameterselectioninlinearregressionmodeling
AT mahdishamsi x03f5confidenceapproximatelycorrectx03f5coaclearnabilityandhyperparameterselectioninlinearregressionmodeling