Fast binary logistic regression

This study presents a novel numerical approach that improves the training efficiency of binary logistic regression, a popular statistical model in the machine learning community. Our method achieves training times an order of magnitude faster than traditional logistic regression by employing a novel...

Full description

Saved in:

Bibliographic Details
Main Authors:	Nurdan Ayse Saran, Fatih Nar
Format:	Article
Language:	English
Published:	PeerJ Inc. 2025-01-01
Series:	PeerJ Computer Science
Subjects:	Logistic regression Low-rank Singular value decomposition Lf-norm regularization
Online Access:	https://peerj.com/articles/cs-2579.pdf
Tags:	Add Tag No Tags, Be the first to tag this record!

_version_	1832574392123523072
author	Nurdan Ayse Saran Fatih Nar
author_facet	Nurdan Ayse Saran Fatih Nar
author_sort	Nurdan Ayse Saran
collection	DOAJ
description	This study presents a novel numerical approach that improves the training efficiency of binary logistic regression, a popular statistical model in the machine learning community. Our method achieves training times an order of magnitude faster than traditional logistic regression by employing a novel Soft-Plus approximation, which enables reformulation of logistic regression parameter estimation into matrix-vector form. We also adopt the Lf-norm penalty, which allows using fractional norms, including the L2-norm, L1-norm, and L0-norm, to regularize the model parameters. We put Lf-norm formulation in matrix-vector form, providing flexibility to include or exclude penalization of the intercept term when applying regularization. Furthermore, to address the common problem of collinear features, we apply singular value decomposition (SVD), resulting in a low-rank representation commonly used to reduce computational complexity while preserving essential features and mitigating noise. Moreover, our approach incorporates a randomized SVD alongside a newly developed SVD with row reduction (SVD-RR) method, which aims to manage datasets with many rows and features efficiently. This computational efficiency is crucial in developing a generalized model that requires repeated training over various parameters to balance bias and variance. We also demonstrate the effectiveness of our fast binary logistic regression (FBLR) method on various datasets from the OpenML repository in addition to synthetic datasets.
format	Article
id	doaj-art-a30209b30fc344df8a321d788295ec23
institution	Kabale University
issn	2376-5992
language	English
publishDate	2025-01-01
publisher	PeerJ Inc.
record_format	Article
series	PeerJ Computer Science
spelling	doaj-art-a30209b30fc344df8a321d788295ec232025-02-01T15:05:17ZengPeerJ Inc.PeerJ Computer Science2376-59922025-01-0111e257910.7717/peerj-cs.2579Fast binary logistic regressionNurdan Ayse Saran0Fatih Nar1Department of Computer Engineering, Cankaya University, Ankara, TürkiyeDepartment of Computer Engineering, Ankara Yildirim Beyazit University, Ankara, TürkiyeThis study presents a novel numerical approach that improves the training efficiency of binary logistic regression, a popular statistical model in the machine learning community. Our method achieves training times an order of magnitude faster than traditional logistic regression by employing a novel Soft-Plus approximation, which enables reformulation of logistic regression parameter estimation into matrix-vector form. We also adopt the Lf-norm penalty, which allows using fractional norms, including the L2-norm, L1-norm, and L0-norm, to regularize the model parameters. We put Lf-norm formulation in matrix-vector form, providing flexibility to include or exclude penalization of the intercept term when applying regularization. Furthermore, to address the common problem of collinear features, we apply singular value decomposition (SVD), resulting in a low-rank representation commonly used to reduce computational complexity while preserving essential features and mitigating noise. Moreover, our approach incorporates a randomized SVD alongside a newly developed SVD with row reduction (SVD-RR) method, which aims to manage datasets with many rows and features efficiently. This computational efficiency is crucial in developing a generalized model that requires repeated training over various parameters to balance bias and variance. We also demonstrate the effectiveness of our fast binary logistic regression (FBLR) method on various datasets from the OpenML repository in addition to synthetic datasets.https://peerj.com/articles/cs-2579.pdfLogistic regressionLow-rankSingular value decompositionLf-norm regularization
spellingShingle	Nurdan Ayse Saran Fatih Nar Fast binary logistic regression PeerJ Computer Science Logistic regression Low-rank Singular value decomposition Lf-norm regularization
title	Fast binary logistic regression
title_full	Fast binary logistic regression
title_fullStr	Fast binary logistic regression
title_full_unstemmed	Fast binary logistic regression
title_short	Fast binary logistic regression
title_sort	fast binary logistic regression
topic	Logistic regression Low-rank Singular value decomposition Lf-norm regularization
url	https://peerj.com/articles/cs-2579.pdf
work_keys_str_mv	AT nurdanaysesaran fastbinarylogisticregression AT fatihnar fastbinarylogisticregression

Fast binary logistic regression

Similar Items