Comparison of the efficiency of zero and first order minimization methods in neural networks

To minimize the objective function in neural networks, first-order methods are usually used, which involve the repeated calculation of the gradient. The number of variables in modern neural networks can be many thousands and even millions. Numerous experiments show that the analytical calculation ti...

Full description

Saved in:
Bibliographic Details
Main Authors: E. A. Gubareva, S. I. Khashin, E. S. Shemyakova
Format: Article
Language:English
Published: Publishing House of the State University of Management 2022-12-01
Series:Вестник университета
Subjects:
Online Access:https://vestnik.guu.ru/jour/article/view/3952
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1832541614929608704
author E. A. Gubareva
S. I. Khashin
E. S. Shemyakova
author_facet E. A. Gubareva
S. I. Khashin
E. S. Shemyakova
author_sort E. A. Gubareva
collection DOAJ
description To minimize the objective function in neural networks, first-order methods are usually used, which involve the repeated calculation of the gradient. The number of variables in modern neural networks can be many thousands and even millions. Numerous experiments show that the analytical calculation time of an N variable function’s gradient is approximately N/5 times longer than the calculation time of the function itself. The article considers the possibility of using zero-order methods to minimize the function. In particular, a new zero-order method for function minimization, descent over two-dimensional spaces, is proposed. The convergence rates of three different methods are compared: standard gradient descent with automatic step selection, coordinate descent with step selection for each coordinate, and descent over two-dimensional subspaces. It has been shown that the efficiency of properly organized zero-order methods in the considered problems of training neural networks is not lower than the gradient ones.
format Article
id doaj-art-62c9c840bf864afda4e91a670dd9e973
institution Kabale University
issn 1816-4277
2686-8415
language English
publishDate 2022-12-01
publisher Publishing House of the State University of Management
record_format Article
series Вестник университета
spelling doaj-art-62c9c840bf864afda4e91a670dd9e9732025-02-04T08:28:14ZengPublishing House of the State University of ManagementВестник университета1816-42772686-84152022-12-01111485510.26425/1816-4277-2022-11-48-552621Comparison of the efficiency of zero and first order minimization methods in neural networksE. A. Gubareva0S. I. Khashin1E. S. Shemyakova2State University of ManagementIvanovo State UniversityUniversity of ToledoTo minimize the objective function in neural networks, first-order methods are usually used, which involve the repeated calculation of the gradient. The number of variables in modern neural networks can be many thousands and even millions. Numerous experiments show that the analytical calculation time of an N variable function’s gradient is approximately N/5 times longer than the calculation time of the function itself. The article considers the possibility of using zero-order methods to minimize the function. In particular, a new zero-order method for function minimization, descent over two-dimensional spaces, is proposed. The convergence rates of three different methods are compared: standard gradient descent with automatic step selection, coordinate descent with step selection for each coordinate, and descent over two-dimensional subspaces. It has been shown that the efficiency of properly organized zero-order methods in the considered problems of training neural networks is not lower than the gradient ones.https://vestnik.guu.ru/jour/article/view/3952neural networksobjective functionminimizationgradientgradient descentcoordinate descentconvergence rate
spellingShingle E. A. Gubareva
S. I. Khashin
E. S. Shemyakova
Comparison of the efficiency of zero and first order minimization methods in neural networks
Вестник университета
neural networks
objective function
minimization
gradient
gradient descent
coordinate descent
convergence rate
title Comparison of the efficiency of zero and first order minimization methods in neural networks
title_full Comparison of the efficiency of zero and first order minimization methods in neural networks
title_fullStr Comparison of the efficiency of zero and first order minimization methods in neural networks
title_full_unstemmed Comparison of the efficiency of zero and first order minimization methods in neural networks
title_short Comparison of the efficiency of zero and first order minimization methods in neural networks
title_sort comparison of the efficiency of zero and first order minimization methods in neural networks
topic neural networks
objective function
minimization
gradient
gradient descent
coordinate descent
convergence rate
url https://vestnik.guu.ru/jour/article/view/3952
work_keys_str_mv AT eagubareva comparisonoftheefficiencyofzeroandfirstorderminimizationmethodsinneuralnetworks
AT sikhashin comparisonoftheefficiencyofzeroandfirstorderminimizationmethodsinneuralnetworks
AT esshemyakova comparisonoftheefficiencyofzeroandfirstorderminimizationmethodsinneuralnetworks