Comparison of the efficiency of zero and first order minimization methods in neural networks
To minimize the objective function in neural networks, first-order methods are usually used, which involve the repeated calculation of the gradient. The number of variables in modern neural networks can be many thousands and even millions. Numerous experiments show that the analytical calculation ti...
Saved in:
Main Authors: | , , |
---|---|
Format: | Article |
Language: | English |
Published: |
Publishing House of the State University of Management
2022-12-01
|
Series: | Вестник университета |
Subjects: | |
Online Access: | https://vestnik.guu.ru/jour/article/view/3952 |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
_version_ | 1832541614929608704 |
---|---|
author | E. A. Gubareva S. I. Khashin E. S. Shemyakova |
author_facet | E. A. Gubareva S. I. Khashin E. S. Shemyakova |
author_sort | E. A. Gubareva |
collection | DOAJ |
description | To minimize the objective function in neural networks, first-order methods are usually used, which involve the repeated calculation of the gradient. The number of variables in modern neural networks can be many thousands and even millions. Numerous experiments show that the analytical calculation time of an N variable function’s gradient is approximately N/5 times longer than the calculation time of the function itself. The article considers the possibility of using zero-order methods to minimize the function. In particular, a new zero-order method for function minimization, descent over two-dimensional spaces, is proposed. The convergence rates of three different methods are compared: standard gradient descent with automatic step selection, coordinate descent with step selection for each coordinate, and descent over two-dimensional subspaces. It has been shown that the efficiency of properly organized zero-order methods in the considered problems of training neural networks is not lower than the gradient ones. |
format | Article |
id | doaj-art-62c9c840bf864afda4e91a670dd9e973 |
institution | Kabale University |
issn | 1816-4277 2686-8415 |
language | English |
publishDate | 2022-12-01 |
publisher | Publishing House of the State University of Management |
record_format | Article |
series | Вестник университета |
spelling | doaj-art-62c9c840bf864afda4e91a670dd9e9732025-02-04T08:28:14ZengPublishing House of the State University of ManagementВестник университета1816-42772686-84152022-12-01111485510.26425/1816-4277-2022-11-48-552621Comparison of the efficiency of zero and first order minimization methods in neural networksE. A. Gubareva0S. I. Khashin1E. S. Shemyakova2State University of ManagementIvanovo State UniversityUniversity of ToledoTo minimize the objective function in neural networks, first-order methods are usually used, which involve the repeated calculation of the gradient. The number of variables in modern neural networks can be many thousands and even millions. Numerous experiments show that the analytical calculation time of an N variable function’s gradient is approximately N/5 times longer than the calculation time of the function itself. The article considers the possibility of using zero-order methods to minimize the function. In particular, a new zero-order method for function minimization, descent over two-dimensional spaces, is proposed. The convergence rates of three different methods are compared: standard gradient descent with automatic step selection, coordinate descent with step selection for each coordinate, and descent over two-dimensional subspaces. It has been shown that the efficiency of properly organized zero-order methods in the considered problems of training neural networks is not lower than the gradient ones.https://vestnik.guu.ru/jour/article/view/3952neural networksobjective functionminimizationgradientgradient descentcoordinate descentconvergence rate |
spellingShingle | E. A. Gubareva S. I. Khashin E. S. Shemyakova Comparison of the efficiency of zero and first order minimization methods in neural networks Вестник университета neural networks objective function minimization gradient gradient descent coordinate descent convergence rate |
title | Comparison of the efficiency of zero and first order minimization methods in neural networks |
title_full | Comparison of the efficiency of zero and first order minimization methods in neural networks |
title_fullStr | Comparison of the efficiency of zero and first order minimization methods in neural networks |
title_full_unstemmed | Comparison of the efficiency of zero and first order minimization methods in neural networks |
title_short | Comparison of the efficiency of zero and first order minimization methods in neural networks |
title_sort | comparison of the efficiency of zero and first order minimization methods in neural networks |
topic | neural networks objective function minimization gradient gradient descent coordinate descent convergence rate |
url | https://vestnik.guu.ru/jour/article/view/3952 |
work_keys_str_mv | AT eagubareva comparisonoftheefficiencyofzeroandfirstorderminimizationmethodsinneuralnetworks AT sikhashin comparisonoftheefficiencyofzeroandfirstorderminimizationmethodsinneuralnetworks AT esshemyakova comparisonoftheefficiencyofzeroandfirstorderminimizationmethodsinneuralnetworks |