QR-DeepONet: resolve abnormal convergence issue in deep operator network

Deep operator network (DeepONet) has been proven to be highly successful in operator learning tasks. Theoretical analysis indicates that the generation error of DeepONet should decrease as the basis dimension increases, thus providing a systematic way to reduce its generalization errors (GEs) by var...

Full description

Saved in:

Bibliographic Details
Main Authors:	Jie Zhao, Biwei Xie, Xingquan Li
Format:	Article
Language:	English
Published:	IOP Publishing 2024-01-01
Series:	Machine Learning: Science and Technology
Subjects:	operator learning QR decomposition machine learning neural network
Online Access:	https://doi.org/10.1088/2632-2153/ada0a5
Tags:	Add Tag No Tags, Be the first to tag this record!

_version_	1832575963120009216
author	Jie Zhao Biwei Xie Xingquan Li
author_facet	Jie Zhao Biwei Xie Xingquan Li
author_sort	Jie Zhao
collection	DOAJ
description	Deep operator network (DeepONet) has been proven to be highly successful in operator learning tasks. Theoretical analysis indicates that the generation error of DeepONet should decrease as the basis dimension increases, thus providing a systematic way to reduce its generalization errors (GEs) by varying the network hyperparameters. However, in practice, we found that, depending on the problem being solved and the activation function used, the GEs fluctuate unpredictably, contrary to theoretical expectations. Upon analyzing the output matrix of the trunk net, we determined that this behavior stems from the learned basis functions being highly linearly dependent, which limits the expressivity of the vanilla DeepONet. To address these limitations, we propose QR decomposition enhanced DeepONet (QR-DeepONet), an enhanced version of DeepONet using QR decomposition. These modifications ensured that the learned basis functions were linearly independent and orthogonal to each other. The numerical results demonstrate that the GEs of QR-DeepONet follow theoretical predictions that decrease monotonically as the basis dimension increases and outperform vanilla DeepONet. Consequently, the proposed method successfully fills the gap between the theory and practice.
format	Article
id	doaj-art-9152ce1838134caf8b9eb19e6fc7175e
institution	Kabale University
issn	2632-2153
language	English
publishDate	2024-01-01
publisher	IOP Publishing
record_format	Article
series	Machine Learning: Science and Technology
spelling	doaj-art-9152ce1838134caf8b9eb19e6fc7175e2025-01-31T15:51:50ZengIOP PublishingMachine Learning: Science and Technology2632-21532024-01-015404507510.1088/2632-2153/ada0a5QR-DeepONet: resolve abnormal convergence issue in deep operator networkJie Zhao0https://orcid.org/0000-0003-0500-242XBiwei Xie1Xingquan Li2Pengcheng Laboratory , Shenzhen 518055, People’s Republic of ChinaPengcheng Laboratory , Shenzhen 518055, People’s Republic of China; Institute of Computing Technology , Chinese Academy of Sciences, Beijing, People’s Republic of ChinaPengcheng Laboratory , Shenzhen 518055, People’s Republic of China; School of Mathematics and Statistics, Minnan Normal University , Zhangzhou, People’s Republic of ChinaDeep operator network (DeepONet) has been proven to be highly successful in operator learning tasks. Theoretical analysis indicates that the generation error of DeepONet should decrease as the basis dimension increases, thus providing a systematic way to reduce its generalization errors (GEs) by varying the network hyperparameters. However, in practice, we found that, depending on the problem being solved and the activation function used, the GEs fluctuate unpredictably, contrary to theoretical expectations. Upon analyzing the output matrix of the trunk net, we determined that this behavior stems from the learned basis functions being highly linearly dependent, which limits the expressivity of the vanilla DeepONet. To address these limitations, we propose QR decomposition enhanced DeepONet (QR-DeepONet), an enhanced version of DeepONet using QR decomposition. These modifications ensured that the learned basis functions were linearly independent and orthogonal to each other. The numerical results demonstrate that the GEs of QR-DeepONet follow theoretical predictions that decrease monotonically as the basis dimension increases and outperform vanilla DeepONet. Consequently, the proposed method successfully fills the gap between the theory and practice.https://doi.org/10.1088/2632-2153/ada0a5operator learningQR decompositionmachine learningneural network
spellingShingle	Jie Zhao Biwei Xie Xingquan Li QR-DeepONet: resolve abnormal convergence issue in deep operator network Machine Learning: Science and Technology operator learning QR decomposition machine learning neural network
title	QR-DeepONet: resolve abnormal convergence issue in deep operator network
title_full	QR-DeepONet: resolve abnormal convergence issue in deep operator network
title_fullStr	QR-DeepONet: resolve abnormal convergence issue in deep operator network
title_full_unstemmed	QR-DeepONet: resolve abnormal convergence issue in deep operator network
title_short	QR-DeepONet: resolve abnormal convergence issue in deep operator network
title_sort	qr deeponet resolve abnormal convergence issue in deep operator network
topic	operator learning QR decomposition machine learning neural network
url	https://doi.org/10.1088/2632-2153/ada0a5
work_keys_str_mv	AT jiezhao qrdeeponetresolveabnormalconvergenceissueindeepoperatornetwork AT biweixie qrdeeponetresolveabnormalconvergenceissueindeepoperatornetwork AT xingquanli qrdeeponetresolveabnormalconvergenceissueindeepoperatornetwork

QR-DeepONet: resolve abnormal convergence issue in deep operator network

Similar Items