QR-DeepONet: resolve abnormal convergence issue in deep operator network

Deep operator network (DeepONet) has been proven to be highly successful in operator learning tasks. Theoretical analysis indicates that the generation error of DeepONet should decrease as the basis dimension increases, thus providing a systematic way to reduce its generalization errors (GEs) by var...

Full description

Saved in:
Bibliographic Details
Main Authors: Jie Zhao, Biwei Xie, Xingquan Li
Format: Article
Language:English
Published: IOP Publishing 2024-01-01
Series:Machine Learning: Science and Technology
Subjects:
Online Access:https://doi.org/10.1088/2632-2153/ada0a5
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1832575963120009216
author Jie Zhao
Biwei Xie
Xingquan Li
author_facet Jie Zhao
Biwei Xie
Xingquan Li
author_sort Jie Zhao
collection DOAJ
description Deep operator network (DeepONet) has been proven to be highly successful in operator learning tasks. Theoretical analysis indicates that the generation error of DeepONet should decrease as the basis dimension increases, thus providing a systematic way to reduce its generalization errors (GEs) by varying the network hyperparameters. However, in practice, we found that, depending on the problem being solved and the activation function used, the GEs fluctuate unpredictably, contrary to theoretical expectations. Upon analyzing the output matrix of the trunk net, we determined that this behavior stems from the learned basis functions being highly linearly dependent, which limits the expressivity of the vanilla DeepONet. To address these limitations, we propose QR decomposition enhanced DeepONet (QR-DeepONet), an enhanced version of DeepONet using QR decomposition. These modifications ensured that the learned basis functions were linearly independent and orthogonal to each other. The numerical results demonstrate that the GEs of QR-DeepONet follow theoretical predictions that decrease monotonically as the basis dimension increases and outperform vanilla DeepONet. Consequently, the proposed method successfully fills the gap between the theory and practice.
format Article
id doaj-art-9152ce1838134caf8b9eb19e6fc7175e
institution Kabale University
issn 2632-2153
language English
publishDate 2024-01-01
publisher IOP Publishing
record_format Article
series Machine Learning: Science and Technology
spelling doaj-art-9152ce1838134caf8b9eb19e6fc7175e2025-01-31T15:51:50ZengIOP PublishingMachine Learning: Science and Technology2632-21532024-01-015404507510.1088/2632-2153/ada0a5QR-DeepONet: resolve abnormal convergence issue in deep operator networkJie Zhao0https://orcid.org/0000-0003-0500-242XBiwei Xie1Xingquan Li2Pengcheng Laboratory , Shenzhen 518055, People’s Republic of ChinaPengcheng Laboratory , Shenzhen 518055, People’s Republic of China; Institute of Computing Technology , Chinese Academy of Sciences, Beijing, People’s Republic of ChinaPengcheng Laboratory , Shenzhen 518055, People’s Republic of China; School of Mathematics and Statistics, Minnan Normal University , Zhangzhou, People’s Republic of ChinaDeep operator network (DeepONet) has been proven to be highly successful in operator learning tasks. Theoretical analysis indicates that the generation error of DeepONet should decrease as the basis dimension increases, thus providing a systematic way to reduce its generalization errors (GEs) by varying the network hyperparameters. However, in practice, we found that, depending on the problem being solved and the activation function used, the GEs fluctuate unpredictably, contrary to theoretical expectations. Upon analyzing the output matrix of the trunk net, we determined that this behavior stems from the learned basis functions being highly linearly dependent, which limits the expressivity of the vanilla DeepONet. To address these limitations, we propose QR decomposition enhanced DeepONet (QR-DeepONet), an enhanced version of DeepONet using QR decomposition. These modifications ensured that the learned basis functions were linearly independent and orthogonal to each other. The numerical results demonstrate that the GEs of QR-DeepONet follow theoretical predictions that decrease monotonically as the basis dimension increases and outperform vanilla DeepONet. Consequently, the proposed method successfully fills the gap between the theory and practice.https://doi.org/10.1088/2632-2153/ada0a5operator learningQR decompositionmachine learningneural network
spellingShingle Jie Zhao
Biwei Xie
Xingquan Li
QR-DeepONet: resolve abnormal convergence issue in deep operator network
Machine Learning: Science and Technology
operator learning
QR decomposition
machine learning
neural network
title QR-DeepONet: resolve abnormal convergence issue in deep operator network
title_full QR-DeepONet: resolve abnormal convergence issue in deep operator network
title_fullStr QR-DeepONet: resolve abnormal convergence issue in deep operator network
title_full_unstemmed QR-DeepONet: resolve abnormal convergence issue in deep operator network
title_short QR-DeepONet: resolve abnormal convergence issue in deep operator network
title_sort qr deeponet resolve abnormal convergence issue in deep operator network
topic operator learning
QR decomposition
machine learning
neural network
url https://doi.org/10.1088/2632-2153/ada0a5
work_keys_str_mv AT jiezhao qrdeeponetresolveabnormalconvergenceissueindeepoperatornetwork
AT biweixie qrdeeponetresolveabnormalconvergenceissueindeepoperatornetwork
AT xingquanli qrdeeponetresolveabnormalconvergenceissueindeepoperatornetwork