QR-DeepONet: resolve abnormal convergence issue in deep operator network
Deep operator network (DeepONet) has been proven to be highly successful in operator learning tasks. Theoretical analysis indicates that the generation error of DeepONet should decrease as the basis dimension increases, thus providing a systematic way to reduce its generalization errors (GEs) by var...
Saved in:
Main Authors: | , , |
---|---|
Format: | Article |
Language: | English |
Published: |
IOP Publishing
2024-01-01
|
Series: | Machine Learning: Science and Technology |
Subjects: | |
Online Access: | https://doi.org/10.1088/2632-2153/ada0a5 |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
_version_ | 1832575963120009216 |
---|---|
author | Jie Zhao Biwei Xie Xingquan Li |
author_facet | Jie Zhao Biwei Xie Xingquan Li |
author_sort | Jie Zhao |
collection | DOAJ |
description | Deep operator network (DeepONet) has been proven to be highly successful in operator learning tasks. Theoretical analysis indicates that the generation error of DeepONet should decrease as the basis dimension increases, thus providing a systematic way to reduce its generalization errors (GEs) by varying the network hyperparameters. However, in practice, we found that, depending on the problem being solved and the activation function used, the GEs fluctuate unpredictably, contrary to theoretical expectations. Upon analyzing the output matrix of the trunk net, we determined that this behavior stems from the learned basis functions being highly linearly dependent, which limits the expressivity of the vanilla DeepONet. To address these limitations, we propose QR decomposition enhanced DeepONet (QR-DeepONet), an enhanced version of DeepONet using QR decomposition. These modifications ensured that the learned basis functions were linearly independent and orthogonal to each other. The numerical results demonstrate that the GEs of QR-DeepONet follow theoretical predictions that decrease monotonically as the basis dimension increases and outperform vanilla DeepONet. Consequently, the proposed method successfully fills the gap between the theory and practice. |
format | Article |
id | doaj-art-9152ce1838134caf8b9eb19e6fc7175e |
institution | Kabale University |
issn | 2632-2153 |
language | English |
publishDate | 2024-01-01 |
publisher | IOP Publishing |
record_format | Article |
series | Machine Learning: Science and Technology |
spelling | doaj-art-9152ce1838134caf8b9eb19e6fc7175e2025-01-31T15:51:50ZengIOP PublishingMachine Learning: Science and Technology2632-21532024-01-015404507510.1088/2632-2153/ada0a5QR-DeepONet: resolve abnormal convergence issue in deep operator networkJie Zhao0https://orcid.org/0000-0003-0500-242XBiwei Xie1Xingquan Li2Pengcheng Laboratory , Shenzhen 518055, People’s Republic of ChinaPengcheng Laboratory , Shenzhen 518055, People’s Republic of China; Institute of Computing Technology , Chinese Academy of Sciences, Beijing, People’s Republic of ChinaPengcheng Laboratory , Shenzhen 518055, People’s Republic of China; School of Mathematics and Statistics, Minnan Normal University , Zhangzhou, People’s Republic of ChinaDeep operator network (DeepONet) has been proven to be highly successful in operator learning tasks. Theoretical analysis indicates that the generation error of DeepONet should decrease as the basis dimension increases, thus providing a systematic way to reduce its generalization errors (GEs) by varying the network hyperparameters. However, in practice, we found that, depending on the problem being solved and the activation function used, the GEs fluctuate unpredictably, contrary to theoretical expectations. Upon analyzing the output matrix of the trunk net, we determined that this behavior stems from the learned basis functions being highly linearly dependent, which limits the expressivity of the vanilla DeepONet. To address these limitations, we propose QR decomposition enhanced DeepONet (QR-DeepONet), an enhanced version of DeepONet using QR decomposition. These modifications ensured that the learned basis functions were linearly independent and orthogonal to each other. The numerical results demonstrate that the GEs of QR-DeepONet follow theoretical predictions that decrease monotonically as the basis dimension increases and outperform vanilla DeepONet. Consequently, the proposed method successfully fills the gap between the theory and practice.https://doi.org/10.1088/2632-2153/ada0a5operator learningQR decompositionmachine learningneural network |
spellingShingle | Jie Zhao Biwei Xie Xingquan Li QR-DeepONet: resolve abnormal convergence issue in deep operator network Machine Learning: Science and Technology operator learning QR decomposition machine learning neural network |
title | QR-DeepONet: resolve abnormal convergence issue in deep operator network |
title_full | QR-DeepONet: resolve abnormal convergence issue in deep operator network |
title_fullStr | QR-DeepONet: resolve abnormal convergence issue in deep operator network |
title_full_unstemmed | QR-DeepONet: resolve abnormal convergence issue in deep operator network |
title_short | QR-DeepONet: resolve abnormal convergence issue in deep operator network |
title_sort | qr deeponet resolve abnormal convergence issue in deep operator network |
topic | operator learning QR decomposition machine learning neural network |
url | https://doi.org/10.1088/2632-2153/ada0a5 |
work_keys_str_mv | AT jiezhao qrdeeponetresolveabnormalconvergenceissueindeepoperatornetwork AT biweixie qrdeeponetresolveabnormalconvergenceissueindeepoperatornetwork AT xingquanli qrdeeponetresolveabnormalconvergenceissueindeepoperatornetwork |