Characterizing out-of-distribution generalization of neural networks: application to the disordered Su–Schrieffer–Heeger model
Machine learning (ML) is a promising tool for the detection of phases of matter. However, ML models are also known for their black-box construction, which hinders understanding of what they learn from the data and makes their application to novel data risky. Moreover, the central challenge of ML is...
Saved in:
Main Authors: | , , , , , |
---|---|
Format: | Article |
Language: | English |
Published: |
IOP Publishing
2025-01-01
|
Series: | Machine Learning: Science and Technology |
Subjects: | |
Online Access: | https://doi.org/10.1088/2632-2153/ad9079 |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
_version_ | 1832591650054995968 |
---|---|
author | Kacper Cybiński Marcin Płodzień Michał Tomza Maciej Lewenstein Alexandre Dauphin Anna Dawid |
author_facet | Kacper Cybiński Marcin Płodzień Michał Tomza Maciej Lewenstein Alexandre Dauphin Anna Dawid |
author_sort | Kacper Cybiński |
collection | DOAJ |
description | Machine learning (ML) is a promising tool for the detection of phases of matter. However, ML models are also known for their black-box construction, which hinders understanding of what they learn from the data and makes their application to novel data risky. Moreover, the central challenge of ML is to ensure its good generalization abilities, i.e. good performance on data outside the training set. Here, we show how the informed use of an interpretability method called class activation mapping, and the analysis of the latent representation of the data with the principal component analysis can increase trust in predictions of a neural network (NN) trained to classify quantum phases. In particular, we show that we can ensure better out-of-distribution (OOD) generalization in the complex classification problem by choosing such an NN that, in the simplified version of the problem, learns a known characteristic of the phase. We also discuss the characteristics of the data representation learned by a network that are predictors of its good OOD generalization. We show this on an example of the topological Su–Schrieffer–Heeger model with and without disorder, which turned out to be surprisingly challenging for NNs trained in a supervised way. This work is an example of how the systematic use of interpretability methods can improve the performance of NNs in scientific problems. |
format | Article |
id | doaj-art-60a2d36ceaa6463c832f109832b87d89 |
institution | Kabale University |
issn | 2632-2153 |
language | English |
publishDate | 2025-01-01 |
publisher | IOP Publishing |
record_format | Article |
series | Machine Learning: Science and Technology |
spelling | doaj-art-60a2d36ceaa6463c832f109832b87d892025-01-22T07:24:20ZengIOP PublishingMachine Learning: Science and Technology2632-21532025-01-016101501410.1088/2632-2153/ad9079Characterizing out-of-distribution generalization of neural networks: application to the disordered Su–Schrieffer–Heeger modelKacper Cybiński0https://orcid.org/0000-0003-2600-7473Marcin Płodzień1https://orcid.org/0000-0002-0835-1644Michał Tomza2https://orcid.org/0000-0003-1792-8043Maciej Lewenstein3https://orcid.org/0000-0002-0210-7800Alexandre Dauphin4https://orcid.org/0000-0003-4996-2561Anna Dawid5https://orcid.org/0000-0001-9498-1732Faculty of Physics, University of Warsaw , Pasteura 5, 02-093 Warsaw, PolandICFO—Institut de Ciències Fotòniques , The Barcelona Institute of Science and Technology, Av. Carl Friedrich Gauss 3, 08860 Castelldefels (Barcelona), SpainFaculty of Physics, University of Warsaw , Pasteura 5, 02-093 Warsaw, PolandICFO—Institut de Ciències Fotòniques , The Barcelona Institute of Science and Technology, Av. Carl Friedrich Gauss 3, 08860 Castelldefels (Barcelona), Spain; ICREA , Pg. Lluís Campanys 23, 08010 Barcelona, SpainICFO—Institut de Ciències Fotòniques , The Barcelona Institute of Science and Technology, Av. Carl Friedrich Gauss 3, 08860 Castelldefels (Barcelona), Spain; PASQAL SAS , 7 rue Léonard de Vinci, 91300 Massy, Paris, FranceCenter for Computational Quantum Physics, Flatiron Institute , 162 Fifth Avenue, New York, NY 10010, United States of AmericaMachine learning (ML) is a promising tool for the detection of phases of matter. However, ML models are also known for their black-box construction, which hinders understanding of what they learn from the data and makes their application to novel data risky. Moreover, the central challenge of ML is to ensure its good generalization abilities, i.e. good performance on data outside the training set. Here, we show how the informed use of an interpretability method called class activation mapping, and the analysis of the latent representation of the data with the principal component analysis can increase trust in predictions of a neural network (NN) trained to classify quantum phases. In particular, we show that we can ensure better out-of-distribution (OOD) generalization in the complex classification problem by choosing such an NN that, in the simplified version of the problem, learns a known characteristic of the phase. We also discuss the characteristics of the data representation learned by a network that are predictors of its good OOD generalization. We show this on an example of the topological Su–Schrieffer–Heeger model with and without disorder, which turned out to be surprisingly challenging for NNs trained in a supervised way. This work is an example of how the systematic use of interpretability methods can improve the performance of NNs in scientific problems.https://doi.org/10.1088/2632-2153/ad9079out-of-distribution generalizationinterpretabilitydisordertopological phases of matter |
spellingShingle | Kacper Cybiński Marcin Płodzień Michał Tomza Maciej Lewenstein Alexandre Dauphin Anna Dawid Characterizing out-of-distribution generalization of neural networks: application to the disordered Su–Schrieffer–Heeger model Machine Learning: Science and Technology out-of-distribution generalization interpretability disorder topological phases of matter |
title | Characterizing out-of-distribution generalization of neural networks: application to the disordered Su–Schrieffer–Heeger model |
title_full | Characterizing out-of-distribution generalization of neural networks: application to the disordered Su–Schrieffer–Heeger model |
title_fullStr | Characterizing out-of-distribution generalization of neural networks: application to the disordered Su–Schrieffer–Heeger model |
title_full_unstemmed | Characterizing out-of-distribution generalization of neural networks: application to the disordered Su–Schrieffer–Heeger model |
title_short | Characterizing out-of-distribution generalization of neural networks: application to the disordered Su–Schrieffer–Heeger model |
title_sort | characterizing out of distribution generalization of neural networks application to the disordered su schrieffer heeger model |
topic | out-of-distribution generalization interpretability disorder topological phases of matter |
url | https://doi.org/10.1088/2632-2153/ad9079 |
work_keys_str_mv | AT kacpercybinski characterizingoutofdistributiongeneralizationofneuralnetworksapplicationtothedisorderedsuschriefferheegermodel AT marcinpłodzien characterizingoutofdistributiongeneralizationofneuralnetworksapplicationtothedisorderedsuschriefferheegermodel AT michałtomza characterizingoutofdistributiongeneralizationofneuralnetworksapplicationtothedisorderedsuschriefferheegermodel AT maciejlewenstein characterizingoutofdistributiongeneralizationofneuralnetworksapplicationtothedisorderedsuschriefferheegermodel AT alexandredauphin characterizingoutofdistributiongeneralizationofneuralnetworksapplicationtothedisorderedsuschriefferheegermodel AT annadawid characterizingoutofdistributiongeneralizationofneuralnetworksapplicationtothedisorderedsuschriefferheegermodel |