Characterizing out-of-distribution generalization of neural networks: application to the disordered Su–Schrieffer–Heeger model

Machine learning (ML) is a promising tool for the detection of phases of matter. However, ML models are also known for their black-box construction, which hinders understanding of what they learn from the data and makes their application to novel data risky. Moreover, the central challenge of ML is...

Full description

Saved in:
Bibliographic Details
Main Authors: Kacper Cybiński, Marcin Płodzień, Michał Tomza, Maciej Lewenstein, Alexandre Dauphin, Anna Dawid
Format: Article
Language:English
Published: IOP Publishing 2025-01-01
Series:Machine Learning: Science and Technology
Subjects:
Online Access:https://doi.org/10.1088/2632-2153/ad9079
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1832591650054995968
author Kacper Cybiński
Marcin Płodzień
Michał Tomza
Maciej Lewenstein
Alexandre Dauphin
Anna Dawid
author_facet Kacper Cybiński
Marcin Płodzień
Michał Tomza
Maciej Lewenstein
Alexandre Dauphin
Anna Dawid
author_sort Kacper Cybiński
collection DOAJ
description Machine learning (ML) is a promising tool for the detection of phases of matter. However, ML models are also known for their black-box construction, which hinders understanding of what they learn from the data and makes their application to novel data risky. Moreover, the central challenge of ML is to ensure its good generalization abilities, i.e. good performance on data outside the training set. Here, we show how the informed use of an interpretability method called class activation mapping, and the analysis of the latent representation of the data with the principal component analysis can increase trust in predictions of a neural network (NN) trained to classify quantum phases. In particular, we show that we can ensure better out-of-distribution (OOD) generalization in the complex classification problem by choosing such an NN that, in the simplified version of the problem, learns a known characteristic of the phase. We also discuss the characteristics of the data representation learned by a network that are predictors of its good OOD generalization. We show this on an example of the topological Su–Schrieffer–Heeger model with and without disorder, which turned out to be surprisingly challenging for NNs trained in a supervised way. This work is an example of how the systematic use of interpretability methods can improve the performance of NNs in scientific problems.
format Article
id doaj-art-60a2d36ceaa6463c832f109832b87d89
institution Kabale University
issn 2632-2153
language English
publishDate 2025-01-01
publisher IOP Publishing
record_format Article
series Machine Learning: Science and Technology
spelling doaj-art-60a2d36ceaa6463c832f109832b87d892025-01-22T07:24:20ZengIOP PublishingMachine Learning: Science and Technology2632-21532025-01-016101501410.1088/2632-2153/ad9079Characterizing out-of-distribution generalization of neural networks: application to the disordered Su–Schrieffer–Heeger modelKacper Cybiński0https://orcid.org/0000-0003-2600-7473Marcin Płodzień1https://orcid.org/0000-0002-0835-1644Michał Tomza2https://orcid.org/0000-0003-1792-8043Maciej Lewenstein3https://orcid.org/0000-0002-0210-7800Alexandre Dauphin4https://orcid.org/0000-0003-4996-2561Anna Dawid5https://orcid.org/0000-0001-9498-1732Faculty of Physics, University of Warsaw , Pasteura 5, 02-093 Warsaw, PolandICFO—Institut de Ciències Fotòniques , The Barcelona Institute of Science and Technology, Av. Carl Friedrich Gauss 3, 08860 Castelldefels (Barcelona), SpainFaculty of Physics, University of Warsaw , Pasteura 5, 02-093 Warsaw, PolandICFO—Institut de Ciències Fotòniques , The Barcelona Institute of Science and Technology, Av. Carl Friedrich Gauss 3, 08860 Castelldefels (Barcelona), Spain; ICREA , Pg. Lluís Campanys 23, 08010 Barcelona, SpainICFO—Institut de Ciències Fotòniques , The Barcelona Institute of Science and Technology, Av. Carl Friedrich Gauss 3, 08860 Castelldefels (Barcelona), Spain; PASQAL SAS , 7 rue Léonard de Vinci, 91300 Massy, Paris, FranceCenter for Computational Quantum Physics, Flatiron Institute , 162 Fifth Avenue, New York, NY 10010, United States of AmericaMachine learning (ML) is a promising tool for the detection of phases of matter. However, ML models are also known for their black-box construction, which hinders understanding of what they learn from the data and makes their application to novel data risky. Moreover, the central challenge of ML is to ensure its good generalization abilities, i.e. good performance on data outside the training set. Here, we show how the informed use of an interpretability method called class activation mapping, and the analysis of the latent representation of the data with the principal component analysis can increase trust in predictions of a neural network (NN) trained to classify quantum phases. In particular, we show that we can ensure better out-of-distribution (OOD) generalization in the complex classification problem by choosing such an NN that, in the simplified version of the problem, learns a known characteristic of the phase. We also discuss the characteristics of the data representation learned by a network that are predictors of its good OOD generalization. We show this on an example of the topological Su–Schrieffer–Heeger model with and without disorder, which turned out to be surprisingly challenging for NNs trained in a supervised way. This work is an example of how the systematic use of interpretability methods can improve the performance of NNs in scientific problems.https://doi.org/10.1088/2632-2153/ad9079out-of-distribution generalizationinterpretabilitydisordertopological phases of matter
spellingShingle Kacper Cybiński
Marcin Płodzień
Michał Tomza
Maciej Lewenstein
Alexandre Dauphin
Anna Dawid
Characterizing out-of-distribution generalization of neural networks: application to the disordered Su–Schrieffer–Heeger model
Machine Learning: Science and Technology
out-of-distribution generalization
interpretability
disorder
topological phases of matter
title Characterizing out-of-distribution generalization of neural networks: application to the disordered Su–Schrieffer–Heeger model
title_full Characterizing out-of-distribution generalization of neural networks: application to the disordered Su–Schrieffer–Heeger model
title_fullStr Characterizing out-of-distribution generalization of neural networks: application to the disordered Su–Schrieffer–Heeger model
title_full_unstemmed Characterizing out-of-distribution generalization of neural networks: application to the disordered Su–Schrieffer–Heeger model
title_short Characterizing out-of-distribution generalization of neural networks: application to the disordered Su–Schrieffer–Heeger model
title_sort characterizing out of distribution generalization of neural networks application to the disordered su schrieffer heeger model
topic out-of-distribution generalization
interpretability
disorder
topological phases of matter
url https://doi.org/10.1088/2632-2153/ad9079
work_keys_str_mv AT kacpercybinski characterizingoutofdistributiongeneralizationofneuralnetworksapplicationtothedisorderedsuschriefferheegermodel
AT marcinpłodzien characterizingoutofdistributiongeneralizationofneuralnetworksapplicationtothedisorderedsuschriefferheegermodel
AT michałtomza characterizingoutofdistributiongeneralizationofneuralnetworksapplicationtothedisorderedsuschriefferheegermodel
AT maciejlewenstein characterizingoutofdistributiongeneralizationofneuralnetworksapplicationtothedisorderedsuschriefferheegermodel
AT alexandredauphin characterizingoutofdistributiongeneralizationofneuralnetworksapplicationtothedisorderedsuschriefferheegermodel
AT annadawid characterizingoutofdistributiongeneralizationofneuralnetworksapplicationtothedisorderedsuschriefferheegermodel