TransformerPayne: Enhancing Spectral Emulation Accuracy and Data Efficiency by Capturing Long-range Correlations

Stellar spectra emulators often rely on large grids and tend to reach a plateau in emulation accuracy, leading to significant systematic errors when inferring stellar properties. Our study explores the use of Transformer models to capture long-range information in spectra, comparing their performanc...

Full description

Saved in:
Bibliographic Details
Main Authors: Tomasz Różański, Yuan-Sen Ting, Maja Jabłońska
Format: Article
Language:English
Published: IOP Publishing 2025-01-01
Series:The Astrophysical Journal
Subjects:
Online Access:https://doi.org/10.3847/1538-4357/ad9b99
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1832540886461841408
author Tomasz Różański
Yuan-Sen Ting
Maja Jabłońska
author_facet Tomasz Różański
Yuan-Sen Ting
Maja Jabłońska
author_sort Tomasz Różański
collection DOAJ
description Stellar spectra emulators often rely on large grids and tend to reach a plateau in emulation accuracy, leading to significant systematic errors when inferring stellar properties. Our study explores the use of Transformer models to capture long-range information in spectra, comparing their performance to the Payne emulator (a fully connected multilayer perceptron), an expanded version of The Payne, and a convolutional-based emulator. We tested these models on synthetic spectral grids, evaluating their performance by analyzing emulation residuals and assessing the quality of spectral parameter inference. The newly introduced TransformerPayne emulator outperformed all other tested models, achieving a mean absolute error (MAE) of approximately 0.15% when trained on the full grid. The most significant improvements were observed in grids containing between 1000 and 10,000 spectra, with TransformerPayne showing 2–5 times better performance than the scaled-up version of The Payne. Additionally, TransformerPayne demonstrated superior fine-tuning capabilities, allowing for pretraining on one spectral model grid before transferring to another. This fine-tuning approach enabled up to a 10-fold reduction in training grid size compared to models trained from scratch. Analysis of TransformerPayne's attention maps revealed that they encode interpretable features common across many spectral lines of chosen elements. While scaling up The Payne to a larger network reduced its MAE from 1.2% to 0.3% when trained on the full data set, TransformerPayne consistently achieved the lowest MAE across all tests. The inductive biases of the TransformerPayne emulator enhance accuracy, data efficiency, and interpretability for spectral emulation compared to existing methods.
format Article
id doaj-art-3111d8e9584f4838a31ea189373ba308
institution Kabale University
issn 1538-4357
language English
publishDate 2025-01-01
publisher IOP Publishing
record_format Article
series The Astrophysical Journal
spelling doaj-art-3111d8e9584f4838a31ea189373ba3082025-02-04T13:05:37ZengIOP PublishingThe Astrophysical Journal1538-43572025-01-0198016610.3847/1538-4357/ad9b99TransformerPayne: Enhancing Spectral Emulation Accuracy and Data Efficiency by Capturing Long-range CorrelationsTomasz Różański0https://orcid.org/0000-0002-5819-3023Yuan-Sen Ting1https://orcid.org/0000-0001-5082-9536Maja Jabłońska2https://orcid.org/0000-0001-6962-4979Research School of Astronomy & Astrophysics, The Australian National University , Cotter Rd., Weston, ACT 2611, Australia; Astronomical Institute, University of Wrocław , Kopernika 11, 51-622 Wrocław, PolandDepartment of Astronomy, The Ohio State University , Columbus, OH 45701, USA; Center for Cosmology and AstroParticle Physics (CCAPP), The Ohio State University , Columbus, OH 43210, USAResearch School of Astronomy & Astrophysics, The Australian National University , Cotter Rd., Weston, ACT 2611, AustraliaStellar spectra emulators often rely on large grids and tend to reach a plateau in emulation accuracy, leading to significant systematic errors when inferring stellar properties. Our study explores the use of Transformer models to capture long-range information in spectra, comparing their performance to the Payne emulator (a fully connected multilayer perceptron), an expanded version of The Payne, and a convolutional-based emulator. We tested these models on synthetic spectral grids, evaluating their performance by analyzing emulation residuals and assessing the quality of spectral parameter inference. The newly introduced TransformerPayne emulator outperformed all other tested models, achieving a mean absolute error (MAE) of approximately 0.15% when trained on the full grid. The most significant improvements were observed in grids containing between 1000 and 10,000 spectra, with TransformerPayne showing 2–5 times better performance than the scaled-up version of The Payne. Additionally, TransformerPayne demonstrated superior fine-tuning capabilities, allowing for pretraining on one spectral model grid before transferring to another. This fine-tuning approach enabled up to a 10-fold reduction in training grid size compared to models trained from scratch. Analysis of TransformerPayne's attention maps revealed that they encode interpretable features common across many spectral lines of chosen elements. While scaling up The Payne to a larger network reduced its MAE from 1.2% to 0.3% when trained on the full data set, TransformerPayne consistently achieved the lowest MAE across all tests. The inductive biases of the TransformerPayne emulator enhance accuracy, data efficiency, and interpretability for spectral emulation compared to existing methods.https://doi.org/10.3847/1538-4357/ad9b99Stellar atmospheresGalactic archaeologyAstroinformaticsAstrostatistics
spellingShingle Tomasz Różański
Yuan-Sen Ting
Maja Jabłońska
TransformerPayne: Enhancing Spectral Emulation Accuracy and Data Efficiency by Capturing Long-range Correlations
The Astrophysical Journal
Stellar atmospheres
Galactic archaeology
Astroinformatics
Astrostatistics
title TransformerPayne: Enhancing Spectral Emulation Accuracy and Data Efficiency by Capturing Long-range Correlations
title_full TransformerPayne: Enhancing Spectral Emulation Accuracy and Data Efficiency by Capturing Long-range Correlations
title_fullStr TransformerPayne: Enhancing Spectral Emulation Accuracy and Data Efficiency by Capturing Long-range Correlations
title_full_unstemmed TransformerPayne: Enhancing Spectral Emulation Accuracy and Data Efficiency by Capturing Long-range Correlations
title_short TransformerPayne: Enhancing Spectral Emulation Accuracy and Data Efficiency by Capturing Long-range Correlations
title_sort transformerpayne enhancing spectral emulation accuracy and data efficiency by capturing long range correlations
topic Stellar atmospheres
Galactic archaeology
Astroinformatics
Astrostatistics
url https://doi.org/10.3847/1538-4357/ad9b99
work_keys_str_mv AT tomaszrozanski transformerpayneenhancingspectralemulationaccuracyanddataefficiencybycapturinglongrangecorrelations
AT yuansenting transformerpayneenhancingspectralemulationaccuracyanddataefficiencybycapturinglongrangecorrelations
AT majajabłonska transformerpayneenhancingspectralemulationaccuracyanddataefficiencybycapturinglongrangecorrelations