Transformer-generated atomic embeddings to enhance prediction accuracy of crystal properties with machine learning

Abstract Accelerating the discovery of novel crystal materials by machine learning is crucial for advancing various technologies from clean energy to information processing. The machine-learning models for prediction of materials properties require embedding atomic information, while traditional met...

Full description

Saved in:
Bibliographic Details
Main Authors: Luozhijie Jin, Zijian Du, Le Shu, Yan Cen, Yuanfeng Xu, Yongfeng Mei, Hao Zhang
Format: Article
Language:English
Published: Nature Portfolio 2025-01-01
Series:Nature Communications
Online Access:https://doi.org/10.1038/s41467-025-56481-x
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1832571511076028416
author Luozhijie Jin
Zijian Du
Le Shu
Yan Cen
Yuanfeng Xu
Yongfeng Mei
Hao Zhang
author_facet Luozhijie Jin
Zijian Du
Le Shu
Yan Cen
Yuanfeng Xu
Yongfeng Mei
Hao Zhang
author_sort Luozhijie Jin
collection DOAJ
description Abstract Accelerating the discovery of novel crystal materials by machine learning is crucial for advancing various technologies from clean energy to information processing. The machine-learning models for prediction of materials properties require embedding atomic information, while traditional methods have limited effectiveness in enhancing prediction accuracy. Here, we proposed an atomic embedding strategy called universal atomic embeddings (UAEs) for their broad applicability as atomic fingerprints, and generated the UAE tensors based on the proposed CrystalTransformer model. By performing experiments on widely-used materials database, our CrystalTransformer-based UAEs (ct-UAEs) are shown to accurately capture complex atomic features, leading to a 14% improvement in prediction accuracy on CGCNN and 18% on ALIGNN when using formation energies as the target, based on the Materials Project database. We also demonstrated the good transferability of ct-UAEs across various databases. Based on the clustering analysis for multi-task ct-UAEs, the elements in the periodic table can be categorized with reasonable connections between atomic features and targeted crystal properties. After applying ct-UAEs to predict formation energy in hybrid perovskites database, we realized an improvement in accuracy, with a 34% boost in MEGNET and 16% in CGCNN, showcasing their potential as atomic fingerprints to address the data scarcity challenges.
format Article
id doaj-art-4c853ad0e0584ab9896e32b197bae37e
institution Kabale University
issn 2041-1723
language English
publishDate 2025-01-01
publisher Nature Portfolio
record_format Article
series Nature Communications
spelling doaj-art-4c853ad0e0584ab9896e32b197bae37e2025-02-02T12:32:13ZengNature PortfolioNature Communications2041-17232025-01-0116111110.1038/s41467-025-56481-xTransformer-generated atomic embeddings to enhance prediction accuracy of crystal properties with machine learningLuozhijie Jin0Zijian Du1Le Shu2Yan Cen3Yuanfeng Xu4Yongfeng Mei5Hao Zhang6School of Information Science and Technology, Fudan UniversityDepartment of Physics, Fudan UniversitySchool of Information Science and Technology, Fudan UniversityDepartment of Physics, Fudan UniversitySchool of Science, Shandong Jianzhu UniversityDepartment of Materials, Fudan UniversitySchool of Information Science and Technology, Fudan UniversityAbstract Accelerating the discovery of novel crystal materials by machine learning is crucial for advancing various technologies from clean energy to information processing. The machine-learning models for prediction of materials properties require embedding atomic information, while traditional methods have limited effectiveness in enhancing prediction accuracy. Here, we proposed an atomic embedding strategy called universal atomic embeddings (UAEs) for their broad applicability as atomic fingerprints, and generated the UAE tensors based on the proposed CrystalTransformer model. By performing experiments on widely-used materials database, our CrystalTransformer-based UAEs (ct-UAEs) are shown to accurately capture complex atomic features, leading to a 14% improvement in prediction accuracy on CGCNN and 18% on ALIGNN when using formation energies as the target, based on the Materials Project database. We also demonstrated the good transferability of ct-UAEs across various databases. Based on the clustering analysis for multi-task ct-UAEs, the elements in the periodic table can be categorized with reasonable connections between atomic features and targeted crystal properties. After applying ct-UAEs to predict formation energy in hybrid perovskites database, we realized an improvement in accuracy, with a 34% boost in MEGNET and 16% in CGCNN, showcasing their potential as atomic fingerprints to address the data scarcity challenges.https://doi.org/10.1038/s41467-025-56481-x
spellingShingle Luozhijie Jin
Zijian Du
Le Shu
Yan Cen
Yuanfeng Xu
Yongfeng Mei
Hao Zhang
Transformer-generated atomic embeddings to enhance prediction accuracy of crystal properties with machine learning
Nature Communications
title Transformer-generated atomic embeddings to enhance prediction accuracy of crystal properties with machine learning
title_full Transformer-generated atomic embeddings to enhance prediction accuracy of crystal properties with machine learning
title_fullStr Transformer-generated atomic embeddings to enhance prediction accuracy of crystal properties with machine learning
title_full_unstemmed Transformer-generated atomic embeddings to enhance prediction accuracy of crystal properties with machine learning
title_short Transformer-generated atomic embeddings to enhance prediction accuracy of crystal properties with machine learning
title_sort transformer generated atomic embeddings to enhance prediction accuracy of crystal properties with machine learning
url https://doi.org/10.1038/s41467-025-56481-x
work_keys_str_mv AT luozhijiejin transformergeneratedatomicembeddingstoenhancepredictionaccuracyofcrystalpropertieswithmachinelearning
AT zijiandu transformergeneratedatomicembeddingstoenhancepredictionaccuracyofcrystalpropertieswithmachinelearning
AT leshu transformergeneratedatomicembeddingstoenhancepredictionaccuracyofcrystalpropertieswithmachinelearning
AT yancen transformergeneratedatomicembeddingstoenhancepredictionaccuracyofcrystalpropertieswithmachinelearning
AT yuanfengxu transformergeneratedatomicembeddingstoenhancepredictionaccuracyofcrystalpropertieswithmachinelearning
AT yongfengmei transformergeneratedatomicembeddingstoenhancepredictionaccuracyofcrystalpropertieswithmachinelearning
AT haozhang transformergeneratedatomicembeddingstoenhancepredictionaccuracyofcrystalpropertieswithmachinelearning