A labeled dataset for AI-based cryo-EM map enhancement

Cryogenic electron microscopy (cryo-EM) has transformed structural biology by enabling near atomic resolution imaging of macromolecular complexes. However, cryo-EM density maps suffer from intrinsic noise arising from structural sources, shot noise, and digital recording, which complicates accurate...

Full description

Saved in:
Bibliographic Details
Main Authors: Nabin Giri, Xiao Chen, Liguo Wang, Jianlin Cheng
Format: Article
Language:English
Published: Elsevier 2025-01-01
Series:Computational and Structural Biotechnology Journal
Subjects:
Online Access:http://www.sciencedirect.com/science/article/pii/S2001037025002570
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1850084967941406720
author Nabin Giri
Xiao Chen
Liguo Wang
Jianlin Cheng
author_facet Nabin Giri
Xiao Chen
Liguo Wang
Jianlin Cheng
author_sort Nabin Giri
collection DOAJ
description Cryogenic electron microscopy (cryo-EM) has transformed structural biology by enabling near atomic resolution imaging of macromolecular complexes. However, cryo-EM density maps suffer from intrinsic noise arising from structural sources, shot noise, and digital recording, which complicates accurate model building. While various methods for denoising cryo-EM density maps exist, there is a lack of standardized datasets for benchmarking artificial intelligence (AI) approaches. Here, we present an open-source dataset for cryo-EM density map denoising comprising 650 high-resolution (1-4 Å) experimental maps paired with three types of generated label maps: regression maps capturing idealized density distributions, binary classification maps distinguishing structural elements from background, and atom-type classification maps. Each map is standardized to 1 Å voxel size and validated through Fourier Shell Correlation analysis, demonstrating substantial resolution improvements in label maps compared to experimental maps. This resource bridges the gap between structural biology and artificial intelligence communities, allowing researchers to develop and benchmark innovative methods for enhancing cryo-EM density maps.
format Article
id doaj-art-e9a4142e5c7e48be9f1d3dcdc3dff1ce
institution DOAJ
issn 2001-0370
language English
publishDate 2025-01-01
publisher Elsevier
record_format Article
series Computational and Structural Biotechnology Journal
spelling doaj-art-e9a4142e5c7e48be9f1d3dcdc3dff1ce2025-08-20T02:43:50ZengElsevierComputational and Structural Biotechnology Journal2001-03702025-01-01272843285010.1016/j.csbj.2025.06.041A labeled dataset for AI-based cryo-EM map enhancementNabin Giri0Xiao Chen1Liguo Wang2Jianlin Cheng3Electrical Engineering and Computer Science, University of Missouri, Columbia, 65211, MO, USA; NextGen Precision Health Institute, University of Missouri, Columbia, 65211, MO, USAComputer Science Department, Hamilton College, Clinton, 13323, NY, USALaboratory for BioMolecular Structure, Brookhaven National Laboratory, Upton, 11973, NY, USAElectrical Engineering and Computer Science, University of Missouri, Columbia, 65211, MO, USA; NextGen Precision Health Institute, University of Missouri, Columbia, 65211, MO, USA; Corresponding author at: Electrical Engineering and Computer Science, University of Missouri, Columbia, 65211, MO, USA.Cryogenic electron microscopy (cryo-EM) has transformed structural biology by enabling near atomic resolution imaging of macromolecular complexes. However, cryo-EM density maps suffer from intrinsic noise arising from structural sources, shot noise, and digital recording, which complicates accurate model building. While various methods for denoising cryo-EM density maps exist, there is a lack of standardized datasets for benchmarking artificial intelligence (AI) approaches. Here, we present an open-source dataset for cryo-EM density map denoising comprising 650 high-resolution (1-4 Å) experimental maps paired with three types of generated label maps: regression maps capturing idealized density distributions, binary classification maps distinguishing structural elements from background, and atom-type classification maps. Each map is standardized to 1 Å voxel size and validated through Fourier Shell Correlation analysis, demonstrating substantial resolution improvements in label maps compared to experimental maps. This resource bridges the gap between structural biology and artificial intelligence communities, allowing researchers to develop and benchmark innovative methods for enhancing cryo-EM density maps.http://www.sciencedirect.com/science/article/pii/S2001037025002570Cryo-EMCryo-EM map enhancementProtein structureDataset
spellingShingle Nabin Giri
Xiao Chen
Liguo Wang
Jianlin Cheng
A labeled dataset for AI-based cryo-EM map enhancement
Computational and Structural Biotechnology Journal
Cryo-EM
Cryo-EM map enhancement
Protein structure
Dataset
title A labeled dataset for AI-based cryo-EM map enhancement
title_full A labeled dataset for AI-based cryo-EM map enhancement
title_fullStr A labeled dataset for AI-based cryo-EM map enhancement
title_full_unstemmed A labeled dataset for AI-based cryo-EM map enhancement
title_short A labeled dataset for AI-based cryo-EM map enhancement
title_sort labeled dataset for ai based cryo em map enhancement
topic Cryo-EM
Cryo-EM map enhancement
Protein structure
Dataset
url http://www.sciencedirect.com/science/article/pii/S2001037025002570
work_keys_str_mv AT nabingiri alabeleddatasetforaibasedcryoemmapenhancement
AT xiaochen alabeleddatasetforaibasedcryoemmapenhancement
AT liguowang alabeleddatasetforaibasedcryoemmapenhancement
AT jianlincheng alabeleddatasetforaibasedcryoemmapenhancement
AT nabingiri labeleddatasetforaibasedcryoemmapenhancement
AT xiaochen labeleddatasetforaibasedcryoemmapenhancement
AT liguowang labeleddatasetforaibasedcryoemmapenhancement
AT jianlincheng labeleddatasetforaibasedcryoemmapenhancement