Protein identification using Cryo-EM and artificial intelligence guides improved sample purification

Protein purification is essential in protein biochemistry, structural biology, and protein design, enabling the determination of protein structures, the study of biological mechanisms, and the characterization of both natural and de novo designed proteins. However, standard purification strategies o...

Full description

Saved in:
Bibliographic Details
Main Authors: Kenneth D. Carr, Dane Evan D. Zambrano, Connor Weidle, Alex Goodson, Helen E. Eisenach, Harley Pyles, Alexis Courbet, Neil P. King, Andrew J. Borst
Format: Article
Language:English
Published: Elsevier 2025-06-01
Series:Journal of Structural Biology: X
Subjects:
Online Access:http://www.sciencedirect.com/science/article/pii/S2590152425000017
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1832576405843476480
author Kenneth D. Carr
Dane Evan D. Zambrano
Connor Weidle
Alex Goodson
Helen E. Eisenach
Harley Pyles
Alexis Courbet
Neil P. King
Andrew J. Borst
author_facet Kenneth D. Carr
Dane Evan D. Zambrano
Connor Weidle
Alex Goodson
Helen E. Eisenach
Harley Pyles
Alexis Courbet
Neil P. King
Andrew J. Borst
author_sort Kenneth D. Carr
collection DOAJ
description Protein purification is essential in protein biochemistry, structural biology, and protein design, enabling the determination of protein structures, the study of biological mechanisms, and the characterization of both natural and de novo designed proteins. However, standard purification strategies often encounter challenges, such as unintended co-purification of contaminants alongside the target protein. This issue is particularly problematic for self-assembling protein nanomaterials, where unexpected geometries may reflect novel assembly states, cross-contamination, or native proteins originating from the expression host. Here, we used an automated structure-to-sequence pipeline to first identify an unknown co-purifying protein found in several purified designed protein samples. By integrating cryo-electron microscopy (Cryo-EM), ModelAngelo’s sequence-agnostic model-building, and Protein BLAST, we identified the contaminant as dihydrolipoamide succinyltransferase (DLST). This identification was validated through comparisons with DLST structures in the Protein Data Bank, AlphaFold 3 predictions based on the DLST sequence from our E. coli expression vector, and traditional biochemical methods. The identification informed subsequent modifications to our purification protocol, which successfully excluded DLST from future preparations. To explore the potential broader utility of this approach, we benchmarked four computational methods for DLST identification across varying resolution ranges. This study demonstrates the successful application of a structure-to-sequence protein identification workflow, integrating Cryo-EM, ModelAngelo, Protein BLAST, and AlphaFold 3 predictions, to identify and ultimately help guide the removal of DLST from sample purification efforts. It highlights the potential of combining Cryo-EM with AI-driven tools for accurate protein identification and addressing purification challenges across diverse contexts in protein science.
format Article
id doaj-art-55fc077729064c51b385b732f9b92aa5
institution Kabale University
issn 2590-1524
language English
publishDate 2025-06-01
publisher Elsevier
record_format Article
series Journal of Structural Biology: X
spelling doaj-art-55fc077729064c51b385b732f9b92aa52025-01-31T05:12:21ZengElsevierJournal of Structural Biology: X2590-15242025-06-0111100120Protein identification using Cryo-EM and artificial intelligence guides improved sample purificationKenneth D. Carr0Dane Evan D. Zambrano1Connor Weidle2Alex Goodson3Helen E. Eisenach4Harley Pyles5Alexis Courbet6Neil P. King7Andrew J. Borst8Department of Biochemistry, University of Washington, Seattle, WA 98195, USA; Institute for Protein Design, University of Washington, Seattle, WA 98195, USADepartment of Biochemistry, University of Washington, Seattle, WA 98195, USA; Institute for Protein Design, University of Washington, Seattle, WA 98195, USADepartment of Biochemistry, University of Washington, Seattle, WA 98195, USA; Institute for Protein Design, University of Washington, Seattle, WA 98195, USADepartment of Biochemistry, University of Washington, Seattle, WA 98195, USA; Institute for Protein Design, University of Washington, Seattle, WA 98195, USADepartment of Biochemistry, University of Washington, Seattle, WA 98195, USA; Institute for Protein Design, University of Washington, Seattle, WA 98195, USADepartment of Biochemistry, University of Washington, Seattle, WA 98195, USA; Institute for Protein Design, University of Washington, Seattle, WA 98195, USADepartment of Biochemistry, University of Washington, Seattle, WA 98195, USA; Institute for Protein Design, University of Washington, Seattle, WA 98195, USADepartment of Biochemistry, University of Washington, Seattle, WA 98195, USA; Institute for Protein Design, University of Washington, Seattle, WA 98195, USADepartment of Biochemistry, University of Washington, Seattle, WA 98195, USA; Institute for Protein Design, University of Washington, Seattle, WA 98195, USA; Corresponding author at: Department of Biochemistry, University of Washington, Seattle, WA 98195, USA.Protein purification is essential in protein biochemistry, structural biology, and protein design, enabling the determination of protein structures, the study of biological mechanisms, and the characterization of both natural and de novo designed proteins. However, standard purification strategies often encounter challenges, such as unintended co-purification of contaminants alongside the target protein. This issue is particularly problematic for self-assembling protein nanomaterials, where unexpected geometries may reflect novel assembly states, cross-contamination, or native proteins originating from the expression host. Here, we used an automated structure-to-sequence pipeline to first identify an unknown co-purifying protein found in several purified designed protein samples. By integrating cryo-electron microscopy (Cryo-EM), ModelAngelo’s sequence-agnostic model-building, and Protein BLAST, we identified the contaminant as dihydrolipoamide succinyltransferase (DLST). This identification was validated through comparisons with DLST structures in the Protein Data Bank, AlphaFold 3 predictions based on the DLST sequence from our E. coli expression vector, and traditional biochemical methods. The identification informed subsequent modifications to our purification protocol, which successfully excluded DLST from future preparations. To explore the potential broader utility of this approach, we benchmarked four computational methods for DLST identification across varying resolution ranges. This study demonstrates the successful application of a structure-to-sequence protein identification workflow, integrating Cryo-EM, ModelAngelo, Protein BLAST, and AlphaFold 3 predictions, to identify and ultimately help guide the removal of DLST from sample purification efforts. It highlights the potential of combining Cryo-EM with AI-driven tools for accurate protein identification and addressing purification challenges across diverse contexts in protein science.http://www.sciencedirect.com/science/article/pii/S2590152425000017Protein PurificationContaminationCryo-Electron MicroscopyCryo-EMDLSTDihydrolipoamide Succinyltransferase
spellingShingle Kenneth D. Carr
Dane Evan D. Zambrano
Connor Weidle
Alex Goodson
Helen E. Eisenach
Harley Pyles
Alexis Courbet
Neil P. King
Andrew J. Borst
Protein identification using Cryo-EM and artificial intelligence guides improved sample purification
Journal of Structural Biology: X
Protein Purification
Contamination
Cryo-Electron Microscopy
Cryo-EM
DLST
Dihydrolipoamide Succinyltransferase
title Protein identification using Cryo-EM and artificial intelligence guides improved sample purification
title_full Protein identification using Cryo-EM and artificial intelligence guides improved sample purification
title_fullStr Protein identification using Cryo-EM and artificial intelligence guides improved sample purification
title_full_unstemmed Protein identification using Cryo-EM and artificial intelligence guides improved sample purification
title_short Protein identification using Cryo-EM and artificial intelligence guides improved sample purification
title_sort protein identification using cryo em and artificial intelligence guides improved sample purification
topic Protein Purification
Contamination
Cryo-Electron Microscopy
Cryo-EM
DLST
Dihydrolipoamide Succinyltransferase
url http://www.sciencedirect.com/science/article/pii/S2590152425000017
work_keys_str_mv AT kennethdcarr proteinidentificationusingcryoemandartificialintelligenceguidesimprovedsamplepurification
AT daneevandzambrano proteinidentificationusingcryoemandartificialintelligenceguidesimprovedsamplepurification
AT connorweidle proteinidentificationusingcryoemandartificialintelligenceguidesimprovedsamplepurification
AT alexgoodson proteinidentificationusingcryoemandartificialintelligenceguidesimprovedsamplepurification
AT heleneeisenach proteinidentificationusingcryoemandartificialintelligenceguidesimprovedsamplepurification
AT harleypyles proteinidentificationusingcryoemandartificialintelligenceguidesimprovedsamplepurification
AT alexiscourbet proteinidentificationusingcryoemandartificialintelligenceguidesimprovedsamplepurification
AT neilpking proteinidentificationusingcryoemandartificialintelligenceguidesimprovedsamplepurification
AT andrewjborst proteinidentificationusingcryoemandartificialintelligenceguidesimprovedsamplepurification