AlphaFold 2, but not AlphaFold 3, predicts confident but unrealistic β-solenoid structures for repeat proteins

AlphaFold 2 (AF2) has revolutionised protein structure prediction but, like any new tool, its performance on specific classes of targets, especially those potentially under-represented in its training data, merits attention. Prompted by a highly confident prediction for a biologically meaningless, r...

Full description

Saved in:
Bibliographic Details
Main Authors: Olivia S. Pratt, Luc G. Elliott, Margaux Haon, Shahram Mesdaghi, Rebecca M. Price, Adam J. Simpkin, Daniel J. Rigden
Format: Article
Language:English
Published: Elsevier 2025-01-01
Series:Computational and Structural Biotechnology Journal
Subjects:
Online Access:http://www.sciencedirect.com/science/article/pii/S2001037025000200
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1832585130625990656
author Olivia S. Pratt
Luc G. Elliott
Margaux Haon
Shahram Mesdaghi
Rebecca M. Price
Adam J. Simpkin
Daniel J. Rigden
author_facet Olivia S. Pratt
Luc G. Elliott
Margaux Haon
Shahram Mesdaghi
Rebecca M. Price
Adam J. Simpkin
Daniel J. Rigden
author_sort Olivia S. Pratt
collection DOAJ
description AlphaFold 2 (AF2) has revolutionised protein structure prediction but, like any new tool, its performance on specific classes of targets, especially those potentially under-represented in its training data, merits attention. Prompted by a highly confident prediction for a biologically meaningless, randomly permuted repeat sequence, we assessed AF2 performance on sequences composed of perfect repeats of random sequences of different lengths. AF2 frequently folds such sequences into β-solenoids which, while ascribed high confidence, contain unusual and implausible features such as internally stacked and uncompensated charged residues. A number of sequences confidently predicted as β-solenoids are predicted by other advanced methods as intrinsically disordered. The instability of some predictions is demonstrated by molecular dynamics. Importantly, other deep learning-based structure prediction tools predict different structures or β-solenoids with much lower confidence suggesting that AF2 alone has an unreasonable tendency to predict confident but unrealistic β-solenoids for perfect repeat sequences. The potential implications for structure prediction of natural (near-)perfect sequence repeat proteins are also explored.
format Article
id doaj-art-63d5de881a024e22aeea0a1005d94bc0
institution Kabale University
issn 2001-0370
language English
publishDate 2025-01-01
publisher Elsevier
record_format Article
series Computational and Structural Biotechnology Journal
spelling doaj-art-63d5de881a024e22aeea0a1005d94bc02025-01-27T04:21:49ZengElsevierComputational and Structural Biotechnology Journal2001-03702025-01-0127467477AlphaFold 2, but not AlphaFold 3, predicts confident but unrealistic β-solenoid structures for repeat proteinsOlivia S. Pratt0Luc G. Elliott1Margaux Haon2Shahram Mesdaghi3Rebecca M. Price4Adam J. Simpkin5Daniel J. Rigden6Department of Biochemistry, Cell and Systems, Biology, Institute of Structural, Molecular and Integrative Biology, University of Liverpool, Crown Street, Liverpool L69 7ZB, United KingdomDepartment of Biochemistry, Cell and Systems, Biology, Institute of Structural, Molecular and Integrative Biology, University of Liverpool, Crown Street, Liverpool L69 7ZB, United KingdomDepartment of Biochemistry, Cell and Systems, Biology, Institute of Structural, Molecular and Integrative Biology, University of Liverpool, Crown Street, Liverpool L69 7ZB, United Kingdom; Department of Chemistry, University of Liverpool, Crown Street, Liverpool L69 7ZD, United KingdomDepartment of Biochemistry, Cell and Systems, Biology, Institute of Structural, Molecular and Integrative Biology, University of Liverpool, Crown Street, Liverpool L69 7ZB, United Kingdom; Computational Biology Facility, MerseyBio,University of Liverpool, Crown Street, Liverpool L69 7ZB, United KingdomDepartment of Biochemistry, Cell and Systems, Biology, Institute of Structural, Molecular and Integrative Biology, University of Liverpool, Crown Street, Liverpool L69 7ZB, United KingdomDepartment of Biochemistry, Cell and Systems, Biology, Institute of Structural, Molecular and Integrative Biology, University of Liverpool, Crown Street, Liverpool L69 7ZB, United KingdomDepartment of Biochemistry, Cell and Systems, Biology, Institute of Structural, Molecular and Integrative Biology, University of Liverpool, Crown Street, Liverpool L69 7ZB, United Kingdom; Corresponding author.AlphaFold 2 (AF2) has revolutionised protein structure prediction but, like any new tool, its performance on specific classes of targets, especially those potentially under-represented in its training data, merits attention. Prompted by a highly confident prediction for a biologically meaningless, randomly permuted repeat sequence, we assessed AF2 performance on sequences composed of perfect repeats of random sequences of different lengths. AF2 frequently folds such sequences into β-solenoids which, while ascribed high confidence, contain unusual and implausible features such as internally stacked and uncompensated charged residues. A number of sequences confidently predicted as β-solenoids are predicted by other advanced methods as intrinsically disordered. The instability of some predictions is demonstrated by molecular dynamics. Importantly, other deep learning-based structure prediction tools predict different structures or β-solenoids with much lower confidence suggesting that AF2 alone has an unreasonable tendency to predict confident but unrealistic β-solenoids for perfect repeat sequences. The potential implications for structure prediction of natural (near-)perfect sequence repeat proteins are also explored.http://www.sciencedirect.com/science/article/pii/S2001037025000200AlphafoldStructure predictionBeta-solenoidModel confidenceRepeat proteins
spellingShingle Olivia S. Pratt
Luc G. Elliott
Margaux Haon
Shahram Mesdaghi
Rebecca M. Price
Adam J. Simpkin
Daniel J. Rigden
AlphaFold 2, but not AlphaFold 3, predicts confident but unrealistic β-solenoid structures for repeat proteins
Computational and Structural Biotechnology Journal
Alphafold
Structure prediction
Beta-solenoid
Model confidence
Repeat proteins
title AlphaFold 2, but not AlphaFold 3, predicts confident but unrealistic β-solenoid structures for repeat proteins
title_full AlphaFold 2, but not AlphaFold 3, predicts confident but unrealistic β-solenoid structures for repeat proteins
title_fullStr AlphaFold 2, but not AlphaFold 3, predicts confident but unrealistic β-solenoid structures for repeat proteins
title_full_unstemmed AlphaFold 2, but not AlphaFold 3, predicts confident but unrealistic β-solenoid structures for repeat proteins
title_short AlphaFold 2, but not AlphaFold 3, predicts confident but unrealistic β-solenoid structures for repeat proteins
title_sort alphafold 2 but not alphafold 3 predicts confident but unrealistic β solenoid structures for repeat proteins
topic Alphafold
Structure prediction
Beta-solenoid
Model confidence
Repeat proteins
url http://www.sciencedirect.com/science/article/pii/S2001037025000200
work_keys_str_mv AT oliviaspratt alphafold2butnotalphafold3predictsconfidentbutunrealisticbsolenoidstructuresforrepeatproteins
AT lucgelliott alphafold2butnotalphafold3predictsconfidentbutunrealisticbsolenoidstructuresforrepeatproteins
AT margauxhaon alphafold2butnotalphafold3predictsconfidentbutunrealisticbsolenoidstructuresforrepeatproteins
AT shahrammesdaghi alphafold2butnotalphafold3predictsconfidentbutunrealisticbsolenoidstructuresforrepeatproteins
AT rebeccamprice alphafold2butnotalphafold3predictsconfidentbutunrealisticbsolenoidstructuresforrepeatproteins
AT adamjsimpkin alphafold2butnotalphafold3predictsconfidentbutunrealisticbsolenoidstructuresforrepeatproteins
AT danieljrigden alphafold2butnotalphafold3predictsconfidentbutunrealisticbsolenoidstructuresforrepeatproteins