One size does not fit all: revising traditional paradigms for assessing accuracy of QSAR models used for virtual screening
Abstract Traditional best practices for quantitative structure activity relationship (QSAR) modeling recommend dataset balancing and balanced accuracy (BA) as the key desired objective of model development. This study explores the value of the conventional norms in the context of using QSAR models f...
Saved in:
Main Authors: | , , , , , , |
---|---|
Format: | Article |
Language: | English |
Published: |
BMC
2025-01-01
|
Series: | Journal of Cheminformatics |
Subjects: | |
Online Access: | https://doi.org/10.1186/s13321-025-00948-y |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
_version_ | 1832594501160402944 |
---|---|
author | James Wellnitz Sankalp Jain Joshua E. Hochuli Travis Maxfield Eugene N. Muratov Alexander Tropsha Alexey V. Zakharov |
author_facet | James Wellnitz Sankalp Jain Joshua E. Hochuli Travis Maxfield Eugene N. Muratov Alexander Tropsha Alexey V. Zakharov |
author_sort | James Wellnitz |
collection | DOAJ |
description | Abstract Traditional best practices for quantitative structure activity relationship (QSAR) modeling recommend dataset balancing and balanced accuracy (BA) as the key desired objective of model development. This study explores the value of the conventional norms in the context of using QSAR models for virtual screening of modern large and ultra-large chemical libraries. For this increasingly common task, we now recommend the use of models with the highest positive predictive value (PPV) built on imbalanced training sets as preferred virtual screening tools. This recommendation stems from practical considerations of how the results of virtual screening are used in experimental laboratories where only a small fraction of virtually screened molecules can be tested using standard well plates. As a proof of concept, we have developed QSAR models for five expansive datasets with different ratios of active and inactive molecules and compared model performance in virtual screening using BA, PPV, and other metrics. We show that training on imbalanced datasets achieves a hit rate at least 30% higher than using balanced datasets, and that the PPV metric captured this difference of performance with no parameter tuning. Importantly, hit rates were estimated for top scoring compounds organized in batches of the size of plates (for instance, 128 molecules) used in the experimental high throughput screening. Based on the results of our studies, we posit that QSAR models trained on imbalanced datasets with the highest PPV should be relied upon to identify and test hit compounds in early drug discovery studies. |
format | Article |
id | doaj-art-aa3da52fd7ce446eb33b88c524bc4f5e |
institution | Kabale University |
issn | 1758-2946 |
language | English |
publishDate | 2025-01-01 |
publisher | BMC |
record_format | Article |
series | Journal of Cheminformatics |
spelling | doaj-art-aa3da52fd7ce446eb33b88c524bc4f5e2025-01-19T12:37:02ZengBMCJournal of Cheminformatics1758-29462025-01-011711810.1186/s13321-025-00948-yOne size does not fit all: revising traditional paradigms for assessing accuracy of QSAR models used for virtual screeningJames Wellnitz0Sankalp Jain1Joshua E. Hochuli2Travis Maxfield3Eugene N. Muratov4Alexander Tropsha5Alexey V. Zakharov6Division of Chemical Biology and Medicinal Chemistry, Laboratory for Molecular Modeling,, UNC Eshelman School of Pharmacy, University of North CarolinaNational Center for Advancing Translational Sciences (NCATS), National Institutes of HealthDivision of Chemical Biology and Medicinal Chemistry, Laboratory for Molecular Modeling,, UNC Eshelman School of Pharmacy, University of North CarolinaDivision of Chemical Biology and Medicinal Chemistry, Laboratory for Molecular Modeling,, UNC Eshelman School of Pharmacy, University of North CarolinaDivision of Chemical Biology and Medicinal Chemistry, Laboratory for Molecular Modeling,, UNC Eshelman School of Pharmacy, University of North CarolinaDivision of Chemical Biology and Medicinal Chemistry, Laboratory for Molecular Modeling,, UNC Eshelman School of Pharmacy, University of North CarolinaNational Center for Advancing Translational Sciences (NCATS), National Institutes of HealthAbstract Traditional best practices for quantitative structure activity relationship (QSAR) modeling recommend dataset balancing and balanced accuracy (BA) as the key desired objective of model development. This study explores the value of the conventional norms in the context of using QSAR models for virtual screening of modern large and ultra-large chemical libraries. For this increasingly common task, we now recommend the use of models with the highest positive predictive value (PPV) built on imbalanced training sets as preferred virtual screening tools. This recommendation stems from practical considerations of how the results of virtual screening are used in experimental laboratories where only a small fraction of virtually screened molecules can be tested using standard well plates. As a proof of concept, we have developed QSAR models for five expansive datasets with different ratios of active and inactive molecules and compared model performance in virtual screening using BA, PPV, and other metrics. We show that training on imbalanced datasets achieves a hit rate at least 30% higher than using balanced datasets, and that the PPV metric captured this difference of performance with no parameter tuning. Importantly, hit rates were estimated for top scoring compounds organized in batches of the size of plates (for instance, 128 molecules) used in the experimental high throughput screening. Based on the results of our studies, we posit that QSAR models trained on imbalanced datasets with the highest PPV should be relied upon to identify and test hit compounds in early drug discovery studies.https://doi.org/10.1186/s13321-025-00948-yComputer-assisted drug discoveryQSAR modelingImbalanced datasetsVirtual screeningPositive predictive valueHit rate |
spellingShingle | James Wellnitz Sankalp Jain Joshua E. Hochuli Travis Maxfield Eugene N. Muratov Alexander Tropsha Alexey V. Zakharov One size does not fit all: revising traditional paradigms for assessing accuracy of QSAR models used for virtual screening Journal of Cheminformatics Computer-assisted drug discovery QSAR modeling Imbalanced datasets Virtual screening Positive predictive value Hit rate |
title | One size does not fit all: revising traditional paradigms for assessing accuracy of QSAR models used for virtual screening |
title_full | One size does not fit all: revising traditional paradigms for assessing accuracy of QSAR models used for virtual screening |
title_fullStr | One size does not fit all: revising traditional paradigms for assessing accuracy of QSAR models used for virtual screening |
title_full_unstemmed | One size does not fit all: revising traditional paradigms for assessing accuracy of QSAR models used for virtual screening |
title_short | One size does not fit all: revising traditional paradigms for assessing accuracy of QSAR models used for virtual screening |
title_sort | one size does not fit all revising traditional paradigms for assessing accuracy of qsar models used for virtual screening |
topic | Computer-assisted drug discovery QSAR modeling Imbalanced datasets Virtual screening Positive predictive value Hit rate |
url | https://doi.org/10.1186/s13321-025-00948-y |
work_keys_str_mv | AT jameswellnitz onesizedoesnotfitallrevisingtraditionalparadigmsforassessingaccuracyofqsarmodelsusedforvirtualscreening AT sankalpjain onesizedoesnotfitallrevisingtraditionalparadigmsforassessingaccuracyofqsarmodelsusedforvirtualscreening AT joshuaehochuli onesizedoesnotfitallrevisingtraditionalparadigmsforassessingaccuracyofqsarmodelsusedforvirtualscreening AT travismaxfield onesizedoesnotfitallrevisingtraditionalparadigmsforassessingaccuracyofqsarmodelsusedforvirtualscreening AT eugenenmuratov onesizedoesnotfitallrevisingtraditionalparadigmsforassessingaccuracyofqsarmodelsusedforvirtualscreening AT alexandertropsha onesizedoesnotfitallrevisingtraditionalparadigmsforassessingaccuracyofqsarmodelsusedforvirtualscreening AT alexeyvzakharov onesizedoesnotfitallrevisingtraditionalparadigmsforassessingaccuracyofqsarmodelsusedforvirtualscreening |