Enhancing uncertainty quantification in drug discovery with censored regression labels
In the early stages of drug discovery, decisions regarding which experiments to pursue can be influenced by computational models for quantitative structure–activity relationships (QSAR). These decisions are critical due to the time-consuming and expensive nature of the experiments. Therefore, it is...
Saved in:
| Main Authors: | , , , , , |
|---|---|
| Format: | Article |
| Language: | English |
| Published: |
Elsevier
2025-06-01
|
| Series: | Artificial Intelligence in the Life Sciences |
| Subjects: | |
| Online Access: | http://www.sciencedirect.com/science/article/pii/S2667318525000042 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| _version_ | 1850128691240108032 |
|---|---|
| author | Emma Svensson Hannah Rosa Friesacher Susanne Winiwarter Lewis Mervin Adam Arany Ola Engkvist |
| author_facet | Emma Svensson Hannah Rosa Friesacher Susanne Winiwarter Lewis Mervin Adam Arany Ola Engkvist |
| author_sort | Emma Svensson |
| collection | DOAJ |
| description | In the early stages of drug discovery, decisions regarding which experiments to pursue can be influenced by computational models for quantitative structure–activity relationships (QSAR). These decisions are critical due to the time-consuming and expensive nature of the experiments. Therefore, it is becoming essential to accurately quantify the uncertainty in machine learning predictions, such that resources can be used optimally and trust in the models improves. While computational methods for QSAR modeling often suffer from limited data and sparse experimental observations, additional information can exist in the form of censored labels that provide thresholds rather than precise values of observations. However, the standard approaches that quantify uncertainty in machine learning cannot fully utilize censored labels. In this work, we adapt ensemble-based, Bayesian, and Gaussian models with tools to learn from censored labels by using the Tobit model from survival analysis. Our results demonstrate that despite the partial information available in censored labels, they are essential to reliably estimate uncertainties in real pharmaceutical settings where approximately one-third or more of experimental labels are censored. |
| format | Article |
| id | doaj-art-2e7739b9e77f42d18d4391a28b7dddfd |
| institution | OA Journals |
| issn | 2667-3185 |
| language | English |
| publishDate | 2025-06-01 |
| publisher | Elsevier |
| record_format | Article |
| series | Artificial Intelligence in the Life Sciences |
| spelling | doaj-art-2e7739b9e77f42d18d4391a28b7dddfd2025-08-20T02:33:12ZengElsevierArtificial Intelligence in the Life Sciences2667-31852025-06-01710012810.1016/j.ailsci.2025.100128Enhancing uncertainty quantification in drug discovery with censored regression labelsEmma Svensson0Hannah Rosa Friesacher1Susanne Winiwarter2Lewis Mervin3Adam Arany4Ola Engkvist5Molecular AI, Discovery Sciences, R&D, AstraZeneca, Gothenburg, 431 83, Sweden; ELLIS Unit Linz & Institute for Machine Learning, Johannes Kepler University Linz, Linz, 4040, Austria; Corresponding author at: Molecular AI, Discovery Sciences, R&D, AstraZeneca, Gothenburg, 431 83, Sweden.Molecular AI, Discovery Sciences, R&D, AstraZeneca, Gothenburg, 431 83, Sweden; ESAT-STADIUS, KU Leuven, Leuven, 3000, BelgiumDrug Metabolism and Pharmacokinetics, Research and Early Development Cardiovascular, Renal and Metabolism (CVRM), BioPharmaceuticals R&D, AstraZeneca, Gothenburg, 431 83, SwedenMolecular AI, Discovery Sciences, R&D, AstraZeneca, Cambridge, CB2 0AA, UKESAT-STADIUS, KU Leuven, Leuven, 3000, BelgiumMolecular AI, Discovery Sciences, R&D, AstraZeneca, Gothenburg, 431 83, Sweden; Department of Computer Science and Engineering, Chalmers University of Technology, Gothenburg, 412 96, SwedenIn the early stages of drug discovery, decisions regarding which experiments to pursue can be influenced by computational models for quantitative structure–activity relationships (QSAR). These decisions are critical due to the time-consuming and expensive nature of the experiments. Therefore, it is becoming essential to accurately quantify the uncertainty in machine learning predictions, such that resources can be used optimally and trust in the models improves. While computational methods for QSAR modeling often suffer from limited data and sparse experimental observations, additional information can exist in the form of censored labels that provide thresholds rather than precise values of observations. However, the standard approaches that quantify uncertainty in machine learning cannot fully utilize censored labels. In this work, we adapt ensemble-based, Bayesian, and Gaussian models with tools to learn from censored labels by using the Tobit model from survival analysis. Our results demonstrate that despite the partial information available in censored labels, they are essential to reliably estimate uncertainties in real pharmaceutical settings where approximately one-third or more of experimental labels are censored.http://www.sciencedirect.com/science/article/pii/S2667318525000042Uncertainty quantificationCensored regressionTemporal evaluationDistribution shiftDeep learningDrug discovery |
| spellingShingle | Emma Svensson Hannah Rosa Friesacher Susanne Winiwarter Lewis Mervin Adam Arany Ola Engkvist Enhancing uncertainty quantification in drug discovery with censored regression labels Artificial Intelligence in the Life Sciences Uncertainty quantification Censored regression Temporal evaluation Distribution shift Deep learning Drug discovery |
| title | Enhancing uncertainty quantification in drug discovery with censored regression labels |
| title_full | Enhancing uncertainty quantification in drug discovery with censored regression labels |
| title_fullStr | Enhancing uncertainty quantification in drug discovery with censored regression labels |
| title_full_unstemmed | Enhancing uncertainty quantification in drug discovery with censored regression labels |
| title_short | Enhancing uncertainty quantification in drug discovery with censored regression labels |
| title_sort | enhancing uncertainty quantification in drug discovery with censored regression labels |
| topic | Uncertainty quantification Censored regression Temporal evaluation Distribution shift Deep learning Drug discovery |
| url | http://www.sciencedirect.com/science/article/pii/S2667318525000042 |
| work_keys_str_mv | AT emmasvensson enhancinguncertaintyquantificationindrugdiscoverywithcensoredregressionlabels AT hannahrosafriesacher enhancinguncertaintyquantificationindrugdiscoverywithcensoredregressionlabels AT susannewiniwarter enhancinguncertaintyquantificationindrugdiscoverywithcensoredregressionlabels AT lewismervin enhancinguncertaintyquantificationindrugdiscoverywithcensoredregressionlabels AT adamarany enhancinguncertaintyquantificationindrugdiscoverywithcensoredregressionlabels AT olaengkvist enhancinguncertaintyquantificationindrugdiscoverywithcensoredregressionlabels |