Context-dependent similarity analysis of analogue series for structure–activity relationship transfer based on a concept from natural language processing
Abstract Analogue series (AS) are generated during compound optimization in medicinal chemistry and are the major source of structure–activity relationship (SAR) information. Pairs of active AS consisting of compounds with corresponding substituents and comparable potency progression represent SAR t...
Saved in:
Main Authors: | , |
---|---|
Format: | Article |
Language: | English |
Published: |
BMC
2025-01-01
|
Series: | Journal of Cheminformatics |
Subjects: | |
Online Access: | https://doi.org/10.1186/s13321-025-00951-3 |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
_version_ | 1832594459996454912 |
---|---|
author | Atsushi Yoshimori Jürgen Bajorath |
author_facet | Atsushi Yoshimori Jürgen Bajorath |
author_sort | Atsushi Yoshimori |
collection | DOAJ |
description | Abstract Analogue series (AS) are generated during compound optimization in medicinal chemistry and are the major source of structure–activity relationship (SAR) information. Pairs of active AS consisting of compounds with corresponding substituents and comparable potency progression represent SAR transfer events for the same target or across different targets. We report a new computational approach to systematically search for SAR transfer series that combines an AS alignment algorithm with context-depending similarity assessment based on vector embeddings adapted from natural language processing. The methodology comprehensively accounts for substituent similarity, identifies non-classical bioisosteres, captures substituent-property relationships, and generates accurate AS alignments. Context-dependent similarity assessment is conceptually novel in computational medicinal chemistry and should also be of interest for other applications. Scientific contribution A method is reported to systematically search for and align analogue series with SAR transfer potential. Central to the approach is the assessment of context-dependent similarity for substituents, a new concept in cheminformatics, which is based upon vector embeddings and word pair relationships adapted from natural language processing. |
format | Article |
id | doaj-art-164fe480377d4bec853914b3386b4936 |
institution | Kabale University |
issn | 1758-2946 |
language | English |
publishDate | 2025-01-01 |
publisher | BMC |
record_format | Article |
series | Journal of Cheminformatics |
spelling | doaj-art-164fe480377d4bec853914b3386b49362025-01-19T12:37:01ZengBMCJournal of Cheminformatics1758-29462025-01-0117111410.1186/s13321-025-00951-3Context-dependent similarity analysis of analogue series for structure–activity relationship transfer based on a concept from natural language processingAtsushi Yoshimori0Jürgen Bajorath1Institute for Theoretical Medicine, Inc.Department of Life Science Informatics and Data Science, B-IT, LIMES Program Unit Chemical Biology and Medicinal Chemistry, University of BonnAbstract Analogue series (AS) are generated during compound optimization in medicinal chemistry and are the major source of structure–activity relationship (SAR) information. Pairs of active AS consisting of compounds with corresponding substituents and comparable potency progression represent SAR transfer events for the same target or across different targets. We report a new computational approach to systematically search for SAR transfer series that combines an AS alignment algorithm with context-depending similarity assessment based on vector embeddings adapted from natural language processing. The methodology comprehensively accounts for substituent similarity, identifies non-classical bioisosteres, captures substituent-property relationships, and generates accurate AS alignments. Context-dependent similarity assessment is conceptually novel in computational medicinal chemistry and should also be of interest for other applications. Scientific contribution A method is reported to systematically search for and align analogue series with SAR transfer potential. Central to the approach is the assessment of context-dependent similarity for substituents, a new concept in cheminformatics, which is based upon vector embeddings and word pair relationships adapted from natural language processing.https://doi.org/10.1186/s13321-025-00951-3Active analogue seriesPotency progressionSeries alignmentSAR transferFragment pair relationshipsContext-dependent similarity |
spellingShingle | Atsushi Yoshimori Jürgen Bajorath Context-dependent similarity analysis of analogue series for structure–activity relationship transfer based on a concept from natural language processing Journal of Cheminformatics Active analogue series Potency progression Series alignment SAR transfer Fragment pair relationships Context-dependent similarity |
title | Context-dependent similarity analysis of analogue series for structure–activity relationship transfer based on a concept from natural language processing |
title_full | Context-dependent similarity analysis of analogue series for structure–activity relationship transfer based on a concept from natural language processing |
title_fullStr | Context-dependent similarity analysis of analogue series for structure–activity relationship transfer based on a concept from natural language processing |
title_full_unstemmed | Context-dependent similarity analysis of analogue series for structure–activity relationship transfer based on a concept from natural language processing |
title_short | Context-dependent similarity analysis of analogue series for structure–activity relationship transfer based on a concept from natural language processing |
title_sort | context dependent similarity analysis of analogue series for structure activity relationship transfer based on a concept from natural language processing |
topic | Active analogue series Potency progression Series alignment SAR transfer Fragment pair relationships Context-dependent similarity |
url | https://doi.org/10.1186/s13321-025-00951-3 |
work_keys_str_mv | AT atsushiyoshimori contextdependentsimilarityanalysisofanalogueseriesforstructureactivityrelationshiptransferbasedonaconceptfromnaturallanguageprocessing AT jurgenbajorath contextdependentsimilarityanalysisofanalogueseriesforstructureactivityrelationshiptransferbasedonaconceptfromnaturallanguageprocessing |