Context-dependent similarity analysis of analogue series for structure–activity relationship transfer based on a concept from natural language processing
Abstract Analogue series (AS) are generated during compound optimization in medicinal chemistry and are the major source of structure–activity relationship (SAR) information. Pairs of active AS consisting of compounds with corresponding substituents and comparable potency progression represent SAR t...
Saved in:
Main Authors: | , |
---|---|
Format: | Article |
Language: | English |
Published: |
BMC
2025-01-01
|
Series: | Journal of Cheminformatics |
Subjects: | |
Online Access: | https://doi.org/10.1186/s13321-025-00951-3 |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Summary: | Abstract Analogue series (AS) are generated during compound optimization in medicinal chemistry and are the major source of structure–activity relationship (SAR) information. Pairs of active AS consisting of compounds with corresponding substituents and comparable potency progression represent SAR transfer events for the same target or across different targets. We report a new computational approach to systematically search for SAR transfer series that combines an AS alignment algorithm with context-depending similarity assessment based on vector embeddings adapted from natural language processing. The methodology comprehensively accounts for substituent similarity, identifies non-classical bioisosteres, captures substituent-property relationships, and generates accurate AS alignments. Context-dependent similarity assessment is conceptually novel in computational medicinal chemistry and should also be of interest for other applications. Scientific contribution A method is reported to systematically search for and align analogue series with SAR transfer potential. Central to the approach is the assessment of context-dependent similarity for substituents, a new concept in cheminformatics, which is based upon vector embeddings and word pair relationships adapted from natural language processing. |
---|---|
ISSN: | 1758-2946 |