Comprehensive Analysis of Masking Techniques in Molecular Graph Representation Learning
Molecule representation learning is a primary area of focus in drug discovery and molecular property prediction. In previous studies, molecules have been modeled as graphs, enabling graph neural networks (GNNs) to capture essential structural information. Recent approaches have enhanced molecular re...
Saved in:
Main Authors: | , |
---|---|
Format: | Article |
Language: | English |
Published: |
IEEE
2025-01-01
|
Series: | IEEE Access |
Subjects: | |
Online Access: | https://ieeexplore.ieee.org/document/10844080/ |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
_version_ | 1832586867765149696 |
---|---|
author | Bonyou Koo Sunyoung Kwon |
author_facet | Bonyou Koo Sunyoung Kwon |
author_sort | Bonyou Koo |
collection | DOAJ |
description | Molecule representation learning is a primary area of focus in drug discovery and molecular property prediction. In previous studies, molecules have been modeled as graphs, enabling graph neural networks (GNNs) to capture essential structural information. Recent approaches have enhanced molecular representations by introducing advanced masking strategies, such as extending granularity from nodes to subgraphs, shifting masking locations, and applying masking during downstream tasks. However, comprehensive analyses of these strategies remain limited. In this study, we systematically evaluate masking techniques across various phases, granularities, locations, feature types, and ratios. Our findings reveal that node feature masking during pre-training achieves high performance, while rich features may reduce gains, and the commonly used 25% masking ratio is not universally optimal, with alternative ratios performing better depending on the dataset. Our study provides deeper insights into the benefits of masking techniques in molecular graphs and highlights their potential to improve semantic understanding and predictive accuracy in graph-based learning. |
format | Article |
id | doaj-art-7a1e57a47f044483a745e51bbf7e879c |
institution | Kabale University |
issn | 2169-3536 |
language | English |
publishDate | 2025-01-01 |
publisher | IEEE |
record_format | Article |
series | IEEE Access |
spelling | doaj-art-7a1e57a47f044483a745e51bbf7e879c2025-01-25T00:01:35ZengIEEEIEEE Access2169-35362025-01-0113142901430310.1109/ACCESS.2025.353130210844080Comprehensive Analysis of Masking Techniques in Molecular Graph Representation LearningBonyou Koo0https://orcid.org/0009-0007-9008-9772Sunyoung Kwon1https://orcid.org/0000-0003-3433-1409Department of Information Convergence Engineering, Pusan National University, Yangsan-si, South KoreaDepartment of Information Convergence Engineering, Pusan National University, Yangsan-si, South KoreaMolecule representation learning is a primary area of focus in drug discovery and molecular property prediction. In previous studies, molecules have been modeled as graphs, enabling graph neural networks (GNNs) to capture essential structural information. Recent approaches have enhanced molecular representations by introducing advanced masking strategies, such as extending granularity from nodes to subgraphs, shifting masking locations, and applying masking during downstream tasks. However, comprehensive analyses of these strategies remain limited. In this study, we systematically evaluate masking techniques across various phases, granularities, locations, feature types, and ratios. Our findings reveal that node feature masking during pre-training achieves high performance, while rich features may reduce gains, and the commonly used 25% masking ratio is not universally optimal, with alternative ratios performing better depending on the dataset. Our study provides deeper insights into the benefits of masking techniques in molecular graphs and highlights their potential to improve semantic understanding and predictive accuracy in graph-based learning.https://ieeexplore.ieee.org/document/10844080/Graph neural networkmaskingmolecular graphrepresentation learningmachine learning |
spellingShingle | Bonyou Koo Sunyoung Kwon Comprehensive Analysis of Masking Techniques in Molecular Graph Representation Learning IEEE Access Graph neural network masking molecular graph representation learning machine learning |
title | Comprehensive Analysis of Masking Techniques in Molecular Graph Representation Learning |
title_full | Comprehensive Analysis of Masking Techniques in Molecular Graph Representation Learning |
title_fullStr | Comprehensive Analysis of Masking Techniques in Molecular Graph Representation Learning |
title_full_unstemmed | Comprehensive Analysis of Masking Techniques in Molecular Graph Representation Learning |
title_short | Comprehensive Analysis of Masking Techniques in Molecular Graph Representation Learning |
title_sort | comprehensive analysis of masking techniques in molecular graph representation learning |
topic | Graph neural network masking molecular graph representation learning machine learning |
url | https://ieeexplore.ieee.org/document/10844080/ |
work_keys_str_mv | AT bonyoukoo comprehensiveanalysisofmaskingtechniquesinmoleculargraphrepresentationlearning AT sunyoungkwon comprehensiveanalysisofmaskingtechniquesinmoleculargraphrepresentationlearning |