Auto-Probabilistic Mining Method for Siamese Neural Network Training
Training deep learning models for classification with limited data and computational resources remains a challenge when the number of classes is large. Metric learning offers an effective solution to this problem. However, it has its own shortcomings due to the known imperfections of widely used los...
Saved in:
| Main Authors: | , , |
|---|---|
| Format: | Article |
| Language: | English |
| Published: |
MDPI AG
2025-04-01
|
| Series: | Mathematics |
| Subjects: | |
| Online Access: | https://www.mdpi.com/2227-7390/13/8/1270 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| Summary: | Training deep learning models for classification with limited data and computational resources remains a challenge when the number of classes is large. Metric learning offers an effective solution to this problem. However, it has its own shortcomings due to the known imperfections of widely used loss functions such as contrastive loss and triplet loss, as well as sample mining methods. This paper address these issues by proposing a novel mining method and metric loss function. Firstly, this paper presents an auto-probabilistic mining method designed to automatically select the most informative training samples for Siamese neural networks. Combined with a previously proposed auto-clustering technique, the method improves model training by optimizing the utilization of available data and reducing computational overhead. Secondly, this paper proposes the novel cluster-aware triplet-based metric loss function that addresses the limitations of contrastive and triplet loss, enhancing the overall training process. To evaluate the proposed methods, experiments were conducted with the optical character recognition task using the PHD08 and Omniglot datasets. The proposed loss function with the random-mining method achieved <inline-formula><math xmlns="http://www.w3.org/1998/Math/MathML" display="inline"><semantics><mrow><mn>82.6</mn><mo>%</mo></mrow></semantics></math></inline-formula> classification accuracy on the PHD08 dataset with full training on the Korean alphabet, surpassing the known baseline. The same experiment, using a reduced training alphabet, set a new baseline of <inline-formula><math xmlns="http://www.w3.org/1998/Math/MathML" display="inline"><semantics><mrow><mn>88.6</mn><mo>%</mo></mrow></semantics></math></inline-formula> on the PHD08 dataset. The application of the novel mining method further enhanced the accuracy to <inline-formula><math xmlns="http://www.w3.org/1998/Math/MathML" display="inline"><semantics><mrow><mn>90.6</mn><mo>%</mo></mrow></semantics></math></inline-formula> (+2.0%) and, combined with auto-clustering, achieved <inline-formula><math xmlns="http://www.w3.org/1998/Math/MathML" display="inline"><semantics><mrow><mn>92.3</mn><mo>%</mo></mrow></semantics></math></inline-formula> (+3.7%) compared with the new baseline. On the Omniglot dataset, the proposed mining method reached <inline-formula><math xmlns="http://www.w3.org/1998/Math/MathML" display="inline"><semantics><mrow><mn>92.32</mn><mo>%</mo></mrow></semantics></math></inline-formula>, rising to <inline-formula><math xmlns="http://www.w3.org/1998/Math/MathML" display="inline"><semantics><mrow><mn>93.17</mn><mo>%</mo></mrow></semantics></math></inline-formula> with auto-clustering. These findings highlight the potential effectiveness of the developed loss function and mining method in addressing a wide range of pattern recognition challenges. |
|---|---|
| ISSN: | 2227-7390 |