Instance-Level Weighted Contrast Learning for Text Classification
With the explosion of information, the amount of text data has increased significantly, making text categorization a central area of research in natural language processing (NLP). Traditional machine learning methods are effective, but deep learning models excel in processing semantic information. M...
Saved in:
| Main Authors: | , , |
|---|---|
| Format: | Article |
| Language: | English |
| Published: |
MDPI AG
2025-04-01
|
| Series: | Applied Sciences |
| Subjects: | |
| Online Access: | https://www.mdpi.com/2076-3417/15/8/4236 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| Summary: | With the explosion of information, the amount of text data has increased significantly, making text categorization a central area of research in natural language processing (NLP). Traditional machine learning methods are effective, but deep learning models excel in processing semantic information. Models such as CNN, RNN, LSTM, and GRU have emerged as powerful tools for text classification. Pre-trained models such as BERT and GPT have further advanced text categorization techniques. Contrastive learning has become a key research focus aimed at improving classification performance by learning the similarities and differences between samples using models. However, existing contrastive learning methods have notable shortcomings, primarily concerning insufficient data utilization. This study focuses on data enhancement techniques to expand the text data through symbol insertion, affirmative auxiliary verbs, double negation, and punctuation repetition, aiming to improve the generalization and robustness of the pre-trained model. Two data enhancement strategies, affirmative enhancement and negative transformation, are introduced to deepen the data’s meaning and increase the volume of training data. To address the introduction of false data, an instance weighting method is employed to penalize false negative samples, while complementary models generate sample weights to mitigate the impact of sampling bias. Finally, the effectiveness of the proposed method is demonstrated through several experiments. |
|---|---|
| ISSN: | 2076-3417 |