Advanced Text Summarization Model Incorporating NLP Techniques and Feature-Based Scoring

The most common traditional approaches to summarizing large texts while retaining their importance are TF-IDF and TextRank. However, these methods often fail to retain narrative coherence and accuracy. This study’s improved summarization methodology overcomes these limitations by combinin...

Full description

Saved in:
Bibliographic Details
Main Authors: Estabraq Abdulreda Kadhim, Mohammad-Reza Feizi-Derakhshi, Hadi S. Aghdasi
Format: Article
Language:English
Published: IEEE 2025-01-01
Series:IEEE Access
Subjects:
Online Access:https://ieeexplore.ieee.org/document/10838534/
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1832576791134339072
author Estabraq Abdulreda Kadhim
Mohammad-Reza Feizi-Derakhshi
Hadi S. Aghdasi
author_facet Estabraq Abdulreda Kadhim
Mohammad-Reza Feizi-Derakhshi
Hadi S. Aghdasi
author_sort Estabraq Abdulreda Kadhim
collection DOAJ
description The most common traditional approaches to summarizing large texts while retaining their importance are TF-IDF and TextRank. However, these methods often fail to retain narrative coherence and accuracy. This study’s improved summarization methodology overcomes these limitations by combining the linguistic and semantic resources. Moreover, although it is more computationally complex, it efficiently combines higher quality with faster summarization. Specifically, a method relies on a weighted feature score scheme. For example, various textual features such as Named Entity Counts, Noun Counts, and Sentence Position contribute to the summarization quality appropriately. This study’s summarization algorithm was tested using the CNN, XSum and BBC Summarization datasets, which aggregate documents from different areas. The methodology was checked against traditional methods using ROUGE-1 and ROUGE-2, ROUGE-L and BERTScore. The last one, BERTScore, evaluates the semantic similarity of the generated summaries and the references. This study shows that the proposed methodology generates summaries that are not only informative but even semantically faithfully reproduce the original textual information; it achieves high scores in terms of F1-measure across different evaluations like BERTSCORE (0.8857) and ROUGE-1(0.6388), ROUGE-2(0.5662) and ROUGE-L (0.6421). It thus suggests that the approach is applicable in real-life situations and deserves further research.
format Article
id doaj-art-e29b4e39dbb0453a8e6ffe7ffc185b8d
institution Kabale University
issn 2169-3536
language English
publishDate 2025-01-01
publisher IEEE
record_format Article
series IEEE Access
spelling doaj-art-e29b4e39dbb0453a8e6ffe7ffc185b8d2025-01-31T00:02:01ZengIEEEIEEE Access2169-35362025-01-0113193021931910.1109/ACCESS.2025.352883010838534Advanced Text Summarization Model Incorporating NLP Techniques and Feature-Based ScoringEstabraq Abdulreda Kadhim0https://orcid.org/0000-0002-9657-2032Mohammad-Reza Feizi-Derakhshi1https://orcid.org/0000-0002-8548-976XHadi S. Aghdasi2https://orcid.org/0000-0003-1613-7370Department of Computer Engineering, Computerized Intelligence Systems Laboratory, University of Tabriz, Tabriz, IranDepartment of Computer Engineering, Computerized Intelligence Systems Laboratory, University of Tabriz, Tabriz, IranDepartment of Computer Engineering, University of Tabriz, Tabriz, IranThe most common traditional approaches to summarizing large texts while retaining their importance are TF-IDF and TextRank. However, these methods often fail to retain narrative coherence and accuracy. This study’s improved summarization methodology overcomes these limitations by combining the linguistic and semantic resources. Moreover, although it is more computationally complex, it efficiently combines higher quality with faster summarization. Specifically, a method relies on a weighted feature score scheme. For example, various textual features such as Named Entity Counts, Noun Counts, and Sentence Position contribute to the summarization quality appropriately. This study’s summarization algorithm was tested using the CNN, XSum and BBC Summarization datasets, which aggregate documents from different areas. The methodology was checked against traditional methods using ROUGE-1 and ROUGE-2, ROUGE-L and BERTScore. The last one, BERTScore, evaluates the semantic similarity of the generated summaries and the references. This study shows that the proposed methodology generates summaries that are not only informative but even semantically faithfully reproduce the original textual information; it achieves high scores in terms of F1-measure across different evaluations like BERTSCORE (0.8857) and ROUGE-1(0.6388), ROUGE-2(0.5662) and ROUGE-L (0.6421). It thus suggests that the approach is applicable in real-life situations and deserves further research.https://ieeexplore.ieee.org/document/10838534/Extractive text summarizationsentences scoringweighted featuresBBC-news datasetrouge metricBERTScore metric
spellingShingle Estabraq Abdulreda Kadhim
Mohammad-Reza Feizi-Derakhshi
Hadi S. Aghdasi
Advanced Text Summarization Model Incorporating NLP Techniques and Feature-Based Scoring
IEEE Access
Extractive text summarization
sentences scoring
weighted features
BBC-news dataset
rouge metric
BERTScore metric
title Advanced Text Summarization Model Incorporating NLP Techniques and Feature-Based Scoring
title_full Advanced Text Summarization Model Incorporating NLP Techniques and Feature-Based Scoring
title_fullStr Advanced Text Summarization Model Incorporating NLP Techniques and Feature-Based Scoring
title_full_unstemmed Advanced Text Summarization Model Incorporating NLP Techniques and Feature-Based Scoring
title_short Advanced Text Summarization Model Incorporating NLP Techniques and Feature-Based Scoring
title_sort advanced text summarization model incorporating nlp techniques and feature based scoring
topic Extractive text summarization
sentences scoring
weighted features
BBC-news dataset
rouge metric
BERTScore metric
url https://ieeexplore.ieee.org/document/10838534/
work_keys_str_mv AT estabraqabdulredakadhim advancedtextsummarizationmodelincorporatingnlptechniquesandfeaturebasedscoring
AT mohammadrezafeiziderakhshi advancedtextsummarizationmodelincorporatingnlptechniquesandfeaturebasedscoring
AT hadisaghdasi advancedtextsummarizationmodelincorporatingnlptechniquesandfeaturebasedscoring