Reducing Data Volume in News Topic Classification: Deep Learning Framework and Dataset
Withthe rise of smart devices and technological advancements, accessing vast amounts of information has become easier than ever before. However, sorting and categorising such an overwhelming volume of content has become increasingly challenging. This article introduces a new framework for classifyin...
Saved in:
Main Authors: | , , |
---|---|
Format: | Article |
Language: | English |
Published: |
IEEE
2025-01-01
|
Series: | IEEE Open Journal of the Computer Society |
Subjects: | |
Online Access: | https://ieeexplore.ieee.org/document/10806791/ |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
_version_ | 1832592843733991424 |
---|---|
author | Luigi Serreli Claudio Marche Michele Nitti |
author_facet | Luigi Serreli Claudio Marche Michele Nitti |
author_sort | Luigi Serreli |
collection | DOAJ |
description | Withthe rise of smart devices and technological advancements, accessing vast amounts of information has become easier than ever before. However, sorting and categorising such an overwhelming volume of content has become increasingly challenging. This article introduces a new framework for classifying news articles based on a Bidirectional LSTM (BiLSTM) network and an attention mechanism. The article also presents a new dataset of 60 000 news articles from various global sources. Furthermore, it proposes a methodology for reducing data volume by extracting key sentences using an algorithm resulting in inference times that are, on average, 50% shorter than the original document without compromising the system's accuracy. Experimental evaluations demonstrate that our framework outperforms existing methodologies in terms of accuracy. Our system's accuracy has been compared with various works using two popular datasets, AG News and BBC News, and has achieved excellent results of 99.7% and 94.55%, respectively. |
format | Article |
id | doaj-art-2a41c73583a54d43b499be4c7bbf2bcf |
institution | Kabale University |
issn | 2644-1268 |
language | English |
publishDate | 2025-01-01 |
publisher | IEEE |
record_format | Article |
series | IEEE Open Journal of the Computer Society |
spelling | doaj-art-2a41c73583a54d43b499be4c7bbf2bcf2025-01-21T00:02:38ZengIEEEIEEE Open Journal of the Computer Society2644-12682025-01-01615316410.1109/OJCS.2024.351974710806791Reducing Data Volume in News Topic Classification: Deep Learning Framework and DatasetLuigi Serreli0https://orcid.org/0000-0002-8793-9015Claudio Marche1https://orcid.org/0000-0002-1017-9046Michele Nitti2https://orcid.org/0000-0002-7832-7121Department of Electrical and Electronic Engineering (DIEE), University of Cagliari, Cagliari, ItalyDepartment of Electrical and Electronic Engineering (DIEE), University of Cagliari, Cagliari, ItalyDepartment of Electrical and Electronic Engineering (DIEE), University of Cagliari, Cagliari, ItalyWiththe rise of smart devices and technological advancements, accessing vast amounts of information has become easier than ever before. However, sorting and categorising such an overwhelming volume of content has become increasingly challenging. This article introduces a new framework for classifying news articles based on a Bidirectional LSTM (BiLSTM) network and an attention mechanism. The article also presents a new dataset of 60 000 news articles from various global sources. Furthermore, it proposes a methodology for reducing data volume by extracting key sentences using an algorithm resulting in inference times that are, on average, 50% shorter than the original document without compromising the system's accuracy. Experimental evaluations demonstrate that our framework outperforms existing methodologies in terms of accuracy. Our system's accuracy has been compared with various works using two popular datasets, AG News and BBC News, and has achieved excellent results of 99.7% and 94.55%, respectively.https://ieeexplore.ieee.org/document/10806791/Data volumedeep learningnatural language processingtopic classification |
spellingShingle | Luigi Serreli Claudio Marche Michele Nitti Reducing Data Volume in News Topic Classification: Deep Learning Framework and Dataset IEEE Open Journal of the Computer Society Data volume deep learning natural language processing topic classification |
title | Reducing Data Volume in News Topic Classification: Deep Learning Framework and Dataset |
title_full | Reducing Data Volume in News Topic Classification: Deep Learning Framework and Dataset |
title_fullStr | Reducing Data Volume in News Topic Classification: Deep Learning Framework and Dataset |
title_full_unstemmed | Reducing Data Volume in News Topic Classification: Deep Learning Framework and Dataset |
title_short | Reducing Data Volume in News Topic Classification: Deep Learning Framework and Dataset |
title_sort | reducing data volume in news topic classification deep learning framework and dataset |
topic | Data volume deep learning natural language processing topic classification |
url | https://ieeexplore.ieee.org/document/10806791/ |
work_keys_str_mv | AT luigiserreli reducingdatavolumeinnewstopicclassificationdeeplearningframeworkanddataset AT claudiomarche reducingdatavolumeinnewstopicclassificationdeeplearningframeworkanddataset AT michelenitti reducingdatavolumeinnewstopicclassificationdeeplearningframeworkanddataset |