A Multi-Faceted Approach to Trending Topic Attack Detection Using Semantic Similarity and Large-Scale Datasets

Twitter’s widespread popularity has made it a prime target for malicious actors exploiting trending hashtags to disseminate harmful content. This study marks the first systematic exploration of semantic consistency in tweets to detect trending topic attacks. Unlike previous approaches, we...

Full description

Saved in:

Bibliographic Details
Main Authors:	Insaf Kraidia, Afifa Ghenai, Samir Brahim Belhaouari
Format:	Article
Language:	English
Published:	IEEE 2025-01-01
Series:	IEEE Access
Subjects:	Trending topic attacks semantic similarity detection twitter hashtag
Online Access:	https://ieeexplore.ieee.org/document/10857330/
Tags:	Add Tag No Tags, Be the first to tag this record!

_version_	1832540542402035712
author	Insaf Kraidia Afifa Ghenai Samir Brahim Belhaouari
author_facet	Insaf Kraidia Afifa Ghenai Samir Brahim Belhaouari
author_sort	Insaf Kraidia
collection	DOAJ
description	Twitter’s widespread popularity has made it a prime target for malicious actors exploiting trending hashtags to disseminate harmful content. This study marks the first systematic exploration of semantic consistency in tweets to detect trending topic attacks. Unlike previous approaches, we emphasize the semantic aspect of tweets, leveraging advanced techniques such as semantic similarity estimation using WordNet and contextual understanding through Sentence-Transformers. To support this methodology, we curated large-scale, high-quality datasets comprising 7,000 Arabic and 28,000 English tweets, applying tailored preprocessing steps to ensure efficiency and accuracy. A novel data augmentation technique further enriched the quality and diversity of these datasets. We evaluated our approach using a comprehensive framework that assessed textual, image, and overall similarity. Five machine learning models—Random Forest, Decision Tree, K-Neighbors, Gradient Boosting, and XGBoost—were tested, with results benchmarked against nine baseline methods across different linguistic datasets and learning scenarios. Our approach demonstrated superior performance, achieving F1-scores of 96% for English and 97% for Arabic, with accuracy improvements ranging from 2% to 14% for English and 5% to 28% for Arabic. These results establish a new benchmark for detecting trending topic attacks across languages, highlighting the robustness and effectiveness of our method in combating malicious activities on social platforms.
format	Article
id	doaj-art-8df8a742fb9440d1b903df02fb5d24ad
institution	Kabale University
issn	2169-3536
language	English
publishDate	2025-01-01
publisher	IEEE
record_format	Article
series	IEEE Access
spelling	doaj-art-8df8a742fb9440d1b903df02fb5d24ad2025-02-05T00:01:09ZengIEEEIEEE Access2169-35362025-01-0113210052102810.1109/ACCESS.2025.353599610857330A Multi-Faceted Approach to Trending Topic Attack Detection Using Semantic Similarity and Large-Scale DatasetsInsaf Kraidia0https://orcid.org/0000-0001-5538-9883Afifa Ghenai1Samir Brahim Belhaouari2https://orcid.org/0000-0003-2336-0490LIRE Laboratory, University of Constantine 2–Abdelhamid Mehri, Ali Mendjeli Campus, Constantine, AlgeriaLIRE Laboratory, University of Constantine 2–Abdelhamid Mehri, Ali Mendjeli Campus, Constantine, AlgeriaDivision of Information and Computing Technology, College of Science and Engineering, Hamad Bin Khalifa University, Ar-Rayyan, Doha, QatarTwitter’s widespread popularity has made it a prime target for malicious actors exploiting trending hashtags to disseminate harmful content. This study marks the first systematic exploration of semantic consistency in tweets to detect trending topic attacks. Unlike previous approaches, we emphasize the semantic aspect of tweets, leveraging advanced techniques such as semantic similarity estimation using WordNet and contextual understanding through Sentence-Transformers. To support this methodology, we curated large-scale, high-quality datasets comprising 7,000 Arabic and 28,000 English tweets, applying tailored preprocessing steps to ensure efficiency and accuracy. A novel data augmentation technique further enriched the quality and diversity of these datasets. We evaluated our approach using a comprehensive framework that assessed textual, image, and overall similarity. Five machine learning models—Random Forest, Decision Tree, K-Neighbors, Gradient Boosting, and XGBoost—were tested, with results benchmarked against nine baseline methods across different linguistic datasets and learning scenarios. Our approach demonstrated superior performance, achieving F1-scores of 96% for English and 97% for Arabic, with accuracy improvements ranging from 2% to 14% for English and 5% to 28% for Arabic. These results establish a new benchmark for detecting trending topic attacks across languages, highlighting the robustness and effectiveness of our method in combating malicious activities on social platforms.https://ieeexplore.ieee.org/document/10857330/Trending topic attackssemantic similaritydetectiontwitterhashtag
spellingShingle	Insaf Kraidia Afifa Ghenai Samir Brahim Belhaouari A Multi-Faceted Approach to Trending Topic Attack Detection Using Semantic Similarity and Large-Scale Datasets IEEE Access Trending topic attacks semantic similarity detection twitter hashtag
title	A Multi-Faceted Approach to Trending Topic Attack Detection Using Semantic Similarity and Large-Scale Datasets
title_full	A Multi-Faceted Approach to Trending Topic Attack Detection Using Semantic Similarity and Large-Scale Datasets
title_fullStr	A Multi-Faceted Approach to Trending Topic Attack Detection Using Semantic Similarity and Large-Scale Datasets
title_full_unstemmed	A Multi-Faceted Approach to Trending Topic Attack Detection Using Semantic Similarity and Large-Scale Datasets
title_short	A Multi-Faceted Approach to Trending Topic Attack Detection Using Semantic Similarity and Large-Scale Datasets
title_sort	multi faceted approach to trending topic attack detection using semantic similarity and large scale datasets
topic	Trending topic attacks semantic similarity detection twitter hashtag
url	https://ieeexplore.ieee.org/document/10857330/
work_keys_str_mv	AT insafkraidia amultifacetedapproachtotrendingtopicattackdetectionusingsemanticsimilarityandlargescaledatasets AT afifaghenai amultifacetedapproachtotrendingtopicattackdetectionusingsemanticsimilarityandlargescaledatasets AT samirbrahimbelhaouari amultifacetedapproachtotrendingtopicattackdetectionusingsemanticsimilarityandlargescaledatasets AT insafkraidia multifacetedapproachtotrendingtopicattackdetectionusingsemanticsimilarityandlargescaledatasets AT afifaghenai multifacetedapproachtotrendingtopicattackdetectionusingsemanticsimilarityandlargescaledatasets AT samirbrahimbelhaouari multifacetedapproachtotrendingtopicattackdetectionusingsemanticsimilarityandlargescaledatasets

A Multi-Faceted Approach to Trending Topic Attack Detection Using Semantic Similarity and Large-Scale Datasets

Similar Items