Advancing Hate Speech Detection in Indonesian Language Using Graph Neural Networks and TF-IDF
Most of the hate speech and abusive content on social media, particularly in the Indonesian language, presents significant challenges for content moderation systems. Previous research has applied machine learning models such as Recurrent Neural Networks (RNN), Support Vector Machines (SVM), and Conv...
Saved in:
| Main Authors: | , |
|---|---|
| Format: | Article |
| Language: | English |
| Published: |
Ikatan Ahli Informatika Indonesia
2025-02-01
|
| Series: | Jurnal RESTI (Rekayasa Sistem dan Teknologi Informasi) |
| Subjects: | |
| Online Access: | https://jurnal.iaii.or.id/index.php/RESTI/article/view/6179 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| _version_ | 1850222615321378816 |
|---|---|
| author | Syaikha Amirah Zikrina Fitriyani |
| author_facet | Syaikha Amirah Zikrina Fitriyani |
| author_sort | Syaikha Amirah Zikrina |
| collection | DOAJ |
| description | Most of the hate speech and abusive content on social media, particularly in the Indonesian language, presents significant challenges for content moderation systems. Previous research has applied machine learning models such as Recurrent Neural Networks (RNN), Support Vector Machines (SVM), and Convolutional Neural Networks (CNN) to address this issue. However, these approaches are limited in their ability to capture the relational and contextual nuances inherent in the data, resulting in suboptimal performance. This study introduces an approach by combining Graph Neural Networks (GNN) with Term Frequency-Inverse Document Frequency (TF-IDF) for feature extraction to improve hate speech detection on Twitter (platform X). The dataset consists of 13,169 Indonesian tweets, manually labeled for hate speech and abusive categories. Preprocessing steps include text cleaning, stemming, stop-word removal, and normalization. The GNN model achieved superior results, with accuracy scores of 92.90% for Abusive and 89.78% for Hate Speech, significantly outperforming the RNN model, which achieved accuracy of 86.09% and 86.15%, respectively. This study highlights the advantage of graph-based approaches in capturing complex relationships within text data. Future research can explore expanding datasets to include regional dialects and integrating advanced feature extraction techniques like Word2Vec or BERT. This study establishes a robust framework for improving hate speech detection, offering a valuable contribution to safer digital environments. |
| format | Article |
| id | doaj-art-8f9f08ec16e641c3b85257b8f054581a |
| institution | OA Journals |
| issn | 2580-0760 |
| language | English |
| publishDate | 2025-02-01 |
| publisher | Ikatan Ahli Informatika Indonesia |
| record_format | Article |
| series | Jurnal RESTI (Rekayasa Sistem dan Teknologi Informasi) |
| spelling | doaj-art-8f9f08ec16e641c3b85257b8f054581a2025-08-20T02:06:16ZengIkatan Ahli Informatika IndonesiaJurnal RESTI (Rekayasa Sistem dan Teknologi Informasi)2580-07602025-02-019113714510.29207/resti.v9i1.61796179Advancing Hate Speech Detection in Indonesian Language Using Graph Neural Networks and TF-IDFSyaikha Amirah Zikrina0Fitriyani1Telkom UniversityTelkom UniversityMost of the hate speech and abusive content on social media, particularly in the Indonesian language, presents significant challenges for content moderation systems. Previous research has applied machine learning models such as Recurrent Neural Networks (RNN), Support Vector Machines (SVM), and Convolutional Neural Networks (CNN) to address this issue. However, these approaches are limited in their ability to capture the relational and contextual nuances inherent in the data, resulting in suboptimal performance. This study introduces an approach by combining Graph Neural Networks (GNN) with Term Frequency-Inverse Document Frequency (TF-IDF) for feature extraction to improve hate speech detection on Twitter (platform X). The dataset consists of 13,169 Indonesian tweets, manually labeled for hate speech and abusive categories. Preprocessing steps include text cleaning, stemming, stop-word removal, and normalization. The GNN model achieved superior results, with accuracy scores of 92.90% for Abusive and 89.78% for Hate Speech, significantly outperforming the RNN model, which achieved accuracy of 86.09% and 86.15%, respectively. This study highlights the advantage of graph-based approaches in capturing complex relationships within text data. Future research can explore expanding datasets to include regional dialects and integrating advanced feature extraction techniques like Word2Vec or BERT. This study establishes a robust framework for improving hate speech detection, offering a valuable contribution to safer digital environments.https://jurnal.iaii.or.id/index.php/RESTI/article/view/6179context-aware sentiment analysishate speech detectiongraph neural network (gnn)social media xtf-idf |
| spellingShingle | Syaikha Amirah Zikrina Fitriyani Advancing Hate Speech Detection in Indonesian Language Using Graph Neural Networks and TF-IDF Jurnal RESTI (Rekayasa Sistem dan Teknologi Informasi) context-aware sentiment analysis hate speech detection graph neural network (gnn) social media x tf-idf |
| title | Advancing Hate Speech Detection in Indonesian Language Using Graph Neural Networks and TF-IDF |
| title_full | Advancing Hate Speech Detection in Indonesian Language Using Graph Neural Networks and TF-IDF |
| title_fullStr | Advancing Hate Speech Detection in Indonesian Language Using Graph Neural Networks and TF-IDF |
| title_full_unstemmed | Advancing Hate Speech Detection in Indonesian Language Using Graph Neural Networks and TF-IDF |
| title_short | Advancing Hate Speech Detection in Indonesian Language Using Graph Neural Networks and TF-IDF |
| title_sort | advancing hate speech detection in indonesian language using graph neural networks and tf idf |
| topic | context-aware sentiment analysis hate speech detection graph neural network (gnn) social media x tf-idf |
| url | https://jurnal.iaii.or.id/index.php/RESTI/article/view/6179 |
| work_keys_str_mv | AT syaikhaamirahzikrina advancinghatespeechdetectioninindonesianlanguageusinggraphneuralnetworksandtfidf AT fitriyani advancinghatespeechdetectioninindonesianlanguageusinggraphneuralnetworksandtfidf |