The BERT Uncased and LSTM Multiclass Classification Model for Traffic Violation Text Classification

The increasing amount of internet content makes it difficult for users to find information using the search function. This problem is overcome by classifying news based on its context to avoid material that has many interpretations. This research combines the Uncased model BiDirectional Encoder Repr...

Full description

Saved in:

Bibliographic Details
Main Authors:	Komang Ayu Triana Indah, I Ketut Gede Darma Putra, I Made Sudarma, Rukmi Sari Hartati, Minho Jo
Format:	Article
Language:	English
Published:	Udayana University, Institute for Research and Community Services 2025-01-01
Series:	Lontar Komputer
Online Access:	https://ojs.unud.ac.id/index.php/lontar/article/view/116705
Tags:	Add Tag No Tags, Be the first to tag this record!

_version_	1832575541985673216
author	Komang Ayu Triana Indah I Ketut Gede Darma Putra I Made Sudarma Rukmi Sari Hartati Minho Jo
author_facet	Komang Ayu Triana Indah I Ketut Gede Darma Putra I Made Sudarma Rukmi Sari Hartati Minho Jo
author_sort	Komang Ayu Triana Indah
collection	DOAJ
description	The increasing amount of internet content makes it difficult for users to find information using the search function. This problem is overcome by classifying news based on its context to avoid material that has many interpretations. This research combines the Uncased model BiDirectional Encoder Representations from Transformer (BERT) with other models to create a text classification model. Long Short-Term Memory (LSTM) architecture trains a model to categorize news articles about traffic violations. Data was collected through the crawling method from the online media application API through unmodified and modified datasets. The BERT Uncased-LSTM model with the best hyperparameter combination scenario of batch size 16, learning rate 2e-5, and average pooling obtained Precision, Recall, and F1 values of 97.25%, 96.90%, and 98.10%, respectively. The research results show that the test value on the unmodified dataset is higher than on the modified dataset because the selection of words that have high information value in the modified dataset makes it difficult for the model to understand the context in text classification.
format	Article
id	doaj-art-b4fc12c575424ecab039b52a5706a02c
institution	Kabale University
issn	2088-1541 2541-5832
language	English
publishDate	2025-01-01
publisher	Udayana University, Institute for Research and Community Services
record_format	Article
series	Lontar Komputer
spelling	doaj-art-b4fc12c575424ecab039b52a5706a02c2025-01-31T23:56:26ZengUdayana University, Institute for Research and Community ServicesLontar Komputer2088-15412541-58322025-01-01150211212310.24843/LKJITI.2024.v15.i02.p04116705The BERT Uncased and LSTM Multiclass Classification Model for Traffic Violation Text ClassificationKomang Ayu Triana Indah0I Ketut Gede Darma Putra1I Made Sudarma2Rukmi Sari Hartati3Minho Jo4Politeknik Negeri Balinformation Technology Department Udayana UniversityInformation Technology Department Udayana UniversityElectrical Engineering Department Udayana UniversityDepartment of Computer and Information Science, Korea UniversityThe increasing amount of internet content makes it difficult for users to find information using the search function. This problem is overcome by classifying news based on its context to avoid material that has many interpretations. This research combines the Uncased model BiDirectional Encoder Representations from Transformer (BERT) with other models to create a text classification model. Long Short-Term Memory (LSTM) architecture trains a model to categorize news articles about traffic violations. Data was collected through the crawling method from the online media application API through unmodified and modified datasets. The BERT Uncased-LSTM model with the best hyperparameter combination scenario of batch size 16, learning rate 2e-5, and average pooling obtained Precision, Recall, and F1 values of 97.25%, 96.90%, and 98.10%, respectively. The research results show that the test value on the unmodified dataset is higher than on the modified dataset because the selection of words that have high information value in the modified dataset makes it difficult for the model to understand the context in text classification.https://ojs.unud.ac.id/index.php/lontar/article/view/116705
spellingShingle	Komang Ayu Triana Indah I Ketut Gede Darma Putra I Made Sudarma Rukmi Sari Hartati Minho Jo The BERT Uncased and LSTM Multiclass Classification Model for Traffic Violation Text Classification Lontar Komputer
title	The BERT Uncased and LSTM Multiclass Classification Model for Traffic Violation Text Classification
title_full	The BERT Uncased and LSTM Multiclass Classification Model for Traffic Violation Text Classification
title_fullStr	The BERT Uncased and LSTM Multiclass Classification Model for Traffic Violation Text Classification
title_full_unstemmed	The BERT Uncased and LSTM Multiclass Classification Model for Traffic Violation Text Classification
title_short	The BERT Uncased and LSTM Multiclass Classification Model for Traffic Violation Text Classification
title_sort	bert uncased and lstm multiclass classification model for traffic violation text classification
url	https://ojs.unud.ac.id/index.php/lontar/article/view/116705
work_keys_str_mv	AT komangayutrianaindah thebertuncasedandlstmmulticlassclassificationmodelfortrafficviolationtextclassification AT iketutgededarmaputra thebertuncasedandlstmmulticlassclassificationmodelfortrafficviolationtextclassification AT imadesudarma thebertuncasedandlstmmulticlassclassificationmodelfortrafficviolationtextclassification AT rukmisarihartati thebertuncasedandlstmmulticlassclassificationmodelfortrafficviolationtextclassification AT minhojo thebertuncasedandlstmmulticlassclassificationmodelfortrafficviolationtextclassification AT komangayutrianaindah bertuncasedandlstmmulticlassclassificationmodelfortrafficviolationtextclassification AT iketutgededarmaputra bertuncasedandlstmmulticlassclassificationmodelfortrafficviolationtextclassification AT imadesudarma bertuncasedandlstmmulticlassclassificationmodelfortrafficviolationtextclassification AT rukmisarihartati bertuncasedandlstmmulticlassclassificationmodelfortrafficviolationtextclassification AT minhojo bertuncasedandlstmmulticlassclassificationmodelfortrafficviolationtextclassification

The BERT Uncased and LSTM Multiclass Classification Model for Traffic Violation Text Classification

Similar Items