Study of Keyword Extraction Techniques for Electric Double-Layer Capacitor Domain Using Text Similarity Indexes: An Experimental Analysis

Keywords perform a significant role in selecting various topic-related documents quite easily. Topics or keywords assigned by humans or experts provide accurate information. However, this practice is quite expensive in terms of resources and time management. Hence, it is more satisfying to utilize a...

Full description

Saved in:
Bibliographic Details
Main Authors: M. Saef Ullah Miah, Junaida Sulaiman, Talha Bin Sarwar, Kamal Z. Zamli, Rajan Jose
Format: Article
Language:English
Published: Wiley 2021-01-01
Series:Complexity
Online Access:http://dx.doi.org/10.1155/2021/8192320
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1832561991995097088
author M. Saef Ullah Miah
Junaida Sulaiman
Talha Bin Sarwar
Kamal Z. Zamli
Rajan Jose
author_facet M. Saef Ullah Miah
Junaida Sulaiman
Talha Bin Sarwar
Kamal Z. Zamli
Rajan Jose
author_sort M. Saef Ullah Miah
collection DOAJ
description Keywords perform a significant role in selecting various topic-related documents quite easily. Topics or keywords assigned by humans or experts provide accurate information. However, this practice is quite expensive in terms of resources and time management. Hence, it is more satisfying to utilize automated keyword extraction techniques. Nevertheless, before beginning the automated process, it is necessary to check and confirm how similar expert-provided and algorithm-generated keywords are. This paper presents an experimental analysis of similarity scores of keywords generated by different supervised and unsupervised automated keyword extraction algorithms with expert-provided keywords from the electric double layer capacitor (EDLC) domain. The paper also analyses which texts provide better keywords such as positive sentences or all sentences of the document. From the unsupervised algorithms, YAKE, TopicRank, MultipartiteRank, and KPMiner are employed for keyword extraction. From the supervised algorithms, KEA and WINGNUS are employed for keyword extraction. To assess the similarity of the extracted keywords with expert-provided keywords, Jaccard, Cosine, and Cosine with word vector similarity indexes are employed in this study. The experiment shows that the MultipartiteRank keyword extraction technique measured with cosine with word vector similarity index produces the best result with 92% similarity with expert-provided keywords. This study can help the NLP researchers working with the EDLC domain or recommender systems to select more suitable keyword extraction and similarity index calculation techniques.
format Article
id doaj-art-8ef1b6aae1514cdc82574ab141662fcd
institution Kabale University
issn 1099-0526
language English
publishDate 2021-01-01
publisher Wiley
record_format Article
series Complexity
spelling doaj-art-8ef1b6aae1514cdc82574ab141662fcd2025-02-03T01:23:40ZengWileyComplexity1099-05262021-01-01202110.1155/2021/8192320Study of Keyword Extraction Techniques for Electric Double-Layer Capacitor Domain Using Text Similarity Indexes: An Experimental AnalysisM. Saef Ullah Miah0Junaida Sulaiman1Talha Bin Sarwar2Kamal Z. Zamli3Rajan Jose4Faculty of ComputingFaculty of ComputingDepartment of Computer ScienceFaculty of ComputingFaculty of Industrial Sciences & TechnologyKeywords perform a significant role in selecting various topic-related documents quite easily. Topics or keywords assigned by humans or experts provide accurate information. However, this practice is quite expensive in terms of resources and time management. Hence, it is more satisfying to utilize automated keyword extraction techniques. Nevertheless, before beginning the automated process, it is necessary to check and confirm how similar expert-provided and algorithm-generated keywords are. This paper presents an experimental analysis of similarity scores of keywords generated by different supervised and unsupervised automated keyword extraction algorithms with expert-provided keywords from the electric double layer capacitor (EDLC) domain. The paper also analyses which texts provide better keywords such as positive sentences or all sentences of the document. From the unsupervised algorithms, YAKE, TopicRank, MultipartiteRank, and KPMiner are employed for keyword extraction. From the supervised algorithms, KEA and WINGNUS are employed for keyword extraction. To assess the similarity of the extracted keywords with expert-provided keywords, Jaccard, Cosine, and Cosine with word vector similarity indexes are employed in this study. The experiment shows that the MultipartiteRank keyword extraction technique measured with cosine with word vector similarity index produces the best result with 92% similarity with expert-provided keywords. This study can help the NLP researchers working with the EDLC domain or recommender systems to select more suitable keyword extraction and similarity index calculation techniques.http://dx.doi.org/10.1155/2021/8192320
spellingShingle M. Saef Ullah Miah
Junaida Sulaiman
Talha Bin Sarwar
Kamal Z. Zamli
Rajan Jose
Study of Keyword Extraction Techniques for Electric Double-Layer Capacitor Domain Using Text Similarity Indexes: An Experimental Analysis
Complexity
title Study of Keyword Extraction Techniques for Electric Double-Layer Capacitor Domain Using Text Similarity Indexes: An Experimental Analysis
title_full Study of Keyword Extraction Techniques for Electric Double-Layer Capacitor Domain Using Text Similarity Indexes: An Experimental Analysis
title_fullStr Study of Keyword Extraction Techniques for Electric Double-Layer Capacitor Domain Using Text Similarity Indexes: An Experimental Analysis
title_full_unstemmed Study of Keyword Extraction Techniques for Electric Double-Layer Capacitor Domain Using Text Similarity Indexes: An Experimental Analysis
title_short Study of Keyword Extraction Techniques for Electric Double-Layer Capacitor Domain Using Text Similarity Indexes: An Experimental Analysis
title_sort study of keyword extraction techniques for electric double layer capacitor domain using text similarity indexes an experimental analysis
url http://dx.doi.org/10.1155/2021/8192320
work_keys_str_mv AT msaefullahmiah studyofkeywordextractiontechniquesforelectricdoublelayercapacitordomainusingtextsimilarityindexesanexperimentalanalysis
AT junaidasulaiman studyofkeywordextractiontechniquesforelectricdoublelayercapacitordomainusingtextsimilarityindexesanexperimentalanalysis
AT talhabinsarwar studyofkeywordextractiontechniquesforelectricdoublelayercapacitordomainusingtextsimilarityindexesanexperimentalanalysis
AT kamalzzamli studyofkeywordextractiontechniquesforelectricdoublelayercapacitordomainusingtextsimilarityindexesanexperimentalanalysis
AT rajanjose studyofkeywordextractiontechniquesforelectricdoublelayercapacitordomainusingtextsimilarityindexesanexperimentalanalysis