A Grammar-Based Semantic Similarity Algorithm for Natural Language Sentences

This paper presents a grammar and semantic corpus based similarity algorithm for natural language sentences. Natural language, in opposition to “artificial language”, such as computer programming languages, is the language used by the general public for daily communication. Traditional information r...

Full description

Saved in:
Bibliographic Details
Main Authors: Ming Che Lee, Jia Wei Chang, Tung Cheng Hsieh
Format: Article
Language:English
Published: Wiley 2014-01-01
Series:The Scientific World Journal
Online Access:http://dx.doi.org/10.1155/2014/437162
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1832563204859887616
author Ming Che Lee
Jia Wei Chang
Tung Cheng Hsieh
author_facet Ming Che Lee
Jia Wei Chang
Tung Cheng Hsieh
author_sort Ming Che Lee
collection DOAJ
description This paper presents a grammar and semantic corpus based similarity algorithm for natural language sentences. Natural language, in opposition to “artificial language”, such as computer programming languages, is the language used by the general public for daily communication. Traditional information retrieval approaches, such as vector models, LSA, HAL, or even the ontology-based approaches that extend to include concept similarity comparison instead of cooccurrence terms/words, may not always determine the perfect matching while there is no obvious relation or concept overlap between two natural language sentences. This paper proposes a sentence similarity algorithm that takes advantage of corpus-based ontology and grammatical rules to overcome the addressed problems. Experiments on two famous benchmarks demonstrate that the proposed algorithm has a significant performance improvement in sentences/short-texts with arbitrary syntax and structure.
format Article
id doaj-art-215f6934fc3342c984ee6c2dd629822b
institution Kabale University
issn 2356-6140
1537-744X
language English
publishDate 2014-01-01
publisher Wiley
record_format Article
series The Scientific World Journal
spelling doaj-art-215f6934fc3342c984ee6c2dd629822b2025-02-03T01:20:40ZengWileyThe Scientific World Journal2356-61401537-744X2014-01-01201410.1155/2014/437162437162A Grammar-Based Semantic Similarity Algorithm for Natural Language SentencesMing Che Lee0Jia Wei Chang1Tung Cheng Hsieh2Department of Computer and Communication Engineering, Ming Chuan University, Taoyuan 333, TaiwanDepartment of Engineering Science, National Cheng Kung University, Tainan 701, TaiwanDepartment of Visual Communication Design, Hsuan Chuang University, Hsinchu 300, TaiwanThis paper presents a grammar and semantic corpus based similarity algorithm for natural language sentences. Natural language, in opposition to “artificial language”, such as computer programming languages, is the language used by the general public for daily communication. Traditional information retrieval approaches, such as vector models, LSA, HAL, or even the ontology-based approaches that extend to include concept similarity comparison instead of cooccurrence terms/words, may not always determine the perfect matching while there is no obvious relation or concept overlap between two natural language sentences. This paper proposes a sentence similarity algorithm that takes advantage of corpus-based ontology and grammatical rules to overcome the addressed problems. Experiments on two famous benchmarks demonstrate that the proposed algorithm has a significant performance improvement in sentences/short-texts with arbitrary syntax and structure.http://dx.doi.org/10.1155/2014/437162
spellingShingle Ming Che Lee
Jia Wei Chang
Tung Cheng Hsieh
A Grammar-Based Semantic Similarity Algorithm for Natural Language Sentences
The Scientific World Journal
title A Grammar-Based Semantic Similarity Algorithm for Natural Language Sentences
title_full A Grammar-Based Semantic Similarity Algorithm for Natural Language Sentences
title_fullStr A Grammar-Based Semantic Similarity Algorithm for Natural Language Sentences
title_full_unstemmed A Grammar-Based Semantic Similarity Algorithm for Natural Language Sentences
title_short A Grammar-Based Semantic Similarity Algorithm for Natural Language Sentences
title_sort grammar based semantic similarity algorithm for natural language sentences
url http://dx.doi.org/10.1155/2014/437162
work_keys_str_mv AT mingchelee agrammarbasedsemanticsimilarityalgorithmfornaturallanguagesentences
AT jiaweichang agrammarbasedsemanticsimilarityalgorithmfornaturallanguagesentences
AT tungchenghsieh agrammarbasedsemanticsimilarityalgorithmfornaturallanguagesentences
AT mingchelee grammarbasedsemanticsimilarityalgorithmfornaturallanguagesentences
AT jiaweichang grammarbasedsemanticsimilarityalgorithmfornaturallanguagesentences
AT tungchenghsieh grammarbasedsemanticsimilarityalgorithmfornaturallanguagesentences