A Novel DBSCAN Based on Binary Local Sensitive Hashing and Binary-KNN Representation

We revisit the classic DBSCAN algorithm by proposing a series of strategies to improve its robustness to various densities and its efficiency. Unlike the original DBSCAN, we first use the binary local sensitive hashing (LSH) which enables faster region query for the k neighbors of a data point. The...

Full description

Saved in:
Bibliographic Details
Main Authors: Qing He, Hai Xia Gu, Qin Wei, Xu Wang
Format: Article
Language:English
Published: Wiley 2017-01-01
Series:Advances in Multimedia
Online Access:http://dx.doi.org/10.1155/2017/3695323
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1832554283085594624
author Qing He
Hai Xia Gu
Qin Wei
Xu Wang
author_facet Qing He
Hai Xia Gu
Qin Wei
Xu Wang
author_sort Qing He
collection DOAJ
description We revisit the classic DBSCAN algorithm by proposing a series of strategies to improve its robustness to various densities and its efficiency. Unlike the original DBSCAN, we first use the binary local sensitive hashing (LSH) which enables faster region query for the k neighbors of a data point. The binary data representation method based on k neighborhood is then proposed to map the dataset into the Hamming space for faster cluster expansion. We define a core point based on binary influence space to enhance the robustness to various densities. Also, we propose a seed point selection method, which is based on influence space and k neighborhood similarity, to select some seed points instead of all the neighborhood during cluster expansion. Consequently, the number of region queries can be decreased. The experimental results show that the improved algorithm can greatly improve the clustering speed under the premise of ensuring better algorithm clustering accuracy, especially for large-scale datasets.
format Article
id doaj-art-5d3430abcef64a5cba9af71ce39c42a6
institution Kabale University
issn 1687-5680
1687-5699
language English
publishDate 2017-01-01
publisher Wiley
record_format Article
series Advances in Multimedia
spelling doaj-art-5d3430abcef64a5cba9af71ce39c42a62025-02-03T05:51:51ZengWileyAdvances in Multimedia1687-56801687-56992017-01-01201710.1155/2017/36953233695323A Novel DBSCAN Based on Binary Local Sensitive Hashing and Binary-KNN RepresentationQing He0Hai Xia Gu1Qin Wei2Xu Wang3College of Big Data and Information Engineering, Guizhou University, Guiyang 550025, ChinaCollege of Big Data and Information Engineering, Guizhou University, Guiyang 550025, ChinaGuizhou University, Guizhou Provincial Key Laboratory of Public Big Data, Guiyang, Guizhou 550025, ChinaCollege of Big Data and Information Engineering, Guizhou University, Guiyang 550025, ChinaWe revisit the classic DBSCAN algorithm by proposing a series of strategies to improve its robustness to various densities and its efficiency. Unlike the original DBSCAN, we first use the binary local sensitive hashing (LSH) which enables faster region query for the k neighbors of a data point. The binary data representation method based on k neighborhood is then proposed to map the dataset into the Hamming space for faster cluster expansion. We define a core point based on binary influence space to enhance the robustness to various densities. Also, we propose a seed point selection method, which is based on influence space and k neighborhood similarity, to select some seed points instead of all the neighborhood during cluster expansion. Consequently, the number of region queries can be decreased. The experimental results show that the improved algorithm can greatly improve the clustering speed under the premise of ensuring better algorithm clustering accuracy, especially for large-scale datasets.http://dx.doi.org/10.1155/2017/3695323
spellingShingle Qing He
Hai Xia Gu
Qin Wei
Xu Wang
A Novel DBSCAN Based on Binary Local Sensitive Hashing and Binary-KNN Representation
Advances in Multimedia
title A Novel DBSCAN Based on Binary Local Sensitive Hashing and Binary-KNN Representation
title_full A Novel DBSCAN Based on Binary Local Sensitive Hashing and Binary-KNN Representation
title_fullStr A Novel DBSCAN Based on Binary Local Sensitive Hashing and Binary-KNN Representation
title_full_unstemmed A Novel DBSCAN Based on Binary Local Sensitive Hashing and Binary-KNN Representation
title_short A Novel DBSCAN Based on Binary Local Sensitive Hashing and Binary-KNN Representation
title_sort novel dbscan based on binary local sensitive hashing and binary knn representation
url http://dx.doi.org/10.1155/2017/3695323
work_keys_str_mv AT qinghe anoveldbscanbasedonbinarylocalsensitivehashingandbinaryknnrepresentation
AT haixiagu anoveldbscanbasedonbinarylocalsensitivehashingandbinaryknnrepresentation
AT qinwei anoveldbscanbasedonbinarylocalsensitivehashingandbinaryknnrepresentation
AT xuwang anoveldbscanbasedonbinarylocalsensitivehashingandbinaryknnrepresentation
AT qinghe noveldbscanbasedonbinarylocalsensitivehashingandbinaryknnrepresentation
AT haixiagu noveldbscanbasedonbinarylocalsensitivehashingandbinaryknnrepresentation
AT qinwei noveldbscanbasedonbinarylocalsensitivehashingandbinaryknnrepresentation
AT xuwang noveldbscanbasedonbinarylocalsensitivehashingandbinaryknnrepresentation