Effective Density-Based Clustering Algorithms for Incomplete Data

Density-based clustering is an important category among clustering algorithms. In real applications, many datasets suffer from incompleteness. Traditional imputation technologies or other techniques for handling missing values are not suitable for density-based clustering and decrease clustering res...

Full description

Saved in:
Bibliographic Details
Main Authors: Zhonghao Xue, Hongzhi Wang
Format: Article
Language:English
Published: Tsinghua University Press 2021-09-01
Series:Big Data Mining and Analytics
Subjects:
Online Access:https://www.sciopen.com/article/10.26599/BDMA.2021.9020001
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1832572790717284352
author Zhonghao Xue
Hongzhi Wang
author_facet Zhonghao Xue
Hongzhi Wang
author_sort Zhonghao Xue
collection DOAJ
description Density-based clustering is an important category among clustering algorithms. In real applications, many datasets suffer from incompleteness. Traditional imputation technologies or other techniques for handling missing values are not suitable for density-based clustering and decrease clustering result quality. To avoid these problems, we develop a novel density-based clustering approach for incomplete data based on Bayesian theory, which conducts imputation and clustering concurrently and makes use of intermediate clustering results. To avoid the impact of low-density areas inside non-convex clusters, we introduce a local imputation clustering algorithm, which aims to impute points to high-density local areas. The performances of the proposed algorithms are evaluated using ten synthetic datasets and five real-world datasets with induced missing values. The experimental results show the effectiveness of the proposed algorithms.
format Article
id doaj-art-1c0dbb6bfa7b4787b96d5e0f7c459a9f
institution Kabale University
issn 2096-0654
language English
publishDate 2021-09-01
publisher Tsinghua University Press
record_format Article
series Big Data Mining and Analytics
spelling doaj-art-1c0dbb6bfa7b4787b96d5e0f7c459a9f2025-02-02T06:50:33ZengTsinghua University PressBig Data Mining and Analytics2096-06542021-09-014318319410.26599/BDMA.2021.9020001Effective Density-Based Clustering Algorithms for Incomplete DataZhonghao Xue0Hongzhi Wang1<institution>USC Viterbi School of Engineering, University of Southern California</institution>, <city>Los Angeles</city>, <state>CA</state> <postal-code>90007</postal-code>, <country>USA</country><institution content-type="dept">Department of Computer Science and Technology</institution>, <institution>Harbin Institute of Technology</institution>, <city>Harbin</city> <postal-code>150001</postal-code>, <country>China</country>Density-based clustering is an important category among clustering algorithms. In real applications, many datasets suffer from incompleteness. Traditional imputation technologies or other techniques for handling missing values are not suitable for density-based clustering and decrease clustering result quality. To avoid these problems, we develop a novel density-based clustering approach for incomplete data based on Bayesian theory, which conducts imputation and clustering concurrently and makes use of intermediate clustering results. To avoid the impact of low-density areas inside non-convex clusters, we introduce a local imputation clustering algorithm, which aims to impute points to high-density local areas. The performances of the proposed algorithms are evaluated using ten synthetic datasets and five real-world datasets with induced missing values. The experimental results show the effectiveness of the proposed algorithms.https://www.sciopen.com/article/10.26599/BDMA.2021.9020001density-based clusteringincomplete dataclustering algorihtm
spellingShingle Zhonghao Xue
Hongzhi Wang
Effective Density-Based Clustering Algorithms for Incomplete Data
Big Data Mining and Analytics
density-based clustering
incomplete data
clustering algorihtm
title Effective Density-Based Clustering Algorithms for Incomplete Data
title_full Effective Density-Based Clustering Algorithms for Incomplete Data
title_fullStr Effective Density-Based Clustering Algorithms for Incomplete Data
title_full_unstemmed Effective Density-Based Clustering Algorithms for Incomplete Data
title_short Effective Density-Based Clustering Algorithms for Incomplete Data
title_sort effective density based clustering algorithms for incomplete data
topic density-based clustering
incomplete data
clustering algorihtm
url https://www.sciopen.com/article/10.26599/BDMA.2021.9020001
work_keys_str_mv AT zhonghaoxue effectivedensitybasedclusteringalgorithmsforincompletedata
AT hongzhiwang effectivedensitybasedclusteringalgorithmsforincompletedata