Effective Density-Based Clustering Algorithms for Incomplete Data
Density-based clustering is an important category among clustering algorithms. In real applications, many datasets suffer from incompleteness. Traditional imputation technologies or other techniques for handling missing values are not suitable for density-based clustering and decrease clustering res...
Saved in:
Main Authors: | , |
---|---|
Format: | Article |
Language: | English |
Published: |
Tsinghua University Press
2021-09-01
|
Series: | Big Data Mining and Analytics |
Subjects: | |
Online Access: | https://www.sciopen.com/article/10.26599/BDMA.2021.9020001 |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
_version_ | 1832572790717284352 |
---|---|
author | Zhonghao Xue Hongzhi Wang |
author_facet | Zhonghao Xue Hongzhi Wang |
author_sort | Zhonghao Xue |
collection | DOAJ |
description | Density-based clustering is an important category among clustering algorithms. In real applications, many datasets suffer from incompleteness. Traditional imputation technologies or other techniques for handling missing values are not suitable for density-based clustering and decrease clustering result quality. To avoid these problems, we develop a novel density-based clustering approach for incomplete data based on Bayesian theory, which conducts imputation and clustering concurrently and makes use of intermediate clustering results. To avoid the impact of low-density areas inside non-convex clusters, we introduce a local imputation clustering algorithm, which aims to impute points to high-density local areas. The performances of the proposed algorithms are evaluated using ten synthetic datasets and five real-world datasets with induced missing values. The experimental results show the effectiveness of the proposed algorithms. |
format | Article |
id | doaj-art-1c0dbb6bfa7b4787b96d5e0f7c459a9f |
institution | Kabale University |
issn | 2096-0654 |
language | English |
publishDate | 2021-09-01 |
publisher | Tsinghua University Press |
record_format | Article |
series | Big Data Mining and Analytics |
spelling | doaj-art-1c0dbb6bfa7b4787b96d5e0f7c459a9f2025-02-02T06:50:33ZengTsinghua University PressBig Data Mining and Analytics2096-06542021-09-014318319410.26599/BDMA.2021.9020001Effective Density-Based Clustering Algorithms for Incomplete DataZhonghao Xue0Hongzhi Wang1<institution>USC Viterbi School of Engineering, University of Southern California</institution>, <city>Los Angeles</city>, <state>CA</state> <postal-code>90007</postal-code>, <country>USA</country><institution content-type="dept">Department of Computer Science and Technology</institution>, <institution>Harbin Institute of Technology</institution>, <city>Harbin</city> <postal-code>150001</postal-code>, <country>China</country>Density-based clustering is an important category among clustering algorithms. In real applications, many datasets suffer from incompleteness. Traditional imputation technologies or other techniques for handling missing values are not suitable for density-based clustering and decrease clustering result quality. To avoid these problems, we develop a novel density-based clustering approach for incomplete data based on Bayesian theory, which conducts imputation and clustering concurrently and makes use of intermediate clustering results. To avoid the impact of low-density areas inside non-convex clusters, we introduce a local imputation clustering algorithm, which aims to impute points to high-density local areas. The performances of the proposed algorithms are evaluated using ten synthetic datasets and five real-world datasets with induced missing values. The experimental results show the effectiveness of the proposed algorithms.https://www.sciopen.com/article/10.26599/BDMA.2021.9020001density-based clusteringincomplete dataclustering algorihtm |
spellingShingle | Zhonghao Xue Hongzhi Wang Effective Density-Based Clustering Algorithms for Incomplete Data Big Data Mining and Analytics density-based clustering incomplete data clustering algorihtm |
title | Effective Density-Based Clustering Algorithms for Incomplete Data |
title_full | Effective Density-Based Clustering Algorithms for Incomplete Data |
title_fullStr | Effective Density-Based Clustering Algorithms for Incomplete Data |
title_full_unstemmed | Effective Density-Based Clustering Algorithms for Incomplete Data |
title_short | Effective Density-Based Clustering Algorithms for Incomplete Data |
title_sort | effective density based clustering algorithms for incomplete data |
topic | density-based clustering incomplete data clustering algorihtm |
url | https://www.sciopen.com/article/10.26599/BDMA.2021.9020001 |
work_keys_str_mv | AT zhonghaoxue effectivedensitybasedclusteringalgorithmsforincompletedata AT hongzhiwang effectivedensitybasedclusteringalgorithmsforincompletedata |