A Robust k-Means Clustering Algorithm Based on Observation Point Mechanism

The k-means algorithm is sensitive to the outliers. In this paper, we propose a robust two-stage k-means clustering algorithm based on the observation point mechanism, which can accurately discover the cluster centers without the disturbance of outliers. In the first stage, a small subset of the ori...

Full description

Saved in:
Bibliographic Details
Main Authors: Xiaoliang Zhang, Yulin He, Yi Jin, Honglian Qin, Muhammad Azhar, Joshua Zhexue Huang
Format: Article
Language:English
Published: Wiley 2020-01-01
Series:Complexity
Online Access:http://dx.doi.org/10.1155/2020/3650926
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1832553740485263360
author Xiaoliang Zhang
Yulin He
Yi Jin
Honglian Qin
Muhammad Azhar
Joshua Zhexue Huang
author_facet Xiaoliang Zhang
Yulin He
Yi Jin
Honglian Qin
Muhammad Azhar
Joshua Zhexue Huang
author_sort Xiaoliang Zhang
collection DOAJ
description The k-means algorithm is sensitive to the outliers. In this paper, we propose a robust two-stage k-means clustering algorithm based on the observation point mechanism, which can accurately discover the cluster centers without the disturbance of outliers. In the first stage, a small subset of the original data set is selected based on a set of nondegenerate observation points. The subset is a good representation of the original data set because it only contains all those points that have a higher density of the original data set and does not include the outliers. In the second stage, we use the k-means clustering algorithm to cluster the selected subset and find the proper cluster centers as the true cluster centers of the original data set. Based on these cluster centers, the rest data points of the original data set are assigned to the clusters whose centers are the closest to the data points. The theoretical analysis and experimental results show that the proposed clustering algorithm has the lower computational complexity and better robustness in comparison with k-means clustering algorithm, thus demonstrating the feasibility and effectiveness of our proposed clustering algorithm.
format Article
id doaj-art-43085d98e54f4668ab71a343fc6201e6
institution Kabale University
issn 1076-2787
1099-0526
language English
publishDate 2020-01-01
publisher Wiley
record_format Article
series Complexity
spelling doaj-art-43085d98e54f4668ab71a343fc6201e62025-02-03T05:53:21ZengWileyComplexity1076-27871099-05262020-01-01202010.1155/2020/36509263650926A Robust k-Means Clustering Algorithm Based on Observation Point MechanismXiaoliang Zhang0Yulin He1Yi Jin2Honglian Qin3Muhammad Azhar4Joshua Zhexue Huang5College of Computer Science & Software Engineering, Shenzhen University, Shenzhen 518060, ChinaCollege of Computer Science & Software Engineering, Shenzhen University, Shenzhen 518060, ChinaDepartment of Trace Inspection Technology, Criminal Investigation Police University of China, Shenyang 110854, ChinaCollege of Computer Science & Software Engineering, Shenzhen University, Shenzhen 518060, ChinaCollege of Computer Science & Software Engineering, Shenzhen University, Shenzhen 518060, ChinaCollege of Computer Science & Software Engineering, Shenzhen University, Shenzhen 518060, ChinaThe k-means algorithm is sensitive to the outliers. In this paper, we propose a robust two-stage k-means clustering algorithm based on the observation point mechanism, which can accurately discover the cluster centers without the disturbance of outliers. In the first stage, a small subset of the original data set is selected based on a set of nondegenerate observation points. The subset is a good representation of the original data set because it only contains all those points that have a higher density of the original data set and does not include the outliers. In the second stage, we use the k-means clustering algorithm to cluster the selected subset and find the proper cluster centers as the true cluster centers of the original data set. Based on these cluster centers, the rest data points of the original data set are assigned to the clusters whose centers are the closest to the data points. The theoretical analysis and experimental results show that the proposed clustering algorithm has the lower computational complexity and better robustness in comparison with k-means clustering algorithm, thus demonstrating the feasibility and effectiveness of our proposed clustering algorithm.http://dx.doi.org/10.1155/2020/3650926
spellingShingle Xiaoliang Zhang
Yulin He
Yi Jin
Honglian Qin
Muhammad Azhar
Joshua Zhexue Huang
A Robust k-Means Clustering Algorithm Based on Observation Point Mechanism
Complexity
title A Robust k-Means Clustering Algorithm Based on Observation Point Mechanism
title_full A Robust k-Means Clustering Algorithm Based on Observation Point Mechanism
title_fullStr A Robust k-Means Clustering Algorithm Based on Observation Point Mechanism
title_full_unstemmed A Robust k-Means Clustering Algorithm Based on Observation Point Mechanism
title_short A Robust k-Means Clustering Algorithm Based on Observation Point Mechanism
title_sort robust k means clustering algorithm based on observation point mechanism
url http://dx.doi.org/10.1155/2020/3650926
work_keys_str_mv AT xiaoliangzhang arobustkmeansclusteringalgorithmbasedonobservationpointmechanism
AT yulinhe arobustkmeansclusteringalgorithmbasedonobservationpointmechanism
AT yijin arobustkmeansclusteringalgorithmbasedonobservationpointmechanism
AT honglianqin arobustkmeansclusteringalgorithmbasedonobservationpointmechanism
AT muhammadazhar arobustkmeansclusteringalgorithmbasedonobservationpointmechanism
AT joshuazhexuehuang arobustkmeansclusteringalgorithmbasedonobservationpointmechanism
AT xiaoliangzhang robustkmeansclusteringalgorithmbasedonobservationpointmechanism
AT yulinhe robustkmeansclusteringalgorithmbasedonobservationpointmechanism
AT yijin robustkmeansclusteringalgorithmbasedonobservationpointmechanism
AT honglianqin robustkmeansclusteringalgorithmbasedonobservationpointmechanism
AT muhammadazhar robustkmeansclusteringalgorithmbasedonobservationpointmechanism
AT joshuazhexuehuang robustkmeansclusteringalgorithmbasedonobservationpointmechanism