Hierarchical clustering methods in a task to find abnormal observations based on groups with broken symmetry

The work is aimed at solving the actual problem of identification and interpretation of anomalous observations in the study of socio-economic processes. The proposed method is based on the use of a cluster approach to detecting anomalous observations. Clustering is performed using hierarchical metho...

Full description

Saved in:
Bibliographic Details
Main Authors: A. N. Kislyakov, S. V. Polyakov
Format: Article
Language:English
Published: North-West institute of management of the Russian Presidential Academy of National Economy and Public Administration 2020-06-01
Series:Управленческое консультирование
Subjects:
Online Access:https://www.acjournal.ru/jour/article/view/1423
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1850228792454283264
author A. N. Kislyakov
S. V. Polyakov
author_facet A. N. Kislyakov
S. V. Polyakov
author_sort A. N. Kislyakov
collection DOAJ
description The work is aimed at solving the actual problem of identification and interpretation of anomalous observations in the study of socio-economic processes. The proposed method is based on the use of a cluster approach to detecting anomalous observations. Clustering is performed using hierarchical methods, which are a set of data ordering algorithms aimed at creating dendrograms consisting of groups of observed points. In the case of mixed data consisting of numeric and categorical variables, it is proposed to use the Gower distance as a metric for distances between elements. Clustering quality is evaluated based on the sum of squares of metric distances between objects within the cluster and the average width of the silhouette. These indicators allow you to select the optimal number of clusters and evaluate the quality of the split results. The dendrogram can be used to study the symmetry groups of cluster systems and the causes of symmetry breaking. Anomaly detection is performed by analyzing the results of hierarchical clustering and identifying branches of the dendrogram that are located at the initial levels of tree construction and do not have branches. The implemented method makes it possible to more accurately interpret the results of clustering with respect to determining errors of the first and second kind in the form of anomalous observations in the data set. Using the described method, it is possible to effectively investigate socio-economic systems and manage their development.
format Article
id doaj-art-c231fd1be269415e93b2ca53769f9a57
institution OA Journals
issn 1726-1139
1816-8590
language English
publishDate 2020-06-01
publisher North-West institute of management of the Russian Presidential Academy of National Economy and Public Administration
record_format Article
series Управленческое консультирование
spelling doaj-art-c231fd1be269415e93b2ca53769f9a572025-08-20T02:04:25ZengNorth-West institute of management of the Russian Presidential Academy of National Economy and Public AdministrationУправленческое консультирование1726-11391816-85902020-06-010511612710.22394/1726-1139-2020-5-116-1271276Hierarchical clustering methods in a task to find abnormal observations based on groups with broken symmetryA. N. Kislyakov0S. V. Polyakov1Russian Presidential Academy of National Economy and Public Administration (Vladimir Branch)Russian Presidential Academy of National Economy and Public Administration (Vladimir Branch)The work is aimed at solving the actual problem of identification and interpretation of anomalous observations in the study of socio-economic processes. The proposed method is based on the use of a cluster approach to detecting anomalous observations. Clustering is performed using hierarchical methods, which are a set of data ordering algorithms aimed at creating dendrograms consisting of groups of observed points. In the case of mixed data consisting of numeric and categorical variables, it is proposed to use the Gower distance as a metric for distances between elements. Clustering quality is evaluated based on the sum of squares of metric distances between objects within the cluster and the average width of the silhouette. These indicators allow you to select the optimal number of clusters and evaluate the quality of the split results. The dendrogram can be used to study the symmetry groups of cluster systems and the causes of symmetry breaking. Anomaly detection is performed by analyzing the results of hierarchical clustering and identifying branches of the dendrogram that are located at the initial levels of tree construction and do not have branches. The implemented method makes it possible to more accurately interpret the results of clustering with respect to determining errors of the first and second kind in the form of anomalous observations in the data set. Using the described method, it is possible to effectively investigate socio-economic systems and manage their development.https://www.acjournal.ru/jour/article/view/1423cluster analysisnetwork graphssymmetry breakinganomalous observationsdecision trees
spellingShingle A. N. Kislyakov
S. V. Polyakov
Hierarchical clustering methods in a task to find abnormal observations based on groups with broken symmetry
Управленческое консультирование
cluster analysis
network graphs
symmetry breaking
anomalous observations
decision trees
title Hierarchical clustering methods in a task to find abnormal observations based on groups with broken symmetry
title_full Hierarchical clustering methods in a task to find abnormal observations based on groups with broken symmetry
title_fullStr Hierarchical clustering methods in a task to find abnormal observations based on groups with broken symmetry
title_full_unstemmed Hierarchical clustering methods in a task to find abnormal observations based on groups with broken symmetry
title_short Hierarchical clustering methods in a task to find abnormal observations based on groups with broken symmetry
title_sort hierarchical clustering methods in a task to find abnormal observations based on groups with broken symmetry
topic cluster analysis
network graphs
symmetry breaking
anomalous observations
decision trees
url https://www.acjournal.ru/jour/article/view/1423
work_keys_str_mv AT ankislyakov hierarchicalclusteringmethodsinatasktofindabnormalobservationsbasedongroupswithbrokensymmetry
AT svpolyakov hierarchicalclusteringmethodsinatasktofindabnormalobservationsbasedongroupswithbrokensymmetry