Classification in Networked Data with Heterophily

In the real world, a large amount of data can be described by networks using relations between data. The data described by networks can be called networked data. Classification is one of the main tasks in analyzing networked data. Most of the previous methods find the class of the unlabeled node usi...

Full description

Saved in:
Bibliographic Details
Main Authors: Zhenwen Wang, Fengjing Yin, Wentang Tan, Weidong Xiao
Format: Article
Language:English
Published: Wiley 2013-01-01
Series:The Scientific World Journal
Online Access:http://dx.doi.org/10.1155/2013/236769
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1832563640411095040
author Zhenwen Wang
Fengjing Yin
Wentang Tan
Weidong Xiao
author_facet Zhenwen Wang
Fengjing Yin
Wentang Tan
Weidong Xiao
author_sort Zhenwen Wang
collection DOAJ
description In the real world, a large amount of data can be described by networks using relations between data. The data described by networks can be called networked data. Classification is one of the main tasks in analyzing networked data. Most of the previous methods find the class of the unlabeled node using the classes of its neighbor nodes. However, in the networks with heterophily, most of connected nodes belong to different classes. It is hard to get the correct class using the classes of neighbor nodes, so the previous methods have a low level of performance in the networks with heterophily. In this paper, a probabilistic method is proposed to address this problem. Firstly, the class propagating distribution of the node is proposed to describe the probabilities that its neighbor nodes belong to each class. After that, the class propagating distributions of neighbor nodes are used to calculate the class of the unlabeled node. At last, a classification algorithm based on class propagating distribution is presented in the form of matrix operations. In empirical study, we apply the proposed algorithm to the real-world datasets, compared with some other algorithms. The experimental results show that the proposed algorithm performs better when the networks are of heterophily.
format Article
id doaj-art-6d45170b65214f2e8299c8b6c1804fa2
institution Kabale University
issn 1537-744X
language English
publishDate 2013-01-01
publisher Wiley
record_format Article
series The Scientific World Journal
spelling doaj-art-6d45170b65214f2e8299c8b6c1804fa22025-02-03T01:12:55ZengWileyThe Scientific World Journal1537-744X2013-01-01201310.1155/2013/236769236769Classification in Networked Data with HeterophilyZhenwen Wang0Fengjing Yin1Wentang Tan2Weidong Xiao3College of Information System and Management, National University of Defense Technology, Changsha 410073, ChinaCollege of Information System and Management, National University of Defense Technology, Changsha 410073, ChinaCollege of Information System and Management, National University of Defense Technology, Changsha 410073, ChinaCollege of Information System and Management, National University of Defense Technology, Changsha 410073, ChinaIn the real world, a large amount of data can be described by networks using relations between data. The data described by networks can be called networked data. Classification is one of the main tasks in analyzing networked data. Most of the previous methods find the class of the unlabeled node using the classes of its neighbor nodes. However, in the networks with heterophily, most of connected nodes belong to different classes. It is hard to get the correct class using the classes of neighbor nodes, so the previous methods have a low level of performance in the networks with heterophily. In this paper, a probabilistic method is proposed to address this problem. Firstly, the class propagating distribution of the node is proposed to describe the probabilities that its neighbor nodes belong to each class. After that, the class propagating distributions of neighbor nodes are used to calculate the class of the unlabeled node. At last, a classification algorithm based on class propagating distribution is presented in the form of matrix operations. In empirical study, we apply the proposed algorithm to the real-world datasets, compared with some other algorithms. The experimental results show that the proposed algorithm performs better when the networks are of heterophily.http://dx.doi.org/10.1155/2013/236769
spellingShingle Zhenwen Wang
Fengjing Yin
Wentang Tan
Weidong Xiao
Classification in Networked Data with Heterophily
The Scientific World Journal
title Classification in Networked Data with Heterophily
title_full Classification in Networked Data with Heterophily
title_fullStr Classification in Networked Data with Heterophily
title_full_unstemmed Classification in Networked Data with Heterophily
title_short Classification in Networked Data with Heterophily
title_sort classification in networked data with heterophily
url http://dx.doi.org/10.1155/2013/236769
work_keys_str_mv AT zhenwenwang classificationinnetworkeddatawithheterophily
AT fengjingyin classificationinnetworkeddatawithheterophily
AT wentangtan classificationinnetworkeddatawithheterophily
AT weidongxiao classificationinnetworkeddatawithheterophily