Subspace Clustering of High-Dimensional Data: An Evolutionary Approach

Clustering high-dimensional data has been a major challenge due to the inherent sparsity of the points. Most existing clustering algorithms become substantially inefficient if the required similarity measure is computed between data points in the full-dimensional space. In this paper, we have presen...

Full description

Saved in:

Bibliographic Details
Main Authors:	Singh Vijendra, Sahoo Laxman
Format:	Article
Language:	English
Published:	Wiley 2013-01-01
Series:	Applied Computational Intelligence and Soft Computing
Online Access:	http://dx.doi.org/10.1155/2013/863146
Tags:	Add Tag No Tags, Be the first to tag this record!

_version_	1832562023207010304
author	Singh Vijendra Sahoo Laxman
author_facet	Singh Vijendra Sahoo Laxman
author_sort	Singh Vijendra
collection	DOAJ
description	Clustering high-dimensional data has been a major challenge due to the inherent sparsity of the points. Most existing clustering algorithms become substantially inefficient if the required similarity measure is computed between data points in the full-dimensional space. In this paper, we have presented a robust multi objective subspace clustering (MOSCL) algorithm for the challenging problem of high-dimensional clustering. The first phase of MOSCL performs subspace relevance analysis by detecting dense and sparse regions with their locations in data set. After detection of dense regions it eliminates outliers. MOSCL discovers subspaces in dense regions of data set and produces subspace clusters. In thorough experiments on synthetic and real-world data sets, we demonstrate that MOSCL for subspace clustering is superior to PROCLUS clustering algorithm. Additionally we investigate the effects of first phase for detecting dense regions on the results of subspace clustering. Our results indicate that removing outliers improves the accuracy of subspace clustering. The clustering results are validated by clustering error (CE) distance on various data sets. MOSCL can discover the clusters in all subspaces with high quality, and the efficiency of MOSCL outperforms PROCLUS.
format	Article
id	doaj-art-f7c2aae2e309402884caf4898141227e
institution	Kabale University
issn	1687-9724 1687-9732
language	English
publishDate	2013-01-01
publisher	Wiley
record_format	Article
series	Applied Computational Intelligence and Soft Computing
spelling	doaj-art-f7c2aae2e309402884caf4898141227e2025-02-03T01:23:40ZengWileyApplied Computational Intelligence and Soft Computing1687-97241687-97322013-01-01201310.1155/2013/863146863146Subspace Clustering of High-Dimensional Data: An Evolutionary ApproachSingh Vijendra0Sahoo Laxman1Department of Computer Science and Engineering, Faculty of Engineering and Technology, Mody Institute of Technology and Science, Lakshmangarh, Rajasthan 332311, IndiaSchool of Computer Engineering, KIIT University, Bhubaneswar 751024, IndiaClustering high-dimensional data has been a major challenge due to the inherent sparsity of the points. Most existing clustering algorithms become substantially inefficient if the required similarity measure is computed between data points in the full-dimensional space. In this paper, we have presented a robust multi objective subspace clustering (MOSCL) algorithm for the challenging problem of high-dimensional clustering. The first phase of MOSCL performs subspace relevance analysis by detecting dense and sparse regions with their locations in data set. After detection of dense regions it eliminates outliers. MOSCL discovers subspaces in dense regions of data set and produces subspace clusters. In thorough experiments on synthetic and real-world data sets, we demonstrate that MOSCL for subspace clustering is superior to PROCLUS clustering algorithm. Additionally we investigate the effects of first phase for detecting dense regions on the results of subspace clustering. Our results indicate that removing outliers improves the accuracy of subspace clustering. The clustering results are validated by clustering error (CE) distance on various data sets. MOSCL can discover the clusters in all subspaces with high quality, and the efficiency of MOSCL outperforms PROCLUS.http://dx.doi.org/10.1155/2013/863146
spellingShingle	Singh Vijendra Sahoo Laxman Subspace Clustering of High-Dimensional Data: An Evolutionary Approach Applied Computational Intelligence and Soft Computing
title	Subspace Clustering of High-Dimensional Data: An Evolutionary Approach
title_full	Subspace Clustering of High-Dimensional Data: An Evolutionary Approach
title_fullStr	Subspace Clustering of High-Dimensional Data: An Evolutionary Approach
title_full_unstemmed	Subspace Clustering of High-Dimensional Data: An Evolutionary Approach
title_short	Subspace Clustering of High-Dimensional Data: An Evolutionary Approach
title_sort	subspace clustering of high dimensional data an evolutionary approach
url	http://dx.doi.org/10.1155/2013/863146
work_keys_str_mv	AT singhvijendra subspaceclusteringofhighdimensionaldataanevolutionaryapproach AT sahoolaxman subspaceclusteringofhighdimensionaldataanevolutionaryapproach

Subspace Clustering of High-Dimensional Data: An Evolutionary Approach

Similar Items