Clustering large datasets with kernel methods

Creators: Faußer, Stefan A. and Schwenker, Friedhelm
Title: Clustering large datasets with kernel methods
Item Type: Conference or Workshop Item
Event Title: (Proceedings of the) 21st International Conference on Pattern Recognition. (ICPR ’12) ; Vol. 1
Event Location: Tsukuba, Japan
Event Dates: November, 11-15th, 2012
Page Range: pp. 501-504
Date: 2012
Divisions: Informationsmanagement
Abstract (ENG): Real-life datasets are becoming larger and less linear separable. Divisive clustering methods with a computation time linear to the number of samples n can handle large data but mostly assume linear boundaries between the cluster in input space. Kernel based clustering methods are able to detect nonlinear boundaries in feature space but have a quadratic computation time O(n2). In this paper, we propose a meta-algorithm that distributes small-sized subset of the large dataset, parallelized cluster these subset and merges the resulting approximate pseudo-centre repeatedly until the whole dataset has been processed. The meta-algorithm is able to use a wide range of kernel based clustering methods. Here we integrate Kernel Fuzzy C-Means and Relational Neural Gas. We analytically show that the algorithm has a linear computation time O(n). In the experiments we empirically evaluate the performance of the method on two real-life datasets.
Forthcoming: No
Language: English
Citation:

Faußer, Stefan A. and Schwenker, Friedhelm (2012) Clustering large datasets with kernel methods. In: (Proceedings of the) 21st International Conference on Pattern Recognition. (ICPR ’12) ; Vol. 1, November, 11-15th, 2012, Tsukuba, Japan, pp. 501-504. ISBN 9781467322164

Actions for admins (login required)

View Item in edit mode View Item in edit mode