Among a number of ensemble learning techniques, boosting and bagging are the most popular sampling-based ensemble approaches for classification problems. Boosting is considered stronger than bagging on noise-free data set with complex class structures, whereas bagging is more robust than boosting in cases where noise data are present. In this paper, we extend both ensemble approaches to clustering tasks, and propose a novel hybrid sampling-based clustering ensemble by combining the strengths of boosting and bagging.
In our approach, the input partitions are iteratively generated via a hybrid process inspired by both boosting and bagging. Then, a novel consensus function is proposed to encode the local and global cluster structure of input partitions into a single representation, and applies a single clustering algorithm to such representation to obtain the consolidated consensus partition. Our approach has been evaluated on 2-D-synthetic data, collection of benchmarks, and real-world facial recognition data sets, which show that the proposed technique outperforms the existing benchmarks for a variety of clustering tasks.