A Unified Framework for Data Visualization and Coclustering

We propose a new theoretical framework for data visualization. This framework is based on iterative procedure looking up an appropriate approximation of the data matrix $A$ by using two stochastic similarity matrices from the set of rows and the set of columns. This process converges to a steady state where the approximated data $hat {A}$ is composed of $g$ similar rows and $l$ similar columns.

Reordering $A$ according to the first left and right singular vectors involves an optimal data reorganization revealing homogeneous block clusters. Furthermore, we show that our approach is related to a Markov chain model, to the double $k$ -means with $g times l$ block clusters and to a spectral coclustering. Numerical experiments on simulated and real data sets show the interest of our approach.

You might also like