7,554 research outputs found
Revisiting Spectral Graph Clustering with Generative Community Models
The methodology of community detection can be divided into two principles:
imposing a network model on a given graph, or optimizing a designed objective
function. The former provides guarantees on theoretical detectability but falls
short when the graph is inconsistent with the underlying model. The latter is
model-free but fails to provide quality assurance for the detected communities.
In this paper, we propose a novel unified framework to combine the advantages
of these two principles. The presented method, SGC-GEN, not only considers the
detection error caused by the corresponding model mismatch to a given graph,
but also yields a theoretical guarantee on community detectability by analyzing
Spectral Graph Clustering (SGC) under GENerative community models (GCMs).
SGC-GEN incorporates the predictability on correct community detection with a
measure of community fitness to GCMs. It resembles the formulation of
supervised learning problems by enabling various community detection loss
functions and model mismatch metrics. We further establish a theoretical
condition for correct community detection using the normalized graph Laplacian
matrix under a GCM, which provides a novel data-driven loss function for
SGC-GEN. In addition, we present an effective algorithm to implement SGC-GEN,
and show that the computational complexity of SGC-GEN is comparable to the
baseline methods. Our experiments on 18 real-world datasets demonstrate that
SGC-GEN possesses superior and robust performance compared to 6 baseline
methods under 7 representative clustering metrics.Comment: Accepted by IEEE International Conference on Data Mining (ICDM) 2017
as a regular paper - full paper with supplementary materia
Incremental eigenpair computation for graph Laplacian matrices: theory and applications
The smallest eigenvalues and the associated eigenvectors (i.e., eigenpairs) of a graph Laplacian matrix have been widely used for spectral clustering and community detection. However, in real-life applications, the number of clusters or communities (say, K) is generally unknown a priori. Consequently, the majority of the existing methods either choose K heuristically or they repeat the clustering method with different choices of K and accept the best clustering result. The first option, more often, yields suboptimal result, while the second option is computationally expensive. In this work, we propose an incremental method for constructing the eigenspectrum of the graph Laplacian matrix. This method leverages the eigenstructure of graph Laplacian matrix to obtain the Kth smallest eigenpair of the Laplacian matrix given a collection of all previously compute
- …
