TY - GEN
T1 - A spectral method to separate disconnected and nearly-disconnected Web graph components
AU - Ding, Chris H.Q.
AU - He, Xiaofeng
AU - Zha, Hongyuan
PY - 2001
Y1 - 2001
N2 - Separation of connected components from a graph with disconnected graph components mostly use breadth-first search (BFS) or depth-first search (DFS) graph algorithms. Here we propose a new algebraic method to separate disconnected and nearly-disconnected components. This method is based on spectral graph partitioning, following a key observation that disconnected components will show up, after properly sorted, as step-function like curve in the lowest eigenvectors of the Laplacian matrix of the graph. Following an perturbative analysis framework, we systematically analyzed the graph structures, first on the disconnected subgraph ease, and second on the effects of adding edges sparsely connecting different subgraphs as a perturbation. Several new results are derived, providing insights to spectral methods and related clustering objective function. Examples are given illustrating the concepts and results our methods. Comparing to the standard graph algorithms, this method has the same O(|E| + |V|log(|V|)) complexity, but is easier to implement (using readily available eigensolvers). Further more the method can easily identify articulation points and bridges on nearly-disconnected graphs. Segmentation of a real example of Web graph for query amazon is given. We found that each disconnected or nearly-disconnected components forms a cluster on a clear topic.
AB - Separation of connected components from a graph with disconnected graph components mostly use breadth-first search (BFS) or depth-first search (DFS) graph algorithms. Here we propose a new algebraic method to separate disconnected and nearly-disconnected components. This method is based on spectral graph partitioning, following a key observation that disconnected components will show up, after properly sorted, as step-function like curve in the lowest eigenvectors of the Laplacian matrix of the graph. Following an perturbative analysis framework, we systematically analyzed the graph structures, first on the disconnected subgraph ease, and second on the effects of adding edges sparsely connecting different subgraphs as a perturbation. Several new results are derived, providing insights to spectral methods and related clustering objective function. Examples are given illustrating the concepts and results our methods. Comparing to the standard graph algorithms, this method has the same O(|E| + |V|log(|V|)) complexity, but is easier to implement (using readily available eigensolvers). Further more the method can easily identify articulation points and bridges on nearly-disconnected graphs. Segmentation of a real example of Web graph for query amazon is given. We found that each disconnected or nearly-disconnected components forms a cluster on a clear topic.
UR - https://www.scopus.com/pages/publications/0035788930
U2 - 10.1145/502512.502551
DO - 10.1145/502512.502551
M3 - 会议稿件
AN - SCOPUS:0035788930
SN - 158113391X
SN - 9781581133912
T3 - Proceedings of the Seventh ACM SIGKDD International Conference on Knowledge Discovery and Data Mining
SP - 275
EP - 280
BT - Proceedings of the Seventh ACM SIGKDD International Conference on Knowledge Discovery and Data Mining
A2 - Provost, F.
A2 - Srikant, R.
A2 - Schkolnick, M.
A2 - Lee, D.
PB - Association for Computing Machinery (ACM)
T2 - Proceedings of the Seventh ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD-2001)
Y2 - 26 August 2001 through 29 August 2001
ER -