msmtools.estimation.largest_connected_set¶

msmtools.estimation.largest_connected_set(C, directed=True)

Largest connected component for a directed graph with edge-weights given by the count matrix.

Parameters: C (scipy.sparse matrix) – Count matrix specifying edge weights. directed (bool, optional) – Whether to compute connected components for a directed or undirected graph. Default is True. lcc – The largest connected component of the directed graph. array of integers

Notes

Viewing the count matrix as the adjacency matrix of a (directed) graph the largest connected set is the largest connected set of nodes of the corresponding graph. The largest connected set of a graph can be efficiently computed using Tarjan’s algorithm.

References

 [1] Tarjan, R E. 1972. Depth-first search and linear graph algorithms. SIAM Journal on Computing 1 (2): 146-160.

Examples

>>> import numpy as np
>>> from msmtools.estimation import largest_connected_set

>>> C =  np.array([[10, 1, 0], [2, 0, 3], [0, 0, 4]])
>>> lcc_directed = largest_connected_set(C)
>>> lcc_directed
array([0, 1])

>>> lcc_undirected = largest_connected_set(C, directed=False)
>>> lcc_undirected
array([0, 1, 2])