msmtools.estimation.largest_connected_set¶

msmtools.estimation.
largest_connected_set
(C, directed=True)¶ Largest connected component for a directed graph with edgeweights given by the count matrix.
Parameters:  C (scipy.sparse matrix) – Count matrix specifying edge weights.
 directed (bool, optional) – Whether to compute connected components for a directed or undirected graph. Default is True.
Returns: lcc – The largest connected component of the directed graph.
Return type: array of integers
See also
Notes
Viewing the count matrix as the adjacency matrix of a (directed) graph the largest connected set is the largest connected set of nodes of the corresponding graph. The largest connected set of a graph can be efficiently computed using Tarjan’s algorithm.
References
[1] Tarjan, R E. 1972. Depthfirst search and linear graph algorithms. SIAM Journal on Computing 1 (2): 146160. Examples
>>> import numpy as np >>> from msmtools.estimation import largest_connected_set
>>> C = np.array([[10, 1, 0], [2, 0, 3], [0, 0, 4]]) >>> lcc_directed = largest_connected_set(C) >>> lcc_directed array([0, 1])
>>> lcc_undirected = largest_connected_set(C, directed=False) >>> lcc_undirected array([0, 1, 2])