# msmtools.estimation.largest_connected_submatrix¶

msmtools.estimation.largest_connected_submatrix(C, directed=True, lcc=None)

Compute the count matrix on the largest connected set.

Parameters: C (scipy.sparse matrix) – Count matrix specifying edge weights. directed (bool, optional) – Whether to compute connected components for a directed or undirected graph. Default is True lcc ((M,) ndarray, optional) – The largest connected set C_cc – Count matrix of largest completely connected set of vertices (states) scipy.sparse matrix

Notes

Viewing the count matrix as the adjacency matrix of a (directed) graph the larest connected submatrix is the adjacency matrix of the largest connected set of the corresponding graph. The largest connected submatrix can be efficiently computed using Tarjan’s algorithm.

References

 [1] Tarjan, R E. 1972. Depth-first search and linear graph algorithms. SIAM Journal on Computing 1 (2): 146-160.

Examples

>>> import numpy as np
>>> from msmtools.estimation import largest_connected_submatrix

>>> C = np.array([[10, 1, 0], [2, 0, 3], [0, 0, 4]])

>>> C_cc_directed = largest_connected_submatrix(C)
>>> C_cc_directed
array([[10,  1],
[ 2,  0]]...)

>>> C_cc_undirected = largest_connected_submatrix(C, directed=False)
>>> C_cc_undirected
array([[10,  1,  0],
[ 2,  0,  3],
[ 0,  0,  4]]...)