msmtools.estimation.largest_connected_submatrix

msmtools.estimation.largest_connected_submatrix(C, directed=True, lcc=None)

Compute the count matrix on the largest connected set.

Parameters:
  • C (scipy.sparse matrix) – Count matrix specifying edge weights.
  • directed (bool, optional) – Whether to compute connected components for a directed or undirected graph. Default is True
  • lcc ((M,) ndarray, optional) – The largest connected set
Returns:

C_cc – Count matrix of largest completely connected set of vertices (states)

Return type:

scipy.sparse matrix

Notes

Viewing the count matrix as the adjacency matrix of a (directed) graph the larest connected submatrix is the adjacency matrix of the largest connected set of the corresponding graph. The largest connected submatrix can be efficiently computed using Tarjan’s algorithm.

References

[1]Tarjan, R E. 1972. Depth-first search and linear graph algorithms. SIAM Journal on Computing 1 (2): 146-160.

Examples

>>> import numpy as np
>>> from msmtools.estimation import largest_connected_submatrix
>>> C = np.array([[10, 1, 0], [2, 0, 3], [0, 0, 4]])
>>> C_cc_directed = largest_connected_submatrix(C)
>>> C_cc_directed 
array([[10,  1],
       [ 2,  0]]...)
>>> C_cc_undirected = largest_connected_submatrix(C, directed=False)
>>> C_cc_undirected 
array([[10,  1,  0],
       [ 2,  0,  3],
       [ 0,  0,  4]]...)