msmtools.estimation.largest_connected_set

msmtools.estimation.largest_connected_set(C, directed=True)

Largest connected component for a directed graph with edge-weights given by the count matrix.

Parameters:
  • C (scipy.sparse matrix) – Count matrix specifying edge weights.
  • directed (bool, optional) – Whether to compute connected components for a directed or undirected graph. Default is True.
Returns:

lcc – The largest connected component of the directed graph.

Return type:

array of integers

See also

connected_sets()

Notes

Viewing the count matrix as the adjacency matrix of a (directed) graph the largest connected set is the largest connected set of nodes of the corresponding graph. The largest connected set of a graph can be efficiently computed using Tarjan’s algorithm.

References

[1]Tarjan, R E. 1972. Depth-first search and linear graph algorithms. SIAM Journal on Computing 1 (2): 146-160.

Examples

>>> import numpy as np
>>> from msmtools.estimation import largest_connected_set
>>> C =  np.array([[10, 1, 0], [2, 0, 3], [0, 0, 4]])
>>> lcc_directed = largest_connected_set(C)
>>> lcc_directed
array([0, 1])
>>> lcc_undirected = largest_connected_set(C, directed=False)
>>> lcc_undirected
array([0, 1, 2])