msmtools.estimation.count_matrix

msmtools.estimation.count_matrix(dtraj, lag, sliding=True, sparse_return=True, nstates=None)

Generate a count matrix from given microstate trajectory.

Parameters:
  • dtraj (array_like or list of array_like) – Discretized trajectory or list of discretized trajectories
  • lag (int) – Lagtime in trajectory steps
  • sliding (bool, optional) – If true the sliding window approach is used for transition counting.
  • sparse_return (bool (optional)) – Whether to return a dense or a sparse matrix.
  • nstates (int, optional) – Enforce a count-matrix with shape=(nstates, nstates)
Returns:

C – The count matrix at given lag in coordinate list format.

Return type:

scipy.sparse.coo_matrix

Notes

Transition counts can be obtained from microstate trajectory using two methods. Couning at lag and slidingwindow counting.

Lag

This approach will skip all points in the trajectory that are seperated form the last point by less than the given lagtime \(\tau\).

Transition counts \(c_{ij}(\tau)\) are generated according to

\[c_{ij}(\tau) = \sum_{k=0}^{\left \lfloor \frac{N}{\tau} \right \rfloor -2} \chi_{i}(X_{k\tau})\chi_{j}(X_{(k+1)\tau}).\]

\(\chi_{i}(x)\) is the indicator function of \(i\), i.e \(\chi_{i}(x)=1\) for \(x=i\) and \(\chi_{i}(x)=0\) for \(x \neq i\).

Sliding

The sliding approach slides along the trajectory and counts all transitions sperated by the lagtime \(\tau\).

Transition counts \(c_{ij}(\tau)\) are generated according to

\[c_{ij}(\tau)=\sum_{k=0}^{N-\tau-1} \chi_{i}(X_{k}) \chi_{j}(X_{k+\tau}).\]

References

[1]Prinz, J H, H Wu, M Sarich, B Keller, M Senne, M Held, J D Chodera, C Schuette and F Noe. 2011. Markov models of molecular kinetics: Generation and validation. J Chem Phys 134: 174105

Examples

>>> import numpy as np
>>> from msmtools.estimation import count_matrix
>>> dtraj = np.array([0, 0, 1, 0, 1, 1, 0])
>>> tau = 2

Use the sliding approach first

>>> C_sliding = count_matrix(dtraj, tau)

The generated matrix is a sparse matrix in CSR-format. For convenient printing we convert it to a dense ndarray.

>>> C_sliding.toarray()
array([[ 1.,  2.],
       [ 1.,  1.]])

Let us compare to the count-matrix we obtain using the lag approach

>>> C_lag = count_matrix(dtraj, tau, sliding=False)
>>> C_lag.toarray()
array([[ 0.,  1.],
       [ 1.,  1.]])