sequence-modelling¶
The HMM class
@author: nbhushan
- class hmm.StandardHMM(A, O)¶
The standard HMM object.
- Andarray
The transition distribution.
- Oobject
The HMM emission model.
pi, the initial state distribution is the last row of the transition matrix ‘A’. i.e. pi = A[-1]. Number of States: K = A.shape[1]
>>> # import the only external dependency >>> import numpy as np >>> #import package modules >>> from sequence_modelling.hmm import StandardHMM >>> from sequence_modelling.emissions import Gaussian >>> from sequence_modelling import hmmviz ... >>> # Define the model parameters >>> # the transition matrix A >>> A = np.array([[0.9, 0.1, 0.0], ... [0.0, 0.9, 0.1], ... [0.0, 0.0, 1.0]]) >>> # the emission object B >>> B = Gaussian(mu = np.array([[0.0, 1.0, 2.0], ... [0.0, 1.0, 2.0]]), ... covar = np.array([[0.1, 0.1, 0.1], ... [0.1, 0.1, 0.1]])) ... >>> # Build the HMM model object >>> hmm = StandardHMM(A, B)
- alpha(logB)¶
Compute alpha (forward) distribution.
alpha [i,n] = joint probability of being in state i, after observing 1..N observations. .
- logBndarray
The observation probability matrix in logarithmic space.
- logalphandarray
The log scaled alpha distribution.
Refer to Tobias Man’s paper [1]_ for the motivation behind the scaling factors used here. Note that this scaling methods is suitable when the dynamics of the system is not highly sparse. Adaptation of log-scaling in the QDHMM would require the use to construct a new sparse data structure
- 1
Mann, T. P. Numerically Stable Hidden Markov Model Implementation 2006.
- beta(logB)¶
Compute beta (backward) distribution.
beta [i,n] = conditional probability generating observations Y_n+1..Y_N, given Z_n.
- logBndarray
The observation probability matrix in logarithmic space.
- logbetandarray
The log scaled beta distribution.
Refer to Tobias Man’s paper [1]_ for the motivation behind the scaling factors used here. Note that this scaling methods is suitable when the dynamics of the system is not highly sparse. Adaptation of log-scaling in the QDHMM would require the use to construct a new sparse data structure.
- 1
Mann, T. P. Numerically Stable Hidden Markov Model Implementation 2006.
- estimatepostduration(logalpha, logbeta, logB, rankn, g, llh)¶
Estimate state durations based on the posterior distribution.
Since the durations are truncated by the timeout parameter, we use a distribution free method.
- logalphandarray
Log scaled alpha distribution.
- logbetandarray
Log scaled beta values.
- logBndarray
Observation probability distribution in log-space.
- ranknndarray
the top ranked ‘n’ for eah state ‘k’, used to estimate state durations.
- gndarray
log scaled posterior distribution (‘logGamma’)
- llhfloat
the normalized log-likelihood.
- int
The estimated durations in each state.
- ndarray
The expected value of the state duration at the ‘rankn’.
The QDHMM EM algorithm requires good initial estimates of the model parameters in order to converge to a good solution. We propose a distribution free method to find the expected value of state durations in a standard HMM model, which is then used to initialize the QDHMM ‘tau’ parameters.
- estimateviterbiduration(path)¶
- Estimate the state durations based on the Viterbi decoded
state sequence.
- pathndarray
The Viterbi decoded state sequence.
- int
Estimated state durations based on the Viterbi path.
- fit(obs, maxiter=50, epsilon=0.0001, debug=True)¶
- Fit the standard HMM to the given data using the (adapted Baum-Welch)
EM algorithm.
- obslist
The list of observations sequences where every sequence is a ndarray. The sequences can be of different length, but the dimension of the features needs to be identical.
- maxiterint, optional
The maximum number of iterations of the EM algorithm. Default = 50.
- epsilonfloat, optional
The minimum allowed threshold in the variation of the log-likelihood between succesive iterations of the EM algorithm. Once the variation exceeds ‘epsilon’ the algorithm is said to have converged. Default = 1e-6.
- debugbool, optional
Display verbose On/off.
- float
The normalized log-likelihood.
- list
The list of log-likelihoods for each iteration of the EM algorithm. To check for monotonicity of the log-likelihoods.
- int
The duration estimates of each HMM state from the posterior distribution.
- ndarray
The top ranked ‘n’ which are used to estimate the state durations.
- ndarray
The expected value of the state durations obtained at the top ranked ‘n’.
- gammaKsi(logB)¶
- Compute gamma (posterior distribution) and Ksi (joint succesive
posterior distrbution) values.
gamma [i,n] = conditional probability of the event state ‘i’ at time ‘n’, given the complete observation sequence.
ksi[n,i,j] = joint posterior probability of two succesive hidden states ‘i’ and ‘j’ at time ‘n’.
- logBndarray
The observation probability matrix in logarithmic space.
- llhfloat
The normalized log-likelihood.
- logGammandarray
The log posterior distribution.
- logKsindarray
The log joint posterior probability distribution.
- logAlphandarray
The log scaled alpha distribution.
- logBetandarray
The log scaled beta distribution.
- rankn(ksi, rank=10)¶
Find the top ranked ‘n’s used to estimate state durations.
Find the top ranked ‘n’ s for which the posterior probability of transitioning into the state ‘k’ given we were not at state ‘k’ at time ‘n-1’.
- ksindarray
The joint sucesive posterior distribution in log-space
- rankint, optional
The number of the ranked ‘n’ which we chose to use to estimate state durations.
- ranknndarray
the top ranked ‘n’s for each state. Used to estimate state durations
- sample(dim=1, N=1000)¶
Generates an observation sequence of length N.
- dimint
The dimension of the data (univariate=1, etc..).
- Nint
The length of the observation sequence.
- obsndarray
An array of N observations.
- zesndarray
The state sequence that generated the data.
- viterbi(obs)¶
Computes Most probable path based on Viterbi algorithm.
Most probable state sequence = argmax_z P(Z|X)
- obsarray_like
Observation sequence.
- pathndarray
The Viterbi decoded state sequence.
Refer to Rabiner’s paper [1]_ or the original Viterbi paper [2]_.
- 1
Rabiner, L. A tutorial on hidden Markov models and selected applications in speech recognition Proceedings of the IEEE, 1989, 77, 257-286.
- 2
Viterbi, A. Error bounds for convolutional codes and an asymptotically optimum decoding algorithm Information Theory, IEEE Transactions on, 1967, 13, 260-269
The QDHMM object
@author: nbhushan
- class qdhmm.QDHMM(p, zeta, eta, O)¶
The QDHMM object.
- pfloat
initial probability distribution for the active state.
- zetafloat
probability of self transitions in active state.
- etafloat
probability of self transitions in inactive state.
- Oobject
QDHMM Emission Model
The QDHMM is an extension to a standard HMM.
- alpha(B)¶
Compute alpha (forward) values.
alpha [i,n] = joint probability of being in state i, after observing 1..N observations. .
- Bndarray
The observation probability matrix.
- alphahatndarray
The scaled alpha values.
- cndarray
The scaling factors.
Refer to Rabiner’s paper [1]_ for the scaling factors used here.
- 1
Rabiner, L. A tutorial on hidden Markov models and selected applications in speech recognition Proceedings of the IEEE, 1989, 77, 257-286.
- beta(B, c)¶
Compute beta (backward) values.
beta [i,n] = conditional probability generating observations Y_n+1..Y_N, given Z_n.
- Bndarray
The observation probability matrix.
- cndarray
The scaling factors obtained from the alpha computation
- betahatndarray
The scaled beta values.
DO NOT call the beta function before calling the alpha function. Refer to Rabiner’s paper [1]_ for the scaling factors used here.
- 1
Rabiner, L. A tutorial on hidden Markov models and selected applications in speech recognition Proceedings of the IEEE, 1989, 77, 257-286.
- buildTransmat()¶
Builds the sparse transition matrix.
- Ascipy.sparse.csr_matrix
The sparse transition matrix.
- fit(obs, maxiter=50, epsilon=1e-05, debug=False, metaheuristic='local')¶
- Fit the QDHMM to the given data using the (adapted Baum-Welch)
EM algorithm.
- obslist
The list of observations sequences where every sequence is a ndarray. The sequences can be of different length, but the dimension of the features needs to be identical.
- maxiterint, optional
The maximum number of iterations of the EM algorithm. Default = 50.
- epsilonfloat, optional
The minimum allowed threshold in the variation of the log-likelihood between succesive iterations of the EM algorithm. Once the variation exceeds ‘epsilon’ the algorithm is said to have converged. Default = 1e-6.
- debugbool, optional
Display verbose On/off.
- metaheuristic{‘local’, ‘sa’, ‘genetic’}, optional
The meta-heuristic to be used to solve the objective in the M-step. ‘local’ is simple local search. ‘genetic’ is genetic algorithm and ‘sa’ is simulated annealing.
- float
The normalized log-likelihood.
- list
The list of log-likelihoods for each iteration of the EM algorithm. To check for monotonicity of the log-likelihoods.
- gammaKsi(B)¶
Compute gamma (posterior distribution) values.
gamma [i,n] = conditional probability of the event state ‘i’ at time ‘n’, given the complete observation sequence.
- Bndarray
The observation probability matrix.
- llhfloat
The normalized log-likelihood.
- gammandarray
The posterior distribution.
- float
The number of tranisitions into the active state.
- float
The number of transitions into the inactive states.
Ksi is the joint succesive posterior distrbution. ksi[n,i,j] = joint posterior probability of two succesive hidden states ‘i’ and ‘j’ at time ‘n’. In the QDHMM, Ksi is too large to fit in contiguous memory [N,K,K]. Hence we estimate Ksi at every time step n, and store the relevant parameters required for the computation and zeta and eta (the transition parameters)
- sample(dim=1, N=1000)¶
Generates an observation sequence of length N.
- dimint
The dimension of the data (univariate=1, etc..).
- Nint
The length of the observation sequence.
- obsndarray
An array of N observations.
- zesndarray
The state sequence that generated the data.
- viterbi(obs)¶
Computes Most probable path based on Viterbi algorithm.
Most probable state sequence = argmax_z P(Z|X)
- obsarray_like
Observation sequence.
- pathndarray
The Viterbi decoded state sequence.
Refer to Rabiner’s paper [1]_ or the original Viterbi paper [2]_.
- 1
Rabiner, L. A tutorial on hidden Markov models and selected applications in speech recognition Proceedings of the IEEE, 1989, 77, 257-286.
- 2
Viterbi, A. Error bounds for convolutional codes and an asymptotically optimum decoding algorithm Information Theory, IEEE Transactions on, 1967, 13, 260-269
Emission distributions for QDHMM
@author: nbhushan
- class emissionplus.Gaussian(mu, var, tau)¶
The Gaussian emission model for a QDHMM.
- mundarray
mean, ‘mu’ is defined by ndarray of shape [d, K]. Where d is the dimension of the features and K is the total number of states.
- varndarray
variance, ‘var’ is defined by ndarray of shape [K, d, d]. Where d is the dimension of the features and K is the total number of states. Note: borrowed from standard HMM ‘covar’.
- taundarray
The time-out parameters. Note: if the individual timeouts are t1,t2, t3.. then tau = [0, t1, t1+t2, t1+t2+t3, .. ]
D is the cumulative sum of the individual timeouts + 2.
- Fit(obs, weights, estimatetau, metaheuristic)¶
Fit a Gaussian to the state distributions after observing the data.
- obsndarray
Observation sequence.
- weightsndarray
The weights attached to each state (posterior distribution).
- estimatetaubool
Find optimal tau’s Yes/No
- metaheuristic{‘local’, ‘sa’, ‘genetic’}, optional
The meta-heuristic to be used to solve the objective in the M-step. ‘local’ is simple local search. ‘genetic’ is genetic algorithm and ‘sa’ is simulated annealing.
Estimation of the tau parameters is done every alternate iteration of the QDHMM EM algorithm. Call viz.plotcontour only if you wish to view the search propogation. Recommended for search in a small space.
- Sample(zes)¶
Generates a Gaussian observation sequence from a given state sequence.
- stateseqndarray
The state sequence which is used to generate the observation sequence.
- ndarray
Observation sequence.
The observation sequence can only be univariate Gaussian in the QDHMM Emission model.
- likelihood(obs)¶
- To compute likelihood of drawing an observation ‘y’ from a
given state: P(x | Z_t) = N(mu,var).
- obsndarray
The observation sequence
- logBndarray
THe observation probability distribution.