sequence-modelling

The HMM class

@author: nbhushan

class hmm.StandardHMM(A, O)

The standard HMM object.

Andarray

The transition distribution.

Oobject

The HMM emission model.

pi, the initial state distribution is the last row of the transition matrix ‘A’. i.e. pi = A[-1]. Number of States: K = A.shape[1]

>>> # import the only external dependency
>>> import numpy as np
>>> #import package modules
>>> from sequence_modelling.hmm import StandardHMM
>>> from sequence_modelling.emissions import Gaussian
>>> from sequence_modelling import hmmviz
...
>>> # Define the model parameters
>>> # the transition matrix A
>>> A = np.array([[0.9, 0.1, 0.0],
...               [0.0, 0.9, 0.1],
...               [0.0, 0.0, 1.0]])
>>> # the emission object B
>>> B = Gaussian(mu = np.array([[0.0, 1.0, 2.0],
...                             [0.0, 1.0, 2.0]]),
...              covar = np.array([[0.1, 0.1, 0.1],
...                              [0.1, 0.1, 0.1]]))
...
>>> # Build the HMM model object
>>> hmm = StandardHMM(A, B)
alpha(logB)

Compute alpha (forward) distribution.

alpha [i,n] = joint probability of being in state i, after observing 1..N observations. .

logBndarray

The observation probability matrix in logarithmic space.

logalphandarray

The log scaled alpha distribution.

Refer to Tobias Man’s paper [1]_ for the motivation behind the scaling factors used here. Note that this scaling methods is suitable when the dynamics of the system is not highly sparse. Adaptation of log-scaling in the QDHMM would require the use to construct a new sparse data structure

1

Mann, T. P. Numerically Stable Hidden Markov Model Implementation 2006.

beta(logB)

Compute beta (backward) distribution.

beta [i,n] = conditional probability generating observations Y_n+1..Y_N, given Z_n.

logBndarray

The observation probability matrix in logarithmic space.

logbetandarray

The log scaled beta distribution.

Refer to Tobias Man’s paper [1]_ for the motivation behind the scaling factors used here. Note that this scaling methods is suitable when the dynamics of the system is not highly sparse. Adaptation of log-scaling in the QDHMM would require the use to construct a new sparse data structure.

1

Mann, T. P. Numerically Stable Hidden Markov Model Implementation 2006.

estimatepostduration(logalpha, logbeta, logB, rankn, g, llh)

Estimate state durations based on the posterior distribution.

Since the durations are truncated by the timeout parameter, we use a distribution free method.

logalphandarray

Log scaled alpha distribution.

logbetandarray

Log scaled beta values.

logBndarray

Observation probability distribution in log-space.

ranknndarray

the top ranked ‘n’ for eah state ‘k’, used to estimate state durations.

gndarray

log scaled posterior distribution (‘logGamma’)

llhfloat

the normalized log-likelihood.

int

The estimated durations in each state.

ndarray

The expected value of the state duration at the ‘rankn’.

The QDHMM EM algorithm requires good initial estimates of the model parameters in order to converge to a good solution. We propose a distribution free method to find the expected value of state durations in a standard HMM model, which is then used to initialize the QDHMM ‘tau’ parameters.

estimateviterbiduration(path)
Estimate the state durations based on the Viterbi decoded

state sequence.

pathndarray

The Viterbi decoded state sequence.

int

Estimated state durations based on the Viterbi path.

fit(obs, maxiter=50, epsilon=0.0001, debug=True)
Fit the standard HMM to the given data using the (adapted Baum-Welch)

EM algorithm.

obslist

The list of observations sequences where every sequence is a ndarray. The sequences can be of different length, but the dimension of the features needs to be identical.

maxiterint, optional

The maximum number of iterations of the EM algorithm. Default = 50.

epsilonfloat, optional

The minimum allowed threshold in the variation of the log-likelihood between succesive iterations of the EM algorithm. Once the variation exceeds ‘epsilon’ the algorithm is said to have converged. Default = 1e-6.

debugbool, optional

Display verbose On/off.

float

The normalized log-likelihood.

list

The list of log-likelihoods for each iteration of the EM algorithm. To check for monotonicity of the log-likelihoods.

int

The duration estimates of each HMM state from the posterior distribution.

ndarray

The top ranked ‘n’ which are used to estimate the state durations.

ndarray

The expected value of the state durations obtained at the top ranked ‘n’.

gammaKsi(logB)
Compute gamma (posterior distribution) and Ksi (joint succesive

posterior distrbution) values.

gamma [i,n] = conditional probability of the event state ‘i’ at time ‘n’, given the complete observation sequence.

ksi[n,i,j] = joint posterior probability of two succesive hidden states ‘i’ and ‘j’ at time ‘n’.

logBndarray

The observation probability matrix in logarithmic space.

llhfloat

The normalized log-likelihood.

logGammandarray

The log posterior distribution.

logKsindarray

The log joint posterior probability distribution.

logAlphandarray

The log scaled alpha distribution.

logBetandarray

The log scaled beta distribution.

rankn(ksi, rank=10)

Find the top ranked ‘n’s used to estimate state durations.

Find the top ranked ‘n’ s for which the posterior probability of transitioning into the state ‘k’ given we were not at state ‘k’ at time ‘n-1’.

ksindarray

The joint sucesive posterior distribution in log-space

rankint, optional

The number of the ranked ‘n’ which we chose to use to estimate state durations.

ranknndarray

the top ranked ‘n’s for each state. Used to estimate state durations

sample(dim=1, N=1000)

Generates an observation sequence of length N.

dimint

The dimension of the data (univariate=1, etc..).

Nint

The length of the observation sequence.

obsndarray

An array of N observations.

zesndarray

The state sequence that generated the data.

viterbi(obs)

Computes Most probable path based on Viterbi algorithm.

Most probable state sequence = argmax_z P(Z|X)

obsarray_like

Observation sequence.

pathndarray

The Viterbi decoded state sequence.

Refer to Rabiner’s paper [1]_ or the original Viterbi paper [2]_.

1

Rabiner, L. A tutorial on hidden Markov models and selected applications in speech recognition Proceedings of the IEEE, 1989, 77, 257-286.

2

Viterbi, A. Error bounds for convolutional codes and an asymptotically optimum decoding algorithm Information Theory, IEEE Transactions on, 1967, 13, 260-269

The QDHMM object

@author: nbhushan

class qdhmm.QDHMM(p, zeta, eta, O)

The QDHMM object.

pfloat

initial probability distribution for the active state.

zetafloat

probability of self transitions in active state.

etafloat

probability of self transitions in inactive state.

Oobject

QDHMM Emission Model

The QDHMM is an extension to a standard HMM.

alpha(B)

Compute alpha (forward) values.

alpha [i,n] = joint probability of being in state i, after observing 1..N observations. .

Bndarray

The observation probability matrix.

alphahatndarray

The scaled alpha values.

cndarray

The scaling factors.

Refer to Rabiner’s paper [1]_ for the scaling factors used here.

1

Rabiner, L. A tutorial on hidden Markov models and selected applications in speech recognition Proceedings of the IEEE, 1989, 77, 257-286.

beta(B, c)

Compute beta (backward) values.

beta [i,n] = conditional probability generating observations Y_n+1..Y_N, given Z_n.

Bndarray

The observation probability matrix.

cndarray

The scaling factors obtained from the alpha computation

betahatndarray

The scaled beta values.

DO NOT call the beta function before calling the alpha function. Refer to Rabiner’s paper [1]_ for the scaling factors used here.

1

Rabiner, L. A tutorial on hidden Markov models and selected applications in speech recognition Proceedings of the IEEE, 1989, 77, 257-286.

buildTransmat()

Builds the sparse transition matrix.

Ascipy.sparse.csr_matrix

The sparse transition matrix.

fit(obs, maxiter=50, epsilon=1e-05, debug=False, metaheuristic='local')
Fit the QDHMM to the given data using the (adapted Baum-Welch)

EM algorithm.

obslist

The list of observations sequences where every sequence is a ndarray. The sequences can be of different length, but the dimension of the features needs to be identical.

maxiterint, optional

The maximum number of iterations of the EM algorithm. Default = 50.

epsilonfloat, optional

The minimum allowed threshold in the variation of the log-likelihood between succesive iterations of the EM algorithm. Once the variation exceeds ‘epsilon’ the algorithm is said to have converged. Default = 1e-6.

debugbool, optional

Display verbose On/off.

metaheuristic{‘local’, ‘sa’, ‘genetic’}, optional

The meta-heuristic to be used to solve the objective in the M-step. ‘local’ is simple local search. ‘genetic’ is genetic algorithm and ‘sa’ is simulated annealing.

float

The normalized log-likelihood.

list

The list of log-likelihoods for each iteration of the EM algorithm. To check for monotonicity of the log-likelihoods.

gammaKsi(B)

Compute gamma (posterior distribution) values.

gamma [i,n] = conditional probability of the event state ‘i’ at time ‘n’, given the complete observation sequence.

Bndarray

The observation probability matrix.

llhfloat

The normalized log-likelihood.

gammandarray

The posterior distribution.

float

The number of tranisitions into the active state.

float

The number of transitions into the inactive states.

Ksi is the joint succesive posterior distrbution. ksi[n,i,j] = joint posterior probability of two succesive hidden states ‘i’ and ‘j’ at time ‘n’. In the QDHMM, Ksi is too large to fit in contiguous memory [N,K,K]. Hence we estimate Ksi at every time step n, and store the relevant parameters required for the computation and zeta and eta (the transition parameters)

sample(dim=1, N=1000)

Generates an observation sequence of length N.

dimint

The dimension of the data (univariate=1, etc..).

Nint

The length of the observation sequence.

obsndarray

An array of N observations.

zesndarray

The state sequence that generated the data.

viterbi(obs)

Computes Most probable path based on Viterbi algorithm.

Most probable state sequence = argmax_z P(Z|X)

obsarray_like

Observation sequence.

pathndarray

The Viterbi decoded state sequence.

Refer to Rabiner’s paper [1]_ or the original Viterbi paper [2]_.

1

Rabiner, L. A tutorial on hidden Markov models and selected applications in speech recognition Proceedings of the IEEE, 1989, 77, 257-286.

2

Viterbi, A. Error bounds for convolutional codes and an asymptotically optimum decoding algorithm Information Theory, IEEE Transactions on, 1967, 13, 260-269

Emission distributions for QDHMM

@author: nbhushan

class emissionplus.Gaussian(mu, var, tau)

The Gaussian emission model for a QDHMM.

mundarray

mean, ‘mu’ is defined by ndarray of shape [d, K]. Where d is the dimension of the features and K is the total number of states.

varndarray

variance, ‘var’ is defined by ndarray of shape [K, d, d]. Where d is the dimension of the features and K is the total number of states. Note: borrowed from standard HMM ‘covar’.

taundarray

The time-out parameters. Note: if the individual timeouts are t1,t2, t3.. then tau = [0, t1, t1+t2, t1+t2+t3, .. ]

D is the cumulative sum of the individual timeouts + 2.

Fit(obs, weights, estimatetau, metaheuristic)

Fit a Gaussian to the state distributions after observing the data.

obsndarray

Observation sequence.

weightsndarray

The weights attached to each state (posterior distribution).

estimatetaubool

Find optimal tau’s Yes/No

metaheuristic{‘local’, ‘sa’, ‘genetic’}, optional

The meta-heuristic to be used to solve the objective in the M-step. ‘local’ is simple local search. ‘genetic’ is genetic algorithm and ‘sa’ is simulated annealing.

Estimation of the tau parameters is done every alternate iteration of the QDHMM EM algorithm. Call viz.plotcontour only if you wish to view the search propogation. Recommended for search in a small space.

Sample(zes)

Generates a Gaussian observation sequence from a given state sequence.

stateseqndarray

The state sequence which is used to generate the observation sequence.

ndarray

Observation sequence.

The observation sequence can only be univariate Gaussian in the QDHMM Emission model.

likelihood(obs)
To compute likelihood of drawing an observation ‘y’ from a

given state: P(x | Z_t) = N(mu,var).

obsndarray

The observation sequence

logBndarray

THe observation probability distribution.