Hidden Markov models (hidden Markov model,HMM) are statistical learning models that can be used for labeling problems, and describe the process of randomly generating observation sequences from hidden Markov chains, which belong to the generation model . The definition of hidden Markov model
The hidden Markov model is determined by initial probability distribution , state transition probability distribution and observation probability distribution .
Set Q is the set of all possible states, and V is the set of all possible observations.
where n is the number of possible states, and M is the possible number of observations.
I is a sequence of states of length T, and O is the corresponding observation sequence.
a is the state transition probability matrix :
which
Represents the probability that the moment T is in the state Qi's condition when the t+1 shifts to the state qj.
B is the observation probability matrix :
which
Represents the probability of generating an observational VK at the moment T is in the state QJ condition.
π is the initial state probability vector :
which
Indicates the probability that t=1 is in state qi at all times.
The hidden Markov model is determined by π,AandB . π and A determine the state sequence, andB determines the observation sequence.
The hidden Markov model λ= ( a, b,π),a,b,π is called the three elements of the hidden Markov model .
Two basic assumptions of hidden Markov models:
(1). Homogeneous Markov hypothesis
(2). Observation Independence hypothesis
three basic problems of hidden Markov model probability calculation problem
Given model λ= ( A, B,π) and observation sequence. Calculates the probability P (o|λ) of the observed sequence O appearing under model λ.
The method for solving this problem is the forward and back algorithm . Learning Problems
The observed sequence is known, and the model λ= ( A, B,π) parameter is estimated, so that the probability P (o|λ) of the observed sequence is maximal in this model.
The maximum likelihood estimation method is used to estimate the parameters when both the observed sequence and the corresponding state sequence are given.
When there is no corresponding state sequence for a given observation sequence, the parameter estimation is based on em algorithm . (Baum-welch algorithm ) prediction Problem
Also known as decoding problems. Known model λ= ( A, B,π) and observation sequence
, the conditional probability P (i|) for a given observation sequence is obtained. O) the largest state sequence. That is, given the observed sequence, the most probable corresponding state sequence is obtained.
The method for solving this problem is the Viterbi algorithm .
References:
The method of statistical learning, Hangyuan Li