The training of Hidden Markov model and hidden Markov model

Source: Internet
Author: User

The development of probability theory from random variables (relatively static) to relatively dynamic stochastic processes should be a big leap forward!

The so-called stochastic process is the process of changing state over time.

In other words, each moment corresponds to a state. The process of changing the state to (the next moment) is the (random) process.

The so-called stochastic means that the state of a given moment is not exactly known (dynamic, random) so it is random.

Markov chain (Markov) is a hypothesis based on stochastic process

So how does he suppose that?

In this process, in the case of a given current knowledge or information, the past (that is, the current historical state) is irrelevant for predicting the future (that is, the current future state)

implied Markov model (HMM)

According to the output of the m1,m2,m3 .... To find the most suitable s1,s2,s3 ... (This process is also known as decoding)

(S1,s2, representing the state of the "1" moment, the state of the "2" moment, this is a generalization, the state of the specific moment is what, there is a can be m1,m2,m3,m4. One of them)

For example, there are a lot of bags, there are many different color balls in the bag, and the probability distribution of the color of the ball in each bag is different.

Now feel free to choose a bag, the initial state probability, choose a ball from the bag, record the color, this color is displayed (M1,M2,M3,M4) state, the selected bag is equivalent to the S1 (SI)

For example, red M1, yellow m2, blue m3, Green M4, the first choice is the red M1, then P (M1|S1) is the emission probability (generation probability ) p (mi|si).

Next time and randomly choose a bag s2, from which to touch a ball m2,p (S2|S1) is the state transfer probability p (si|si-1)

These two parameters are the parameters of the Markov model , the process of estimating and calculating these two parameters is called the training of the model , very cute name ~

I thought about the difference between Hmm and Markov (also visible Markov, and implied antonyms)

See Evans, Visual Markov refers to the process of state transfer is visible, you can know, the above example is said from a bag to the next bag, choose which bag is known (the official point, the state transfer sequence, the state transfer process is known),

And hmm is not aware of the choice is which bag, hmm know only, specific what color m1,m2,m3,m4.

We can only speculate s1,s2,s3 by seeing the color of the ball .... Which bag was selected (decode, front that picture backwards push)

In other words, we can only speculate by seeing the color of the ball (a State transfer probability P (si|si-1), B emission probability (generation probability) p (mi|si), C initial state probability)

Reiterate that Recap,a,b is called the hmm parameter, the process of estimating and calculating these two parameters is called model training

First of all, there are 3 basic questions around the HMM model:

1, the observation sequence (given a model (A,B,C), to find a specific output sequence (m1,m2 ... ))

2, given (A,B,C) and 1 observation sequences, seek s1,s2,s3 ...

3, Estimating Hmm parameters ( training of the model )

1, the Forward-barkward algorithm can be used to solve

2, the Viterbi algorithm can be used to solve

3, using Bao-Welch algorithm

We're discussing 3,3, using the Bao-Welch algorithm

Bao-Welch algorithm,

First, we find a model parameter, called M0, that generates an output sequence o, which is obviously present, because the transfer probability and the output probability are evenly distributed, and the model can produce any output o.

And then we're going to find a better model M1, assuming it solves the

1, the observation sequence (given a model (A,B,C), to find a specific output sequence (m1,m2 ... ))

2, given (A,B,C) and 1 observation sequences, seek s1,s2,s3 ...

Not only can we calculate the probability p for this model to produce O (o| M0), and you can generate all possible paths to o based on this model (records how many times each state passes SI, which states are si+1, which symbols o are exported), and the probabilities of these paths, so you can know the probability of the emission and the probability of the transfer, these two parameters constitute another called the new model M1.

Can prove P (o| M1) >p (o| M0)

Then from M1, to find a better model M2, has been found that the quality of the model is no longer improved, the algorithm is constantly estimated every time, so that the output probability maximization, the process is called the expected maximization of EM.

Hey,,,, say so much, the rest of the next time (EM, Viterbi, and Forward-barkward).

These are all their own thinking, do not know right, hope to correct.

The training of Hidden Markov model and hidden Markov model

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.