One-day algorithm (2) Hidden Markov model (2)

Source: Internet
Author: User

On the hidden Markov model, this is the most praise of the answer, feeling very interesting, special reproduced in the next, convenient to review.

Nong Bloody
Links: http://www.zhihu.com/question/20962240/answer/33561657
Source: Know

1. Casino (Background introduction)

Recently, a casino boss found that the business is not smooth, so sent his men to the casino to look. After the scout returns, there is an uncle in the casino always win money, play a good dice , almost invincible. And every time you play the dice around there are several bodyguards standing around, let people unknown, can only see every opening, dice fly out , steady landing. Based on years of experience, the boss speculated that the poor guest used is the "Change Dice Dafa" (Editor Note: Change dice Dafa, with pocket of the dice secretly replaced the dice). The boss is a calm person, see this uncle is not good, do not want to offend him easily, and don't want to let him bad rules. Is worried about the heart, this time came in a name hmm handsome, tell the boss he has a good solution.

Do not close to the body, as long as a camera in the distance, the dice of each game points are recorded.

Then the HMM handsome man will use its powerful mathematical internal force, using these data to push the export

1. Is the uncle in the thousands?

2. If it is in the thousands, then he used a few cheat dice? There is not currently cheating on the dice.

3. What is the probability that these cheat dice appear each point?

God, the Boss a listen, this is called hmm even don't close body, can calculate is not cheating, even can calculate the other people cheat dice is what kind of. Then, as long as he cheats, send someone to round him up, on the spot to verify the dice can make him speechless.

2. Who is hmm?

The owner of the casino also made a survey of hmm before the hmm was investigated.

HMM (Hidden Markov model), also known as Hidden Markov models, is a probabilistic model used to describe the transition of a system's recessive state and the performance probability of a recessive state .

The recessive state of the system refers to a state in which the outside world is inconvenient to observe (or observe), as in the current example, the state of the system refers to the state of the uncle using the dice, i.e.

{Normal dice, cheat dice 1, Cheat dice 2,...}

the manifestation of recessive state is the outward manifestation characteristic which can be observed and produced by recessive state. Here is the point that the dice throw.

{1,2,3,4,5,6}

The HMM model will describe the transition probability of the recessive state of the system. That is, uncle switching the probability of the dice, is an example, this time Uncle switch the possibility of the dice is described vividly.

Fortunately, such a complex probability transfer diagram can be expressed in a simple matrix, where A_{ij} represents the probability of the occurrence from the I state to the J State.

Of course, there will also be, recessive state performance transfer probability. That is, the probability distribution of each point of the dice, (e.g. cheat dice 1 can have 90% chance to throw six, cheat Dice 2 have 85% chance to throw to ' small '). Give a picture as follows,

The probability of the representation distribution of the recessive state can also be expressed by a matrix.

Summing up these two things is the whole HMM model.

This model describes the probability of the transformation of the recessive state, and also describes the distribution of the probability of the external representation of each state. In short, the HMM model can describe the rate at which the Uncle cheats (the probability of dice replacement), and the probability distribution of the dice used by the uncle. With the uncle's HMM model, you can see through the uncle, let him completely in the sun.

3. What can hmm do!

To sum up, hmm can deal with three problems,

3.1 Decoding (decoding)

Decoding is needed from a series of dice, see which some dice are used to cheat the dice, which is the normal dice.

For example, a series of dice (3,6,1,2 ...) is given. And the uncle's HMM model, we want to calculate which of the dice results (recessive state performance) may be the result of what kind of dice (recessive state).

3.2 Study (learning)

Learning is, from a series of dice, learn to Uncle Switch dice probability, of course, there are the probability of the distribution of these dice points. This is the most frightening and most complicated trick of hmm!!

3.3 Estimate (Evaluation)

It is estimated that, in the case where we have known the uncle's HMM model, it is possible to estimate the probability of a certain series of dice appearing. For example, in the case where we already know the uncle's HMM model, we can directly estimate the probability that the uncle throws 10 6 or 8 1.

4. How does hmm do it?
4.1 Estimate

Estimation is the easiest one, and we can easily estimate it when we know the uncle's HMM model completely.

Now we have the uncle. The state transfer probability matrix, a, B, can be estimated. For example, we want to know what the probability of this uncle throwing 10 6 in a row in the next game? As follows

This means that, at the beginning of the recessive state (S0) is 1, that is, the first to hold a normal dice , the uncle has thrown 10 consecutive 6 probability.

Now that the problem is difficult, we know the probability of the conversion of Hmm, and the observed state v{1:t}, but we do not know the actual recessive state changes.

Well, we don't know the change of the recessive state, well, we first assume a recessive state sequence , assuming that the first 5 of the uncle with the normal dice, after 5 with the Cheat dice 1.

Well, then we can calculate the probability of throwing 10 6 in this implicit sequence hypothesis.


This probability is actually the product of the probability B of the recessive state.

But the problem arises again, the recessive sequence of states that I assumed, and the actual sequence I don't know, what to do. OK, all the possible combinations of hidden sequences can be tried once. So


R is the set of all possible sequences of hidden states. Well, now the problem seems to be solved, and we've been able to get the probability value by trying all the combinations, and we can calculate the total probability of the occurrence by a A-B matrix.
But the problem arises again, the possible set is too big, for example there are three kinds of dice, there are 10 times choice opportunity, then the total combination will have 3^10 times ... This magnitude O (c^t) is too large, and when the problem is a little larger, the number of combinations will be larger than the calculation. So we need a more efficient way to calculate the P (V (1:t) probability.
For example, an algorithm that calculates the computational complexity of P (v1:t) can be reduced to O (CT).

With this equation, we can deduce from the t=0 of the situation, and always derive the probability of P (v1:t). Let's calculate the probability of the uncle throwing 3,2,1 this dice sequence (assuming the initial state is 1, that is, the old uncle was holding a normal dice)?
4.2 Decoding (decoding)

The decoding process is to find the most likely recessive state sequence in the case of a sequence of sequences and a known HMM model.

The mathematical formula indicates that (V is the visible sequence, W is the recessive state sequence, and a, B is the HMM state transition probability matrix)

(There are too many formulas, please look at the derivation machine learning in my blog ---4. The secret Spy hmm (hidden Markov) rounding up the casino old thousand )

The maximum P (w (1:t), V (1:t)) can then be calculated using the forward derivation method in the estimation (4.1) .

After the forward derivation method is completed, the back-tracking Method (Tracking) is used to solve the implicit sequence that can make this P (W (1:t), V (1:t)) the largest. This algorithm is called the Viterbi algorithm (Viterbi algorithm).

4.3 Study (learning)

Learning is given the structure of the HMM (for example, assuming that the Uncle has 3 dice, each with 6 sides), the most likely model parameters are calculated.

(There are too many formulas, please look at the derivation machine learning in my blog ---4. The secret Spy hmm (hidden Markov) rounding up the casino old thousand )
5. Application of HMM

The above example is using HMM to model and analyze the dice. Of course, there are many HMM classic applications, can be based on different application needs, to model the problem.

But the problem of modeling using HMM must meet the following conditions,

1. the transfer of recessive state must satisfy the Markov nature. (Markov nature of State transitions: a state is related only to the previous state)

2. The recessive state must be able to be presumably estimated.

In the case of satisfying the condition, we can determine what the recessive state is in the problem and what the recessive state may be.

The problem with HMM is that the real state (implicit) is difficult to estimate, and there is a connection between state and state.

5.1 Speech recognition

Speech recognition problem is the process of converting a speech signal into a text sequence . In a question.

Recessive state is: the word sequence corresponding to the speech signal

And the dominant state is: voice signal.

Hmm model Learning (learning): The Model learning of speech recognition differs from the most probable model established by observing the dice sequence above. There are two steps to learning the HMM model of speech recognition:

1. The probability of pronunciation of statistical text, establishing the recessive performance probability matrix B

2. Statistical conversion probabilities between words (this step does not need to take into account the voice, you can directly count the transfer probability between words)

Speech model Estimation (Evaluation): Calculates the probability of "is 14", "44" and so on, comparing the most likely occurrence of the text sequence.

5.2 Handwriting recognition

This is a similar to speech, but the process of handwriting recognition is the image of the word as a dominant sequence.

5.3 Chinese participle

It is well known that in Chinese, there is no delimiter between words and words (in English, the words are separated by a space, this is a natural word mark), and the words themselves lack obvious morphological marks, so the peculiar problem of Chinese processing is how to divide the string into reasonable word order. For example, the English sentence: You should go to kindergarten now natural space has been divided into words, only need to remove the preposition "to" can be, and "You should go to kindergarten," the phrase said the same meaning there is no obvious delimiter, the purpose of Chinese participle is to get "you /now/should/GO/kindergarten/up ". So how to do participle? There are three main methods: The 1th is the rule method based on linguistic knowledge, such as: the maximum matching of various forms, the least slicing method, and the 2nd is a machine learning method based on large-scale corpus, which is a relatively extensive and effective solution at present. The statistical models used include n-meta-language model, channel-noise model, maximum expectation, hmm, and so on. The 3rd category is also used in the actual word segmentation system, that is, the synthesis of rules and statistics and other multi-class methods. [1] using hmm for Chinese word segmentation.

5.4 Hmm implement Pinyin input method

Pinyin Input method, is an estimate phonetic alphabet corresponding to the text you want to input (recessive state) of the process (for example, "pingyin" pinyin)

Using hmm to implement simple pinyin input method

One-day algorithm (2) Hidden Markov model (2)

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.