Hidden Markov Model HMM (2)

Source: Internet
Author: User

Hmm Definition

Cui xiaoyuan Translation

Hmm is a triple (, a, B ).

The vector of the initial state
Probabilities;

The state transition matrix;

The
Confusion matrix;

Among them, all the State transfer probabilities and obfuscation probabilities remain unchanged throughout the system. This is also the most impractical assumption in HMM.

Hmm Application

There are three main applications: the first two are pattern recognition and the last one is used as parameter estimation.

(1) Evaluation

Find the probability of an observed sequence based on known hmm.

This type of problem is assumed that we have a series of HMM models to describe different systems (such as the weather changes in summer and the weather changes in winter ), we want to know which system has the highest probability of generating the sequence of observed states. On the other hand, the weather system of different seasons is applied to a given sequence of observed States to obtain which system with the highest probability corresponds to the season that is most likely to appear. (That is, how to judge the season Based on the sequence of observed states ). It also applies to speech recognition.

We will use the Forward Algorithm algorithm to obtain the probability that the observed state sequence corresponds to a hmm.

(2) decoding

Find the most likely hidden state sequence based on the observed Sequence

Looking back at algae and the weather, it is especially important for a blind hermit to judge the weather conditions by feeling the state of algae. We use Viterbi algorithm to solve such problems.

Viterbi algorithms are also widely used in natural language processing. For example, part-of-speech tagging. The literal text information is the observation state, and the part of speech is the hidden state. With Hmm, we can find the most likely syntactic structure in a sentence context.

(3) Learning

Hmm obtained from the observed Sequence

This is the most difficult hmm application. That is, a triple HMM (, a, B) is generated based on the observed sequence and its hidden state ). So that the triple can best describe a symptom we see.

We use forward-backward algorithm to solve common problems in reality-transfer matrices and confusion matrices cannot be obtained directly.

Summary three problems that hmm can solve

  1. Matching the most likely system to a sequence of observations-evaluation, solved using the forward algorithm;

     

  2. Determining the hidden sequence most likely to have generated a sequence of observations-decoding, solved using the Viterbi algorithm;

     

  3. Determining the model parameters most likely to have generated a sequence of observations-learning, solved using the forward-backward algorithm.
4-1) Forward Algorithm

Find the probability of the observed Sequence

Cui xiaoyuan Translation

Finding the probability of an observed Sequence

1. exhaustive search

For the relationship between algae and the weather, we can use the exhaustive search method to go to the following state transition chart (trellis ):

In the figure, the line between each column and the adjacent column is determined by the state transfer probability, while the observed state and the hidden state of each column are determined by the confusion matrix. If we use the exhaustive method to calculate the probability of a sequence of observed states, the sum of the probabilities of all possible sequences of weather states is required, there are 3*3 = 27 possible sequences in this trellis.

Pr (dry, damp, soggy | hmm) = Pr (dry, damp, soggy | sunny, sunny, sunny) + Pr (dry, damp, soggy | sunny, sunny, cloudy) + Pr (dry, damp, soggy | sunny, sunny, rainy) + .... pr (dry, damp, soggy | rainy, rainy, rainy)

It can be seen that the computing complexity is very large, especially when the state space is large and the observation sequence is very long. We can use Time immutability of ProbabilitySolve complexity. 2. recursive method is used to reduce the complexity. We use recursion to calculate the probability of the observed sequence. First, we define Partial ProbabilityIt is the probability of reaching a certain intermediate state in trellis. In the following article, we represent the observed state sequence with a length of T as: 2a. partial probabilities ('s) indicates the sum of all paths that may reach a certain State in trellis when calculating the probability of an intermediate state. For example, the probability that the state is cloudy at t = 2 can be calculated using the following path: T (j) indicates that at t time
Partial probability of State J. The calculation method is as follows: T (j) = Pr (observation
| Hidden state is j) * partial probability of the last observed state of Pr (all paths to state J at time t) indicates the probability of all possible paths of these States. For example:

This indicates that the sum of the final probability is the sum of all possible paths in trellis, that is, the probability of the observed sequence under the current HMM. Section 3 describes how to calculate probability based on dynamic results. 2b. Calculate the partial probability of the initial state. The formula for calculating the partial probability is T (j) = Pr (observation
| Hidden state is j) x Pr (all paths to state J at time t) but in the initial state, no path reaches these states. Then we use probability multiplied by associated observation probability to calculate: in this way, the partial probability of the State at the initial moment is only related to its own probability and the probability of the observed state at the moment. Hidden Markov Model HMM (4-2) Forward Algorithm

Cui xiaoyuan Translation

This document describes how to calculate the partial probability of the initial state in forward algorithm. We will continue this time.

2c. How to calculate the partial probability of T> 1 moment

Recall how we calculate part of the probability:

T (j) = Pr (observation | hidden
State is j) * Pr (all paths to state J at time t)

We can see that the first item in the product (through recursion) is available. So how can we get pr (all paths to state J at time t?

To calculate the probability of all paths arriving at a state, it is equal to the sum of all paths arriving at this state:

As the number of sequences increases, the number of paths to be calculated increases exponentially. However, at the T moment, we have calculated all the partial probabilities that reach a certain State. Therefore, the partial probabilities of a certain State at the t + 1 moment are only related to the T moment. The meaning of this formula is the appropriate observation probability (the probability of the observed state that t + 1 sees at the moment under State J) multiply by the sum of all probabilities that reach this State (the probability of all States in the previous moment and the product of the corresponding transfer probability ). Therefore, when we calculate the probability of t + 1, we only use the probability of T moment. In this way, we can calculate the probability of the entire observed sequence. 2d. Complexity comparisonFor the observed sequence length t, the complexity of the exhaustive method is T's exponential level, while the complexity of Forward Algorithm is T's linear. ========================================================== ==================== Forward Algorithm's complete definitionWe use the forward algorithm to calculate the probability of a t long observation sequence;

Where each of the Y is one of the observable set. Intermediate probabilities ('s)
Are calculated recursively by first calculating
All States at T = 1.

Then for each time step, t = 2,..., T, the partial probability is
Calculated for each State;

That is, the product of the appropriate observation probability and the sum over all possible routes to that State, exploiting recursion by knowing these values already for the previous time step. finally the sum of all partial probabilities gives the probability
Of the observation, given the HMM ,.
========================================================== ====================

We also use the weather example to illustrate how to calculate the partial probability of the State cloudy at t = 2.

How is it? Now we can see that it is quite clear. If you still don't understand it, I will ..................... there is another way to look at the animation effect: Ghost (it is junk if no algorithm is applied ), select a model (the one with the highest probability) that best reflects a given sequence of observed states from several hmm models ). Forward Algorithm (done)

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.