Hidden Markov Model-HMM-Summary-2-evaluation-decoding-Learning

Last Update:2018-12-05 Source: Internet

Author: User

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

1. Definitions of three typical hmm Problems

The commonalities of the three problems are part of the known model. The difference is that the known part and the required part are different.
Evaluation: model parameters are known to calculate the probability of a specific output sequence. Generally, the forward algorithm is used.
Decoding: known model parameters are used to find the sequence that is most likely to generate an implicit state of a specific output sequence. Generally, the Viterbi algorithm is used.
Learning: the output sequence is known to find the most likely state transition and output probability. Generally, the Baum-welch algorithm and the reversed Viterbi algorithm are used.

2. Three Typical Problems of part-of-speech tagging and hmm

In terms of part-of-speech tagging, it is necessary to give the corresponding part-of-speech sequence when word sequences are known. Word sequences are observability, So words are used as observation states and parts of speech as hidden states.
Evaluation: calculate the probability of a specified word sequence based on a given model, which makes no sense in part-of-speech tagging.
Decoding: Based on the given model and word sequence, find the most probable part-of-speech sequence. In part-of-speech tagging, This is the prediction stage.
Learning: Find the most likely hmm parameter (a, B) based on the word sequence and part-of-speech sequence. In part-of-speech tagging, This is a learning stage.

3. Evaluation-forward algorithm-Example
Assume that there are m kinds of weather and N kinds of activities. Hmm parameters are a [m] [m], B [m] [N], and P [M]. The specified observation sequence is int o [Len];
3.1 exhaustion
Calculate the values of each hidden sequence, specify the probability value of the observed sequence, and then sum the values. This is a different hidden sequence of M ^ LEN, which is time-consuming.
3.2 Forward Algorithm

In fact, there are many repeated computations in the exhaustive method. The forward algorithm uses the existing results to reduce repeated computations.
Definition: P [LEN] [M], where P [I] [j] indicates the sequence from day 1 to day I, and the sum of the probabilities of all possible hidden sequences in the I-day hidden state is Sj. The final result is P [LEN-1] [0] +... + P [LEN-1] [M-1], that is, the sum of the probability values of the observed sequence in all hidden states on the last day.

P [0] [J] = P [J] * B [J] [O [0];
// J =, 2,..., M-1, first day calculation, initial probability of the state, multiplied by the probability of the hidden state to the observed state.
P [I] [J] = {P [I-1] [0] * A [0] [I] + P [I-1] [1] * A [1] [I] +... + P [I-1] [M-1] * A [M-1] [I]} * B [J] [O [I];
// I> 1, j = 0, 1, 2 ,..., m-1, calculated after the first day, first from each State of the previous day, to the probability sum of the current state, and then multiply the probability of the hidden state to the observed state.

Complexity: The first day is M multiplication. From the second day, every day is M ^ 2 multiplication, T = O (LEN * M ^ 2 ).

4. decoding-Viterbi algorithm-distance

Assume that there are M kinds of weather and N kinds of activities. HMM parameters are A [M] [M], B [M] [N], and P [M]. The specified observation sequence is int O [LEN];
4.1 exhaustion
First, recall that in the above evaluation algorithm, the actual calculation is the probability and that different hidden sequences of M ^ LEN meet the observed sequence, here, we need to calculate the maximum probability that different hidden sequences of M ^ LEN meet the observed sequence. Similarities: the factors are the same. Differences: The evaluation calculates the sum, and the decoding calculates the maximum value.
4.2 Viterbi Algorithm
Definition: Max [LEN] [M], where P [I] [j] indicates the sequence from day 1 to day I, the maximum probability of all possible hidden sequences in the I-day hidden state of Sj. Path [LEN] [M], where Prev [I] [j] = prev, max [I] [j] = Max [I] [prev] * A [prev] [j];

Max [0] [J] = P [J] * B [J] [O [0];
// J =, 2,..., M-1, first day calculation, initial probability of the state, multiplied by the probability of the hidden state to the observed state.
Path [0] [J] =-1;
// The path above the first day is empty, represented by-1.
Max [I] [J] = max {P [I-1] [k] * A [k] [I]} * B [J] [O [I];
// I> 1, j = 0, 1, 2 ,..., m-1, K =, 2 ,..., m-1, calculated after the first day, first from each State of the previous day, to the maximum probability of the current state, and then multiply the probability of the hidden state to the observed state.
Path [I] [J] = Prev;
// The Prev is the K value corresponding to max {P [I-1] [k] * A [k] [I.

Time Complexity: similar to the forward algorithm, the sum is changed to the maximum value. In addition, a Path array is recorded, which is also O (LEN * M ^ 2)

5. Learning

Learning is the process of obtaining HMM parameters based on several observed sequences and corresponding hidden sequence samples. The common algorithm is the Baum-Welch algorithm, also known as the forward and backward algorithms. This algorithm is a special case of the EM algorithm. First, you need to understand the EM algorithm. Since I still don't know the EM algorithm at half past one, I will write it here first, and then I will separately supplement the EM algorithm and the forward and backward algorithms.

6. Highlights

Forward Algorithm: A known model used to calculate the probability of a specified observed sequence. This requires the sum of probabilities of each hidden sequence.
Viterbi algorithms: known models are used to calculate hidden sequences that specify the maximum probability of observed sequences. A Sequence in each hidden sequence is required.
The calculation process is the same, but one is the sum of the calculated values, and the other is to calculate the maximum value + path.

7. Reference

Forward Agorithm in HMM)
Http://www.suzker.cn/computervision/forward-agorithm-for-hmm.html
Viterbi Agorithm in HMM)
Http://www.suzker.cn/computervision/viterbi-algorithm-for-hmm.html
I love several articles in natural language processing
Http://www.52nlp.cn/category/hidden-markov-model

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More

Hidden Markov Model-HMM-Summary-2-evaluation-decoding-Learning

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support

Hidden Markov Model-HMM-Summary-2-evaluation-decoding-Learning

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

Trending Topic

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support