Hidden Markov model-a basic model and three basic problems

Source: Internet
Author: User

Hidden Markov model-a basic model and three basic problems

This study will tell the hidden Markov chain, which is a particularly common model, in the natural language processing of the application is also very much.

Hidden Markov models can be used for common applications such as Word segmentation, POS tagging, named entity recognition, and other problem sequence labeling problems.

Below, I based on my own understanding for example to explain the basic model of HMM and three basic questions, I hope to understand the help ~

1

Implicit Markov model definition

The Hidden Markov model is a probabilistic model of time series, which describes the random generation of non-observable state random sequences by a hidden Markov chain, and then generates an observation by each State to generate an observation random sequence.

A sequence of randomly generated states of a hidden Markov chain, called a state sequence, generates an observation of each state, and the resulting random sequence of observations, called the observed sequence (observation sequence).

Each position of the sequence can be viewed as a moment.

Here we introduce some symbols to represent these definitions:

Set Q is the set of all possible states, and V is the set of all possible observations.

where n is the number of possible states, and M is the number of possible observations.

The status Q is invisible and the observed V is visible.

Applied to part-of-speech tagging, V stands for words, which can be observed. Q means that we want to predict the part of speech (a word may correspond to multiple parts of speech) is implied state.

When applied to participle, V stands for words, which can be observed. Q represents our labels (b,e these tags, which represent the beginning of a word, or the middle, etc.)

Applied to named entity recognition, V stands for words, which can be observed. Q represents our label (the label stands for the place word, time term these)

The above mentioned methods, interested students can be further into the relevant information.

I is a sequence of states of length T, and O is the corresponding observation sequence.

We can be seen as a training set given a word (O) + part of speech (I).

or the training set of a word (O) + participle tag (I) .... With the training data, then plus the training algorithm is a lot of problems can be solved, the problem is slowly coming ~

We continue to define a as the state transition probability matrix:

which

is the probability that at the moment T is in the state Qi, the t+1 shifts to the state qj at all times.

b is the observation probability matrix:

which

is the probability that the observed VK is generated at the moment T is in the state QJ (the so-called "emission probability").

So what we see in other data is that the probability of generation and the probability of emission are actually a concept.

π is the initial state probability vector:

which

The hidden Markov model is determined by the initial state probability vector π, the state transition probability matrix A and the observation probability matrix B. π and a determine the state sequence, and B determines the observation sequence. Therefore, the hidden Markov model can be expressed in ternary notation, i.e.

Three elements called Hidden Markov models.

If a specific state set Q and the observed sequence v are added, the five-tuple of Hmm is formed, and this is all part of the hidden Markov model.

2

The following is a brief introduction to the three basic problems of hidden Markov chains:

Probability calculation problem.

  

For example, the example is as follows: (example from Wikipedia)

Consider a village where all villagers are healthy or have a fever, and only villager doctors can determine if everyone has a fever. The doctor diagnoses the fever by asking the patient's feelings. Villagers can only answer that they feel normal, dizzy or cold. (Here's normal, dizzy, cold is the observation sequence we said earlier)

Doctors believe that the health status of his patients as a discrete chain of Markov. "Health" and "fever" have two states, but doctors cannot directly observe them; health and fever are hidden (health and fever here is the hidden state we said earlier). Every day the patient is given the opportunity to tell the doctor whether he or she is "normal", "cold" or "dizzy" according to the patient's health condition.

Observation (normal, cold, dizziness) and hidden State (health, fever) forms the Hidden Markov model (HMM) and can be expressed in the Python programming language as follows:

  

In this code, start_probability represents the doctor's belief in the state of the HMM when the patient first visits (he knows that the patient is often healthy).

The specific probability distribution used here is not balanced, it is (given the transfer probability) about {' Health ': 0.57, ' Fever ': 0.43}. (This represents the initial state probability pi we said earlier.)

Transition_probability represents changes in health conditions in the underlying Markov chain. In this case, there is only a 30% chance today, and if he is healthy today, the patient will have a fever. The probability of emission represents the likelihood of a patient feeling every day. If he is healthy, then there is a 50% chance to feel normal, and if he has a fever, then 60% of the chance to feel dizzy.

Then the graph indicates that the above example can be shown as follows:

  

OK, so we're done with the example, and now we're going to continue to describe the first question by employing words.

The first problem is to ask for the probability of the occurrence of an observation sequence given the model.

For example, given the HMM model parameters are known, find out three days to observe is (dizzy,cold,normal) probability is how much.

The corresponding hmm model parameter known meaning, that is, the A,b,pi matrix is already known.

Learning problems

According to the example above, the second problem is.

We already know that the observation sequence is (dizzy,cold,normal), the need to ask for a hmm parameter problem (so that our observation sequence appears the most probability). That's what we're talking about. A,b,pi Three matrix parameters

Predicting problems

  

According to the example above, the third problem is.

We know that the observation sequence is (dizzy,cold,normal), and we know the parameters of hmm, so that we can find the state sequence that is most likely to correspond to this observation sequence. For example (Healthy,healthy,fever) or (healthy,healthy,healthy), and so on, there are 3 of 3 27 kinds of possible ~

Thanks:

Hao Yu, Tokugawa, Hao, Shi

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.