Introduced
Tri Xiaoyuan Translation
We are usually used to finding the law of the change of a thing over a period of time. In many areas we want to find this pattern, such as the order of instructions in a computer, the order of words in a sentence, the order of words in a speech, and so on. One of the most applicable examples is weather forecasts.
First, this article will introduce the system of the claim probability pattern, used to predict the weather changes
Then we analyze a system where we want the state of the prediction to be hidden behind the surface, not the phenomenon we observe. For example, we will predict the state of the weather according to the appearance of the observed plant algae.
Finally, we will use the established model to solve some practical problems, such as analyzing the weather state of these days according to the observation records of some algae.
Generating Patterns
There are two modes of generation: deterministic and non-deterministic.
Deterministic generation patterns : Like the traffic lights in everyday life, we know that each lamp has a fixed pattern of change. We can easily determine the next state based on the current state of the lamp.
non-deterministic generation modes: For example, sunny, cloudy, and Rainy. Unlike the traffic lights, we are not sure about the weather conditions for the next moment, but we want to be able to generate a pattern to get the rule of the weather. We can simply assume that the current weather is only relevant to the previous weather conditions, which is known as the Markov hypothesis. Although this is an approximate estimate, some information will be lost. But this method is well suited for analysis.
The Markov process is that the current state is only related to the first n states. This is called the N-order Markov model. The simplest model is the first-order model when n=1. The current state is only relevant to the previous state. (Note the difference between this and the deterministic generation pattern, where we get a probabilistic model). Are all possible weather transitions:
For a first-order Markov model with M states, there is a total of m*m state transitions. Each State transfer has a certain probability, we call the transfer probability, all the transfer probability can be expressed in a matrix. Throughout the modeling process, we assume that the transfer matrix is constant.
The meaning of the matrix is: If yesterday is clear, then today is sunny probability is 0.5, cloudy probability is 0.25, the probability of rain is 0.25. Note that the sum of probabilities for each row and column is 1.
Also, at the beginning of a system, we need to know an initial probability, called a vector.
So far, we have defined a first-order Markov model, which includes the following concepts:
Status: Sunny, cloudy, rain
State transition Probability
Initial probability
Markov models also need to be improved!
Tri Xiaoyuan Translation
When a hermit cannot predict the weather by observing the state of the weather directly, he has some algae. Folklore tells us that the state of algae has a certain probability relationship with the weather. In other words, the state of algae is closely related to the weather. At this point, we have two sets of states: The observed state (the state of the algae) and the implied state (weather State). Therefore, we would like to get an algorithm that can be used for hermits through algae and Markov processes, without the direct observation of the weather conditions to get the weather change situation.
One application that is easier to understand is speech recognition, our problem definition is how to predict the original text information by the given speech signal. Here, the voice signal is the observation state, the recognized text is the implied state.
It is important to note that in any application, the number of observed states and the number of implied states may be different. Here we use the hmm of hidden Markov model to solve this kind of problem.
HMM
is a transition diagram of two kinds of states in the weather example, we assume that the hidden state is described by the first-order Markov process, so they are connected to each other.
The connection between the hidden state and the observed state: the probability of a particular hidden state corresponding to the observed state in a given Markov process. We can also get a matrix: Note that the sum of each row (all observed states corresponding to the hidden state) is 1. In this, we can get all the elements of hmm: two types of states and three groups of probabilities: observation state and hidden state; three groups of probabilities: initial probability, state transition probability and two-state correspondence probability (confusion matrix)
HMM definition
Tri Xiaoyuan Translation
Hmm is a ternary group (, A, b).
The vector of the initial state probabilities;
The state transition matrix;
The confusion matrix;
In this, all the state transition probabilities and the confusion probabilities are invariable in the whole system. This is also the most unrealistic hypothesis in Hmm.
Application of Hmm
There are three main applications: the first two are pattern recognition after one as parameter estimation
(1) Assessment
The probability of a observed sequence is found based on the known Hmm.
The problem is that we have a series of HMM models that describe different systems (such as summer weather patterns and winter weather variations), and we want to know which system generates the most probability of observing the state sequence. Conversely, the weather system of different seasons is applied to a given observation state sequence, which is the most probable season for which system has the greatest probability. (that is, how to judge the season according to the observed state sequence.) It also has the same application in speech recognition.
We will use the forward algorithm algorithm to get the probability of observing the state sequence corresponding to a hmm.
(2) decoding
Find the most probable sequence of hidden states based on the observed sequence
Recalling the algae and weather examples, it is particularly important that a blind hermit can judge weather conditions only by feeling the state of the algae. We use Viterbi algorithm to solve this kind of problem.
The Viterbi algorithm is also widely used in the field of natural language processing. such as POS tagging. The literal message is the observation state, and the part of speech is the hidden state. By hmm we can find a sentence in the context of the most likely syntactic structure.
(3) Learning
To derive the hmm from the observation sequence
This is the most difficult hmm application. That is, a ternary hmm (, A, b) is generated based on the observed sequence and the implicit state of its representation. So that this ternary group can best describe the law of a phenomenon we see.
We use Forward-backward algorithm to solve the problems that often arise in reality-the situation that the transfer matrix and the confusion matrix cannot be directly obtained.
Summarize three kinds of problems that HMM can solve
- Matching the most likely system to a sequence of observations-evaluation, solved using the forward algorithm;
- Determining the hidden sequence most likely to has generated a sequence of observations-decoding, solved using the Vite RBI algorithm;
- Determining the model parameters most likely to has generated a sequence of observations-learning, solved using the for Ward-backward algorithm.
Four, Hidden Markov model (Hidden Markov Models)
1. Definitions (definition of a hidden Markov model)
A hidden Markov model is a ternary group (PI, A, B).
: initializes the probability vector;
: state transition matrix;
: Confusion matrix;
Each of the probabilities in the state transition matrix and the confusion matrix is time-independent-that is, when the system evolves, these matrices do not change over time. In fact, this is one of the most unrealistic assumptions about the real world of Markov models.
2. Application (Uses associated with HMMs)
Once a system can be described as a hmm, it can be used to solve three basic problems. The first two are pattern recognition problems: the probability of a given hmm to observe a sequence (evaluation); The search is most likely to generate a sequence of hidden states (decoding) of an observation sequence. The third problem is the generation of a hmm (learning) for a given observation sequence.
A) Assessment (Evaluation)
Considering this problem, we have some hidden Markov models that describe the different systems (that is, the set of some (pi,a,b) triples) and an observation sequence. We want to know which hmm is most likely to produce this given observation sequence. For example, for seaweed, we might have a "summer" model and a "winter" model, because the situation is different between seasons-we might want to determine the current season based on the observed sequence of algae humidity.
We use the forward algorithm (forward algorithm) to calculate the probability of an observation sequence for a given hidden Markov model (hmm), and therefore choose the most appropriate hidden Markov model (hmm).
This type of problem in speech recognition occurs when a large number of Markov models are used, and each model is modeled on a particular word. An observation sequence is formed from a pronounced word, and the word is identified by a hidden Markov model (HMM) that is most likely to be found for this observation sequence.
b) decoding (decoding)
searches for the most probable sequence of hidden states for a given observation sequence.
Another related issue, and one of the most interesting, is the search for a sequence of hidden states that generate output sequences. In many cases we are more interested in the hidden state of the model because they represent something more valuable, which is not normally observed directly.
Considering algae and the weather, a blind hermit can only feel the state of the algae, but he wants to know more about the weather and the state of the weather is hidden in this case.
We use the VITERBI algorithm (Viterbi algorithm) to determine (search) the known observed sequence and the most likely hidden state sequence under Hmm.
Another widespread application of the VITERBI algorithm (Viterbi algorithm) is the part-of-speech tagging in natural language processing. In part-of-speech tagging, the words in a sentence are observation states, and the part of speech (grammar category) is hidden (note that for many words, such as wind,fish has more than one part of speech). For each word in a sentence, by searching for its most likely hidden state, we can find the most probable part-of-speech label for each word in a given context.
C) Study (learning)
The Hidden Markov model is generated according to the observation sequence.
The third problem, which is also the hardest of the HMM-related problems, is to estimate the most suitable hidden Markov model (hmm), which is the most suitable (PI,A,B) ternary to describe the known sequence, based on an observation sequence (from a known set) and a hidden state set associated with it.
When matrices A and B cannot be measured directly (estimated), the forward-and-back algorithm (Forward-backward algorithm) is used for learning (parameter estimation), which is also a common case in practical applications.
3. Summary (Summary)
The hidden Markov model, described by a vector and two matrices (pi,a,b), is of great value to the actual system, although it is often only an approximation, but they can withstand the analysis. The problems commonly solved by hidden Markov models include:
1. For an observation sequence matching the most probable system-evaluation, using the forward algorithm (forward algorithm) to solve;
2. For an observed sequence that has been generated, determine the most likely hidden state sequence-decoding, using the VITERBI algorithm (Viterbi algorithm) to solve;
3. For the observed sequences that have been generated, determine the most likely model parameters-learning, using the forward-and-back algorithm (Forward-backward algorithm) to solve.
To be continued: Forward Algorithm 1
This article translated from: http://www.comp.leeds.ac.uk/roger/HiddenMarkovModels/html_dev/main.html
Partial translation reference: Hidden Markov model hmm self-study
Reprint Please specify the source "I love Natural Language processing": www.52nlp.cn
This article link address: http://www.52nlp.cn/hmm-learn-best-practices-four-hidden-markov-models
reprint this article please contact the original person to obtain the authorization, at the same time please indicate this article from Liu Bin Science net blog.
Link Address:http://blog.sciencenet.cn/blog-641976-533895.html
Hidden Markov model (Hidden Markov model,hmm)