This article is original article, reprint please indicate source, http://www.cnblogs.com/ycwang16/p/5999034.html
The Bayes filtering method is described earlier, and we'll talk about the Kalman filter in detail next. Although the Kalman filter has been widely used, there are many tutorials, but we in the Bayes filter framework, to understand the design of the Kalman filter, to understand the use of the Gaussian model to approximate the state distribution of the multiGaussian filter (Guassian Multihyperthesisfilter) and so on are helpful.
I. Background Knowledge Review 1.1 Bayes filter
First review the Bayes filter. Bayes filtering is divided into two steps:1. State prediction ; and 2. Status update
1. State prediction, based on the state transition model:
$\overline {Bel} ({x_t}) = \int {p ({x_t}{ U_t},{x_{t1}})} \;bel ({x_{t1}}) \;d {x_{t1}}$
2. Status updates, based on new observations
$bel ({x_t}) = \;\eta \,p ({z_t}{ x_t}) \,\overline {Bel} ({x_t}) $
We can see that our aim is to calculate the posterior probability of the $x_t$, if $bel ({x_t}) $ is arbitrary distribution, we need to calculate the probability of the value at all possible points of $x_t$, which is difficult to achieve in computation. This computational problem can be approximated by a number of methods, such as using the sampling method, which is the particle filter and the nontrace Kalman filter.
The approximate method of this sectionto is that when assuming that $bel ({x_t}) $ obeys the Gauss distribution, then we only need to distribute the mean and variance to fully describe the $bel ({x_t}) $, without having to do probabilistic calculations at every possible point in the $x_t$. This is also the benefit of using a Gaussian distribution to approximate $bel ({x_t}) $, since we can only calculate the mean $\mu_t$ and the variance $\sigma_t$ these two values at each moment, so that the $bel ({x_t}) $ is fully described, So we can deduce the recursive formula of these two values, so that the recursive formula of these two values fully obtains the state estimation at each moment, which is the basic idea of the Kalman filter.
1.2 Normal distribution (Guassian distribution)
Then we'll review the basics of the normal distribution. The normal distribution is a special probability distribution, and the distribution pattern is determined by the second moment completely. The onedollar Gaussian distribution is expressed as follows:
$$\BEGIN{ARRAY}{L}
P (x) \sim N (\mu, {\sigma ^2}): \ \
P (x) = \frac{1}{{\sqrt {2\pi} \sigma}}{e^{\frac{1}{2}\frac{{{{(X\MU)}^2}}}{{{\sigma ^2}}}}
\end{array}$$
Among them, the firstorder moment is the mean value $\mu$ expresses the expectation, the second moment is the variance $\sigma$ indicates the uncertainty degree of the distribution.
The expression of the multivariate Gaussian distribution is:
$$\BEGIN{ARRAY}{L}
P ({\bf{x}}) \sim {\rm N} ({\bf{\mu}},{\bf{\sigma}}): \ \
P ({\bf{x}}) = \frac{1}{{{{(2\PI)}^{d/2}}{{\left {\bf{\sigma}} \right} ^{1/2}}}}{e^{\frac{1}{2}{{({\bf{x}}{\BF{\MU}})}^t}{{\bf{\sigma}}^{1}} ({\bf{x}}{\BF{\MU}}}}
\end{array}$$
Similarly, the firstorder moment is $\bf{\mu}$, which represents the expected value of each metavariable, and the second moment is the variance matrix $\bf{\sigma}$ The uncertainty degree of each metavariable.
1.3 Characteristics of the normal distribution
In the linear transformation, once Gauss, Generation Gauss.
First, the Gaussian variable is still Gaussian after the linear transformation, and the mean and variance are as follows:
$\left. {\begin{array}{*{20}{l}}
{X\sim N (\mu, \sigma)}\\
{Y = AX + b\quad \;}
\end{array}} \right\}\quad \rightarrow \quad y\sim N (A\mu + b,a\sigma {a^t}) $
Then, the linear combination of the two Gaussian variables is still the Gaussian distribution, and the mean and variance are as follows:
$\left. {\begin{array}{*{20}{c}}
{{X_1}\sim N ({\mu _1},\sigma _1^2)}\\
{{X_2}\sim N ({\mu _2},\sigma _2^2)}
\end{array}} \right\} \rightarrow p ({x_1} + {x_2}) \sim N\left ({{\mu _1} + {\mu _2},\sigma _1^2 + \sigma _2^2 + 2\rho {\si GMA _1}{\sigma _2}} \right) $
Finally, the product of two independent Gaussian variables is still Gaussian, and the mean and variance are as follows:
$\left. {\begin{array}{*{20}{c}}
{{X_1}\sim N ({\mu _1},\sigma _1^2)}\\
{{X_2}\sim N ({\mu _2},\sigma _2^2)}
\end{array}} \right\} \rightarrow p ({x_1}) \cdot p ({x_2}) \sim N\left ({\frac{{\sigma _2^2}}{{\sigma _1^2 + \sigma _2^2}}{\ Mu _1} + \frac{{\sigma _1^2}}{{\sigma _1^2 + \sigma _2^2}}{\mu _2},\quad \frac{{\sigma _1^2\sigma _2^2}}{{\sigma _1^2 + \s Igma _2^2}}} \right) $
Because of the Gaussian distribution has these characteristics, so, in the Bayes filter formula of the addition of random variables, multiplication, the analytic formula can be used to calculate the mean and variance, which makes Bayes filter the entire computational process is very simple, that is, the iterative process of Kalman filter.
Two. Kalman Filter 2.1 Kalman filter Model hypothesis
The problem that Kalman filter solves is the problem of state tracking of a dynamic system, the basic model hypothesis includes: 1) the state equation of the system is linear, 2) the observation equation is linear, 3) the process noise conforms to the 0 mean Gaussian distribution, and 4 the observed noise conforms to the 0 mean Gaussian distribution; The Gaussian distribution is operated in a linearly varying space, and the probability density of the state conforms to the Gaussian distribution.
 State equation
${x_t} = {A_t}{x_{t1}} + {b_t}{u_t} + {\varepsilon _t}$
 Observational equations
${z_t} = {h_t}{x_t} + {\delta _t}$
The process noise ${\varepsilon _t}$ hypothesis conforms to the 0 mean Gaussian distribution, and the observed noise ${\delta _t}$ hypothesis conforms to the 0 mean Gaussian distribution. For the above model, we can describe the whole problem with the following parameters:
Model of 2.2 Kalman filter
 $x _t$, $n A $dimensional vector that represents the mean value of the $t$ moment observation state.
 $P _t$, $n *n$ variance Matrix, which represents the variance of $n$ states observed at $t$ time.
 $u _t$, $l A $ dimension vector that represents input at $t$ moment
 $z _t$, $m A $dimensional vector representing observations of $t$ moments
 ${a_t}$, $n *n$ Matrix, representing the way the state transitions from $t1$ to $t$ when no input is affected
 ${b_t}$, $n *n$ Matrix, representing how $u_t$ affects $x_t$
 ${h_t}$, $m *n$ Matrix, representing how the state $x_t$ is converted to observations $z_t$
 ${r_t}$, $n *n$ Matrix, representing the variance matrix for process noise ${\varepsilon _t}$
 $Q _t$, $m *m$ Matrix, representing the variance matrix of the observed noise ${\delta _t}$
Figure 1. The transfer mode of system State from $t1$ to $t$ in the absence of observational conditions
Figure 1 shows that in the absence of observation, only input $u_t$, the mean and variance of the state variable from $t1$ to $t$ transfer mode, visible mean and variance calculation, is based on the Gaussian distribution of the linear change of the method to calculate.
Figure 2. Kalman filter solves the problem of updating the state $x_t$ when the input $u_t$ and the observed $z_t$ are received at t time
Figure 2 shows the problem solved by Kalman filter, that is, how to update the mean and variance of $x_t$ in the case of the input and observation of T time. Of course, $u_t$ and $z_t$ do not need to be acquired at the same time, just like Bayesian filtering, you can make a state prediction when you get $u_t$, and do a status update when you get $z_t$.
2.3 Kalman Filtering algorithm
The overall Kalman filter algorithm is as follows:
kalman Filter ($x _{t1}, p_{ T1}, u_t, z_t$)
 prediction
Li> ${\overline x _t} = {A_t}{x_{t1}} + {b_t}{u_t}$
 ${\overline P _t} = {a_t}{p_{t1}}a_t^t + {r_t}$
 correction
 ${k_t} = {\overline P _t} h_t^t{({h_t}{\overline P _t}h_t^t + {q_t}) ^{1}}$
 ${x_t} = {\overline x _t} + {k_t} ({z_t}{h_t}{\overline x _t}) $
 ${p_ T} = (I{k_t}{h_t}) {\overline P _t}$

 The first line predicts the state of the $t$ moment based on the transfer matrix and control input
 The second line is the predicted variance matrix
 The third line calculates the Kalman gain,Kt
 The fourline status update based on observed new interest
 Row five calculates the variance matrix for the updated state.
You can see that all the subtleties of the algorithm lie in the third and fourth lines. We can understand this by:
 $ ({h_t}{\overline P _t}h_t^t + {q_t}) $ represents the degree of uncertainty observed when observing a state, which is inversely proportional to the Kalman gain KT , indicating the greater the possible noise observed, The smaller the Kalman gain Kt .
 Looking at line fourth,${x_t}$ 's update is to add a $K _t$ multiplied by $ ({z_t}{h_t}{\overline x _t}) on $\overline x_t$. $ ({z_t}{h_t}{\overline x _t}) $ represents the difference between the predicted value and the observation, which is relatively small when both predictions and observations are closer to the real value. When the observation is not allowed, or the prediction is not allowed to be relatively large. While the front multiplier Kt is relatively small when observing the noise, so the whole ${k_t} ({z_t}{h_t}{\overline x _t}) $ is a modifier that indicates the amount of correction used to predict the result using observations.
 When the observed noise is relatively small, the correction amplitude is larger when the prediction error is large.
 When the observed noise is relatively small prediction error, or when the observation noise is relatively large, the correction error amplitude is also relatively small, thus playing a smooth role.
 By using the accurate observation to correct the prediction error, the inaccurate observation correction amount is also small, so it can be corrected quickly when the error is large, and can be gradually convergent when the error is small.
Derivation of 2.4 Kalman filtering algorithm
Here we use the Bayes formula to show how the Kalman filter is derived.
1. The initial state of the system is:
$bel ({x_0}) = N\left ({{\mu _0},{p_0}} \right) $
2. Derivation of the predictive process
The state transition model is a linear function
${x_t} = {A_t}{x_{t1}} + {b_t}{u_t} + {\varepsilon _t}$
Therefore, the conditional probabilities of state transitions from $x_{t1}$ to $x_{t}$ are:
$p ({x_t}{ U_t},{x_{t1}}) = N\left ({{a_t}{x_{t1}} + {b_t}{u_t},{r_t}} \right) $
Reviewing the Bayes formula to calculate the distribution of the predicted state, all possible $x_{t1}$ need to be considered:
$\overline {Bel} ({x_t}) = \int {p ({x_t}{ U_t},{x_{t1}})} {\rm{}}bel ({x_{t1}}) \;d {x_{t1}}$
This is the process of calculating the convolution of two Gaussian distributions, reference [2]:
$\BEGIN{ARRAY}{L}
\overline {Bel} ({x_t}) = \int {p ({x_t}{ U_t},{x_{t1}})} \quad \;\;\quad \quad \quad \quad bel ({x_{t1}}) \;d {x_{t1}}\\
\quad \quad \quad \quad \quad \quad \Downarrow \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \Downarrow \\
\quad \quad \quad \sim n\left ({{A_t}{\mu _{t1}} + {b_t}{u_t},{r_t}} \right) \quad \quad \sim n\left ({{\mu _{t1}},{p _{t1}}} \right) \ \
\quad \quad \quad \quad \quad \quad \downarrow \quad \quad \quad \quad \quad \quad \quad \ \
\overline {Bel} ({x_t}) = \eta \;\int {\exp \left\{{\frac{1}{2}{{({x_t}{a_t}{x_{t1}}{b_t}{u_t})}^t}r_t^{1} ({x_t}{a_t}{x_{t1}}{b_t}{u_t})} \right\}} \ \
\quad \quad \quad \quad \;\quad \exp \left\{{\frac{1}{2}{{({x_{t1}}{\mu _{t1}})}^t}p_{t1}^{1} ({x_{t1 }}{\mu _{t1}}} \right\}\;d {x_{t1}}\\
\overline {Bel} ({x_t}) = \left\{{\begin{array}{*{20}{c}}
{{{\bar \mu}_t} = {A_t}{\mu _{t1}} + {b_t}{u_t}}\\
{{{\overline P}_t} = {a_t}{p_{t1}}a_t^t + {r_t}}
\end{array}} \right.\quad \quad \quad \quad \quad \quad \quad
\end{array}$
So the prediction process of Kalman filter is the analytic expression based on the convolution calculation of two Gaussian distributions.
3. Derivation of the observation update process
The observed equations are also linear, and the noise is Gaussian noise
${z_t} = {h_t}{x_t} + {\delta _t}$
So $p ({z_t}{ x_t}) $ The conditional probability is the linear transformation calculation of the Gaussian distribution:
$p ({z_t}{ x_t}) = N\left ({{h_t}{x_t},{q_t}} \right) $
Reconsider the status update steps of the Bayesian formula
$bel ({x_t}) = \,\quad \eta \quad \,p ({z_t}{ x_t}) \overline {Bel} ({x_t}) $
This is the problem of the product of two Gaussian distributions, reference [2]
$\BEGIN{ARRAY}{L}
Bel ({x_t}) = \,\quad \eta \quad \,p ({z_t}{ x_t}) \quad \quad \quad \quad \quad \quad \overline {Bel} ({x_t}) \ \
\quad \quad \quad \quad \quad \quad \quad \quad \Downarrow \quad \quad \quad \quad \quad \quad \quad \quad \Downarrow \\
\quad \quad \quad \quad \quad \sim n\left ({{z_t};{ h_t}{x_t},{q_t}} \right) \quad \quad \sim n\left ({{x_t};{ {\overline \mu}_t},{{\overline P}_t}} \right) \ \
\quad \quad \quad \quad \quad \quad \quad \quad \downarrow \ \
Bel ({x_t}) = \eta \;\exp \left\{{\frac{1}{2}{{({z_t}{h_t}{x_t})}^t}q_t^{1} ({z_t}{h_t}{x_t})} \right\}\exp \le ft\{{\frac{1}{2}{{({x_t}{{\bar \mu}_t})}^t}\bar p_t^{1} ({x_t}{{\bar \mu}_t})} \right\}\\
\end{array}$
Therefore, based on the method of calculating the distribution of the product of the Gaussian variable, the result can be derived from the Gaussian distribution, which is represented by its secondorder moment:
$bel ({x_t}) = \left\{{\begin{array}{*{20}{c}}
{{\mu _t} = {{\bar \mu}_t} + {k_t} ({z_t}{H_t}{{\bar \mu}_t})}\\
{{p_t} = (I{k_t}{h_t}) {{\overline P}_t}}
\end{array}} \right.\quad \quad {\rm{with}}{k_t} = {\overline P _t}h_t^t{({h_t}{\overline p _t}h_t^t + {Q_t}) ^{1}}$
So, the Kalman gain, the mean value and the variance update formula in the state update are all exported.
An example of 2.5 Kalman filtering algorithm
Figure 3 and Figure 4 show how the probability density distributions of state variables change during the prediction and update process through an example of a Gaussian distribution.
Figure 3. Examples of the prediction process, the blue curve represents the $x_{t1}$ PDF, and the purple curve represents the $\overline x_t$ PDF.
Figure 4. Example of the update process, purple for the postprediction PDF, yellow for the updated PDF, and cyan As the result of the observation
It can be noted from this example that in predicting partial Gaussian distributions, the variance of the state estimation is generally increased; the product of the Gaussian distribution in the observed part generally narrows the estimated variance.
Code implementation of 2.6 Kalman filter
Kalman filter algorithm can be easily implemented by matrix computing method, and its iterative updating process of MATLAB Implementation code only has the following lines:
example of the effect of 2.7 Kalman filter
By implementing a simple Kalman filter, we can visually look at the effect of improving the tracking accuracy of the Kalman filter.
Figure 5. Example of the experimental effect of the Kalman filter, where the red solid line is true, the blue point is the observation; the Green Line is the result of the sliding average, and the purple curve is the result of the Kalman filter.
Figure 6. Compare the tracking results of Kalman filter and the tracking results of sliding average
Figure 6 shows a comparison of the least squares error between the visual tracking result and the real value, and the courseware Kalman filtering algorithm provides higher tracking accuracy than the sliding average.
2.8The characteristics of Kalman filter algorithm
 Kalman filtering is computationally fast and the computational complexity is $o (m^{2.376} + n^2) $, where $m$ is the observed dimension; $n $ is the number of States.
 For linear systems, the 0 mean Gaussian noise system, the Kalman is theoretically unbiased, the optimal filter.
 In practical use, Kalman filter should pay attention to the regulation of parameter $r$ and $q$, which are actually relative, which indicates more belief in the observation or the prediction. When used specifically, the $R $ can be determined based on the amplitude of the process noise, and then $q$ can be given relative to $r$. When more believe in the observation, the $q$ small, do not believe in the observation, the $q$ to increase.
 The greater the $Q $, the less it is believed to be observed, which is the more easily the system state converges and the slower the response to the observed changes. The smaller the $Q $, the more confident the observation is, the faster the response to the observed changes, but the less likely it is to converge.
Reference documents
[1]. Sebastian Thrun, Wolfram Burgard, Dieter Fox, Probabilistic Robotics, 2002, the MIT Press.
[2]. P.a. Bromiley, Products and convolutions of Gaussian probability Density Functions, University of Manchester
(b). The Kalman filter: the Kalman filter