Basic Idea of Kalman Filter principles

Last Update:2016-08-21 Source: Internet

Author: User

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

I. Problems to be Solved by Kalman Filter

First, let's talk about what kind of problems the Kalman filter should solve and how such systems should be modeled. Here we talk about the linear Kalman filter, which is a linear dynamic discrete system. Such systems can be expressed using the following two equations:

\ [\ Begin {array} {L}
X (n + 1) = f (n + 1, n) x (n) + {v_1} (n )\\
Y (n) = C (n) x (n) + {V_2} (n )\\
\ End {array} \]

Where:

X (n) indicates the system status.

F (n + 1, n) is the state transition matrix, indicating the State changes over time. In general, what is the relationship between the current status and the next status.

C (n) indicates the relationship between observed values and States.

Y (n) indicates the observed state

V1 indicates system process noise

V2 indicates the noise generated during the observation process.

In the above two equations, the first equation isProcess EquationIt indicates the update process of system status x (n) over time. The second equation isMeasurement equation, Indicating the relationship between state x (n) and Measurement Result Y (n. Here we need to explain the concepts in these two equations first.

First, explain the concept of state.A State is an abstraction of system features. It is composed of the minimum data required to predict the future characteristics of the system and related to the system's past behavior.

This concept is hard to understand! For example. I believe many of my friends have seen on the Internet that someone uses room temperature measurement to describe the Kalman Filter Principle. Here, the real temperature in the room is the state. It can be a parameter or multiple parameters. The value measured by temperature is the observed value Y (n ). For another example, if we want to track a moving object, the displacement and velocity of the object can fully represent the main characteristics of the system composed of the moving object. The state can be expressed by a vector with displacement and velocity. I believe many of my friends have correctly understood the concept of status,It represents the true characteristics of the objective existence of the system..

Let's talk about the existence of C (n) between the system state and its observed value. Here it represents the relationship between the observed value and the State. Take the indoor measurement as an example. The objective and real indoor temperature (unknown amount) is used as the state in this system and the thermometer is used to measure the state. The tested temperature is our observed temperature y (n ). Here, it is very likely that the system state does not have a simple relationship with its observed values to add a measurement noise. Then, this relationship can be modeled using C (n.

The above two equations are linear Kalman Filter modeling of the system to be solved. Here we will explain the models represented by these two equations in a more general way:

　　Process equation: It describes how the system changes from one State to another over time.

　　Measurement equation: it describes the relationship between the current state and the current measured value.

That is to say, if we are interested in a system, we need to first find one or more main features (States) of the system, then observe these features, and get a group of observations, through continuous observation, we can identify this system and use only the observed data to find the optimal solution for process and measurement equations. This is the problem to be solved by the Kalman filter. For ease of understanding, we will discuss only the observation system composed of only one feature state.

Ii. Projection Theorem and new interest Process

Now we know what problems the Kalman filter should solve and how to build a model for such problems. Next, let's take a look at the specific method used to optimize the process equation and measurement equation. There are two ways to derive the Kalman filter:

(1) orthogonal projection theorem (Projection Theorem is also called in some cases)

(2) New interest Process

Mr. Kalman first introduced Kalman Filter Using orthogonal projection theorem for derivation. Later, kailath proposed a method based on the new interest process. We will introduce both of them here. As mentioned above, when observing the system, we will get a group of observations y (N). Well, we should first consider an estimation problem, more specifically, it is about how to use only the observed data to predict the state x (n. From the content of regression analysis, we can know that the linear least square estimation (linear least variance estimation) of the random variable X is obtained from the observed value Y, which can be expressed:

\ [\ Hat x = E [x] + Cov (x, y) d {(y) ^ {-1 }}\ {y-e [y] \} \]

In this case, the difference between the random variable X and its estimation is orthogonal (irrelevant) with the observed value Y. We call the estimated value of X as the projection of X on y, which is recorded:

\ [\ Hat x = proj [x | Y] = proj [x | Y (1), y (2),..., y (k)] \]

Its geometric meaning is as follows:

Well, let's look at another question. The real value of random variable X is what we finally want, but it cannot be actually obtained. If we change our mindset and use the first n-1 observations to predict the observed values at the current N-hour, what kind of results will be presented. Here we need to define a new concept --It indicates the error between the current observed value and its estimated value.. It can be expressed as follows:

\ [\ Alpha (n) = y (N)-proj [Y (n) | Y (1), y (2 ),..., Y (n-1)] = Y (N)-\ hat y (N | {Y _ {n-1}) \]

The geometric meaning of the new information is as follows:

Let's look at the process of generating new information. We can get an observed value and get an error corresponding to it at the same time. The opposite is true. That is to say, there is a one-to-one relationship between the observed values and the new information. If we want to get the final state of the system, we can also use the new information to replace the observed values for estimation. That is:

\ [\ {Y (1), y (2 ),..., Y (n) \} \ mathbin {\ lower.3ex \ hbox {$ \ buildrel \ textstyle \ rightarrow \ over
{\ Smash {\ leftarrow} \ vphantom {_{\ vbox to.5ex {\ VSS }}}$ }\{ \ alpha (1 ), \ alpha (2 ),..., \ alpha (n) \} \]

Therefore

\ [Proj [x | Y (1), y (2 ),..., Y (n)] = proj [x | \ alpha (1), \ alpha (2 ),..., \ alpha (n)] \]

It is written as follows:

\ [\ Begin {array} {L}
Proj [x | Y (1), y (2 ),..., Y (n)] = proj [x | \ alpha (1), \ alpha (2 ),..., \ alpha (n)] \
= Proj [x | \ Alpha] = E [x] + \ sum \ limits _ {I = 1} ^ {n-1} {e \ {x} {\ Alpha ^ t} (I) \} e {\ alpha (I) {\ Alpha ^ t} (I) \} ^ {-1} \ alpha (I) + e \ {X {\ Alpha ^ t} (n) \} e {\ alpha (n) {\ Alpha ^ t} (N) \} ^ {-1 }}\ alpha (n )\\
= Proj [x | \ alpha (1), \ alpha (2 ),..., \ alpha (n-1)] + e \ {X {\ Alpha ^ t} (n) \} e {\ alpha (N) {\ Alpha ^ t} (n) \} ^ {-1} \ alpha (n )\\
= Proj [x | Y (1), y (2 ),..., Y (n-1)] + e \ {X {\ Alpha ^ t} (n) \} e {\ alpha (N) {\ Alpha ^ t} (n) \} ^ {-1} \ alpha (n )\\
\ End {array} \]

We can use the projection idea to recursively express the state prediction process. It can also be said that in ry, the Kalman filter can be seen as the projection of the state in the linear space generated by observation. However, if we only analyze it here, it is estimated that many friends may not really understand the concept of Kalman filter. Then we will analyze how to solve the system status through the new information process.

After careful consideration, we will find that there are two errors in the entire process. One is the prediction error in the observation process: new information; the other is the prediction error in the State update process (which has been hidden in the projection process but not noticed). Let's look at the relationship between them.

Based on the measurement equations of all the observed values Y (1),..., y (n-1) and the system, we can obtain the Minimum Mean Square estimate of the observed values Y (n ).

\ [\ Hat y (N | {Y _ {n-1}) = C (n) \ hat x (n | {Y _ {n-1 }}) \]

Therefore, the new information process can also be expressed:

\ [\ Begin {array} {L}
\ Alpha (n) = y (N)-\ hat y (N | {Y _ {n-1 }})\\
= Y (N)-C (n) \ hat x (n | {Y _ {n-1 }})\\
= C (n) \ varepsilon (N, N-1) + {V_2} (n )\\
\ End {array} \]

Here, ε (n, n-1) is the difference between state x (n) and its one-step prediction value.

\ [\ Varepsilon (N, N-1) = x (n)-\ hat x (n | {Y _ {n-1}) \]

If we define

\ [\ Begin {array} {L}
R (n) = E [\ alpha (n) {\ Alpha ^ h} (n)] \
K (N, N-1) = E [\ varepsilon (N, N-1) {\ varepsilon ^ h} (N, N-1)] \
{Q_2} (n) = {V_2} (n) V_2 ^ h (n )\\
\ End {array} \]

Here, R (n) indicates the quality of the observed information, K (n, n-1) indicates the quality of the State updated by time, And Q2 (n) indicates the correlation matrix of the measurement noise vector. The following information is available in the observation direction:

\ [R (n) = C (n) K (N, N-1) {C ^ h} (n) + {Q_2} (n) \]

This is the second-order statistical relationship between the state prediction error and the new information. Therefore, as long as we try to optimize the quality of time update and improve the quality of observation information, we can estimate the desired system status. Well, the problem is reversed. Back to the question raised at the beginning of this article, we have a set of observations and a new sequence corresponding to the ratio. How to make good use of them to estimate the system status we want. Let's talk about it as follows:

Based on the one-to-one correspondence between the observed values and the new interest, if we use the new interest process to estimate the smallest mean square estimate, this estimate can be expressed as a linear combination of the new interest process sequence. That is:

\ [\ Hat X (I | {Y_n}) = \ sum \ limits _ {k = 1} ^ n {B _ I} (k) \ alpha (k)} \]

Here, B is the matrix of undetermined coefficients. Based on the Orthogonal Principle of linear filtering, the prediction state error vector is orthogonal to the new information process, that is:

\ [E [\ varepsilon (I, n) {\ Alpha ^ h} (m)] = e \ {[X (I)-X (I | {Y_n})] {\ Alpha ^ h} (m) \}= 0 \]

Then, the Minimum Mean Square Estimation of the State X (I) derived from the new information process is substituted into the result.

\ [E [X (I) {\ Alpha ^ h} (m)] = {B _ I} (m) E [\ alpha (m) {\ Alpha ^ h} (m)] = {B _ I} (m) R (m) \]

Here R (m) is the correlation matrix of the new interest process. The above formula is multiplied by the inverse matrix of R (m) at the same time on both sides. The expression of undetermined coefficient matrix B is as follows:

\ [{B _ I} (m) = E [X (I) {\ Alpha ^ h} (m)] {R ^ {-1} (m) \]

Then, B is substituted into the Least Mean Square Estimation Formula of State X (I) derived from the new information process. The minimum mean square error of State X (I) is estimated as follows:

\ [\ Hat X (I | {Y_n}) = \ sum \ limits _ {k = 1} ^ n {e [X (I )} {\ Alpha ^ h} (k)] {R ^ {-1} (k) \ alpha (k) = \ sum \ limits _ {k = 1} ^ {n-1} {e [X (I)} {\ Alpha ^ h} (k)] {R ^ {-1} (k) \ alpha (k) + E [X (I) {\ Alpha ^ h} (n)] {R ^ {-1 }}( N) \ alpha (n) \]

For I = n + 1

\ [\ Hat x (n + 1 | {Y_n }) = \ sum \ limits _ {k = 1} ^ {n-1} {e [x (n + 1) }{\ Alpha ^ h} (k)] {R ^ {-1} (k) \ alpha (k) + E [x (n + 1) {\ Alpha ^ h} (n)] {R ^ {-1 }}( N) \ alpha (n) \]

We can see that this result has the same meaning as the result derived from the Projection Theorem, except that the tools and principles used are different.

Finally, let's look back at the relationship between the observed data and the new information and status, and the export results based on these relationships. It is not difficult to draw a conclusion that the starting point of Kalman filter is to make full use of the observed values and estimate the system state with as few computations as possible. Haha, isn't it boring to feel that digital games are still fun. Okay. Let's analyze it here today. The results of this analysis will be used in the future to gradually describe the Kalman Filter and its usage. The analysis here comes from some of my thoughts on learning Kalman Filter (see books: Adaptive Filter Principle, Kalman Filter Principle and Application). The level is limited. If there is any error, please kindly advise me.

Basic Idea of Kalman Filter principles

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More

Basic Idea of Kalman Filter principles

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support

Basic Idea of Kalman Filter principles

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

Trending Topic

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support