Is the Kalman Filter Optimal?

This document is split into several sections:


Defining the Problem

In the previous document we assumed that the best linear estimate for the state, xj, was given by

where

The question to be answered is: Can we prove that this statement is true?

If we want to estimate the state we can use only the three quantities that we know, the previous estimate, the current input and the current measured output.  We use these three variables to form a linear estimate of the state:

where αj, βj and γj are chosen to minimize the squared error between the value of the state, xj, and its estimate, x^j.  In other words we want to minimize the expected value of the squared error ej with respect to the variables αj, βj and γj .

 

Finding the Constants

To do the minimization with respect to each variable we simply differentiate and set the result to zero.

which can be rewritten

These last two expressions are often referred to as the orthogonality conditions; i.e., the error is orthogonal to the previous estimated state, the current input and the current value of the measured output.

Let's use the first condition to find an expression for αj that minimizes the expected value of the error.  If we add and subtract αjxj-1 from the equation (why we do this will become clear shortly), we get:

Now we can use the facts that

to write

Note that because of the orthogonality relationships the first term on the right can be rewritten as

We also know that the previous estimate is uncorrelated with the current value of the measurement noise:

So we can simplify the equation to the following

This is a complicated expression that we can use a bit later, but first we need to derive one more expression.  By following the same sequence of steps as is done above (but starting with the second equation in which we set the derivative to zero), it is easily shown that

We can rewrite the last two equations

or, in matrix form

For a matrix equation

we know that either

So for the matrix equation above, either

The second equation can be written as

If the last equation is true, it should be true for any input. If the input is a constant such that uj=c, then

However

is only true if M and N are independent, and the value of the state and its estimate are not independent, so the first condition must be true.  In other words,

(This last argument seems weak to me, but I haven't worked out the details.  If you have a more rigorous argument, please email me.)

So

Substituting this into our original equation for the estimate of x, we get


Reconciling with Previous Document

Now recall Equation 6 from the previous document

    Equation 6

From the previous document we know that the a priori estimate of the state is given by

and if we let

we can rewrite our last equation (at the end of the previous section)

which matches Equation 6.

We have shown that the Kalman filter represents the optimal linear filter.  The other document goes on to derive the optimal value for kj.