Monday, July 02, 2007

Linear least square error estimation

Let z be some unknown function of x and y. Assume the function is close to linear, so we want a function
    f(x,y) = a x + b y + c
that approximates z by minimizing the total square error for a collection of N data points (xi, yi, zi). We will need to accumulate the following sums. This can be done incrementally in real time, if the data arrive that way. We can use exponentially weighted sums to give more importance to more recent data points, if that makes sense.
Sx = ∑ xi
Sy = ∑i yi
Sz = ∑i zi
Sxx = ∑i xi2
Syy = ∑i yi2
Sxy = ∑i xiyi
Syz = ∑i yizi
N = ∑i 1 (or with exponential weighting if desired)
Then we have a total square error:
    E = ∑i (zi - axi - byi -c)2
and we want to minimize that error by choosing (a,b,c) at a minimum:
    ∂E/∂a = ∂E/∂b = ∂E/∂c = 0

0 = Sxz - a Sxx - b Sxy - c Sx
0 = Syz - a Sxy - b Syy - c Sy
0 = Sz - a Sx - b Sy - cN
Then we can obtain (a,b,c) from linear algebra.
[ a ]    [ Sxx Sxy Sx ]-1  [ Sxz ]
[ b ] = [ Sxy Syy Sy ] [ Syz ]
[ c ] [ Sx Sxy N ] [ Sz ]
Based on all this, we can write a linear least-squares estimator class in Python.

No comments: