## Freesteel Blog » Enjoying the 2D covariant matrix

## Enjoying the 2D covariant matrix

Wednesday, November 6th, 2013 at 7:25 pm

Let’s make a set of 50 random points, distributed like so:

N = 50 xs = [ random.gauss(0,25) for i in range(N) ] ys = [ random.gauss(0,11.9) for i in range(N) ] ys = [ y+x*0.3 for x, y in zip(xs, ys) ] xm, ym = sum(xs)/N, sum(ys)/N pts = [ (x-xm, y-ym) for x, y in zip(xs, ys) ]

The displacement by *(xm, ym)* is so that average is centred on the origin to make things simpler. Here’s the picture:

To fit line *y = mx + c* by least squares, we should minimize the value of the formula:

*E = Sum _{i}(mx_{i} + c – y_{i})^{2}*.

At the minimal values, the

*dE/dc = dE/dm = 0*— ie the rate of change in the error term for a change in the parameters

*m, c*is zero, otherwise we could go in the direction of the downwards slope to get a smaller value.

0 = dE/dc = Sum_{i}2(mx_{i}+ c - y_{i}) = 2m Sum_{i}x_{i}+ 2Nc - Sum_{i}y_{i}= 2Nc

So, *c=0* as it should be because we distributed the points centred on the origin.

0 = dE/dm = Sum_{i}2(mx_{i}- y_{i}) x_{i}= 2m Sum_{i}x_{i}^{2}- 2 Sum_{i}x_{i}y_{i}

In Python this line gets expressed as:

m = sum(x*y for x,y in pts) / sum(x*x for x,y in pts) plotline((-100,-100*m), (100,100*m))

The equation is not symmetrical in x and y, because it’s the square of differences from the line in the y-axis direction only. For symmetry what we want is the distance in the perpendicular direction *E = Sum _{i}(u x_{i} + v y_{i} – c)^{2}*, where the line is defined by

*u x + v y – c = 0*where

*u*.

^{2}+ v^{2}= 1Again, we can prove *c = 0*:

0 = dE/dc = Sum_{i}2(u x_{i}+ v y_{i}- c) = 2u Sum_{i}x_{i}+ 2v Sum_{i}y_{i}- 2Nc = -2Nc

Differentiating *u ^{2} + v^{2} = 1* by

*v*gives

*2 u – 2 v dv/du = 0*, or

*dv/du = -u/v*, so:

0 = dE/du = Sum_{i}2(u x_{i}+ v y_{i})(x_{i}- u/v y_{i}) 0 = Sum_{i}(u x_{i}+ v y_{i})(v x_{i}- u y_{i}) = uv Sum_{i}(x_{i}^{2}- y_{i}^{2}) + (v^{2}+ u^{2})Sum_{i}x_{i}y_{i}= uv Sum_{i}(x_{i}^{2}- y_{i}^{2}) + (1 - u^{2})Sum_{i}x_{i}y_{i}

In Python we calculate the value of uv like so:

M = sum(x*y for x,y in pts)/sum(x*x+y*y for x,y in pts)

This leaves solving the quadratic equation in u^{2}:

u^{4}(4 M^{2}+ 1) - u^{2}(4 M^{2}+ 1) + M^{2}

so we can draw a second fitting line (ignoring the maximal and transposed solutions), like so:

This is a better symmetrical fit.

What if we now plot the value *N ^{-2} Sum_{i}(u*x + v*y)^{2}* in polar coordinates, where

*u = cos theta, v = sin theta*?

What’s this weird bow-tie affair doing, when people normally expect this plot to be an ellipse?

There are two solutions for *dE/du = 0* above for the maxima and minima values. These are perpendicular to one another by virtue of the equation *uv/(1 – 2 u ^{2}) = M* being true at the solution under the transform

*(-v, u) –> (u, v)*. (It doesn’t matter if you differentiate with respect to

*u*,

*v*or even

*theta*for this.)

**Let’s try this again, but using matrix technology.**

(u v) ( x ) (x y) ( u ) ux + vy = ( y ) = ( v ) (u v) ( x ) (x y) ( u ) (u v) ( x^{2}xy ) ( u ) (ux + vy)^{2}= ( y ) ( v ) = ( xy y^{2}) ( v ) (u v) ( Sum_{i}x_{i}^{2}Sum_{i}x_{i}y_{i}) ( u ) E = Sum_{i}(ux + vy)^{2}= ( Sum_{i}x_{i}y_{i}Sum_{i}x_{i}^{2}) ( v )

That symmetric matrix *C* between the two *(u v)* vectors is the covariance matrix.

Because it is symmetric, it will have two perpendicular eigenvectors *e _{1}, e_{2}* with eigenvalues

*f*so that

_{1}, f_{2}*C e*.

_{1}= f_{1}e_{1}If *a = (u, v).e _{1}, b = (u, v).e_{2}*, then

*(u, v) = a e*and

_{1}+ b e_{2}*a*

^{2}+ b^{2}= 1, so we can express:

(u v) ( ) ( u ) E = ( C ) ( v ) = (a e)_{1}+ b e_{2}^{T}C (a e_{1}+ b e_{2}) = (a e_{1}+ b e_{2}) . (a f_{1}e_{1}+ b f_{2}e_{2}) = a^{2}f_{1}e_{1}+ b^{2}f_{2}e_{2}

This looks like an equation for an ellipse, but it is not (though you would get an ellipse in *(a, b)* if you set *E = 1*). The bow-tie graph comes from the fact that you are plotting the unit vector *(a, b)* but scaling it by *E*, and if *f _{1} = f_{2}* then

*E*would be constant and we would graph a circle. But the moment the values are different, it starts to look like a corpuscle.

By inspection, the minimal and maximal values of *E* are going to be when *(a, b) = (1, 0) or (0, 1)*, respectively when *f _{1}<=f_{2}*.

This tells us we need to find the eigenvectors of *C*.

Now the eigenvalue *f* of *C* is going to satisfy:

0 = det (C - fI) ( Cxx - f Cxy ) = det ( Cxy Cyy - f) = (Cxx - f)(Cyy - f) - Cxy^{2}= f^{2}- f(Cxx + Cyy) + Cxx Cyy - Cxy^{2}

And then you have to work out the corresponding eigenvector for it to discover your minima and maxima error directions in the point cloud.

A lot of algebra I don’t have time for should show that this is equivalent to the solution I did earlier without using matrices.

Next I’m going to extend this to 3 dimensions to calculate the best fitting plane under least squares for a cloud of points. This is a highly useful function that is necessary for my mobile laser scanning tool.

The brute force method done without matrices runs into problems because it requires solving a cubic equation. However, the matrix/eigenvector solution allows us to skip over those issues and get to the actual geometry of the situation.

## Leave a comment