Freesteel Blog » Enjoying the 2D covariant matrix

Enjoying the 2D covariant matrix

Wednesday, November 6th, 2013 at 7:25 pm Written by:

Let’s make a set of 50 random points, distributed like so:

N = 50
xs = [ random.gauss(0,25)  for i in range(N) ]
ys = [ random.gauss(0,11.9)  for i in range(N) ]
ys = [ y+x*0.3  for x, y in zip(xs, ys) ]
xm, ym = sum(xs)/N, sum(ys)/N
pts = [ (x-xm, y-ym)  for x, y in zip(xs, ys) ]

The displacement by (xm, ym) is so that average is centred on the origin to make things simpler. Here’s the picture:

To fit line y = mx + c by least squares, we should minimize the value of the formula:
E = Sumi(mxi + c – yi)2.
At the minimal values, the dE/dc = dE/dm = 0 — ie the rate of change in the error term for a change in the parameters m, c is zero, otherwise we could go in the direction of the downwards slope to get a smaller value.

0 = dE/dc
  = Sumi2(mxi + c - yi) 
  = 2m Sumixi + 2Nc - Sumiyi
  = 2Nc

So, c=0 as it should be because we distributed the points centred on the origin.

0 = dE/dm
  = Sumi2(mxi - yi) xi
  = 2m Sumixi2 - 2 Sumixiyi

In Python this line gets expressed as:

m = sum(x*y  for x,y in pts) / sum(x*x  for x,y in pts)
plotline((-100,-100*m), (100,100*m))

The equation is not symmetrical in x and y, because it’s the square of differences from the line in the y-axis direction only. For symmetry what we want is the distance in the perpendicular direction E = Sumi(u xi + v yi – c)2, where the line is defined by u x + v y – c = 0 where u2 + v2 = 1.

Again, we can prove c = 0:

0 = dE/dc
  = Sumi2(u xi + v yi - c) 
  = 2u Sumixi + 2v Sumiyi - 2Nc
  = -2Nc

Differentiating u2 + v2 = 1 by v gives 2 u – 2 v dv/du = 0, or dv/du = -u/v, so:

0 = dE/du
  = Sumi2(u xi + v yi)(xi - u/v yi)
0 = Sumi(u xi + v yi)(v xi - u yi)
  = uv Sumi(xi2 - yi2) + (v2 + u2)Sumixiyi
  = uv Sumi(xi2 - yi2) + (1 - u2)Sumixiyi

In Python we calculate the value of uv like so:

M = sum(x*y  for x,y in pts)/sum(x*x+y*y  for x,y in pts)

This leaves solving the quadratic equation in u2:

u4(4 M2 + 1) - u2(4 M2 + 1) + M2

so we can draw a second fitting line (ignoring the maximal and transposed solutions), like so:

This is a better symmetrical fit.

What if we now plot the value N-2 Sumi(u*x + v*y)2 in polar coordinates, where u = cos theta, v = sin theta?

What’s this weird bow-tie affair doing, when people normally expect this plot to be an ellipse?

There are two solutions for dE/du = 0 above for the maxima and minima values. These are perpendicular to one another by virtue of the equation uv/(1 – 2 u2) = M being true at the solution under the transform (-v, u) –> (u, v). (It doesn’t matter if you differentiate with respect to u, v or even theta for this.)

Let’s try this again, but using matrix technology.

          (u  v) ( x )   (x  y) ( u )
ux + vy =        ( y ) =        ( v )

             (u  v) ( x ) (x  y) ( u )   (u  v) ( x2  xy ) ( u )
(ux + vy)2 =        ( y )        ( v ) =        ( xy  y2 ) ( v )

                    (u  v) ( Sumixi2  Sumixiyi ) ( u )
E = Sumi(ux + vy)2 =       ( Sumixiyi  Sumixi2 ) ( v )

That symmetric matrix C between the two (u v) vectors is the covariance matrix.

Because it is symmetric, it will have two perpendicular eigenvectors e1, e2 with eigenvalues f1, f2 so that C e1 = f1 e1.

If a = (u, v).e1, b = (u, v).e2, then (u, v) = a e1 + b e2 and a2 + b2 = 1, so we can express:

    (u  v) (     ) ( u )
E =        (  C  ) ( v )
  = (a e1 + b e2)T C (a e1 + b e2)
  = (a e1 + b e2) . (a f1 e1 + b f2 e2)
  = a2 f1 e1 + b2 f2 e2

This looks like an equation for an ellipse, but it is not (though you would get an ellipse in (a, b) if you set E = 1). The bow-tie graph comes from the fact that you are plotting the unit vector (a, b) but scaling it by E, and if f1 = f2 then E would be constant and we would graph a circle. But the moment the values are different, it starts to look like a corpuscle.

By inspection, the minimal and maximal values of E are going to be when (a, b) = (1, 0) or (0, 1), respectively when f1<=f2.

This tells us we need to find the eigenvectors of C.

Now the eigenvalue f of C is going to satisfy:

0 = det (C - fI)

        ( Cxx - f    Cxy  )
  = det (   Cxy    Cyy - f)

  = (Cxx - f)(Cyy - f) - Cxy2
  = f2 - f(Cxx + Cyy) + Cxx Cyy - Cxy2

And then you have to work out the corresponding eigenvector for it to discover your minima and maxima error directions in the point cloud.

A lot of algebra I don’t have time for should show that this is equivalent to the solution I did earlier without using matrices.

Next I’m going to extend this to 3 dimensions to calculate the best fitting plane under least squares for a cloud of points. This is a highly useful function that is necessary for my mobile laser scanning tool.

The brute force method done without matrices runs into problems because it requires solving a cubic equation. However, the matrix/eigenvector solution allows us to skip over those issues and get to the actual geometry of the situation.

Leave a comment

XHTML: You can use these tags: <a href="" title=""> <abbr title=""> <acronym title=""> <blockquote cite=""> <code> <em> <strong>