Freesteel Blog » Enjoying the 2D covariant matrix
Enjoying the 2D covariant matrix
Wednesday, November 6th, 2013 at 7:25 pm
Let’s make a set of 50 random points, distributed like so:
N = 50 xs = [ random.gauss(0,25) for i in range(N) ] ys = [ random.gauss(0,11.9) for i in range(N) ] ys = [ y+x*0.3 for x, y in zip(xs, ys) ] xm, ym = sum(xs)/N, sum(ys)/N pts = [ (x-xm, y-ym) for x, y in zip(xs, ys) ]
The displacement by (xm, ym) is so that average is centred on the origin to make things simpler. Here’s the picture:
To fit line y = mx + c by least squares, we should minimize the value of the formula:
E = Sumi(mxi + c – yi)2.
At the minimal values, the dE/dc = dE/dm = 0 — ie the rate of change in the error term for a change in the parameters m, c is zero, otherwise we could go in the direction of the downwards slope to get a smaller value.
0 = dE/dc = Sumi2(mxi + c - yi) = 2m Sumixi + 2Nc - Sumiyi = 2Nc
So, c=0 as it should be because we distributed the points centred on the origin.
0 = dE/dm = Sumi2(mxi - yi) xi = 2m Sumixi2 - 2 Sumixiyi
In Python this line gets expressed as:
m = sum(x*y for x,y in pts) / sum(x*x for x,y in pts) plotline((-100,-100*m), (100,100*m))
The equation is not symmetrical in x and y, because it’s the square of differences from the line in the y-axis direction only. For symmetry what we want is the distance in the perpendicular direction E = Sumi(u xi + v yi – c)2, where the line is defined by u x + v y – c = 0 where u2 + v2 = 1.
Again, we can prove c = 0:
0 = dE/dc = Sumi2(u xi + v yi - c) = 2u Sumixi + 2v Sumiyi - 2Nc = -2Nc
Differentiating u2 + v2 = 1 by v gives 2 u – 2 v dv/du = 0, or dv/du = -u/v, so:
0 = dE/du = Sumi2(u xi + v yi)(xi - u/v yi) 0 = Sumi(u xi + v yi)(v xi - u yi) = uv Sumi(xi2 - yi2) + (v2 + u2)Sumixiyi = uv Sumi(xi2 - yi2) + (1 - u2)Sumixiyi
In Python we calculate the value of uv like so:
M = sum(x*y for x,y in pts)/sum(x*x+y*y for x,y in pts)
This leaves solving the quadratic equation in u2:
u4(4 M2 + 1) - u2(4 M2 + 1) + M2
so we can draw a second fitting line (ignoring the maximal and transposed solutions), like so:
This is a better symmetrical fit.
What if we now plot the value N-2 Sumi(u*x + v*y)2 in polar coordinates, where u = cos theta, v = sin theta?
What’s this weird bow-tie affair doing, when people normally expect this plot to be an ellipse?
There are two solutions for dE/du = 0 above for the maxima and minima values. These are perpendicular to one another by virtue of the equation uv/(1 – 2 u2) = M being true at the solution under the transform (-v, u) –> (u, v). (It doesn’t matter if you differentiate with respect to u, v or even theta for this.)
Let’s try this again, but using matrix technology.
(u v) ( x ) (x y) ( u ) ux + vy = ( y ) = ( v ) (u v) ( x ) (x y) ( u ) (u v) ( x2 xy ) ( u ) (ux + vy)2 = ( y ) ( v ) = ( xy y2 ) ( v ) (u v) ( Sumixi2 Sumixiyi ) ( u ) E = Sumi(ux + vy)2 = ( Sumixiyi Sumixi2 ) ( v )
That symmetric matrix C between the two (u v) vectors is the covariance matrix.
Because it is symmetric, it will have two perpendicular eigenvectors e1, e2 with eigenvalues f1, f2 so that C e1 = f1 e1.
If a = (u, v).e1, b = (u, v).e2, then (u, v) = a e1 + b e2 and a2 + b2 = 1, so we can express:
(u v) ( ) ( u ) E = ( C ) ( v ) = (a e1 + b e2)T C (a e1 + b e2) = (a e1 + b e2) . (a f1 e1 + b f2 e2) = a2 f1 e1 + b2 f2 e2
This looks like an equation for an ellipse, but it is not (though you would get an ellipse in (a, b) if you set E = 1). The bow-tie graph comes from the fact that you are plotting the unit vector (a, b) but scaling it by E, and if f1 = f2 then E would be constant and we would graph a circle. But the moment the values are different, it starts to look like a corpuscle.
By inspection, the minimal and maximal values of E are going to be when (a, b) = (1, 0) or (0, 1), respectively when f1<=f2.
This tells us we need to find the eigenvectors of C.
Now the eigenvalue f of C is going to satisfy:
0 = det (C - fI) ( Cxx - f Cxy ) = det ( Cxy Cyy - f) = (Cxx - f)(Cyy - f) - Cxy2 = f2 - f(Cxx + Cyy) + Cxx Cyy - Cxy2
And then you have to work out the corresponding eigenvector for it to discover your minima and maxima error directions in the point cloud.
A lot of algebra I don’t have time for should show that this is equivalent to the solution I did earlier without using matrices.
Next I’m going to extend this to 3 dimensions to calculate the best fitting plane under least squares for a cloud of points. This is a highly useful function that is necessary for my mobile laser scanning tool.
The brute force method done without matrices runs into problems because it requires solving a cubic equation. However, the matrix/eigenvector solution allows us to skip over those issues and get to the actual geometry of the situation.
Leave a comment