## Sideways projected tapered tool into edge bug

Monday, May 28th, 2012 at 7:38 pm

Another interesting calculation bug, to go with the previous one, that has been in my code for decades, occurred as a result of not properly factoring down a quadratic equation.

The danger area is when such an equation approaches a perfect square, and the calculation isn’t at the margin.

A very simple example of a safe situation is the line y = m x + c intersecting a circle radius r centred on the origin, so that:
r^2 = y^2 + x^2 = (m x + c)^2 + x^2
0 = x^2 (m^2 + 1) + 2 x (m c) + (c^2 – r^2) = x^2 a + 2 x b2 + c
Solving this quadratic equation using the standard formula gives:
x = -b2 / a +- sqrt(b2^2 – a c) / a
= -m c / (m^2 + 1) +- sqrt(m^2 c^2 – (m^2 + 1) (c^2 – r^2)) / (m^2 + 1)

The value inside the square root expands to -c^2 + m^2 r^2 + r^2, and you’re going to get no solution when this is negative, corresponding to no intersection with the circle. (As a sanity check, set r = 0 and no line that is not through the origin hits it.)

The margin, when this value is 0, corresponds to a double contact point (a tangency) between the line and the circle, and it really doesn’t matter if you get it completely accurate; you’re barely intersecting the geometry at all.

In the following case I found such a use of the quadratic equation where the margin was not at all marginal, and the presence of a collision mattered.

.

.

A tapered tool is a section of a cone from centre point a radius ra to point b radius rb along the x axis.

The (red) line segment (which is normally the edge of a triangle, say) runs from
a = (ax, ay, az) to b = (bx, by, bz) where ax < bx.

The task is to lower the cone along the z axis (with the cone’s axis in the y=0 plane) until it makes contact with the (red) line segment. In other words, we want to find the length dz of the orange vertical line — which is the standard calculation to find the position when cutter touches the geometry.

To reduce the number of variables we have trimmed the line segment against the range of tapered tool we are interested in, and also trimmed the extents of the taper against the x range of the line segment. Obviously if there is no overlap then there can be no point of contact as the cone will drop right past one or other tip of the line segment.

Let m be the proportion of the line from a to the contact point p towards b, so it must be between zero and one.

Let v = b – a, so p = a + m v

Then the radius of the cone in this plane x = px is r = ra + m rv where rv = rb – ra.

Therefore |(py, pz + dz)| = r.

This vector, which is a vector in the x=0 plane away from the axis of the cone, represents the y and z components of the normal vector n from the point on the cone surface. The x component of this normal vector is
nx = -r slope where slope = rv / vx.

As a function of m, the vector n can be written as (-r slope, ay + vy m, sqrt(r^2 – (ay + vy m)^2).

When we have tangential contact, n . v = 0, the line and the normal vector are perpendicular.

In other words we must solve Equation (1):
[-(ra + m rv) vx slope + (ay + vy m) vy]^2 = [(ra + m rv)^2 – (ay + vy m)^2] vz^2

Easy, I thought. This is a quadratic equation in m. All I do is multiply it out into the form
m^2 qa + 2 m qb2 + qc = 0 and then solve it using the quadratic formula
-qb2 / qa +- sqrt(qb2^2 – qa qc) / qa, and assume there is no solution when qb2^2 – qa qc < 0.

The problem is that sometimes qb2^2 – qa qc computes to a negative number when it should be positive, due to floating point error. To see when, let’s take the special case when az = bz, or vz = 0.

Then the right hand side of Equation (1) above becomes zero, so that it becomes a perfect square:
[-(ra + m rv) vx slope + (ay + vy m) vy]^2 = 0.

And with a perfect square, if you independently compute the qa, qb2 and qc values, then
qb2^2 – qa qc = 0 should be true — subject to a rounding error exposed by the arithmetic difference. That is, sometimes the calculation will be negative and the buggy function will return false having found no point of contact between the line segment and the cone, even though it could not be bigger.

To put this into context, in most geometric computations, the time when a quadratic equation becomes a square is when there is a double point of contact, or when the two pieces of geometry are passing one another just grazing their surfaces at the extreme — as in the safe example given at the top. It does not much matter whether you find a collision point or not, because it is marginal.

In this case, the double point corresponds to the two points on a horizontal vz = 0 line as the conical surface drops through it, and they will be hitting from below and on the top in the y = 0 plane, according to the symmetry. It is critical to get this answer right.

And I did get it right by considering the special case of vz = 0 forcing there to be a solution.

But the bug I encountered occurred when vz = 1e-14, and so was outside the special case, and close enough to zero to cause the floating point imprecision to kick in and produce the slightly negative value of -1e-28 when it should have been positive.

Initially I thought I could force there to always be a solution setting qb2^2 – qa qc to zero whenever it came out as non-negative, because I thought that there should always be a solution. Could you not always slide the cone along the z-axis to make contact with any line you chose — assuming the geometry extends to infinity?

And that worked until I encountered numbers that were very negative, which made me realize that sometimes there is no point of tangential contact, because the line always intersects the cone for all values of dz. For example, take a line parallel to the x-axis when vy = vz = 0.

In the end, I got to the fix by factoring the Equation (1) down properly so that I could ensure that the sign of qb2^2 – qa qc was always as it should be. This required me to substitute out the value of slope, which was a precalculated value (constant for the cutter), which I had not originally considered doing. This gives:
[m (-rv^2 + vy^2) + (-ra rv + ay vy) ]^2
= [m^2 (rv^2 – vy^2) + 2 m (ra rv – ay vy) + ra^2 – ay^2] vz^2

Suddenly you can see there are previously hidden shared components on both sides of the equation, which is always bad news when it comes to exposing miscalculations.

Let s1 = -rv^2 + vy^2, s2 = -ra rv + ay vy, s2 = ra^2 – ay^2, so that:
[m s1 + s2 ]^2 = [-m^2 s1 – 2 m s2 + s3] vz^2.

And now our quadratic equation qa m^2 + 2 qb2 m + qc = 0 resolves to:
qa = s1^2 + s1 vz^2, qb2 = s1 s2 + s2 vz^2, qc = s2^2 – s3 vz^2.

And the crucial value:
qb2^2 – qa qc
= s2^2 (s1 + vz^2)^2 – s1 (s1 + vz^2) (s2^2 – s3 vz^2)
= (s1 + vz^2) (s2^2 (s1 + vz^2) – s1 (s2^2 – s3 vz^2))
= (s1 + vz^2) vz^2 (s2^2 + s1 s3)

is now going to be stable even as vz approaches zero. It will not accidentally change sign due to the difference between two almost equal floating point values.

This looks like a pattern that could happen in other places, because quadratic equations are quite common, though I haven’t found them yet.

What it really demonstrates is that putting the quadratic solution formula into its own function is a Very Bad Idea. This is common practice with such mathematical applications: to encapsulate something as ugly and obviously self-contained as this quadratic solution equation -qb2 / qa +- sqrt(qb2^2 – qa qc) / qa into its own special function called QuadSolve(), which just does it, so that you never need to remember or look at this horror ever again. Why should you “reinvent the wheel” every time for such a tricky calculation? And surely if you include the inline directive on the function defintion, the optimizer will factor out all the redundant variables, keeping your code nice and straightforward and easy to read.

Well, that’s how programmer group-think works. The counter-argument, as I have shown here, is rather sophisticated, to the extent that almost no entire group is ever likely to trust or care enough to put in the effort to understand and accept it.

What actually happens in the real world is the bug I have encountered would be fixed in the QuadSolve() function by simply seeing if qb2^2 – qa qc is greater than smallepsilon = -1e-10, at which point it would be set to zero and accepted. This would fix probably 99.999% of all occurrences, and bugger up all uses of the quadratic solver function with this imprecision, slightly fattening every single glancing intersection by an unpredictable degree. And then the rot will have really set in.

In conclusion, what I would look for as a software archaeologist when all the CADCAM code in the world finally becomes available in the same place so that it can be cross-checked, are functions like so:

```bool QuadSolve(double& lo, double& hi, double a, double b, double c)
{
double q = b*b - 4*a*c;
if (q < -1e-10)    // made up value which seems to fix it
return false;
else if (q < 0)
q = 0;
double sq = sqrt(q);
lo = (-b - sq) / (2 * a);
hi = (-b + sq) / (2 * a);
return true;
}
```

Sadly, because all the software is closed source, we have no chance to inspect and point these specific observations out to one other, and users have to live with unnecessary defects caused by these errors. But as long as there is absolutely no way for them to find out what the root cause of any problem is, everyone is happy.