## Bisecting a toolpath to tolerance

Tuesday, January 27th, 2009 at 8:48 pm

Suppose I have a function f(t) which returns a pair of 3d points given the scalar t. In mathematics speak it is a mapping from R to R6 where R is the set of real numbers. But from our point of view the mapping is to a pair of points

(fp(t), fn(t))

because fp(t) is the point that is a distance t along a toolpath, and fn(t) can either be the contact normal of the tool, or the tool axis direction for a 5-axis toolpath, depending on what you are doing.

The task is very simple. I need to sample this function (or toolpath) at a sufficient rate to satisfy a tolerance given by a set of parameters.

In this example, the return value will be a sequence of t values, (t0, t1, ..., tn) which are bounded by the conditions:

1. ti+1 - ti < maxSampleRate
2. Angle(fn(ti+1), fn(ti)) < maxAngleChange
3. If ti+1 - ti < minSampleRate, then no other conditions apply.

for all i < n.

If the values fn(t) are unit vectors (because they represent tool axes), then Condition 2 can be tested by taking the dot product between the two vectors and testing if the result is greater than cos(maxAngleChange), so you shouldn’t need to perform a trigometric calculation at every step to actually determin the angle between the two vectors. Often when talking about computer algorithms the implementation is different to the way it is described.

Condition 3 provides the lower bound for subdivision. Usually it’s set to 2 microns, the motion tolerance of the machine tool.

If maxSampleRate > minSampleRate, then Condition 1 can’t conflict with Condition 3, but if there is a discontinuity in fn(t) (for example, the contact angle suddenly changes due to the toolpath crossing an inside corner), then it becomes important.

Oh, and the last input parameter is T, where t ranges from 0 to T.

The simplest answer that solves all the conditions (by satisfying Condition 3) is to set

n = int(T / minSampleRate)+2, and ti = i T / (n-1)

but if the functions f are expensive (slow) then it will also be the slowest.

My usual algorithm is to do it two parts. First, define the function:

```bool ShouldSplit(ta, tb)
{
if (tb - ta < minSampleRate)
return false;
if (tb - ta > maxSampleRate)
return true;
double d = Dot(fn(ta), tn(tb));
if (d < cos(maxAngleChange))
return true;
return false;
}
```

As usual, the implementation is quite different from the description, where we would complicate matters by carrying around structures based on the tuple

(t, fp(t), fn(t))

so that there would never be unnecessarily recalculated, but it makes the description much harder.

In real code there are many conditions tested in this ShouldSplit() function. Usually t does not correspond to the distance along the curve, and Condition 1 becomes

|fp(ti+1) - fp(ti)| < maxSampleRate

and so on.

The second part of my usual algorithm works by subdividing the pair (0, T) until ShouldSplit() returns true for every adjacent pair, like so:

```result = array(0, T);
i = 0;
while (i < count(result) - 1)
{
if (ShouldSplit(result[i], result[i + 1]))
{
t = (result[i] + result[i + 1]) / 2;
result.insertBefore(i + 1, t)
}
else
i = i + 1;
}
```

This was my state of the art for a good few years when I started programming this sort of thing. The ShouldSplit() can get as complicated as it likes, and the outer function would maintain the results in a linked list and can be parallizable when it is written in a different form, like:

```result = list(0.0, T); // indexed by iterators
stack = Set();
stack.add(result.iteratorFront);  // points to the 0.0

BEGIN_CRITICAL_SECTION
if (stack.empty())
i = stack.pop();
END_CRITICAL_SECTION
if (ShouldSplit(result[i], result[i.next]))
{
t = (result[i] + result[i.next]) / 2;
result.insertBefore(i.next, t)
BEGIN_CRITICAL_SECTION
END_CRITICAL_SECTION
}
```

... and it goes on to get even more complicated when it is done properly. Obviously, there has to be a delay before sending in the second and third threads, or the first one won't have time to have created some space for them to get to work. The key is we've separated problem into two self-contained halves that can be developed independently.

Unfortunately, there is an inefficiency here. What happens if, for the sake of simplicity, we ignore Condition 2 and 3 and set T=64.0 and maxSampleRate=0.999. Then this algorithm is going to return a list of 129 evenly sampled points spaced by the distance 0.5 because we would have subdivided the single gap of width 64.0 into two gaps of width 32.0 and so on, until there are gaps of 1.0, which is only just wider than 0.999, causing the next subdivision to take it down to 0.5; whereas the most efficient sampling rate would be to set n = int(64.0/0.999)+2 = 66, resulting in a sample spacing of 0.984.

This is a factor of 2 inefficiency. Admittedly it's a worst case example, but on average it's going to produce about 50% too many points, depending on what you take for an average pair of numbers T and maxSampleRate.

The obvious way to fix it is to begin the algorithm with a loop which samples all the points at the even spacing driven by maxSampleRate, and then start the subdivisions between the pairs of points until Conditions 2 and 3 are satisfied. But this is inelegant and obstructs all that interesting, hard and rewarding parallel processing that could be developed into the outer function, as described above.

There, however, an easy tweak. And it goes like this:
Replace the line which bisects the two sample points:

t = (result[i] + result[i + 1]) / 2;

with:

```d = result[i + 1] - result[i];
m = (int)(d / maxSampleRate);
if ((m > 1) and (m is even))
{
lam = m / (2*m + 2.0);
t = result[i] + lam * (result[i + 1] - result[i]);
}
else
t = (result[i + 1] + result[i]) / 2;
```

In other words, instead of bisecting in the middle of the interval all the time, we divide it to one side of the halfway point in certain cases, and these end up aligning with the optimal solution. In the case above, the first point between 0 and 64.0 will be 64/130.0*64=31.50769 which is an exact multiple of 64/65.0=0.98461.

For even more performance and self-containability, allow the ShouldSplit() function return the point of bisection when it returns true. It could also squeeze some further optimization by carefully examining the directions of the contact normals at ti and ti+1, and predict an approximate point of change where there is likely to be the discontinuity in fn.

However, this is complicated because the result demands a sample on either side of the discontinuity, separated by a distance less than minSampleRate, and if you consistently hit the wrong side of the discontinuity you'll produce a sequence of convergent samples running towards the value maybe in an infinite loop. (I think I've had that bug once.) So it's often better to alternate subdivisions at this predicted discontinuty, and at the halfway point, or pick an average of the predicted value and the half-point, or all kinds of variations of this that don't seem to make a lot of improvement, partly because discontinuities are rare (though important) compared to the overwhelming majority of other sample cases.

When I started this work, the guys working on the old system ran a sampling function which calculated the point fn(t) between fn(ti) and fn(ti+1), tested whether it was colinear to within a certain tolerance, and discarded it if it was. This meant that twice as many points were calculated during the toolpath creation than showed up in the result. After exploring the idea of looking beyond just pairs of points in the sequence to test colinearity without throwing away valuable calculation results, I settled on the above way of doing it, when I observed that the geometric distance sample rate defines the machining tolerance. This gave my code an immediate two-fold advantage, though no one in the office seemed particularly interested in this discovery at the time or any time later.

It takes dozens and dozens of variations and tricks of this kind to make one CAM system perform better than another. You've got to be pretty dedicated to carry on doing it for year after year, even though no one gives a damn. Out in the academic world, each one of these advances could result in a published paper and another credit to your name. But in the commercial programming business in the small niche that CAM is, these things just get lost, forgotten, unnamed, unrecognized, not passed on and not shared. And in the end the people who are able to do them get discouraged, lose interest and stop caring, even when they keep coming to work every morning just to pay off the mortgage. What a waste of a life. Without any conferences of programmers at this level technicality -- totally at odds with the misguided "commercially confidential" culture within almost all of the CAM businesses these days -- the software becomes an exercise in stagnation.