## The PX4 Differential Barometer and its 13.415bit precision problem

Wednesday, March 1st, 2017 at 7:55 pm

Lately I have become interested in noisy sensors; all sensors are noisy, so this is an applicable subject.

The interest comes from my attempts to read up on Kalman filters where the variance of the sensor measurements is a crucial input into calculation of the weight to apply to that measurement as well as the subsequent variance of the output.

I was wanting to do something more scientific with this noisy CO2 sensor, other than simply filtering it and saying to myself: “That looks like a nice clean line”. At the same time, a minor brush with the concepts of statistical seasonal adjustment lead eventually to the autoregressive model and a way to test the quality of the noise — eg whether it is pure noise or the result of noise that has been filtered by a smoothing process.

For example, if you take pure white noise function X[*] with a standard deviation of 1 and pass it through the usual exponential decay filter:

Y[n] = Y[n-1]*f + X[n]*(1-f)

the standard deviation of Y[*] is

(1-f)^2/(1-f^2)

and the covariance between Y[*] and Y[*-1] (formed by averaging the product of the sequence with an offset of itself) is f times this value (whereas the covariance between the purely random sequence X[*] and X[*-1] is obviously zero).

In the absence of attendance at a university lecture course taught by a professor who has structured a procession of scientific concepts into a coherent order in which to learn over the course of a semester, I am happy to follow this journey where it takes me.

My glider data logger has two wind sensors now which, as with most sensor problems, disagree unpredictably. Airspeed is a tricky thing, due to local turbulence and large scale vortices which exist owing to the existence of a levitating wing.

One of the sensors is a pitot tube connected to a MS4525DO differential pressure device purporting to measure to 14 bits of precision as often as every 0.5milliseconds.

I’ve set up the pitot to read as fast as possible, about every millisecond, like so:

Wire.beginTransmission(0x28);
Wire.endTransmission();
delayMicroseconds(500); // needs a delay of 500microseconds or the reading is marked stale

Wire.requestFrom(0x28, 4);
uint16_t stat = (pmsb & 0xC0) >> 6; // 0 good, 2 stale, 3 fault
uint16_t rawpressure = ((pmsb & 0x3F) << 8) | (plsb);

uint16_t rawtemperature = (tmsb << 3) | (tlsb >> 5);

Now immediately you’re going to have a problem because the measurements aren’t continuous; the values are quantized. You can draw a histogram to see that the “noise” (or whatever it is) looks like this (with a gaussian normal curve drawn over it).

# d is pandas series of measurements
d.hist(bins=10)
x = np.linspace(min(d), max(d), 100)
plt.plot(x, mlab.normpdf(x, d.mean(), d.std())*len(d)) That doesn’t look right. Let’s try it with some fake data

s = np.array([random.gauss(3, 2) for i in range(1000000)])
x = np.linspace(min(s), max(s), 100)
pd.DataFrame(s).hist(bins=10)
plt.plot(x, mlab.normpdf(x, s.mean(), s.std())*len(s)) But with 20 histogram bins, the fake data, based on a computer generated random number, it looks right: What happens with 20 bins on the real data: Well it fits, but what’s with all these gaps??

It turns out the sensor is missing out every 3rd reading. It will only give numbers of the form (3i) and (3i+1).

This might explain why my attempts to gain greater precision (a more continuous in value measurement) by taking averages of 16 samples in series have proved such a failure, resulting in distributions that look like this:

n = 16 # successive groups of 16 averaged
dN = d.groupby(d.index//n).sum()[1:-1]/n # discard first and last as they are partial sums
dN.hist(bins=50) Let’s do some experiments to see how ratty this needs to be, continuing this work on Jupyter and using the ability to do instant calculations on streams of numbers to check verify the statistical invariants. (I am no expert in statistics, I tell you.)

Suppose I take the same mean and standard deviation as before and make some fake continuous data with it:

mn, sd = 8239.4345769998472, 2.1556257422301854
s = pd.Series([(random.gauss(mn, sd)) for i in range(32000)])
s.hist(bins=20)
x = np.linspace(min(s), max(s), 100)
plt.plot(x, mlab.normpdf(x, mn, sd)*len(s)) We can as before group this data into sets of n=16 like so:

s = pd.Series([(random.gauss(mn, sd)) for i in range(32000)])
sN = s.groupby(s.index//n).sum()[1:-1]/n # the std will reduce by a factor of sqrt(n)
x = np.linspace(min(sN), max(sN), 100)
plt.plot(x, mlab.normpdf(x, mn, sd/math.sqrt(n))*len(s)/(n*math.sqrt(n)))
sN.hist(bins=20) Now apply the quantization onto the measure by applying the function int() around the random number

s = pd.Series([int(random.gauss(mn, sd)) for i in range(320000)]) That’s not right. We need to make that “int(X+0.5)” so that the precision truncation is balanced, and then it will remain symmetric on the original mean. And then what happens when we do that n=16 grouping again: You get all these spikes.

Hmm. On second thoughts, I think that’s to do with the sampling on the terrible histogram function merging columns when the occasional bin doesn’t quite align.

I’ve got a replacement for that crappy histogram function. It’s:

sN.value_counts().sort_index().plot() This is loads better, because it measures the quantity of each discrete integer in the sum.

Sorry to have wasted your time with the bad histogram work. I should delete all those sections of this post if I had the time.

I had the notion that if I factored out the 1/3 and 2/3 step and made it more even, then averaging over 16 rapid readings would produce a more balanced noise (with a lower standard deviation) than if I left in this actual 13.415bit precision (instead of the full 14) we appear to have here.

This literally must be what’s happening, because when I do:

(d%3).value_counts()

The distribution is even between 0 and 1 (no 2s), so my reading code should scale it down to get to the real raw reading where all the numbers are represented.

So, getting back to where we all started, we can factor out the lack of representation of the (i*3+2) numbers and even it out with the code:

dL = (((d//3*2) + (d%3)))*3/2
plt.plot(d.value_counts(sort=False).sort_index(), label=”original”)
plt.plot(dL.value_counts(sort=False).sort_index(), label=”2/3even”)

x = np.linspace(min(d), max(d), 100)
plt.plot(x, mlab.normpdf(x, dL.mean(), dL.std())*len(dL)*1.5, label=”normal”) Whatever else are its properties, this minor transform does remove the slight skew caused by these readings. I don’t see why they did this, when they could have simply left the sensor’s scale as being between 0 and 10922 instead of scaling it up by 3/2 to 0 and 16383.

And I’ve not even got back to dealing with the other wind sensor. That’s got a lot of looking to right now, for sure.

Update
I’ve just run the covariance test on the differential barometric readings (factored by this 3/2 problem) using the following code:

dLm = dL.mean()
ss = [((dL – dLm)*(dL.shift(i) – dLm)).mean() for i in range(400)]
plt.plot(ss) I don’t exactly know what this means, but that’s some very regular long-term cycle in the system.

Here’s what happens with a delay(2) in the loop so that the readings are 3ms apart: And this is with 10ms readings: And this is with 20ms frequent readings: So it seems there is a fairly steady oscillation that is not to do with the rate of reading.

I’ll bet if I look further into it, this is a recurrence of my electrical noise in the temperature sensor problem caused by running too many sensors on the same loop…

But hang on, they’re all off. We have the situation that either it’s making this signal itself on a roughly 100ms cycle, or it’s coming from some other power circuit in the system.

All this does mean that if I am sampling at a rate which is aliasing with this slow signal, I’ll get an even longer wave variation that will effect the agreement between this and the other wind sensors. And could explain why sometimes they agree and sometimes they don’t. It’s a curious prospect.