Friday, March 24th, 2017 at 12:19 pm - - Flightlogger

We move on to the temperature sensor work, and the controversial concept that the temperature of the rising air in a thermal is hotter than the non-upwardly-mobile surrounding environmental atmosphere.

I say it’s controversial because the meaning of “the temperature” in relation to a mobile and turbulent airmass whose structure spans hundreds of metres in the vertical dimension and thousands of Pascals in relative pressure is undefined. Remember that the adiabatic temperature differential is about 0.7degrees per 100m change in altitude.

We do, however, have a single point temperature sensor on a mobile glider which is also progressing up and down the air column (depending on the wind currents and skill of the pilot). The location of the glider with the single temperature sensor is imperfectly known (due to bad GPS (see previous post) and inexplicable barometric behavior (see next post)), and the sensor itself has a thermal mass which means its readings have a delay half-life of about 9 seconds in flowing air (see this other post).

I have taken the precaution of jettisoning my slow, accurate and low resolution dallas temperature sensor for two humidity/temperature sensor combos and an infrared thermometer that has quite a good ambient temperature sensor within its metal can.

These sensors tracked one another pretty well, except for one small problem when I spotted a series of spikes in one and then the other humidity/temperature sensor.


What is going on?


Monday, March 20th, 2017 at 1:45 pm - - Flightlogger

Last week I finally had my first flight of the year with my newly build flight data logger. I can’t believe the number of issues it’s already thrown up.

At least I may be making quick enough progress to get past the issues (rather than being swamped by them) using this exceptionally efficient Jupyter/Pandas technology.

For example, my code for parsing and loading the IGC file is 15 lines long.

The code for loading in my flight logger data into a timeseries is just as brief, if you consider each data type individually (there are more than 13 of them from humidity sensors to an orientation meter).

The GPS time series from my flight logger (at 10Hz) can be crudely converted it to XYs in metres, like so:

# pQ is the GPS position pandas.DataFrame
earthrad = 6378137
lng0, lat0 = pQ.iloc[0].lng, pq.iloc[0].lat
nyfac = 2*math.pi*earthrad/360
exfac = nyfac*math.cos(math.radians(lat0))
pQ["x"] = (pQ.lng - lng0)*exfac
pQ["y"] = ( - lat0)*nyfac
plt.plot(pQ.x, pQ.y)


Note the suspicious sharp turn near (-1000, -400). Here’s another sharp turn somewhere else in the sequence covering a 1 minute 5 second period using time slicing technology:

t0, t1 = Timestamp("2017-03-09 15:42:55"), Timestamp("2017-03-09 15:44:00")
q = fd.pQ[t0:t1]
plt.plot(q.x, q.y)


The dot is at the start point, time=t0.


Friday, March 3rd, 2017 at 11:04 am - - Flightlogger, Hang-glide, Uncategorized

To be clear, I haven’t got mathematical proofs here (I don’t have the time), but the experimental evidence is quick to get.

Take the differential barometer sensor (used to measure airspeed) of the hang-glider flight logger. The Arduino code which updates the reading every 200ms looks like this:

long lastpx4timestamp; 
void Flylogger::FetchPX4pitot()
    long mstamp = millis(); 
    if (mstamp >= lastpx4timestamp + 200) {
        sdlogger->logpitot(px4timestampset, px4pitot-rawpressure, px4pitot->rawtemp); 
        lastpx4timestamp = mstamp; 

Why did I choose 200 milliseconds? It sounded like a good number to read it at. This is a quick way to program it to be a regular reading.

A better way is to actually synchronize it with the clock divided rather than simply add 200ms to the next time, like so:

int mstampdivider = 20; 
int prevmstampdivided = 0; 
void loop()
    long mstampdivided = millis()/mstampdivider; 
    if (mstampdivided != prevmstampdivided) {
        prevmstampdivided = mstampdivided; 
        P(micros());  P(" ");  P(singlereading());  P("\n"); 

Now that code reads at 20ms rather than 200ms, but it prints a load of output which I can cut and paste into a file and read into pandas, like so:

rows = [ (int(s[0]), int(s[1]))  for s in (ln.split()  for ln in open("../logfiles/dmprapidtest.txt").readlines())  if len(s) == 2]
k = pandas.DataFrame.from_records(rows, columns=["t", "d"])

And then we can plot the autocorrelation (the covariance) with itself shifted in time, like so:

d = k.d   # just the measurement Series
dm = d.mean()
ss = [((d - dm)*(d.shift(i) - dm)).mean()  for i in range(400)]


Let’s zoom in on the first 50 covariances:

Wednesday, March 1st, 2017 at 7:55 pm - - Flightlogger

Lately I have become interested in noisy sensors; all sensors are noisy, so this is an applicable subject.

The interest comes from my attempts to read up on Kalman filters where the variance of the sensor measurements is a crucial input into calculation of the weight to apply to that measurement as well as the subsequent variance of the output.

I was wanting to do something more scientific with this noisy CO2 sensor, other than simply filtering it and saying to myself: “That looks like a nice clean line”.


At the same time, a minor brush with the concepts of statistical seasonal adjustment lead eventually to the autoregressive model and a way to test the quality of the noise — eg whether it is pure noise or the result of noise that has been filtered by a smoothing process.

For example, if you take pure white noise function X[*] with a standard deviation of 1 and pass it through the usual exponential decay filter:

Y[n] = Y[n-1]*f + X[n]*(1-f)

the standard deviation of Y[*] is


and the covariance between Y[*] and Y[*-1] (formed by averaging the product of the sequence with an offset of itself) is f times this value (whereas the covariance between the purely random sequence X[*] and X[*-1] is obviously zero).

In the absence of attendance at a university lecture course taught by a professor who has structured a procession of scientific concepts into a coherent order in which to learn over the course of a semester, I am happy to follow this journey where it takes me.

My glider data logger has two wind sensors now which, as with most sensor problems, disagree unpredictably. Airspeed is a tricky thing, due to local turbulence and large scale vortices which exist owing to the existence of a levitating wing.

One of the sensors is a pitot tube connected to a MS4525DO differential pressure device purporting to measure to 14 bits of precision as often as every 0.5milliseconds.

I’ve set up the pitot to read as fast as possible, about every millisecond, like so:

delayMicroseconds(500); // needs a delay of 500microseconds or the reading is marked stale

Wire.requestFrom(0x28, 4);
uint16_t pmsb =; // this reading takes 400microseconds whatever happens
uint16_t stat = (pmsb & 0xC0) >> 6; // 0 good, 2 stale, 3 fault
uint16_t plsb =;
uint16_t rawpressure = ((pmsb & 0x3F) << 8) | (plsb);

uint16_t tmsb =;
uint16_t tlsb =;
uint16_t rawtemperature = (tmsb << 3) | (tlsb >> 5);

Now immediately you’re going to have a problem because the measurements aren’t continuous; the values are quantized.



Monday, February 27th, 2017 at 7:13 pm - - Machining, Whipping

For a number of years I have been familiar with the observation that the sophistication of, in particular, time series data analysis is adversely impacted by the use of the Excel spreadsheet program. More recently I have discovered exactly how it is an irreparably deficient application and I am convinced that its use should be abolished from all non-small business accounting applications (ie everything except what it was originally intended for).

Hitherto I did not attach much importance to this view, owing to the fact that it is considered an anti-Microsoft bias as well as a “lost cause” because “everyone uses it”. However, on learning the existence of a large body of signal processing theory which is all but inaccessible to users of Excel due to its fundamental nature, I submit my observations for consideration below.

My first remark is that if data scientists don’t know about the benefits and substantial applications of multi-source data combinations, Kalman filters and seasonal adjustments reliant on the autoregressive-moving-average model, they are missing an important part of their job and are deferring the implementations of these concepts to mere “estimation by eye” from the graphs.

My second remark is that when external software exists that can be used to, say, calculate and compensate for the seasonal adjustment, it generally requires the data to be submitted in a time series format, and this requires a great deal of preparation of the spreadsheet data. Thus the appearance of being able to open up and immediately (and supposedly do) work with the spreadsheet data within seconds is deceptive, because there is now a longer route for the data to move it back out in a form to be processed and re-imported back into the spreadsheet for familiar viewing.

Let us consider a couple of time-series data sets. For example, the monthly national GDP and the employment statistics, or imagine one minute intervals of temperature and electricity use in a building.

What elements of the data are required to perform any useful processes on it, beyond simply the production of visual graphs?

For time series data (which a great proportion of data can be described as being), the existence of a reliable datetime value is paramount. Excel may in theory have a datetime cell type, but it is not visibly distinguishable from an unstructured string type with its ambiguous American and English date orderings. As such, it cannot be consistently used because improper use does generate an unignorable error (eg anything in column A must be in this datetime form or you can’t save the file).

Furthermore, just the datetime is not enough, because there are time series intervals (for example, monthly or quarterly data) and these cannot always be approximated by a single time point. By convention quarterly intervals can either be represented by the end of the quarter (eg sales results) or the start of the quarter (eg sales targets) but both need to be boxed into the same entity in any subsequent regression model.

Finally, when you have different data series from different data sources they usually work to different sample rates, so you cannot represent them adequately as a single row per time sample. This would apply to the power use for the heating system which is provided every minute, when the average outdoor temperature is recorded daily.

Accordingly, the primary dimension of the data points, the datetimes, are problematical. But what of the data point values, the measured quantities? If they are each recorded into a single spreadsheet cell we will invariably be lacking an associated standard deviation/confidence interval for them. The standard deviation is an crucial input to the Kalman filter for the determination of the weight applied to each individual measure.

Take the example of the monthly political polling data prior to an election. These are often supplied by different agencies and almost always come with a confidence interval that depends on the size of sample so we know to take less notice of a poll which defies the steady trend when it has a wide margin of error. But then if there are more polls with the same wide margin of error that are also in line with that new trend, the best guess of the trend will be pulled in the new direction as much as it would have been by one very good accurate poll with a narrow margin of error. This balancing of the estimations from the aggregation of the location and accuracy of the measures is optimized by the Kalman filter, and should not be done by eye from the charts themselves merely because it can’t easily be applied in Excel and we’re too lazy to convert our working platform to something where it could have been easily applied.

And this brings me to the final point about Excel, which apparently can do anything because it can run programmed macros. Really? Who can honestly think, if they have every stopped to consider it, that it is a good idea to co-mingle software with data? You might as well nail your vinyl record onto the record player and then parcel-tape it into a heavy cardboard box to prevent interchangeability.

The co-mingling of data and code with no reliable means of disconnection leads to dangerous and ridiculous practices, such as copying the data into and out of the spreadsheet by hand just to access the service of the macros.

Come on folks. If you’re going to call yourself data scientists, you cannot rely on a single tool that prevents you from applying the last fifty years of data science know-how optimally and productively — and then rely on its inadequacy as an excuse to not challenge yourself to learn and master the amazing mathematical technology that would have been at your disposal had you not chosen ignorance over education.

We have got to get beyond the pointless issue of formatting data into “machine readable form” for the mere purpose of making graphs for visual titillation, and get to grips with actual intelligence and effective control theory.

There is, for example, nothing smart about having to control a car with a human looking out through a window for white lines on the tarmac and the traffic lights on the verge in order to move the steering wheel and pedals in response. Smart is getting the data to feed-back directly into a computer that controls these motions optimally while you sit back in awe having merely specified the destination. But if someone out there building the tech has dared to embed a copy of Excel within the process chain between the sensor acquisition and the motor control actuators, then we are doomed.

Thanks to the famous Go to considered harmful letter of 1968. There was a heated debate about it, and 30 years later it was unconscionable that programming languages could even be conceived of to include a goto statement. Kids these days probably don’t even know what one is.

But just think about all that programming wasted and how much further on we could have been without the inclusion of that single statement which caused so much unnecessary expense and buggy code throughout the years, and then imagine how much damage is being caused by inappropriate use of this inadequate data non-analysis tool up to now and for the next 20 years before it too finally gets buried in the ash-can of history and people don’t even remember that we ever used it in the first place.

This has been the moment of truth.

Good day.

Friday, February 10th, 2017 at 1:08 pm - - Machining

Here’s some bad out-of-focus video from the latest steel milling.

The milling was bodged because somebody used the wrong size collet which didn’t grip properly and so the cutter slowly slid inside, leaving steps on the job.

I’ve lost patience for blogging, or filming or documenting things properly. It’s all been out of sorts. Maybe I’m low on VitaminD.

I did feel content while doing a day’s work yanking out lots of small trees by their roots on Saturday.

Progress has been on-going with a number of datalogging and ESP8266 projects, such as this Sonoff S20 smart Socket hack where I got help to install Micropython to turn on and off a vacuum cleaner.

(I am trying to do the same with my heating system boiler, except the ESP8266 I’m using keeps dropping out for no reason.)

I have the view that the “Internet” part of the “Internet of Things” is the problem, with all these servers, gateways, analytics, and remote controls from anywhere in the world. None of it does anything useful, because the value of data and control of a thing is inversely proportional to your remoteness from it in time and space.

The answer is therefore going to be nearby. It’s not ever going to be found from pushing it out to some faraway server in China.

What we really want are “Things with Internet Technology”. That is stuff with Wifi, Webpages and Wisdom. No nonsense, fun, and easily reprogrammable on an escalator of betterness to a high plane than the bare ground we start with.

So, in addition to the cheap and multiples-of cave logger work, I’m proposing a mains cut-off switch that does not contain its own microcontroller — let alone its own pre-installed software. Those are added extra onto the minimally necessary hardware.

All you get from it are four pins on the junction box: Ground, 5V, GPIO-output for on-off, Analog-in for the current draw.

There are about a billion pis, arduinos, microbits and esps out in the world that people know about and are waiting for something like this to plug into, so there is no reason to force your own choice onto them just because you’re familiar with it. That’s what I think.

Friday, December 30th, 2016 at 9:44 pm - - Flightlogger 1 Comment »

Not been taking time to blog much. I have been lost in a vast pool of non-working projects, and trying to learn scipy and pandas and jupyter, which is an effective replacement for twistcodewiki, which itself was derived from Scraperwiki. The in-browser interactive programming environment Jupyter is everything that Scraperwiki was supposed to have become (eg see data journalism example). But that didn’t happen because every other software engineer I knew assured me in No Uncertain Terms that nobody was ever going suffer the sheer excruciating pain of coding in the browser — when real programmers all want to use vi and git.

But I digress to a world where practically all of my coding is in the browser and I’m using tools like matplotlib at a cycle time of around 10 seconds, which is about a quarter of the time it takes to lose any image file you have generated somewhere on your goddamn disk!

Today I put my new hang-glider data logger (which is held together with lots of glue) into the floor of the elevator and recorded the accelerometer in Z sampling at 100Hz.


With my new powers of filtering

b, a = scipy.signal.butter(3, 0.005, 'low')
pZ["fz"] = scipy.signal.lfilter(b, a,

we can get this:


The art of using sensors is to use a set of them in order to access the underlying reality.

Here is the raw barometer reading as the elevator goes up and down several floors.

Once again it’s a good idea to filter it, only this time we set the frequency of that butterworth filter to 0.01 instead of 0.005 because the barometer readings are at 50Hz, and we need the time delay from the filters to be comparable.


Because it’s nice and smooth after filtering, we can differentiate it by differencing it, using the pF[“df”] = pF.f.diff()*50/10. The times 50 is because it’s at 50Hz, and the divide by 10 is to convert from barometric readings to metres (approximately). The scale is metres per second, and metres per second squared. The elevator travels at about 1.5m/s. The section on the right with the green line lifted for 12 seconds was traveling down about 3 floors at once, so the air pressure was constantly rising.


Acceleration is the differential of the velocity, so setting ddf = df.diff()*50 gets us the following plot:

What we now see are just horrendous oscillations in the barometer. Every time you differentiate you magnify the noise.

But we can reduce the lowpass frequency of this filter by a factor of 5, so it’s 0.001 for the accelerometer and 0.002 for the barometer, and get a curve that agrees (where all the scaling seems to work out without any fudging).


The problem now is you can’t see the elevator moving between the different floors at all. What are those barometric oscillations from? Was I breathing too hard?

Let’s plot the barometer differences against the filtered barometer differences.

There’s a little too much noise, so filter the barometer difference by 0.1 instead and plot it against the 0.01 filter.


No, I can’t see those oscillations in the underlying data either.

To my eyes they only start to spike up with a filter low pass frequency of 0.03.


So I’m pretty stumped. But now I’ve got my eye on it, I can see a slight waviness at about this frequency in the original barometric reading in the third picture from the top, so it’s not an illusion or a creation of this filter.

If I hadn’t just glued my entire unit together I might have separated just this device out to see if it’s caused by yet another oscillation in the circuitry, like the kind I’d seen on a temperature sensor.

Maybe I’ll have to rerun this elevator experiment with all the other sensors turned off as much as possible to see if I can get the the bottom of this mysterious 4 second oscillation in the barometer.

Then there’s the temperature decay curves to sort out (from the use of a hairdryer this morning), which are not as good as I had hoped.

I am pleased to have finally implemented an AB Butterworth filter on my arduino.

It’s not a big deal. The code is roughly as follows, and you can get the a[] and b[] arrays from SciPy by executing the function scipy.signal.butter(3, freq, ‘low’):

class ABFilter
    float* a, *b, *xybuff; 
    int n, xybuffpos; 

ABFilter::ABFilter(const float* la, const float* lb) :
    a(la), b(lb), n(len(la)), xybuffpos(0) // len(a) == len(b)
{  xybuff = (float*)malloc(2*n*sizeof(float));  }

float ABFilter::filt(float x)
    xybuff[xybuffpos] = x; 
    int j = xybuffpos; 
    float y = 0; 
    for (int i = 0; i < n; i++) {
        y += xybuff[j]*b[i]; 
        if (i != 0)
            y -= xybuff[j+n]*a[i]; 
        if (j == 0)
            j = n; 
        j -= 1; 
    if (a[0] != 1)
        y /= a[0]; 
    xybuff[xybuffpos+n] = y; 
    xybuffpos += 1;
    if (xybuffpos == n)
        xybuffpos = 0; 
    return y; 

Friday, November 18th, 2016 at 11:23 pm - - Science Fiction, Whipping

Well, it happened. Donald Trump got elected. And into the hard vacuum of his political philosophy, he’s sucked in all kinds of plague carrying rats.

Rats like Myron Ebell.

People ask me if I am going to restart my proxy blog for Ebell, which I wrote from 2004 to 2011, called The Myron Ebell Climate chronicling his part in the suicide of the human species.

But there’s no point.

Not only is it too late, but everyone else is covering this walking cancer now.

Where were you folks when it mattered? Back in 2004 I said we should run a proxy blog for each one of these think-tank bastards and form a shadow network for this corporate funded disinformation infrastructure. It was required to correct the reputations of these monsters who were constantly popping up on the TV and in the newspapers carrying out lies and damaging all of society. It would have been easy. They’re slippery pricks, but one could systematically keep their record alive to make sure some of it sticks.

Saturday, October 15th, 2016 at 2:20 pm - - Cave, Hang-glide, Kayak Dive

Maybe I’ve got writer’s block. I’ve not even filled these into my logbook. I call it a hat-trick if I do a cave trip, a hang-glider flight and a dive in the same week. This is the fourth time I’ve done it. Generally speaking, the individual events are not all the greatest: the dive was pretty murky, the cave was gritty, and the flight was ridgy. Can’t complain.

The wreck of the Azmund is in Holyhead harbour about a mile of paddling out from the beach.

It was dark and murky and we didn’t find the way back to the boilers after starting on it. The wreck is huge though. Part of the metal juts out of water at low tide.

On the way back we discovered why the beach we launched from is not popular with kayakers — it dries out to about 500m. We couldn’t see our boats after the first time we walked back with a load to the car.

That was Saturday. There was a pleasant day out at Moelfre, with some people being terrified of the currents, but it was the wash from the joy-riding lifeboats that nearly sunk us. The image of the almost breaking 4 metre high wall of water that came upon us while we were anchored in the shallows of Rat Island a few minutes after they zoomed through the channel is going to live long in my memory. The second dive worked out well when we found the remains of the Royal Charter in the sand after groveling in the shallows among the kelp where it was supposed to be until giving up.

Then there was a cave trip to the far end of Ingleborough Show Cave (the only photo of which I have is a line of cavers getting changed on the footpath), followed by a quick escape home ostensibly to start clearing out the house, but which was in fact an excuse to be in North Wales for a flight off Penmaenbach.

I landed on the dwindling beach at high tide after an hour of very smooth sailing in the sea air.


A concrete breaker was hired to smash up and take down the floor. We filled a skip with the crap a day later with some help from friends.

Now we live in a building site. Again. And it’s mid-way through October.

Saturday, September 24th, 2016 at 1:51 pm - - Flightlogger

I’ve been flying around with my data logger for two years now and I have only this week developed a theory of how to process the temperature data. I might have got to it sooner had I known enough physics or bothered to test the actual response time of my Dallas temperature sensor, which was so abysmal I had to start doing something about it. To be fair, it did take me over a year to even shelter it from the sunlight.

Part 1: The responsiveness of the temperature sensor is 0.09

The apparatus involved running a vacuum cleaner against the sensor housing to create a fast airflow while drawing the air either from the room or down a tube containing a bag of ice.


This was the graph it made, holding the intake near the ice and away from the ice for about 40seconds at a time

The curves you see are exponential decay curves, as the temperature sensor cools to the air cold temperature, then warms to the room air temperature, quickly at first, and then slowly as it converges.