Here’s a zoomed-in section of my fridge temperature as it rises by about 0.75 degrees an hour, or 12 units of 1/16th of a degree which my Dallas OneWire DS18B20 digital temperature sensor reads at its maximum 12 bits of resolution.

The readings don’t jump between more than two levels when the temperature is stable. You’d expect some more random hopping from signal noise.

Indeed, applying a crude Guassian filter doesn’t seem to do much good. This (in green) is the best I got by convolving it with a kernel 32 readings wide (equating to about 25 seconds)

filteredcont = [ ] # cont = [ (time, value) ] k = [math.exp(-n*n/150) for n in range(-16, 17)] sk = sum(k) # (the 1/150 const chosen for small tail beyond 16 units) k = [x/sk for x in k] for i in range(16, len(cont) - 16): xk = sum(x[1]*kx for x, kx in zip(cont[i-16:i+17], k)) filteredcont.append((cont[i][0], xk))

The filtered version still has steps, but with a rough slope at the change levels. This filter is very expensive, and not any better than the trivially implementable Alpha-beta filter, which smoothed it like so:

a, b = 0.04, 0.00005 # values picked by experimentation dt = 0.5 vk, dvk = cont[0][1], 0 cont3 = [ ] for t, v in cont: vk += dvk * dt # add on velocity verr = v - vk # error measurement vk += a * verr # pull value to measured value dvk += (b * verr) / dt # pull velocity in direction of difference cont3.append((t, vk))

With such a slowly varying signal, the noise seems to encounter different zones of variability.

Take three temperatures 1/16 of a degree apart: 3.4375 and 3.5 and 3.5625. When the fridge temperature is anywhere between 3.48 and 3.52 degrees it will read a rock steady constant 3.5 degrees, but when it’s somewhere in the boundary zone between, say between 3.52 and 3.5425 degrees, then the sensor will choose at random either 3.5 or 3.5625 (with a weighted probability of which end of the interval it is close to), before locking on to a steady 3.5625 as the temperature rises into that stable zone.

*[I am now doubtful that this observation of stickiness is true. It is, however, an emergent anomaly that makes it impossible to filter these sequences into a straight line when the underlying temperature change is shallow and exactly linear.]*

As mentioned, the 1/16th of a degree Celsius readings correspond to 12 bits of resolution. This requires 750ms for a reading, according to the documentation. If you want it to 1/8 degree precision (11 bits) you can get the answer in 375ms. 10 bits precision takes 187.5ms, 9 bits precision (half a degree) takes 93.75ms. A doubling of the precision takes double the time to read.

Maxim (Dallas) Application Note 208 states:

Maxim Integrated uses its unique manufacturing capabilities to provide factory-calibrated digital temperature sensors with accuracy as high as ±0.5°C…

[You can successfully compensate for the error offset of the device which is a simple second order curve and] of a repeatable nature for bandgap-based sensors.

This will not work for dual-oscillator-based thermal measurement circuits.

The silicon bandgap temperature sensor is a peculiar circuit that produces an almost exactly linear voltage relative to the temperature. The proposed compensation above requires you to measure and model the second order response curve of the device, as they couldn’t be bothered to calibrate and implement it in the factory.

To convert the voltage into a number, you need an analog to digital converter, of which there are at least 10 implementations.

Because the time duration required to read the temperature doubles for each extra bit of precision, I don’t think it’s using the successive approximation method which should take a constant time for each extra bit. Instead, it must be a Wilkinson ADC, where the voltage is converted to an oscillation and the device counts the number of oscillations within a fixed time window. It’s like a series of sparrows hitting your front door at exactly one second intervals. If you open your door for exactly 15.5 seconds, then you would expect half the time to get 15 sparrows in your living room, and the other half the time 16 sparrows. You will never get 14 or 17 sparrows if the sparrows are evenly spaced. More consistent measures could be made if you opened your door the immediately after the last sparrow had hit instead of simply at random.

Let’s take a break from all this hard theory, and look at the readings I got from the DoESLiverpool fridge:

That’s a beautiful 30 minute cycle between 5 and 9 degrees, exponential curves with about half the time spent running and burning electricity. (Vertical lines are hourly intervals, and horizontal lines are degree intervals from zero.)

We have various fridge opening events, usually in the form of double spikes in the light measurement, because you open it to take the milk out for your tea, and then open it again to put the milk back in.

The bimodal hopping at 1/16ths of a degree occurs all the way along.

I decided to port my damn-efficient convex-hull-based run-time line fitting arduino code into Python so I could plot the results.

Below is the fitting of a curve with 67265 sample points to a series of 123 straight line intervals at a precision of 0.3 degrees, which is deliberately loose to show how it works.

This is same again, but to a precision of 0.09 degrees resulting in 525 subsamples (less than 1% sampling) which represent the curve.

This is about the threshhold. If you go down to 0.06 degrees it’s 10756 points because it starts following all the up and down little steps.

Line fitting of this kind could be applicable for all time-series data so you aren’t tempted to undersample to reduce the data at the cost of losing temperal resolution. If you think about it, your standard time series data

[ (t0, v0), (t1, v1), ... (tn, vn) ]

is recorded with the time intervals (t_{i} – t_{i-1}) virtually the same, which means that these values contain no information (subject to the odd glitch).

The only unsatisfying thing about it is that it can only refer to points that are in the data. If you had something simple like a staircase curve:

[(0,0), (0,1), (1,1), (1,2), (2,2), (2,3), ...]

there isn’t any subsampling of this set that will fit quite as perfectly as the line *y = x + 0.5*.

It might be possible to get a better line fit by considering the simple linear regression within the section windows, and simply trigger a new point whenever it goes out of tolerance.

The computational advantage here is that we can find the best fit curve from the cumulative sums of t, t^{2}, t*v, v and v^{2}, which means it could be realistic to encode into the arduino.

Here’s how we can plot the regression across several segments:

def lf(cont): n = len(cont) st = sum(t for t, v in cont) st2 = sum(t*t for t, v in cont) stv = sum(t*v for t, v in cont) sv = sum(v for t, v in cont) sv2 = sum(v*v for t, v in cont) m = (stv*n - st*sv)/(st2*n - st*st) c = (sv - m*st)/n return [(cont[0][0], cont[0][0]*m+c), (cont[-1][0], cont[-1][0]*m+c)] for i in range(0, len(cont)-500, 500): sendactivity("contours", contours=[lf(cont[i:i+500])])

I made a slight tweak to the algorithm so it fits these least squares lines as it goes along, but my implementation is subject to over-shooting.

So that’s a bust.

The next plan is to look one node back and see if it can be adjusted for a better fit of the lines on either side of the sequence.

Before that, I wired up a TMP36 sensor to a proper ADS1115 analog to digital converter, and got this trace in yellow above the Dallas sensor readings (scaled and translated to fit the picture):

Applying the maximum reading filter on a sample window of data (where the samples can be so much more frequent than the Dallas readings), produces this exceptionally clean curve:

This leads me to conclude that these fancy Dallas sensors are a waste of time. If I am trying to do some basic research I should start with as clean a set of data as possible to make my algorithms work. Then, when the theory is functioning, we can talk about making it work on crappier instruments where the signal is harder to pick out.

**Update:** A 2 window second max value of sequences of 10 samples taken at 1/5 second intervals is giving superlative results around the fridge opening events.

I’m going to put fridge monitoring aside and play with my other toys for a bit.

]]>I have a few surpluses by now. I got a realtime clock which is 5V, and a microSD card reader which is 3V3; the Jeenodes run on 3.3V and the normal arduinos are 5V, so I can’t easily use either as the controller for datalogger. Some of the more idiot-proof breakout boards have converters on them, so they are safe for either voltage. Adrian has warned me to prepare for the coming of the 1.8V standard everywhere soon. I bought a combined ArduLog-RTC Data Logger, which for the moment is not playing ball.

Meanwhile, I’ve made a rule for the data logging of sensor data. Don’t do it. It’s not an end in itself. Too often people take on projects to collect sensor data and upload it to the internet (it’s Tuesday, so the site must be called Xively) with the idea that anyone else in the world could download it and [**rolls eyes**] *“Do whatever they want with it.”*

*“Like what?”*

*“Whatever they want!”*

If **you** can’t think of a single interesting application for your data, why do you think anyone else in the world will be able to? And even if there was anyone in the world who could do something with it, they’re probably the sort of person who’d have their own data which is guaranteed to be lot more interesting to them than yours. There’s a reason we don’t have a CCTV channel of someone else’s back door at night on cable TV.

I’ve formulated a stronger principle:

The value of sensor data is inversely proportional to the product of the time that has ellapsed since it was collected and the distance you are from the subject of the data.

Let’s take a simple case.

Patrick made CoffeeBot by putting the coffee machine onto an electronic weighing scale connected to an Arduino with an ethernet shield that talks to the internet.

He writes:

The optimum condition for using the CoffeeMonitor [a second arduino device that shows an arrow to the level of the coffee] would be out-of-sight of it, but still in the same building.

Clearly, the data about the level of the coffee in the pot at some moment 30 days ago is of no value.

Similarly, the data about the level of the coffee right now is of no value to someone in Manchester.

The blurb concludes:

While collection of the level of the coffee in the machine is an interesting exercise and it’s nice to know if there’s coffee available before getting up, there are some more practical uses once the data has been collected. Averaging the number of cups consumed during each day might give guidance on when to stop making coffee in the afternoon, and how many cups to make in the last pot. Tracking total cups made would mean that it’s possible to determine how regularly coffee beans need to be purchased so as to keep them available, but still fresh.

All very worthy, if over-optimized suggestions. So, how come when I get to the hackspace before everyone else in the morning and put the coffee on, the people who normally do it have to come in and ask me if I’ve put the coffee on just to make sure that they haven’t just poured themselves a cup that’s been baking on the hotplate overnight?

It seems that nobody has got round to writing the simple code that works out if the coffee has been on there for more than six hours and lights up a poop-coloured LED.

Data logging should be therefore be banned as a distraction.

*My god, is there really that much noise in the coffee scale signal?*

Data streaming, however, is essential for the development of algorithms that can turn raw sensor data into actually useful information.

So I deleted my sophisticated hour-by-hour file single day directory CSV data logging system whose files could be loaded into excel and badly graphed to make you go “Aw, that’s interesting”, and replaced it with a brutally simple sequential hex dump from the sensors to the SD card that is intended only to be parsed and replayed back into the system, either from the SD card or down the serial port — once I have worked out how to do either.

This will make it possible to systematically develop, debug and test how I am going to handle noise and detect signals relative to the various scatterbrained projects I have already engaged myself in.

These are:

**FridgePhoton** (aka Fridgahedron) – Detect the parameters of the cooling and warming cycle of your fridge, and see how it varies throughout the day and night, and what are the effects on it from the outside temperature, the fullness of the fridge, and the cleaning of detritus from the rubbery seal that’s letting the cold air leak out, the marked effect of which you should be informed of in less than an hour so that you know that you earned that beer in the fridge that you’re going to have. My scientific hypothesis is that a full fridge will have a longer cycle because of the thermal mass. The market for this product is as an upgrade of the Fridgeezoo. Additionally, I contend that you’re not going to get retrofitted **smart demand response fridges** until the plug-in device that turns it on and off (probably by responding to signals transmitted over the mains A/C frequency) knows exactly what’s going on in the fridge so it doesn’t switch it off at the worst possible time and melts your freezer. It would help to have this intelligent toy in the fridge first, which is then able to communicate with and over-ride the dumb wall socket switching unit so it cuts the power only when it doesn’t matter to you but reduces a spike on the grid in an arbitrage deal. Smart arduino-based fridge controllers that are able to learn the thermal overshoot degree have been developed for the purpose of brewing beer.

**Methane detector** – This is a prototype for CO2 sensing which can control ventilation in Andrew’s air-tight house. I had hoped that I could track the transfer of air around my not-at-all air-tight house, but early experiments show that you have to hold it quite close to the gas cooker to get any reading. I do have a more sensitive analogue-to-digital converter than the arduino analog pin, so I am not done yet. I will not be satisfied until I have measured a positive concentration in at least one fart.

**Total light switch duration** – A device you can stash above the strip lights in Becka’s office building that logs their on-off periods over a week and prove to the facilities manager which ones have their motion sensors buggered because they’re always on even when no one is there.

**PALvario** – Following the discovery that gliding variometers (which detect the rate of ascent or descent) can be made with a very accurate altimeter chip (one of which I now have), I decided I would like to have a *Peace-At-Last vario* that does not bleep at you all the time while you are flying. My haptic vario is going to be connected to a vibration motor against my skin, so I can feel the information as to whether I am going up or down, leaving me to enjoy the sensation of the clear air rushing past my head.

**Wing nerves** – This is the daddy. A hang-glider is the ultimate in wearable tech. Birds have an advantage because they can feel the changing pressure on their living wings as they soar through the air. Can I install air pressure sensors along the top surface of my wings which will be able to detect changes in its flight characteristics at high speed or close to stall? Most electronic barometers are connected to the I2C bus, which has a range of about a metre. We need an oscilloscope to check if we have the right shielded cabling to make it go longer. There’s the issue of finding a way to attach it onto the surface of the wing, probably taping a wire back over the trailing edge and along to a box on the keel. The barometric device is said to be sensitive to sunlight. My first flight will merely be data logging to see if anything can be found in the data. Relating the data stream to a events on the flight (turn left, turn right, go slow) will require synchronizing it with the video.

The flight video, of course, is the ultimate in data logging, and I will continue to do this, so the title of the post is all wrong. I am going through rapid iterations and changing my mind in ways that would not be possible if I was instructing someone else to do the work for me. It would drive them nuts. It must be done myself.

Imagine in a larger organization where the inertia is fully formalized, and engineers (who are never allowed to hold any authority) have to tediously argue the case to a middle manager or other responsible adult to release the funds and employee time to work on a new idea. Then on day one it doesn’t look like it’s going to work so well, and you’re forced to either flog this dead horse for the duration, or attempt to undo all that work you just did with the middle managers building up the idea and persuade him that you want to do something else.

There’s no way you’re going to be bothered. They need to get this inactive talking shop idea clearing process out of the picture and instruct engineers to do different things all the time. The manager should say, Do whatever you want, but I’m going to come in once a week, and I want to see you doing something different each time I see you. Not the same project day after day. That’s the only way you are going to absorb the breadth of experience to start making stuff that puts together something new.

Meanwhile, last night’s home fridge trace: horizontal lines are degrees centigrade from zero at the bottom, vertical lines every hour, and the red spikes are the fridge light when I put the device inside in the evening and took it out in the morning. It’s working to a four hour cycle, and there’s a mysterious two slope step during the cooling section that lasts half an hour. There’s a large freezer compartment on the bottom for keeping a year’s supply of Becka’s breakfast blackcurrants.

I was going to spend some time running it through my peak detection algorithms, but now I’ve just wasted the whole afternoon on a blog.

]]>Short_ints are only 2bytes with a maximum value of 32767, so you’re always overflowing them and it’s not worth the hassle. Therefore you have to use long_ints, which are 4bytes, the same as a float, so saving precious memory is not a factor in this decision.

Anyways, I woke up this morning and decided I needed some benchmarking.

After a number of false starts, I got as far as making the following arduino code:

#define P(X) Serial.print(X) float x1 = 11, x2 = 2; long n = 100000L; void setup() { Serial.begin(9600); long i = n; long t = millis(); float x; for (long i = n; i > 0; i--) { x = ((x1*x2*13*5-16)*3-1700)*3; } t = millis() - t; P("Force x to exist by printing it: "); P(x); P("\n"); P("floatmult "); P(t); P(" "); P((float)t*1000/n); P(" microseconds"); P("\n"); } void loop() {}

One of the problems is that the compiler is sh*t-hot, and will make the variable **x** disappear along with all the associated calculations if you don’t print it out, leading to a computation time of zero microseconds for n calculations for all values of n.

By defining the x1 and x2 variables as floats, all five multiplications in “((x1*x2*13*5-16)*3-1700)*3″ are cast to floats, producing a time estimate of about 59microseconds for that formula. Tagging on another “*x1″ onto it gets 66microseconds, so it’s a rough guess of about 10microsecond per floating point multiplication.

If I make the second line above read: “int x1 = 11, x2 = 2;”, we’re down to 3.5 microseconds, or probably 0.5microseconds per integer multiplication. That’s a factor of 20 times difference in speed. (The Mega Arduino and the cheap Jeenode (Uno) Arduino give exactly the same timings.)

I am not an expert, but this may be indicative that the floating point implementation is being done by the compiler in software.

Meanwhile, a press release from Atmel about a completely different device states:

MUNICH, Nov. 10, 2010: — Today at Electronica 2010, Atmel announced the first 32-bit AVR microcontrollers with a floating point unit (FPU). Targeting industrial control applications, the new Atmel AVR UC3 C MCU series offers a unique mix of high-processing power, true 5V operation, high-speed communication and advanced safety and reliability features packaged in a range of small and miniature packages.

The IEEE 754-1985-compatible FPU increases the performance, precision and dynamic range of calculations offered by the Atmel AVR UC3 CPU. The native support for the floating point arithmetic allows design engineers to use a full-featured toolbox for designing sensor and control applications. In addition, the advanced math can be applied to enhance signal processing, filtering, and noise suppression in a wide range of applications including motor control, robotics and audio.

Now, here’s the problem.

Although it’s not in hardware, 10microseconds per floating point calculation is not that bad, if there are few enough of them.

And this press release indicates that the more advanced microcontrollers will have good floating point performance, and that people might find it useful for signal processing. As I believe there is no technical reason you can’t put a good FPU onto a cheap chip, it’s probably going to be there eventually if there is a reason for it.

On the other hand, I’ll be able to process many more signals with one little arduino if I stick to integer arithmetic. But integer arithmetic is less reliable and harder to debug and develop if I am attempting to do something interesting — such as fitting exponential curves to data, or some sort of thing.

It’s a hard choice.

I think I’m going to rewrite my code and go with floating point. Doing things with integer arithmetic is only going to get in the way and slow me down. And anyway, if I do process many signals at once, the I/O is more likely to choke before the arithmetic processing.

I’m wondering if I can get it both ways by using a #define SCALAR to “float” or “long” and assume I can work with fixed point arithmetic once the floating point version is working.

It’s not that straightforward, as I’d need a second value, say, #define FIXEDMULT 1000 which I’d need to divide by whenever I multiplied two fixed point values. Or set FIXEDMULT to a power of 2, like 1024, so you’d write (a*b)<<10.

I’ll stick with pure floats in the initial library then, and not get distracted.

]]>You win some, you lose some.

I got a motion for electionleaflets.org to be done properly accepted by the members at the UnlockDemocracy AGM on Saturday. This has the potential to get some professionalism on the situation in time for the next election.

Then I spent 36 hours working and sleeping in the basement of the National Audit Office at the Accountability Hack 2014 on a **my project**, with Becka getting predictably very bored at times.

The purpose of the project was to learn how to use PDF.js, which Francis told me about the day before.

I thought I had a good chance with it (being as it is completely practical and could be implemented by the Public Accounts Committee right away), but it did not even get an honourable mention. That honour went to Richard whose Parliamentary Bill analyser disclosed how many goats would need to be skinned to print out the Act, among other things. For more details, see my blogpost from six years ago: The vellum has got to go.

We met Rob for dinner who had a brain machine on the bookshelf, which Becka was very taken with. I can tell you that someone will be learning how to solder in the next couple of weeks, because that is the only way they are going to get one of their own.

]]>The sun was low and blinding, and I couldn’t see anything through my dark glasses when I went into shade, and had to carry them the whole time in my mouth. This is my concerned face about to make a turn in close to the trees.

I briefly got to the top of the stack in very weak thermic activity, and then it got too crowded. Shapes would fly out of the sun and startle you with your complete inability to judge depth on a silhouette.

Second flight (cheap orange glasses instead of cheap dark glasses) had another hang-glider in the air. He went and landed on the hill at this point.

I hung around and spent ten uncomfortable minutes below take-off level until I was able to work my way back up to a sensible altitude to come in for my favourite kind of landing: an uneventful one.

I’ve gotten familiar with the wind-gradient here, so it doesn’t come as a surprise.

Well that’s one way to cheer myself up after those US mid-term elections. It’s such a shame we have to share the same atmosphere and planet with them. If Americans want to teach their children that the world is 6000 years old, hand-guns make you safe, and two weeks annual holiday from a full-time job is a perfectly adequate way to live, then that’s their own problem. But this stuff is global.

Future historians are going to have their work cut out trying to understand where we went wrong, and why the very rich thought it would be such a good idea to mold human civilization in this particular direction, and why so many smart people took the money and helped them do it. It will take a lot of empathy from people a hundred years from now to really get under their skin in terms of understanding what was on our minds. But they might not care to do so, considering the irreparable damage that was knowingly caused by our generation for no gains. But it will be very important that they do so, because our mental and political flaws will be their mental and political flaws. The capability for some fiercesomely smart and effective organization for the implementation of total stupidity is not going to be bred out of us anytime soon.

]]>Oh yes, he says, obviously VCC is standard code for “power in” for that red square in the centre-left of the picture that contains a microSD card and requires 3.3V of power — even though this is nowhere stated and all the other circuits in this kit use 5V.

Whatever.

It’s not much of a standard when this is immediately contradicted by the thin thing on the bottom left of the picture (called a Jeenode) which labels its corresponding power pin “**PWR**“, and the low-power bluetooth blue board on the middle of the white panel which calls its power pin “VIN” for “voltage in”, and the red “real-time clock” thing above it which labels its power pin “**5V**“, which is so much better because: (a) it is immediately understandable by the man in the street, (b) it conveys the crucial information about the level of voltage required, and (c) it uses one fewer character when the labels are already too small to read without a magnifying glass which I do not have but should get.

So WhyTF do they use any of those other codes?

Ah, you might say, wouldn’t your logic require sometimes writing “**3.3V**“, which is four characters?

Well, no, actually, because the thing in the middle with the USB plug has two power pins on it, one called “**5V**” and the other called “**3V3**“, so they were forced to be sensible.

Of course, I’ll be proved wrong when I find a peripheral that contains both “**VCC**” and “**VIN**” pins.

Don’t get me started on all the other pin names, especially on the different arduino boards on which they’ve failed to mark out these all-important SPI pins that are either pins 11, 12 and 13, or pins 4, 1 and 3, or pins 51, 50 and 52, or you have to look it up on this handy diagram if you have a Jeenode.

I think electronics got off to a bad start from the very beginning when they decided that *current* flows in the opposite direction to the electrons. From then on it’s been seven human generations of miscodings and mistakes that have been adopted as conventions resulting in something not unlike spelling in the english language — ie you can’t see the problem once you have gotten used to it.

One of the steps in the process is to displace all the inertial unit measurements by a small error term in order to minimize the error in the correspondences.

More simply, we have a matrix **A** of height m and width n (m>n), a column vector **b** of height n, and we want to fill in the column vector **x** of height m such that **A x = b** is almost true.

There is no exact solution, so we look for a least squares answer, where **(A x – b)^2** is minimal.

Luckily, there is a function scipy.linalg.lstsq() which does the job.

Let’s consider a simple example of a 3×2 matrix:

import scipy.optimize import numpy as np A = np.array([ [2, 3], [1, 2], [-9, 8] ]) b = np.array([7,8,9]) x = np.linalg.lstsq(A, b)[0] print(x) # [ 0.9631829 2.21615202] print(np.dot(A, x) - b) # [ 1.57482185 -2.60451306 0.06057007]

In theory, any other value of **x** will evaluate to a bigger vector.

The problem is I am frequently working with a 2200×2000 matrix, which takes up to ten seconds to solve.

I am not a numerical analyst, but I have been told that it’s worth looking at the Limited Memory Broyden–Fletcher–Goldfarb–Shanno algorithm. Luckily, there is one implemented in the scipy.optimize.minimize(), so I don’t need to understand anything.

def func(x): bd = np.dot(A, x) - b return np.dot(bd, bd) x0 = [0.0, -1.0] # pick any starting point res = scipy.optimize.minimize(fun=func, x0=x0, method="L-BFGS-B", jac=grad) print(res) # nfev: 9 # x: array([ 0.9631829 , 2.21615202])

This looks pretty close to the answer from lstsq() after what it claims are 9 functional evaluations.

However, when I put a counter into func(x) I got 36 evaluations, many of which are small increments around a single value as though it is numerically computing the partial derivatives:

funceval 13 [-0.40910492 1.00305985] funceval 14 [-0.40910492 1.00305985] funceval 15 [-0.40910491 1.00305985] funceval 16 [-0.40910492 1.00305986] funceval 17 [ 0.64061277 2.10014887] funceval 18 [ 0.64061277 2.10014887] funceval 19 [ 0.64061278 2.10014887] funceval 20 [ 0.64061277 2.10014888]

The solution to this is to include a gradient function, which is a vector of partial derivatives in each of the coordinates of **x**:

def grad(x): bd = np.dot(mA, x) - mb return 2*np.dot(bd, mA) res = scipy.optimize.minimize(fun=func, x0=x0, jac=grad, method="L-BFGS-B")

Now I get one function call to each per iteration, like so:

funceval 5 [ 0.64061285 2.10014896] gradeval 5 [ 0.64061285 2.10014896] funceval 6 [ 0.96463734 2.21446478] gradeval 6 [ 0.96463734 2.21446478]

How do we know that I’ve programmed the grad() function right? Well, for a start, its evaluation at (0.9631829, 2.21615202) is nearly zero, as you would expect at a minima. And secondly, I can check it with this handy scipy.optimize.check_grad() function which automatically compares its value to its numerical approximation.

So, how does this work with my real world 1266×1201 size matricies?

I have a case where func([0]*1201) evaluates to 97.6421, and the lstsq() function finds a value of **x** in 2.77 seconds where func(x) evaluates to 7.8202.

So, how does the L-BFGS-B algorithm perform of this when I put it in?

Horribly.

Starting from x0=[0]*1201 it gets to an answer that evaluates to 92.6386 after 1000 iterations in 14 seconds.

I tried the BFGS implementation itself, which doesn’t allow you to limit the number of iterations, and it came back after 35 seconds with an evaluation of 96.8386.

Then I discovered scipy.optimize.leastsq() within this huge sprawling interface and noted that it’s not to be confused with scipy.linalg.lstsq().

Back to the toy example, where instead we need to return the whole vector before we’ve squared it:

def func(x): return np.dot(A, x) - b res = scipy.optimize.leastsq(func=func, x0=x0) print(res) # [ 0.9631829 , 2.21615202]), 3

which is a good answer.

Now back to the real world case of my 1266×1201 matrix, where it produces a residual of 7.8202 (same as the linalg.lstsq()) after 29 seconds and 8417 evaluations of func(x).

But, it’s possible to give it the Jacobean (the matrix of partial derivatives), which is just the **A** matrix, like so:

def Dfun(x): return A res = scipy.optimize.leastsq(func=func, x0=x0, Dfun=Dfun)

Now it gets to that answer in 12 seconds after 10 calls to func(x) and 8 to Dfun().

According to the documentation, this uses a set of FORTRAN libraries MINPACK written in 1980 to implement the Levenberg–Marquardt algorithm.

So the upshot is, nothing is coming close to the performance linalg.lstsq() algorithm in this case, which appears to be sourced from another worldwide public FORTRAN library LAPACK, developed since 1992 wrapping gelss as disclosed in the scipy source code.

So, that’s what I have found out today. I am now slightly familiar with how to call into this giant set of fundamental best-in-the-world numerical solvers.

It remains to explain why the BFGS functions performed so badly on my real world matricies. I was kind of counting on them performing a bit better than they did.

Things to do tomorrow: Get familiar with using kd-trees, and more hacking on the SLAM thing, if I can cope with it. Or I could do something else, like work on a GPL implementation of an STL slicer, which is at least something I know I can do.

]]>Anyways, while doing my apt-cache searching stuff for stuff, I noticed stimfit – *Program for viewing and analyzing electophysiological data* show up in the search for scipy.

It appears to take datasets of electo-potential readings from a single neuron at every tenth of a milisecond and then fit exponential decay curves *[the thick grey line]* to selected sections from the (negative) peak to the baseline.

A bit like a temperature sequence, eh?

Oh, and it has a funky Python shell built into it to help you automate the analysis functions. What’s not to like?

Using the power of open source, I can find the code that does the exponential curve fitting, which looks pretty familiar in the way it hackily matches a sequence of values to a positive or negative exponential decay curve by guessing the “floor” (limit) value, subtracting and flipping the readings if necessary, applying a **log** function before doing a least squares simple linear regression.

void stf::fexp_init(const Vector_double& data, double base, double peak, double RTLoHi, double HalfWidth, double dt, Vector_double& pInit ) { bool increasing = data[0] < data[data.size()-1]; Vector_double::const_iterator max_el = std::max_element(data.begin(), data.end()); Vector_double::const_iterator min_el = std::min_element(data.begin(), data.end()); double floor = (increasing ? (*max_el+1.0e-9) : (*min_el-1.0e-9)); Vector_double peeled( stfio::vec_scal_minus(data, floor)); if (increasing) peeled = stfio::vec_scal_mul(peeled, -1.0); std::transform(peeled.begin(), peeled.end(), peeled.begin(), log); Vector_double x(data.size()); for (std::size_t n_x = 0; n_x < x.size(); ++n_x) x[n_x] = (double)n_x * dt; double m=0, c=0; stf::linFit(x,peeled,m,c); ... }

Shame this is unnecessarily in C++, which makes it hard to hack. I suspect this is for historical reasons, and left as-is because it's mathematical stuff which people like to treat as a black box once it works.

Meanwhile, on another idea that I have been sitting on for years, I've found a complete simulator for spiking neural networks written entirely in Python.

I have to get used to all of my ideas being out there already, done for years. I was going to do my spiking simulation idea in Javascript so that I could drag the neurons around interactively on the screen. However, it's good to see the confirmation bias kicking in here if I ever consider writing some CAM functions again. I will not be using C++, which is an out of date dead language, like FORTRAN, as far as I am concerned. Nothing new should be started in it.

Brian is a simulator for spiking neural networks available on almost all platforms. The motivation for this project is that a simulator should not only save the time of processors, but also the time of scientists.

Brian is easy to learn and use, highly flexible and easily extensible. The Brian package itself and simulations using it are all written in the Python programming language, which is an easy, concise and highly developed language with many advanced features and development tools, excellent documentation and a large community of users providing support and extension packages.

The efficiency of Brian relies on vectorised computations (using NumPy), so that the code above is only about 25% slower than C.

What's the trade-off between searching for stuff to reuse and writing it from scratch?

What's it going to be like in a hundred years time, when whenever you try to write some new experimental code, someone next to you can always find something that does it better already and will be trying out your idea before you've even settled down?

They're like DJ-mixing a thousand sound-tracks simultaneously from the last 50 years when you're trying to put down a five note riff on your guitar. Nearly all time spent coding will be wasted. But then again, nearly all time spent writing poetry is wasted, because someone else has already written a poem expressing that thought and emotion a million times better.

It must be so hard to do schoolwork these days when you can look up every answer and see it expressed a million times better than you're going to put in your homework essay. It's all so clearly futile.

There was an Ideas-Suggestions website inside of Autodesk. My all-time favourite idea that I submitted was for all source code committed by employees to go through an automatic plagiarism detector to make sure you are never cutting and pasting a perfect implementation of a function from elsewhere, and that you are only implementing your own crappy versions that you wrote yourself on the day.

Proprietary code is like school homework essays. It has to be your own work or it's not right. On the other hand, open source code is like Wikipedia entries -- almost always of better quality than the unpublished work from any random student in school.

And of course the school-work is crap. You don't know how to do it and you're trying to learn. The question is, are those programmers making proprietary code for a corporation trying to learn? Or is the point to make code that is owned, because that's actually more important to the boss than code which is any good?

]]>

I just got a “Hello world” program working out of a pair of Jeenodes kicking around in the cardboard box left over from the Housahedron project before they migrated to Berlin. Of course, there was no documentation for how to plug in the interface into the Jeenode, and I had to get Adrian’s help.

Then I had to return my Kobo to the shop as it was broken already. And I’d only read one book on it — one of the few Barrington Bayley novels I was missing. I’m hoping to get back into reading books as it’s been a long time. Apparently in this web era all our attention spans have gone to pot. The killer feature about the eReader is that it’s not on the web, so I have to read the books it’s got.

Also, this eReader is known to be hackable into a gliding computer. Luckily I hadn’t taken it apart and cancelled the warrantee, so I got a replacement.

Meanwhile, I also needed to return the karabiner that came with the hang-gliding harness that I’d been using for the past year due to a product recall. Probably best not to think too hard about that.

New hotbin compost in the garden. My sister’s got one and it works for her. This one isn’t getting very hot yet, what with the lid blowing open in the winds. But all the worms are climbing out, which is a good sign.

Last night we took a delivery from Mapo Man who cooked us a tofu dinner with Japanese rice in our house. It’s art.

On Sunday we went surfing at Crosby with the Liverpool Canoe Club from a beach made of eroded bricks.

Too busy getting thrown in to get any photos while paddling. It was very knackering. I haven’t fully recovered from the virus infection I had in Berlin. Becka has a new wetsuit which is not allowed to be taken caving.

And finally I passed no less than five billboards in Liverpool in the last week displaying Heathrow airport expansion disinformation and lies. I was sensitised to the issue because of where I stayed in Berlin under the flight path an airplane screaming overhead every three to four minutes from 6am in the morning to 11pm at night.

If the stock market even remotely worked, there would be no way this would be on the table as it should have factored in the fact that (a) this will never be allowed under the circumstances where we address climate change, and (b) all the private wealth in the world won’t be worth very much if we don’t address climate change, and (c) the addressing of climate change will create a heck of a lot more jobs than expanding a poxy airport which we should be closing down as soon as possible.

]]>My theory is that by fitting exponential decay curves to the data I would get some invariant values relating to the fabric of the building that would change when you improved its insulation characteristics (eg draught-proofing a window).

The first step is to chop of this data into the sections where the temperature is dropping down. It took a while to get some working code, but it came like this:

gw = 30 # half an hour sampleseqs, sampleseq = [ ], None for i in range(gw, len(samples): vd = samples[i-gw][1] - samples[i][1] # positive if past temp higher if vd >= 0: if not sampleseq or vd >= mvd: # restart seq at bigger difference sampleseq = samples[i-gw:i] mvd = vd sampleseq.append(samples[i]) elif sampleseq: sampleseqs.append(sampleseq) sampleseq = None

Now all these sections of declining temperatures are cut out, we can plot them aligned on the left, like so:

And now it’s a matter of picking each section and fitting an exponential decay curve to it, like so:

That one doesn’t fit so well, so I fitted the curve in a piecewise linear fashion on overlapping two hour sections (advanced by one hour at a time in this diagram).

Sadly, the exponential values are all over the place. We can guess that the outside temperature is declining through the night, which is why the lower limit temperature is going down, but this messes with the exponential decay value considerably. If the outside temperature was accelerating upward (its rate of going down was slowing) as it would be at the start of the night due to most of the cooling happening early on, then this will bend the curve upwards and cause the exponential decay value to be larger relative to what it will be later in the night when the temperature is more constant.

Which is all very well, but a narrative explanation (ie BS) isn’t any good when I really need some numbers.

Here’s plotting the time (compressed) against the lower limit temperature vL, which looks like there’s a decline in the night time each day:

And this is plotting the exponential decay factor b against vL (b is always less than 0):

Not a great correlation, but it sort of shows that the exponential decay constant is lesser (closer to zero) when the temperature is low.

So, these aren’t great exponential decay fits, unless you account for the limit temperature changing, which is going for a third order effect (fitting curves is second order as it requires 3 points relating to second differential, fitting lines is first order as it requires 2 points relating to first differential).

Might be worth a try. And I should do it now I’m here, except I’ve been at this since early this morning.

But first, how am I fitting these exponential decay curves?

Well, here’s the simple linear regression calculated for the curve **v = exp(b t + a) + vL**, given **vL** and a sequent **sq = [ (t, v) ]**

n = len(sq) sx = sum(t for t, v in sq) sx2 = sum(t*t for t, v in sq) sy = sum(math.log(v - vL) for t, v in sq) sxy = sum(t*math.log(v - vL) for t, v in sq) b = (sxy n - sx sy) / (sx2 n - sx sx) a = (sy - b sx) / n

How do I find **vL**?

Well, I run this calculation for the two half sequences **sq0, sq1 = sq[:len(sq)/2], sq[len(sq)/2:]**, and calculate **b0** and **b1** for each of them.

I need to pick a value of **vL** where **b0 = b1**.

By differentiating **b** with respect to **vL** you can solve for **delta_vL** in:

b0 + (db0/dvL) delta_vL = b1 + (db1/dvL) delta_vL

then add it to **vL**, and it converges pretty quickly.