## Do I or don’t I use floating point?

Wednesday, November 12th, 2014 at 11:48 am

I’ve been playing around with some geometric signal processing on the Atmega328-based Arduino kit for my run-time line fitting routine, when it occurred to me that I ought to know if I should be using floats or long_ints as the basis for this system.

Short_ints are only 2bytes with a maximum value of 32767, so you’re always overflowing them and it’s not worth the hassle. Therefore you have to use long_ints, which are 4bytes, the same as a float, so saving precious memory is not a factor in this decision.

Anyways, I woke up this morning and decided I needed some benchmarking.

After a number of false starts, I got as far as making the following arduino code:

```#define P(X) Serial.print(X)
float x1 = 11, x2 = 2;
long n = 100000L;
void setup()
{
Serial.begin(9600);
long i = n;
long t = millis();
float x;
for (long i = n; i > 0; i--) {
x = ((x1*x2*13*5-16)*3-1700)*3;
}
t = millis() - t;
P("Force x to exist by printing it: "); P(x);  P("\n");
P("floatmult ");  P(t);  P(" "); P((float)t*1000/n);  P(" microseconds");  P("\n");
}
void loop() {}
```

One of the problems is that the compiler is sh*t-hot, and will make the variable x disappear along with all the associated calculations if you don’t print it out, leading to a computation time of zero microseconds for n calculations for all values of n.

By defining the x1 and x2 variables as floats, all five multiplications in “((x1*x2*13*5-16)*3-1700)*3” are cast to floats, producing a time estimate of about 59microseconds for that formula. Tagging on another “*x1” onto it gets 66microseconds, so it’s a rough guess of about 10microsecond per floating point multiplication.

If I make the second line above read: “int x1 = 11, x2 = 2;”, we’re down to 3.5 microseconds, or probably 0.5microseconds per integer multiplication. That’s a factor of 20 times difference in speed. (The Mega Arduino and the cheap Jeenode (Uno) Arduino give exactly the same timings.)

I am not an expert, but this may be indicative that the floating point implementation is being done by the compiler in software.

Meanwhile, a press release from Atmel about a completely different device states:

MUNICH, Nov. 10, 2010: — Today at Electronica 2010, Atmel announced the first 32-bit AVR microcontrollers with a floating point unit (FPU). Targeting industrial control applications, the new Atmel AVR UC3 C MCU series offers a unique mix of high-processing power, true 5V operation, high-speed communication and advanced safety and reliability features packaged in a range of small and miniature packages.

The IEEE 754-1985-compatible FPU increases the performance, precision and dynamic range of calculations offered by the Atmel AVR UC3 CPU. The native support for the floating point arithmetic allows design engineers to use a full-featured toolbox for designing sensor and control applications. In addition, the advanced math can be applied to enhance signal processing, filtering, and noise suppression in a wide range of applications including motor control, robotics and audio.

Now, here’s the problem.

Although it’s not in hardware, 10microseconds per floating point calculation is not that bad, if there are few enough of them.

And this press release indicates that the more advanced microcontrollers will have good floating point performance, and that people might find it useful for signal processing. As I believe there is no technical reason you can’t put a good FPU onto a cheap chip, it’s probably going to be there eventually if there is a reason for it.

On the other hand, I’ll be able to process many more signals with one little arduino if I stick to integer arithmetic. But integer arithmetic is less reliable and harder to debug and develop if I am attempting to do something interesting — such as fitting exponential curves to data, or some sort of thing.

It’s a hard choice.

I think I’m going to rewrite my code and go with floating point. Doing things with integer arithmetic is only going to get in the way and slow me down. And anyway, if I do process many signals at once, the I/O is more likely to choke before the arithmetic processing.

I’m wondering if I can get it both ways by using a #define SCALAR to “float” or “long” and assume I can work with fixed point arithmetic once the floating point version is working.

It’s not that straightforward, as I’d need a second value, say, #define FIXEDMULT 1000 which I’d need to divide by whenever I multiplied two fixed point values. Or set FIXEDMULT to a power of 2, like 1024, so you’d write (a*b)<<10. I'll stick with pure floats in the initial library then, and not get distracted.

### 1 Comment

• 1. Clive replies at 19th November 2014, 2:44 am :

Spotting if it’s software or hardware should be possible by inspection of the object code shouldn’t it, especially if viewed in something which can show it as assembler?