Wheezier, squeezier.

I’ve put Raspbian Wheezy onto a separate SD card, having been running Squeeze since I received my Raspberry Pi several months ago. Once I’d dd‘d the filesystem onto it from the downloaded image, I mounted it on my main Linux desktop, and edited /etc/rc.local to start lcdinfo, which I copied across as a binary, and lo and behold, after unmounting, putting the card in the RPi, and giving it power, it booted and the LCD worked. Excellent.

Now for some testing…

Floating point performance

My friend John Honniball (also @anachrocomputer) had
been playing with the RPi around the same time as experimenting with various curve-calculating, vector drawing routines, and wanted to test the floating point performance on the Arduino, versus his fixed-point approximation code. The MegaAVR family doesn’t have hardware floating point (a general rarity at that end of the sector, although the much faster STM32F4 series of ARM chips can do 32-bit, single precision floating point), so it was expected his dedicated algorithm would work better. On the Arduino, it’s no contest.

So he ported it to the RPi, which had hardware floating point, but (at the time) no support in the Linux kernel to take advantage of it, so everything was compiled with software FP support. That way the compiler takes care of things behind the scenes.

I decided to compare Debian Squeeze to the newly released Raspbian Wheezy, the latter with hardware FP capability. Don’t worry about the values being spat out for a, b and c, it just matters that a is equal to ia, b to ib and c to ic. Which they are.

This first run is compiled and run on Squeeze with -O2 optimisation:

Floater
True 'float': 11610ms elapsed
a = 1004.375000, b = 4.250000, c = 1000.125000
Fixed-point: 2200ms elapsed
ia = 1004.375000, ib = 4.250000, ic = 1000.125000

Now, let’s transfer this to the new Raspbian Wheezy release. Compiled with no optimisation:

Floater
True 'float': 7890ms elapsed
a = 1004.375000, b = 4.250000, c = 1000.125000
Fixed-point: 5200ms elapsed
ia = 1004.375000, ib = 4.250000, ic = 1000.125000

That’s…what? The fixed point is slower than it used to be? The true float is still slower though (but a marked improvement cf Squeeze), but since it’s to the same precision, fixed point has to be considered winning. If flexibility or
extra precision was of use to us, it might be a different verdict. Let’s try some values for -O:


-rwxr-xr-x 1 pi pi 9221 Jul 14 17:28 floater
-rw-r--r-- 1 pi pi 1535 Jul 14 17:27 floater.c
-rwxr-xr-x 1 pi pi 6534 Jul 22 17:32 floater_recomp
-rwxr-xr-x 1 pi pi 6206 Jul 22 17:35 floater_recomp1
-rwxr-xr-x 1 pi pi 6158 Jul 22 17:35 floater_recomp2
-rwxr-xr-x 1 pi pi 6158 Jul 22 17:35 floater_recomp3
-rwxr-xr-x 1 pi pi 6206 Jul 22 17:35 floater_recompO

According to cmp, -O is the same code as -O1, and -O2 and -O3 are the same- at least for the C we’re feeding it, obviously. All are smaller than the code created by the original compilation.

-O / -O1:
Floater
True 'float': 4310ms elapsed
a = 1004.375000, b = 4.250000, c = 1000.125000
Fixed-point: 3120ms elapsed
ia = 1004.375000, ib = 4.250000, ic = 1000.125000

-O2 / -O3:
Floater
True 'float': 4310ms elapsed
a = 1004.375000, b = 4.250000, c = 1000.125000
Fixed-point: 2230ms elapsed
ia = 1004.375000, ib = 4.250000, ic = 1000.125000

Ah, so, finally, we have fixed point running at “original speed”, and real floats “only” twice as slow. It should be reiterated that this is for a specific requirement, where the precision was fixed. A huge improvement though. Free speed!

Boot times

Start up (power on to seeing output from my LCD code), three boots each:

Distro/
Release
Boot #1 Boot #2 Boot #3
Squeeze 19.64 19.71 19.73
Wheezy 28.34 28.56 28.55

It was timed using a stopwatch, but they seemed to be consistent enough between runs, and different enough between releases to be
pretty sure about these conclusions

Oddly, the OK LED is now usually off, as opposed to on, and just flashes on SD card accesses. It’s a different SD card, and on first logging in, I got asked a bunch of questions, one of which was resizing the partition to fill the card, which I said yes to, and that worked seamlessly, also creating a swap partition. Not sure I like having swap on flash memory, might switch that off. Maybe the loss of speed is due to the card? I guess I’ll have to put it on the old one too…

…nope. Trying Wheezy on the old card is slightly slower again, just tested a single boot, it’s 29.66 seconds.

Looking at the serial console output, we have two main sources of delay. Between these two lines:

[    5.861576] smsc95xx 1-1.1:1.0: eth0: register 'smsc95xx' at usb-bcm2708_usb-1.1, smsc95xx USB 2.0 Ethernet, b8:27:eb:94:95:61
[   15.033879] EXT4-fs (mmcblk0p2): re-mounted. Opts: (null)

And between these two:

[   15.443907] ### BCM2835 ALSA driver init OK ###
[   22.864029] smsc95xx 1-1.1:1.0: eth0: link up, 100Mbps, full-duplex, lpa 0xCDE1