It turns out the device driver design established previously has a serious drawback. While the SpiDriver interface enables unit testing, the cost of polymorphism has to be paid also in production code. In this article you are going to see how high that cost is, i.e. how much extra code the compiler actually generates for providing virtual methods. We will also introduce one possible solution to overcome that additional cost.

Decision making

What has been implemented so far is also known as dynamic polymorphism. The concrete implementation a particular client uses can be exchanged even at runtime. This is actually a lot more than what we originally asked for - exercising the driver using mock objects during testing, and using real ones in production. The decision for one of those two possible implementations is deferred to runtime. If we were able to make that decision at compile already, we could apply some technique that's called static (or parametric) polymorphism.

Analysis of the initial version

This is the driver how we created it in the first part of the series, but slightly renamed:

class Mcp2515CoreDynamic {
public:
	Mcp2515CoreDynamic(SpiDriver& spiDriver) :
		spiDriver(spiDriver) {
	}
	void reset() {
		spiDriver.select();
		spiDriver.write(0xC0);
		spiDriver.deselect();
	}
private:
	SpiDriver& spiDriver;
};

Let's create an instance of that class using some dummy SPI driver and call the reset() method:

DummySpiDriver spiDriver;
Mcp2515CoreDynamic mcpCore{spiDriver};
mcpCore.reset();

Here is the block of AVR instructions that the compiler generates for calling DummySpiDriver::select() via the base class reference:

ldd r24,Y+1
ldd r25,Y+2
movw r30,r24
ld r24,Z
ldd r25,Z+1
movw r30,r24
ld r24,Z
ldd r25,Z+1
adiw r24,4
movw r30,r24
ld r18,Z
ldd r19,Z+1
ldd r24,Y+1
ldd r25,Y+2
movw r30,r24
ld r24,Z
ldd r25,Z+1
movw r30,r18
icall

After studying the AVR instruction set and the ABI description of avr-gcc, one might get an idea what is happening here:

The spiDriver object maintains an internal pointer to a table of function pointers (vtable) to enable polymorphism. If we now look at the very last instruction in the previous assembler listing, we'll find an indirect call. It's indirect because it calls a subroutine in program memory where the Z pointer (register pair 31,30) is pointing at. This pointer position has to be determined before the call and that's what most of the preceding instructions do.

Another approach

Now, let's convert that example to static polymorphism. In the next code block, you can see the modified and renamed Mpc2515Core class that takes an SpiDriver as a template parameter:

template <typename SpiDriver>
class Mcp2515CoreStatic {
public:
	Mcp2515CoreStatic(SpiDriver& spiDriver) :
		spiDriver(spiDriver) {
	}
	void reset() {
		spiDriver.select();
		spiDriver.write(0xC0);
		spiDriver.deselect();
	}
private:
	SpiDriver& spiDriver;
};

This is how it is instantiated:

DummySpiDriver spiDriver;
Mcp2515CoreStatic<dummyspidriver> mcpCore{spiDriver};
mcpCore.reset();

Now let's look at the AVR assembler generated for the call to DummySpiDriver::select() via the template parameter:

ldd r24,Y+1
ldd r25,Y+2
movw r30,r24
ld r24,Z
ldd r25,Z+1
call DummySpiDriver::select()

This time the code block is much shorter. In particular, the icall has been replaced by a call instruction. In fact, now there is no difference to calling a normal (non-virtual) class method.
I haven't yet figured out the purpose of the preceding instructions though. They eventually lead to some state in registers 25:24 which is then stored inside of the frame of the select() method. This part of the code is actually identical to the first lines of the dynamic polymorphism example. I would really appreciate any explanation on that via twitter.

Conclusion

In this example the difference in program size between static and dynamic polymorphism is more than 30%, considering only the virtual method calls themselves. The total ratio is probably even higher, because the vtable also has to be put in place at initialization.

In some applications, increased program size alone might suffice to rule out dynamic polymorphism. The same is probably true for decreased execution speed. In such cases, static polymorphism seems to be an appropriate means to decouple objects and ensure testability. Because we would never sacrifice testability to preliminary optimization, would we? ;-)

Please let me know what you think about this article via twitter @ronalterde!

Sample code

To reproduce the examples in the article, you can either just take a look at the code using an online C++ compiler or install the AVR gcc toolchain, get the example project from github.com/ronalterde/mcp-tdd-part2 and build it yourself.

You can download the AVR toolchain from Atmel (now Microchip) or your can build it from sources. There might also be a package available in your OS distribution.

References

Wikipedia: Function prologue

AVR-lib user manual: Register usage

Wikipedia: Polymorphism

avr-gcc ABI description