Sometimes software development is a struggle

I'm currently working on the firmware for a new project, an 8-slot smart card reader. I will share more about the architecture and design ideas behind this project soon, but today I'll simply write about how hard it sometimes is to actually get software development done. Seemingly trivial things suddenly take ages. I guess everyone writing code knows this, but today I felt like I had to share this story.

Chapter 1 - Introduction

As I'm quite convinced of test-driven development these days, I don't want to simply write firmware code that can only execute in the target, but I'm actually working on a USB CCID (USb Class for Smart Card readers) stack which is hardware-independent, and which can also run entirely in userspace on a Linux device with USB gadget (device) controller. This way it's much easier to instrument, trace, introspect and test the code base, and tests with actual target board hardware are limited to those functions provided by the board.

So the current architecture for development of the CCID implementation looks like this:

  • Implement the USB CCID device using FunctionFS (I did this some months ago, and in fact developing this was a similarly much more time consuming task than expected, maybe I find time to expand on that)

  • Attach this USB gadget to a virtual USB bus + host controller using the Linux kernel dummy_hcd module

  • Talk to a dumb phoenix style serial SIM card reader attached to a USB UART, which is connected to an actual SIM card (or any smart card, for that matter)

By using a "stupid" UART based smart card reader, I am very close to the target environment on a Cortex-M microcntroller, where I also have to talk to a UART and hence implement all the beauty of ISO 7816-3. Hence, the test / mock / development environment is as close as possible to the target environment.

So I implemented the various bits and pieces and ended up at a point where I wanted to test. And I'm not getting any response from the UART / SIM card at all. I check all my code, add lots of debugging, play around with various RTS / DTR / ... handshake settings (which sometimes control power) - no avail.

In the end, after many hours of trial + error I actually inserted a different SIM card and finally, I got an ATR from the card. In more than 20 years of working with smart cards and SIM cards, this is the first time I've actually seen a SIM card die in front of me, with no response whatsoever from the card.

Chapter 2 - Linux is broken

Anyway, the next step was to get the T=0 protocol of ISO 7816-3 going. Since there is only one I/O line between SIM card and reader for both directions, the protocol is a half-duplex protocol. This is unlike "normal" RS232-style UART communication, where you have a separate Rx and Tx line.

On the hardware side, this is most often implemented by simply connecting both the Rx and Tx line of the UART to the SIM I/O pin. This in turn means that you're always getting an echo back for every byte you write.

One could discard such bytes, but then I'm targeting a microcontroller, which should be running eight cards in parallel, at preferably baud-rates up to ~1 megabit speeds, so having to read and discard all those bytes seems like a big waste of resources.

The obvious solution around that is to disable the receiver inside the UART before you start transmitting, and re-enable it after you're done transmitting. This is typically done rather easily, as most UART registers in hardware provide some way to selectively enable transmitter and/or receiver independently.

But since I'm working in Linux userspace in my development environment: How do I approximate this kind of behavior? At least the older readers of this blog will remember something called the CREAD flag of termios. Clearing that flag will disable the receiver. Back in the 1990ies, I did tons of work with serial ports, and I remembered there was such a flag.

So I implement my userspace UART backend and somehow it simply doesn't want to work. Again of course I assume I must be doing something wrong. I'm using strace, I'm single-stepping through code - no avail.

In the end, it turns out that I've just found a bug in the Linux kernel, one that appears to be there at least ever since the git history of linux-2.6.git started. Almost all USB serial device drivers do not implement CREAD, and there is no sotware fall-back implemented in the core serial (or usb-serial) handling that would discard any received bytes inside the kernel if CREAD is cleared. Interestingly, the non-USB serial drivers for classic UARTs attached to local bus, PCI, ... seem to support it.

The problem would be half as much of a problem if the syscall to clear CREAD would actually fail with an error. But no, it simply returns success but bytes continue to be received from the UART/tty :/

So that's the second big surprise of this weekend...

Chapter 3 - Again a broken card?

So I settle for implementing the 'receive as many characters as you wrote' work-around. Once that is done, I continue to test the code. And what happens? Somehow my state machine (implemented using osmo-fsm, of course) for reading the ATR (code found here) somehow never wants to complete. The last byte of the ATR always is missing. How can that be?

Well, guess what, the second SIM card I used is sending a broken, non-spec compliant ATR where the header indicates 9 historical bytes are present, but then in reality only 8 bytes are sent by the card.

Of course every reader has a timeout at that point, but that timeout was not yet implemented in my code, and I also wasn't expecting to hit that timeout.

So after using yet another SIM card (now a sysmoUSIM-SJS1, not sure why I didn't even start with that one), it suddenly works.

After a weekend of detours, each of which I would not have assumed at all before, I finally have code that can obtain the ATR and exchange T=0 TPDUs with cards. Of course I could have had that very easily if I wanted (we do have code in pySim for this, e.g.) but not in the architecture that is as close as it gets to the firmware environment of the microcontroller of my target board.