How not to write a kernel driver



  • During an internship, I was tasked with updating a kernel driver for a PCI card to work with newer Linux versions. The porting was relatively straightforward, but when I took a more closer look to fix some problems, I noticed some… oddities in the driver code.

    • To communicate with the card and receive events from it, the driver needs to know its base memory address and interrupt line ID. Thanks to this newfangled technology called “Plug-n-Play”, a driver can query the system for this information. Not this one, though. To load the driver, the memory address and interrupt ID needed to be provided on its command line. IIRC the driver was accompanied with a small utility which would manually look for the card info in the /sys pseudo-filesystem, and then load the driver.

    • When loaded, the driver registered an interrupt handler with the provided interrupt line ID. Since the card was not supposed to raise interrupts, the original driver used (approximatively) this code:
      void my_interrupt_handler(int interrupt) {
      	printk("Huh?");
      }
      In slightly newer kernel versions, the concept of interrupt sharing was introduced; two devices can now share the same interrupt ID. Since there is no way to know which device raised the interrupt, the kernel calls each registered interrupt handler until one of the handlers responds “the interrupt was from my device and I handled it”. So how was the driver updated to handle interrupt sharing?
      interrupt_status my_interrupt_handler(int interrupt) {
      	printk("Huh?");
      	return INTERRUPT_HANDLED;
      }

      Yes, INTERRUPT_HANDLED. At least the log message was present; without it, I might have missed that the driver was basically eating interrupts destined to other drivers…



    • The card was used to control eight relay switches. Controlling it from a kernel driver was straightforward: for instance, to enable relays #2 and #8 and disable all other relays, the driver just needed to write the byte 01000001 to the card’s base address. Reading from the address would give the value back. Given the simplicity of implementation, you could expect the driver to have a somewhat nice API for programs to control relays (for instance, something like ioctl(relay_card, ENABLE_RELAY, 2);). But of course, it wasn’t the case. The driver implemented two operations:

      When a byte is written to the device file
          copy this byte to the card’s memory address

      When a byte is read from the device file
          copy a byte from the card’s memory address to the read buffer


    • Given the previous API, a user application needs to do a read-bitflip-write operation to change the status of a given relay. This cannot be done atomically; race conditions may arise if two applications try to flip two relays at the same time. There are multiple solutions to the problem:
      • Let multiple applications open the device file. Let them synchronise their operations by using file locking.
      • Only allow one application to open the device file. If another application tries to open it, make it wait for the first to close the file.
      • Only allow one application to open the device file. If another application tries to open it, fail the open() call. Write a huge userland application keeping the device open, and receiving relay toggle commands from other applications through TCP and UDP sockets.
      Guess which one was used?

    • As you have guessed, the driver function handling the open() was approximatively like this:
      int device_open(…) {
      	if (already_opened)
      		return -EBUSY;
      	already_opened = 1;
      	…
      
      Of course, there is a race condition in this code: if two applications try to open the device at the same time, they may both succeed. To fix this, the original developer added a lock to the open function. But instead of creating a proper lock, he used the Big Kernel Lock, a monolithic lock which was used by various parts of the kernel before being replaced by fine-grained locking. I’m not completely sure, but fine-grained locking was probably already available when the driver was originally written.

    Too bad I wasn’t authorized to rewrite (or even make big changes) to this driver. My limited knowledge in kernel drivers would probably have generated something slightly better than this.


  •  @VinDuv said:

    Only allow one application to open the device file. If another application tries to open it, fail the open() call. Write a huge userland application keeping the device open, and receiving relay toggle commands from other applications through TCP and UDP sockets.

     

    I can't fault the original developer. I'm also a big fan of microkernels.

     



  • @VinDuv said:

    you could expect the driver to have a somewhat nice API for programs to control relays (for instance, something like ioctl(relay_card, ENABLE_RELAY, 2);). But of course, it wasn’t the case. The driver implemented two operations:

    When a byte is written to the device file
        copy this byte to the card’s memory address

    When a byte is read from the device file
        copy a byte from the card’s memory address to the read buffer

    IMHO, using read and write is a much better idea than ioctl (big fan of Plan9 here...) Now, if the relays need to be controlled by different applications, it's probably better not to put them all in the same byte...



  • Please tell me this driver is not licensed GPL and has never gone through peer review on lkml or by ldp...

    If yes, TRWTF is using hardware in Linux that only has TAINT_PROPRIETARY drivers (since this post demonstrates quite well the general quality of those...).

    If no, it would have been a much bigger WTF, since I am pretty sure things like this (at least handling interrupts of other devices) would have been flagged in a review.


  • BINNED

    @VinDuv said:

    Since there is no way to know which device raised the interrupt, the kernel calls each registered interrupt handler until one of the handlers responds “the interrupt was from my device and I handled it”.

    At the risk of sounding totally ignorant:

    If there's "no way to know which device raised the interrupt" how does the driver know how to respond/handle the interrupt??



  • @topspin said:

    At the risk of sounding totally ignorant:

    If there's "no way to know which device raised the interrupt" how does the driver know how to respond/handle the interrupt??

    The PCI specification does not provide a way to generally determine which device raised the interrupt. Since most devices can raise an interupt for more than one possible event, there is usually a device specific way to ask that device whether/why it raised an interrupt, so that the driver can handle it (and clear the interrupt status of the device).

    If device A and device B share an interrupt, the driver for device A will ask device A why it raised an interrupt, and if it says it did not raise one, the driver for device B will ask device B. (Note that if both devices raise an interrupt, it is not guaranteed that these two interrupts are handled in the right order (the interrupt that was raised by device B may handle the interrupt for device A and vice versa), but eventually both interrupts will have been handled).


  • BINNED

    @mihi said:

    If device A and device B share an interrupt, the driver for device A will ask device A why it raised an interrupt, and if it says it did not raise one, the driver for device B will ask device B. (Note that if both devices raise an interrupt, it is not guaranteed that these two interrupts are handled in the right order (the interrupt that was raised by device B may handle the interrupt for device A and vice versa), but eventually both interrupts will have been handled).

    Doesn't the first driver ask its device if it raised an interrupt and then say "yes, ok. I handled it" by returning INTERRUPT_HANDLED. So the second one never sees the interrupt?



  • If both A and B raise an interrupt, the interrupt handler will be called twice. On first call, driver A will ask device A and handle its interrupt and clear the interrupt status of device A. On second call, driver A will ask device A and will return that it did not handle the interrupt as device A has no more interrupts raised (assuming a sane implementation and not the WTFy one of this thread of course). Then the driver for device B will get a chance to handle the interrupt of device B.



  • @mihi said:

    If yes, TRWTF is using hardware in Linux that only has TAINT_PROPRIETARY drivers
     

    Hum... What I get from the post is that it's VinDuv's company that wrote the driver. At least, he's the one maintaining it... Thus, I don't think they are concerned about licenses.

    I wouldn't be surprized if the device were proprietary.



  • it is not clear for me if they are writing a driver for a device they bought and use, or they are writing the driver for a device they sell. In the latter case, the license of the driver code does matter, since shipping non-GPLed Linux drivers with your hardware will reduce your customer base (and I'd say you might just skip Linux support altogether if you are trying to go that route in 2013). If they are only developing the driver inhouse for their own use, I don't care (but they might want to procure a relay hardware that is properly supported by Linux when they need the next one - definitely cheaper than writing your own crappy driver).



  • @mihi said:

    it is not clear for me if they are writing a driver for a device they bought and use, or they are writing the driver for a device they sell. In the latter case, the license of the driver code does matter, since shipping non-GPLed Linux drivers with your hardware will reduce your customer base



    I think you forgot to consider the probable case of them selling a compound device with embedded Linux interfacing with their custom hardware.



  • @mihi said:

    shipping non-GPLed Linux drivers with your hardware will reduce your customer base

    Except when it won't. If you're buying a $100K+ camera, and the only way to practically use it is with the included Java/C++ application, I don't think you care whether or not the driver is proprietary.



  • @configurator said:

    Except when it won't. If you're buying a $100K+ camera, and the only way to practically use it is with the included Java/C++ application, I don't think you care whether or not the driver is proprietary.

    You're too busy cussing and ranting at the shitty Java app.



  • @blakeyrat said:

    You're too busy cussing and ranting at the shitty Java app.

    Yeah, probably. What I had in mind has a shitty Java app that does a lot more than the competition and is the camera's main selling point, and it's still pretty shitty.



  • @configurator said:

    @mihi said:
    shipping non-GPLed Linux drivers with your hardware will reduce your customer base

    Except when it won't. If you're buying a $100K+ camera, and the only way to practically use it is with the included Java/C++ application, I don't think you care whether or not the driver is proprietary.

    Only that a relay card is not an "expensive camera" :)



  • @mihi said:

    Only that a relay card is not an "expensive camera" :)

    Fair point.

    What is a relay card, anyway?



  • A card that allows the software to control some electrical relays (i.e. devices that flip a mechanical connector from circuit A to circuit B).



  • @topspin said:

    Doesn't the first driver ask its device if it raised an interrupt and then say "yes, ok. I handled it" by returning INTERRUPT_HANDLED. So the second one never sees the interrupt?

    PCI is using level-triggered interrupts. A device will hold an IRQ line asserted as long as its interrupt is not handled by the driver. The driver needs to explicitly tell the device to deassert the IRQ. If an ISR returns and the device is still holding the IRQ line asserted, another interrupt will happen immediately.

    If a driver fails to deassert the IRQ line properly, an interrupt storm happens. This will happen is some driver in the chain will always return "interrupt handled" even if its device didn't have a request.


Log in to reply