Gratuitously Perverse IO; or, how not to code a driver



  • The Linux driver for the USB TV card I'm using is... interesting in many ways. For example, TV cards generally have several general-purpose IO pins on the main bridge chip, and the em2880 used by this hardware is no exception. In em28xx-based hardware, they usually seem to be hooked up to the standby and reset pins of the other chips, so they need to be set correctly before attempting to talk to any of the other chips. For some reason, this is handled (for most of the supported hardware) by a callback into the em28xx code from the tuner code. (Since a lot of the em28xx-based hardware is hybrid analog/digital stuff, the enabled hardware depends on mode, and since GPIO-twiddling is used to reset the tuner on mode switches anyway...)

    When it starts up, the DVB-T driver needs to power up the demodulator in order to initialise it (it keeps its settings while in standby, but it doesn't respond to commands). In order to do this, it switches the tuner into digital mode (via the analog tuner API, since the tuner can't be bound to the digital device until after the demodulator is initialised; tuners are often behind demods but the reverse doesn't happen). Unfortunately, at some point the driver was modified not to bother setting up the tuner for analog stuff on DVB-T only devices, and this broke the scheme.



  • I was thinking the same thing this morning as I ate breakfast.



  • Did you try reversing the polarity?


  • Considered Harmful

    @bobday said:

    Did you try reversing the polarity?

    Are you crazy?  That could rip a hole in the fabric of space-time!



  • @Pap said:

    I was thinking the same thing this morning as I ate breakfast.

    Yeah, looks like I botched that summary. Let's try again: the developer of the driver (em28xx) decided not to set the tuner driver up to accept analog TV related commands on devices that only supported digital TV. Unfortunately, the driver for digital TV on the device used an analog TV command to tell the tuner driver to call back into the original em28xx driver and ask it to turn on the digital TV demodulator chip. (Actually, this was sort of a side-effect of the way the callback did tuner resets, and probably not what it was intended for.) It then got upset when it couldn't send commands to that chip because it was still in standby; since the analog tuner driver wasn't set up, it didn't receive the command sent to it. As a result, the driver gave up trying to set up the device for digital TV and bailed out.



  • "Now, before I begin the lesson, will those of you who are playing in the match this afternoon move your clothes down onto the lower peg immediately after lunch, before you write your letter home, if you're not getting your hair cut, unless you've got a younger brother who is going out this weekend as the guest of another boy, in which case, collect his note before lunch, put it in your letter after you've had your hair cut, and make sure he moves your clothes down onto the lower peg for you."



  • @joe.edwards@imaginuity.com said:

    @bobday said:

    Did you try reversing the polarity?

    Are you crazy?  That could rip a hole in the fabric of space-time!

    Not if you divert the beam through the main deflector. 



  • @makomk said:

    Yeah, looks like I botched that summary. Let's try again: the developer of the driver (em28xx) decided not to set the tuner driver up to accept analog TV related commands on devices that only supported digital TV. Unfortunately, the driver for digital TV on the device used an analog TV command to tell the tuner driver to call back into the original em28xx driver and ask it to turn on the digital TV demodulator chip. (Actually, this was sort of a side-effect of the way the callback did tuner resets, and probably not what it was intended for.) It then got upset when it couldn't send commands to that chip because it was still in standby; since the analog tuner driver wasn't set up, it didn't receive the command sent to it. As a result, the driver gave up trying to set up the device for digital TV and bailed out.

    Still botchin' it. Lemme try:

    The developer of the EM driver made a premature optimization to not set up any analog TV commands on a digital only board.

    However, the device driver (unclear here) for the digital TV chip uses an analog TV command as a callback to the EM driver to turn on the digital TV chip (it assumes analog was set up first, possibly reasonable.) This was ignored because the EM driver doesn't set up analog commands on digital only boards.

    Since analog TV commands are being ignored, the digital TV driver never got the command to turn on the digital chip, timed out, and dumped core all over Pap's breakfast table.

     

    The real WTF here is the lack of communication and standards between the two drivers--looks like they were reverse engineered.



  • @operagost said:

    @joe.edwards@imaginuity.com said:

    @bobday said:

    Did you try reversing the polarity?

    Are you crazy?  That could rip a hole in the fabric of space-time!

    Not if you divert the beam through the main deflector. 

    However, that would only create chronotron particles... And you know what that would do!

     



  • @Benanov said:

    The real WTF here is the lack of communication and standards between the two drivers--looks like they were reverse engineered.

    Nobody has ever accused the v4l drivers of being well-written - they suck. Their only real excuse is that the Windows drivers for those devices are not any better.

    The problem is that all the professional video work is done with pure hardware devices, so the software variants are only used for home/hobby stuff, which means they don't get much attention.



  • Um, so The Real WTF™ is that when they tested the driver they didn't happen to notice that it doesn't actually work?



  • @woodle said:

    Um, so The Real WTF™ is that when they tested the driver they didn't happen to notice that it doesn't actually work?

    lol. Too many things fall under the "didn't happen to notice that it doesn't actually work" category, including most of Mozilla Thunderbird. The best ones are where Save causes the program or indeed the whole operating system to crash. The only one command that really has to work reliably ...



  • @woodle said:

    Um, so The Real WTF™ is that when they tested the driver they didn't happen to notice that it doesn't actually work?

    Linux hardware drivers are not routinely tested with every new upstream kernel release, because there's thousands of the bloody things and nobody has all the hardware - this change was probably tested against a different piece of hardware, where it happened to work. The QA model for linux drivers is based on feedback: if a new version breaks a driver, then either somebody will report the fault or nobody cares. Distributions are expected to maintain their own versions, where they delay acceptance of changes until they are confident in them, and each distribution gets to make its own choices about what degree of testing is sufficient. Upstream kernels are expected to break drivers.

    This is why everybody who complains that a given vendor has an "old" version of the kernel has no clue what they are talking about.



  • The issue has nothing to do with the kernel, the driver design is such that the thing can't be initialised - it needs to use the analog api to initialise, but it's ignoring analog api commands.  It's not even that it's for different hardware (unless I've read it wrong).  It could never have worked.

    Either that or the driver is for some completely different hardware, in which case it would appear to be simple EBCAC.



  • @woodle said:

    The issue has nothing to do with the kernel, the driver design is such that the thing can't be initialised - it needs to use the analog api to initialise, but it's ignoring analog api commands.  It's not even that it's for different hardware (unless I've read it wrong).  It could never have worked.

    It's a combination of several layers of code - some specific to one piece of hardware, some that apply to the entire class of hardware, and some generic. One layer changed and broke another. 



  • @asuffield said:

    @woodle said:

    Um, so The Real WTF™ is that when they tested the driver they didn't happen to notice that it doesn't actually work?

    Linux hardware drivers are not routinely tested with every new upstream kernel release, because there's thousands of the bloody things and nobody has all the hardware - this change was probably tested against a different piece of hardware, where it happened to work. The QA model for linux drivers is based on feedback: if a new version breaks a driver, then either somebody will report the fault or nobody cares. Distributions are expected to maintain their own versions, where they delay acceptance of changes until they are confident in them, and each distribution gets to make its own choices about what degree of testing is sufficient. Upstream kernels are expected to break drivers.

    This is why everybody who complains that a given vendor has an "old" version of the kernel has no clue what they are talking about.

    Actually, looking at the device list, the change which caused this affects exactly two devices (Terratec Cinergy T XS and Kworld 350 U DVB-T) and fails on at least the Terratec. (I'd expect it to fail on the Kworld too, since it uses similar hardware and the same GPIO code, but perhaps they built their hardware slightly differently.) Also, the driver in question is currently not merged into Linux, for reasons unrelated to this bug.

    @asuffield said:

    @woodle said:

    The issue has nothing to do with the kernel, the driver design is such that the thing can't be initialised - it needs to use the analog api to initialise, but it's ignoring analog api commands.  It's not even that it's for different hardware (unless I've read it wrong).  It could never have worked.

    It's a combination of several layers of code - some specific to one piece of hardware, some that apply to the entire class of hardware, and some generic. One layer changed and broke another. 

    The thing is, all the major culprits - the analog driver, the digital driver and the tuner driver - were mostly written by (and are currently sort-of maintained by) one person. The analog and digital drivers are also tightly bound and are designed to be used together; the tuner driver was written for (and is mainly used with) these drivers.


Log in to reply