Subclasses in python: nailed it

aikii

Just came accross this nugget :

def _compose(hook, func):
    if hook is None:
        return func
    if func is None:
        return hook

    def run_hook():
        hook()
        func()

    run_hook.__name__ = func.__name__
    return run_hook

(...)

class TestCase(unittest.TestCase):
    pending = globals()['pending']
    pending = globals()['ignore']

    def __init__(self, methodName='runTest'):
        super(TestCase, self).__init__(methodName)
        try:
            subclass_setup = self.setUp
        except AttributeError:
            subclass_setup = None
        try:
            subclass_teardown = self.tearDown
        except AttributeError:
            subclass_teardown = None

        self.setUp = _compose(self.__setup, subclass_setup)
        self.tearDown = _compose(self.__teardown, subclass_teardown)

SVN tells me it dates from 2009, so there's no chance I could get some explanation. My best guess is that it allows the subclasses to never call super() in setup and teardown, because this superclass makes sure it's called by overriding the submethod with a "composition". And indeed since the developer didn't know how to check if the method belongs to the subclass or the superclass, the actual supermethods are prepended with "__".

aikii

bummer, it appears to be part of a library called "mocktest". I have no idea why such thing could be necessary

skotl

Wow - that's fascinating. Thanks so much for sharing [rollseyes]

Mcoder

@aikii said:

My best guess is that it allows the subclasses to never call super()

Of course, in Python you can't ever say "never", but yes, that'll replace the setUp and tearDown methods of every instance of every class that inherits this one with the composition of this class' methods and the subclass' methods. Also if you instance TestCase, those methods will run twice.

Looks like a way to hook a logger into the test cases.

DrPepper

@Mcoder said:

@aikii said:
My best guess is that it allows the subclasses to never call super()
Of course, in Python you can't ever say "never", but yes, that'll replace the setUp and tearDown methods of every instance of every class that inherits this one with the composition of this class' methods and the subclass' methods. Also if you instance TestCase, those methods will run twice.
Looks like a way to hook a logger into the test cases.

When writing unit tests, it is common practice to move as much of the test setup as you can out of the actual test and into test setup and teardown methods. The test framework ensures that the setup method and teardown method are called before and after each test run.

Looks to me like this is a way of creating another class, a "test superclass" if you will, that has setup and teardown methods; the setup method would be run before the setup method of the test class.

This is useful in a lot of ways. For example, I can have my test superclass create a logger. Then each of my unit test classes can inherit from the superclass, and presto -- each test has access to a logger. Since these superclasses can be chained, I could add a superclass that configures mock objects -- a date/time mock, a database backend mock, etc, and chain them. Now presto -- the test class has access to a logger, mocks, etc; and if I want to create a new test class (to test a new feature) I have very little coding to do to get my new test class going.

That's the name of the game -- make it as frictionless as possible to stand up a new test class, and begin writing tests for a new feature you're developing.

Rather than ridicule this particular piece of code, it is worth studying to see how it was implemented. It's an elegant way of providing method chaining/overrides/behavior injection that could be useful in your ordinary code.

Edit:

While the above is perfectly reasonable and you could write code to implement the above, I spent more than a few seconds just now to re-study the above code, and it's working slightly differently.

If you have a class that inherits from TestCase, the above code allows you to write a Setup and Teardown method in your class, that TestCase knows about. Essentially, there is code in TestCase that calls it's own Setup and Teardown methods, but TestCase replaces (the references to) it's own Setup and Teardown methods with (refererences to) the subclasses Setup and Teardown methods. Sort of like, in C#, where the superclass has an abstract method, and it's subclasses must implement that method.

Arnavion

@DrPepper said:

Essentially, there is code in TestCase that calls it's own Setup and Teardown methods, but TestCase replaces (the references to) it's own Setup and Teardown methods with (refererences to) the subclasses Setup and Teardown methods. Sort of like, in C#, where the superclass has an abstract method, and it's subclasses must implement that method.

Wrong. aikii already posted in the OP what the code does, which is the other way around from what you wrote. The TestCase constructor replaces the setUp and tearDown methods of the instance (of the sub-class; remember that the "self" reference in the TestCase is the sub-class instance) with a composition that calls the corresponding methods in TestCase first. Essentially, as the OP said, this allows the sub-class to not call super.setUp() in its own setup().

This is not like C# abstract methods. This is more like a class hierarchy that wants the base class's method called from the derived class's method, but instead of relying on the programmer to call base.Foo() inside SubClass.Foo(), the base class uses reflection to change SubClass.Foo() to do that.

Also no this is not a good thing and shouldn't be done. If you write a base class that has a method you want sub-classes to override but still call the base implementation, then just document it to be so.

aihtdikh

@Arnavion said:

Also no this is not a good thing and shouldn't be done. If you write a base class that has a method you want sub-classes to override but still call the base implementation, then just document it to be so.

Yeah, who would want a framework to automatically perform standard boilerplate operations for them? That's crazy talk.

The finest frameworks don't have any code, they just have documentation telling you how to write code that acts how the framework would have acted if it actually existed.

Arnavion

@aihtdikh said:

Yeah, who would want a framework to automatically perform standard boilerplate operations for them?

Inheritance is not a "framework" and calling a base class implementation in a sub-class is not "boilerplate". If the class needs to be designed in a way that the base class's code should always be called first, that is either something that should be documented, or else there is already a design pattern for that - base class has concrete Setup() and abstract DoSetup(), derived class has DoSetup(), base.Setup() calls DoSetup(). There's a reason they're called design patterns - other programmers reading the code recognize what they are and what they do. Ad-hoc monkey-patching methods in derived classes is not one of those.

flabdablet

@Arnavion said:

Ad-hoc monkey-patching methods in derived classes is not one of those.

I still look back fondly on the solution I devised for working around the lack of thread safety we discovered too late in some of the low-level functions in the proprietary, closed-source C I/O library we'd bought for a heavily threaded PowerPC based embedded system.

I wrote a little function called patch(), which took a pointer to the function that needed patching and a pointer to the function you wanted called instead, and returned a pointer you could use to call the original (unpatched) function. It worked by moving the target function's first processor instruction to a trampoline, followed by a jump to the target function's second instruction; then replacing the first instruction with a jump to the replacement function; then returning the address of the trampoline. This was feasible only because PowerPC instructions are all the same length.

That let me write e.g. safe_open() that would acquire a mutex, call the original open() and then release the mutex, and make that appear to be open() as far as the rest of the code base was concerned - including the library itself. This automatically stopped e.g. open()'s lack of thread-safety being inherited by fopen() and freopen(). Only had to patch() a few library functions to render the whole thing thread-safe, and it saved the rest of the team a shitload of time, effort and errors.

DrPepper

@Arnavion said:

@DrPepper said:
Essentially, there is code in TestCase that calls it's own Setup and Teardown methods, but TestCase replaces (the references to) it's own Setup and Teardown methods with (refererences to) the subclasses Setup and Teardown methods. Sort of like, in C#, where the superclass has an abstract method, and it's subclasses must implement that method.

Wrong. aikii already posted in the OP what the code does, which is the other way around from what you wrote. The TestCase constructor replaces the setUp and tearDown methods of the instance (of the sub-class; remember that the "self" reference in the TestCase is the sub-class instance) with a composition that calls the corresponding methods in TestCase first. Essentially, as the OP said, this allows the sub-class to not call super.setUp() in its own setup().
This is not like C# abstract methods. This is more like a class hierarchy that wants the base class's method called from the derived class's method, but instead of relying on the programmer to call base.Foo() inside SubClass.Foo(), the base class uses reflection to change SubClass.Foo() to do that.
Also no this is not a good thing and shouldn't be done. If you write a base class that has a method you want sub-classes to override but still call the base implementation, then just document it to be so.

Yeah, what Arnavion said. But the point I made is still valid -- the goal is to have things "just work" when writing unit tests, without having to worry about wiring up your setup/teardown methods to those of the base class. It's meant to make writing new unit test classes as frictionless as possible.

Vanders

@flabdablet said:

I wrote a little function called patch(), which took a pointer to the function that needed patching and a pointer to the function you wanted called instead, and returned a pointer you could use to call the original (unpatched) function. It worked by moving the target function's first processor instruction to a trampoline, followed by a jump to the target function's second instruction; then replacing the first instruction with a jump to the replacement function; then returning the address of the trampoline. This was feasible only because PowerPC instructions are all the same length.

Modern ELF linkers and libraries like Glibc solve that problem with weak & strong symbols, so you could have just written a library which wrapped the functions you cared about and then over-rode the weak entry points (like open()) with strong symbols (Er, open()) which in turn then calls the original weak symbol (E.g. __open()). That's basically how stuff like LD_PRELOAD works.

Of course the Amiga had the best system ever: every entry point into a library was via. a jump table, so if you wanted to patch a function, you just changed the address of the function in the jump table. You could do monkey patching at runtime.

flabdablet

@Vanders said:

Modern ELF linkers and libraries like Glibc solve that problem with weak & strong symbols

Yeah. We weren't using one of those. We were using a shitty expensive closed source proprietary library instead of one of the many excellent open source ones. Not my decision, obviously.

@Vanders said:

You could do monkey patching at runtime.

My patch() function did do its monkey patching at runtime.

Jump tables are indeed beautiful things, and should be used more often. They're a much underappreciated way to enable truly independent compilation and linking, and in that project I did in fact use them to define the entry points for our two stages of boot loader. But in our specific instance they wouldn't actually have made it much easier to patch the library: first because we had no reasonable way to modify the library binary blob to include them, and second because PowerPC instructions really are all the same length, so it's always safe just to rip the first word out of whatever function you're patching and replace it with a jump.

Very fortunately, the code didn't have to run from ROM.

fennec

@Vanders said:

You could do monkey patching at runtime.

Perl calls it Test::Resub. Ruby has Test::Redef. Both are quite handy for simple things, though of course for big serious things you'd do well to consider dependency-injection of mock objects et cetera.

Ben L.

@fennec said:

@Vanders said:
You could do monkey patching at runtime.

Perl calls it Test::Resub. Ruby has Test::Redef. Both are quite handy for simple things, though of course for big serious things you'd do well to consider dependency-injection of mock objects et cetera.

Use MiniTest. Test is deprecated.

The problem I see with the code is that it should compose the teardown calls in reverse order. Also, it should be written with metaclasses for bonus fun.

fennec

@Ben L. said:

@fennec said:
@Vanders said:
You could do monkey patching at runtime.

Perl calls it Test::Resub. Ruby has Test::Redef. Both are quite handy for simple things, though of course for big serious things you'd do well to consider dependency-injection of mock objects et cetera.

Use MiniTest. Test is deprecated.

Test::Redef is not a part of Test::Unit and wholly compatible with any testing framework. Your argument is valid, but irrelephant.

And mock objects are useful, but fulfill a different role than redef/resub (which mock out method calls on existing classes instead of new ones).

PJH

@fennec said:

Your argument is valid, but irrelephant.

vyznev

@Vanders said:

Of course the Amiga had the best system ever: every entry point into a library was via. a jump table, so if you wanted to patch a function, you just changed the address of the function in the jump table. You could do monkey patching at runtime.

...except for the few exec.library functions that were "optimized" by squeezing all of their code into the jump table entry. Try to SetFunction() those, and the most likely outcome was the friendly Guru Meditation box. You could still monkey-patch them by hand, of course, but that required copying the full six-byte jump table entry, rather than just the four-byte address.

(Indirect addressing? Who wants that? The Amiga library jump tables contained actual executable jmp.l instructions: two bytes for the instruction and four for the 32-bit address. This meant that, i your library function could be implemented in four bytes of 68k assembly, and if you were crazy enough to do it, you could squeeze it into the jump table entry itself and still have two bytes left over for an rts, allowing you to save several clock cycles per call.)

I'm not sure what it tells about me that I still remember all that after nearly 20 years. I think I even still have the RKRMs somewhere up it the attic.