Any Python multiprocessing gurus around here?
Trying to track down a bizarre problem at work.
We've got a Python script that's throwing a bunch of errors in a piece of the code where they shouldn't be coming from. Digging into it, I've managed to trace it down to the
All that that's supposed to do is set up a MP pool object, but somewhere, somehow, when I call that method, a bunch of other code ends up running that's not supposed to run until much later in the script! A ton of error messages get printed before the
Does anyone have any idea how that can happen?
I don't claim to be a guru (in Python), but it seems that making a pool makes a bunch of other objects, notably including several threads, several queues, and several processes. It also depends on the pickling of objects, forking (on non-Windows) and other tricks. The way in which these coordinate is non-trivial, and not necessarily stable internally between releases of Python as a bunch of it involves undocumented support classes. In fact, there's a whole fucking lot of them; it's a damn snake pit down there.
Can you be a little more precise about the nature of what's detonating? Some things I would expect to be extremely problematic in concurrence with such trickiness (such as signal handling), and “it runs lots of code that it shouldn't do yet and prints lots of errors” really doesn't make for an actionable issue.
I'm doing a bunch of multiprocessing.Pool stuff now.
If you have code that's running that you don't think should be running, have you checked that all code is either in classes, functions or behind
if __name__ == '__main__':
If you think it's the constructor, is it possible you're shadowing the Pool class? If you have a local file with the same name (like multiprocessing), it will get it from there instead of the system one.
You can also debug the code and step through it. Easy in eclipse with pydev, but remote debugging possible too.
@masonwheeler not specific of multiprocessing, but a bug that runs unexpected code: if your code uses the 'future' module (not the built in
__future__) in certain cases can import unintended modules.
By exampe, if your app starts by a main.py and a test.py is in the same directory, and some of your modules imports future, in certain cases that test.py is imported
They are some other problematic names besides test.py