Python project structure



  • I want to start my adventure with Python (for real; not like the previous seven tries). I want to make an "executable" project - not necessarily compiled to actual binary, but it will behave like a complete application, not a library.

    So, I have several very basic questions about what files I need to make and where.

    • If my project is named Foo, what would be the path to file containing main() function, relative to repo's root directory? Let's say that repo is also named Foo.
    • If I want to add module named bar which is all in a single file, where do I put it?
    • If I want to add module named qux with a submodule quz, what files do I have to create and where?
    • When and where should I make __init__.py files? Are there other special files I have to remember?
    • What do I have to do to make the project installable (and uninstallable) via pip? Or is it a bad idea?

    I tried to look it all up, but the articles I've found are all very old, very confusing and/or contradict each other. So, please help me here :grin:

    fucking markdown... fucking double underscores...



  • There's a real language that this has to be asked about?



  • No, there isn't. But scripting languages are a different beast.



  • I don't know anything about python so I can't help, but I'm curious; what will the program do?



  • A little GUI tool to make my life easier at work that no one except me and possibly my teammates will ever use. But I want to make it as "professional" as possible because why not? It's a great learning opportunity!



  • This post is deleted!


  • Doesn't the documentation answer most of these questions?



  • No, it answers only about 0.8 of 5 questions - specifically, how to name single-file-module files , where to put child modules and how to make scripts executable. Not where to put single-file-module files in project tree, not where to put code of module that has both its own items as well as submodules (directories cannot have text data in them after all), and not how to make executable project.



  • Basically you need to structure it like you would structure a well-behaved Python library package, and use setuptools entry points so that python setup.py install (and by extension, pip install) does the right thing and creates executables for you. Also, it will do the right thing on Windows.

    For gorier details, you might want to read the Setuptools documentation.



  • @wft said:

    Basically you need to structure it like you would structure a well-behaved Python library package

    That raises another five questions...



  • Is a plain ol' HTML file with a bunch of Javascript out of the question here?



  • That link. Have you read it?

    Also:


  • Winner of the 2016 Presidential Election

    I'm not exactly a Python guru (mostly use it for small scripts and have some experience with Flask), but here's what I know:

    @Gaska said:

    If my project is named Foo, what would be the path to file containing main() function, relative to repo's root directory? Let's say that repo is also named Foo.

    I don't know of any convention, and I don't think it matters. In some python file, write:

    if __name__ == '__main__':
        main_function()
    

    This is the script you use to start the application now.

    If I want to add module named bar which is all in a single file, where do I put it? If I want to add module named qux with a submodule quz, what files do I have to create and where? When and where should I make `__init__.py` files? Are there other special files I have to remember?

    See http://programmers.stackexchange.com/questions/111871/module-vs-package and https://docs.python.org/2/tutorial/modules.html#packages:

    • A package is a directory full of Python modules with some optional initialization code (which you put in the __init__.py).
    • A module is a Python file containing any kind of definitions (functions, classes, ...) and/or code
    • It doesn't really matter what you put where, just structure your code into modules and packages however you want.
    What do I have to do to make the project installable (and uninstallable) via pip? Or is it a bad idea?


  • I would presume/suggest that you would typically want to make a module correspond to a class (or a group of tightly related classes, if they're related enough), and then go nuts with the object oriented patterns to organize code.



  • @anotherusername said:

    Is a plain ol' HTML file with a bunch of Javascript out of the question here?

    Javascript doesn't integrate well with shell commands.

    @wft said:

    That link. Have you read it?

    No, in fact, I didn't until now. And after I did read it, I'm none the wiser - one example there is for a single file, and another is for a single module - both of these are too small to be useful for me.

    @asdf said:

    - A package is a directory full of Python modules with some optional initialization code (which you put in the __init__.py).

    • A module is a Python file containing any kind of definitions (functions, classes, ...) and/or code
    • It doesn't really matter what you put where, just structure your code into modules and packages however you want.

    The natural follow-up question is, if package is directory, and module is file, and (if I understood correctly) modules can't have child modules because they're files, not directories, then how do I make some functions available directly in package's namespace?

    Also, if I'm going to use pip, do I still need the __main__ trick?


  • Winner of the 2016 Presidential Election

    @Gaska said:

    how do I make some functions available directly in package's namespace?

    You don't necessarily need to. Let's say you have a package foo containing a module baz defining a class bar. You can just write:

    from foo.baz import bar
    

    If you really want to make a symbol from a module available in the containing package's namespace, just import it in __init__.py:

    @Gaska said:

    Also, if I'm going to use pip, do I still need the __main__ trick?

    You don't need that trick at all, you can just use "normal" scripts as entry points for your application all the time. But that trick can be quite handy: It ensures that you can import the class/function definitions from the "main" script in another module without executing the actual script.



  • @asdf said:

    You don't need that trick at all, you can just write a "normal" script, but that trick can be handy.

    Let me ask again. The __main__ trick is for differentiating between being run as import vs. being run as executable script. It works by having some code on top-level (outside of any function) in an if block that runs when script is executed but not when it's imported. My question is, if I'm going to use pip, and pip uses setup.py and the list of entry points defined there instead of just running the script's top-level code, and I'm not caring about non-pip users, do I get any benefit from this __main__ trick?


  • Winner of the 2016 Presidential Election

    If you don't write scripts at all, but declare a function in a certain module as a console_scripts entry point instead (in setup.py), then using that trick doesn't make any sense at all, of course, since setuptools will automatically create a wrapper script for you and you'll never need to execute the module_xy.py directly.



  • I see. Thanks!

    So, to summarize:

    • the Python Way of organizing project is to have a directory in your repo named exactly like the repo itself (assuming the repo is named exactly like the project, which it should), in which all (or most) implementation files go
    • the Python Way doesn't say anything at all about the insides of this directory - it just has to be a valid package (ie. have __init__.py file)
    • I can do what the fuck I want and I won't be breaking any conventions because there aren't any?

    Am I mostly right here?


  • Winner of the 2016 Presidential Election

    AFAIK, yes.

    BTW:

    @Gaska said:

    Are there other special files I have to remember?

    There is one other special file I can think of: __main__.py

    https://docs.python.org/3/library/main.html#module-main

    It's basically the __name__ == '__main__' equivalent for packages (instead of modules). I've never needed it, just thought I'd mention it since you asked.



  • @wft said:

    Basically you need to structure it like you would structure a well-behaved Python library package,

    Saying "you do it how you do it" is not very helpful.

    @wft said:

    http://click.pocoo.org/5/setuptools/#setuptools-integration

    That seems to be more useful.



  • @Gaska said:

    how do I make some functions available directly in package's namespace?

    You import them in your package's init.py



  • @blakeyrat said:

    Saying "you do it how you do it" is not very helpful.

    If I read the OP correctly, it was implied he knows a bit how to structure a library project that other code uses and the unclear bit was how it's different when one needs to ship actual executables.



  • @Gaska said:

    - I can do what the fuck I want and I won't be breaking any conventions because there aren't any?

    Well, you're gonna make people upset if you mess with, say, __builtins__ namespace in the __init.py__. "Why did my code go haywire after I imported your shit?!"



  • @wft said:

    If I read the OP correctly, it was implied he knows a bit how to structure a library project that other code uses

    Well, you read it wrong. Sorry if I made it not very clear that I have virtually zero Python experience.


  • Winner of the 2016 Presidential Election

    @wft said:

    Well, you're gonna make people upset if you mess with, say, __builtins__ namespace in the __init.py__. "Why did my code go haywire after I imported your shit?!"

    a.k.a. "don't use ugly hacks unless you understand the consequences"
    a.k.a. common sense



  • I do a lot of Python and this post is probably going to be long.

    To start with, use Python 3.5. That is the newest official implementation from python.org (also known as CPython). Don't bother with alternative implementations or Python 2. Read the official tutorial, too, it's fairly good.

    @Gaska said:

    If my project is named Foo, what would be the path to file containing main() function, relative to repo's root directory?

    I'd recommend going with this kind of basic structure:

    repo/
        <main package name>/
            __init__.py
            __main__.py
            <other modules and packages>
        setup.py
    

    Names of packages follow the rules of Python identifiers: must start with underscore or letter, might contain only letters, numbers and underscores, can't be a keyword (see [url=https://docs.python.org/3.5/reference/lexical_analysis.html#identifiers]here[/url]). While it's possible to not have a single root package for everything, it eliminates the problem of name collisions, so it's worth doing (that's pretty universal though).

    Python package is a directory that contains an __init__.py file. This is a module that corresponds to the package name, so e.g.

    # package/__init__.py
    foo = 'bar'
    
    # somewhere else
    import package
    print(package.foo)
    

    Otherwise a module is any .py file. You import nested things by using dot notation, so package/package2/module.py is package.package2.module.

    @Gaska said:

    If my project is named Foo, what would be the path to file containing main() function, relative to repo's root directory?

    That's the __main__.py in the main package. It should look like this:

    def main(): 
        pass
    
    if __name__ == '__main__':
        main()
    

    Others explained what the if means, but you want the __main__ module and a function for two things:

    • It makes package directly executable (via python -m package and if you ZIP it, via python package.zip)
    • It makes easier to create an entry script for your project through setup.py

    Plus it's a well-known location so it's easier for people to find your entry points.

    @Gaska said:

    * If I want to add module named bar which is all in a single file, where do I put it?

    • If I want to add module named qux with a submodule quz, what files do I have to create and where?

    As examples, bar.py and qux/__init__.py, qux/quz.py.

    Search path for modules is in sys.path variable. This is determined by few things (you can change it from code, it's an ordinary list, or from environment by setting PYTHONPATH), but you don't have to worry about it if you structure the project like this and write a proper setup.py.

    @Gaska said:

    * What do I have to do to make the project installable (and uninstallable) via pip? Or is it a bad idea?

    You need a setup.py file that calls setuptools.setup(). The most basic form looks like this:

    from setuptools import setup, find_packages
    
    package_name = '<main package name>'
    packages = find_packages(include = [package_name, package_name + '.*'])
    
    setup(
        name = '<project name>',
        version = '0.1.0',
        packages = packages,
    )
    

    Then you can install the project in editable mode (that is, without copying the files, only making it a part of sys.path without manually adjusting PYTHONPATH, so you only have do it once) via pip install -e <repo>.

    You can use entry_points argument to make use of that __main__ module like this:

    setup(
        ...,
        entry_points = {
            'console_scripts': [
                '<executable name> = <main project package>.__main__:main'
            ]
        }
    )
    

    Then pip install will generate wrapper executable that calls given function. This is not standalone though, it will still require full Python installation.

    You can declare dependencies using install_requires argument to setup(), it takes a list. Dependencies are installed from [url=https://pypi.python.org/pypi]PyPI[/url], so search on there. You can also install directly from Git/Hg/SVN repos (you'll need the dependency_links argument).

    From other stuff I recommend using virtualenvs from the start: in Python 3.5 this is built-in, just do python -m venv <directory> and it will create an isolated environment there. Then in shell use . <directory>/bin/activate (*nixes), . <directory>/Scripts/Activate.ps1 (Windows, PowerShell) or <directory>/scripts/activate (Windows, cmd). After that pip will know what to do.

    There's some more story to creating distributions and uploading them to PyPI, not sure if you need that at the moment.

    @wft said:

    You import them in your package's init.py

    Note here: careful with this. Python executes scripts as it goes, and that's also true for imports. And modules are created before the code of the module starts executing (because the names have to go somewhere). Which has a nasty consequence of circular imports having rather unintuitive behaviour:

    # a.py
    import b
    x = 42
    
    # b.py
    import a
    print(a.x) # AttributeError, because import above returns an already-created but yet-unfilled module
    

    Circular imports are otherwise fine, as long as the module code has a chance to execute to the end before it's imported (import statements don't have to be at the top-level). But I'd avoid them if possible.



  • @CatPlusPlus said:

    From other stuff I recommend using virtualenvs from the start: in Python 3.5 this is built-in, just do python -m venv <directory> and it will create an isolated environment there. Then in shell use . <directory>/bin/activate (*nixes), . <directory>/Scripts/Activate.ps1 (Windows, PowerShell) or <directory>/scripts/activate (Windows, cmd). After that pip will know what to do.

    Uuuh, nice.

    I didn't know they added that.



  • That was a great post - both in size and in quality. Thank you very much!

    Just one more question: from what I understood, __init__.py is just a regular Python module that assumes the name of the directory it's in, and otherwise there's nothing special about it? I can treat it just like any other file and put any code I want there? Or would it be against some convention?



  • @Gaska said:

    Just one more question: from what I understood, init.py is just a regular Python module that assumes the name of the directory it's in, and otherwise there's nothing special about it? I can treat it just like any other file and put any code I want there?

    Yes to all of those. Sometimes projects put metadata in their root package (see for example [url=https://github.com/kennethreitz/requests/blob/master/requests/init.py]requests[/url]) but that's it as far as conventions go. Might be worth looking at some other [url=http://pypi-ranking.info/alltime]popular projects[/url], too.


Log in to reply
 

Looks like your connection to What the Daily WTF? was lost, please wait while we try to reconnect.