How to best deal with Python's dying batteries?

Bulb

Python used to boast that

Python comes with batteries included

and that was great for various utilities, in my case usually parts of building and deploying something. But lately the batteries have been pretty flat and I end up needing additional libraries (semver, ruamel-yaml, sgqlc, requests, azure-identity …).

When the libraries were at least available and reasonably up-to-date in Ubuntu, it wasn't too big of a deal, but the last batch either isn't, or is out of date. This is compounded by the fact that some smart ass at Microsoft decided that the azure-cli package they create for Ubuntu will not only install everything into a non-standard location, but it even carries its own python executable.

Now I know python's been pushing this venv thing, but I don't have experience with it.

So how do I best set up something that I can easily run anywhere with a Python installed? With two sub-cases:

The git repository is checked out, after which the script should just run with simple python3 …path/to/script.py
The git repository cannot be checked out where the script should run, so the script needs to be exported from the git repository as preferably a single file that can be executed (on any platform using system python).

Does anybody have experience with an appropriate approach for these tasks?

Unperverted Vixen

@Bulb How does the azure-cli install impact what you want to do? Its Python install should just be an implementation detail that you never see. (i.e. go ahead and write shell scripts that call it, and it’ll just be a black box; but from within Python you should be using their SDK, not trying to use azure-cli’s internals.)

dkf

@Bulb said in How to best deal with Python's dying batteries?:

Now I know python's been pushing this venv thing, but I don't have experience with it.

A virtual environment is really just a directory structure in a particular pattern and some environment variables to point into it. You can tell it to use a system-provided Python (probably some variant of python3) and if necessary the system-provided packages that go with it; that's not recommended for anything that doesn't come with Python.

Once you've set one up (made the directory structure) and activated it (set the environment variables, usually by running the bin/activate script in the directory structure) then it's straight forward; you just install everything you need in there. Probably by starting with getting an up-to-date pip. Yes, it uses more disk space than sharing everything, but disk capacity is really very cheap for the amount of room you're going to need. (It's all much easier now that there are wheels for most packages that need them, and you don't have to do nasty things like building SciPy from source.)

Bulb

@Unperverted-Vixen said in How to best deal with Python's dying batteries?:

@Bulb How does the azure-cli install impact what you want to do? Its Python install should just be an implementation detail that you never see. (i.e. go ahead and write shell scripts that call it, and it’ll just be a black box; but from within Python you should be using their SDK, not trying to use azure-cli’s internals.)

It already installs half of the libraries I want to use, so it would save me quite a few manual installs.

Bulb

@dkf said in How to best deal with Python's dying batteries?:

@Bulb said in How to best deal with Python's dying batteries?:

Now I know python's been pushing this venv thing, but I don't have experience with it.

A virtual environment is really just a directory structure in a particular pattern and some environment variables to point into it. You can tell it to use a system-provided Python (probably some variant of python3)

It “does”, but not in the sense I mean. What it does is it symlinks (or wraps on systems that don't have symlinks) python to the venv directory, and only if you run it through that symlink is it configured to work in that venv. Which I consider antithesis to easily executing a script.

It also uses absolute paths in a bunch of places, making the venv non-portable, which means it has to be set up after checkout—on the build server the build directory is different every time (contains build id) and it starts out empty, because the build is running in a freshly booted VM anyway—and that's another antithesis to easily executing a script.

and if necessary the system-provided packages that go with it; that's not recommended for anything that doesn't come with Python.

I'm fine with including all non-stdlib packages; most of the system-provided ones are out of date anyway (well, the python is too, but there it's less of a problem).

Once you've set one up (made the directory structure) and activated it (set the environment variables, usually by running the bin/activate script in the directory structure) then it's straight forward;

The need to “activate” it is a .

The activation is dependent on which shell you run it from. With a handful terminal windows inside the IDE and a handful more outside, it's just too easy to run it in the wrong one and then it just doesn't work, and probably in a way it does not have good diagnostics for. And then there is all the automation that all needs to understand the python feature to work. That's pain in the ass, not simply running a script.

It does work consistently if you put the local bin/python in the venv on the #!-line, but I don't think the #!-line supports relative paths, making it another thing that needs to be updated after checkout.

you just install everything you need in there. Probably by starting with getting an up-to-date pip. Yes, it uses more disk space than sharing everything, but disk capacity is really very cheap for the amount of room you're going to need.

Well, I have to include it if I want to distribute it easily anyway, that's not the problem.

(It's all much easier now that there are wheels for most packages that need them, and you don't have to do nasty things like building SciPy from source.)

I'd only allow pure python packages. The script is supposed to run (almost) anywhere…

Bulb

@dkf Have you used the pip wheel function? That, combined with the python default of inserting the script location at the beginning of sys.path, looks like it might address the issue here.

PleegWat

@Bulb said in How to best deal with Python's dying batteries?:

It does work consistently if you put the local bin/python in the venv on the #!-line, but I don't think the #!-line supports relative paths, making it another thing that needs to be updated after checkout.

What if you add a separate wrapper which opens the venv then inside that invokes python with the script? Then none of your other code needs to run in the venv.

Just guessing, I don't have experience with this but I know I've got one of these on the backlog as well.

Bulb

@PleegWat

That's another blow to the “simply run a script” concept.
It doesn't work for the IDE. Ok, the IDE has a plugin that can be easy-ish-ly told to use the venv python link, but still, E2MUCHSETUP.

Arantor

Going to throw a real grenade into the mix here: Dockerise it.

You can absolutely design a Docker container to run once and terminate, you encapsulate all the things you want, in the versions you want, in the container.

The part that I’m wary of is getting credentials into it for the purposes of handing that to AWS but depending on how you plan on doing that, it might not be insurmountable.

Bulb

@Arantor said in How to best deal with Python's dying batteries?:

Going to throw a real grenade into the mix here: Dockerise it.

Actually … that's what it currently is. Except

I seem to be the only one mostly sticking to actually developing in devcontainers,
the configuration to run the build in that container is a special- hack (because there is no simple tool to interpret the .devcontainer/devcontainer.json), and
it's easier to just slap a bunch of .py files in a ConfigMap and mount them in a container based on standard python:3.11 or whichever image than going through the extra trouble of building bunch of special images, which means setting up their own builds and propagating the tags to the deployment etc.

You can absolutely design a Docker container to run once and terminate, you encapsulate all the things you want, in the versions you want, in the container.

The part that I’m wary of is getting credentials into it for the purposes of handing that to AWS but depending on how you plan on doing that, it might not be insurmountable.

Credentials can be passed at runtime either in environment variables, or by binding/mounting appropriate files. I do that all over the place.

Arantor

@Bulb to the last point, I do that too, but I wasn’t sure from what context was going on if there was some other more magic way of it working. I’ve seen people do weird shit with the AWS cli before now.

I just thought I’d throw the idea in because this is the sort of problem Docker is good at (and where venv really isn’t that helpful)

Bulb

@Arantor I don't know AWS cli, but Azure cli has it's five or so standard methods of logging in (credentials from arguments, credentials from environment, managed identity, browser interactive and device code grant) and at least one is usable in any of the use-cases. And then the azure-identity library (it's version for every language) can use the same methods directly, or pick them up from azure cli.

dkf

@Bulb said in How to best deal with Python's dying batteries?:

I'd only allow pure python packages. The script is supposed to run (almost) anywhere…

But when the wheels exist for pretty much any package and platform you're likely to need them for, why restrict yourself? The "(almost) anywhere" is really fantastically unlikely to be more than a really quite small set of platforms: three different operating systems most likely, and at most four CPU architectures (though 32-bit x86 is getting rare now). Do you really need to worry about an old Sun UltraSPARC workstation, when that's something you're probably never going to see now outside a museum or the lower levels of the city trash dump? There's a limit on how flexible you need to be. (I can guarantee that there are platforms that won't run your code. Mostly because you can't fit Python on them in the first place...)

Avoiding custom C++ code for your project is far more understandable.

Bulb

The best I came up with so far is:

deps.py (in the same directory as the other scripts):

from os.path import join
import os
import sys
d = join(__file__, 'pydeps')
sys.path[0:0] = [join(d, f) for f in os.listdir(d)]

script:
```
import deps
import …
```
updating:

update the dependency list in the requirements.txt, clear out the pydeps directory, then do pip wheel -r requirements.txt -w pydeps and check it in.

Bulb

@dkf said in How to best deal with Python's dying batteries?:

@Bulb said in How to best deal with Python's dying batteries?:

I'd only allow pure python packages. The script is supposed to run (almost) anywhere…

But when the wheels exist for pretty much any package and platform you're likely to need them for, why restrict yourself?

Because the part that is allowed access to the infernet does not know where it will eventually run and the part where it eventually runs is not allowed to go fetch them from anywhere (well, I could allow it, but that again defeats the purpose of simply running a script if it starts by downloading something).

Bulb

@Bulb said in How to best deal with Python's dying batteries?:

The git repository cannot be checked out where the script should run, so the script needs to be exported from the git repository as preferably a single file that can be executed (on any platform using system python).

… I knew I was already looking for something for that once:

pex

that zips up a package with entrypoint and all directories to the zipped bundle python knows how to execute – without bundling also the interpreter itself like py2exe does, so that suits that use-case.

… I'll probably stick to the devcontainer for running from git, I've got that set up anyway.

topspin

@Bulb said in How to best deal with Python's dying batteries?:

pex

Still unsure about what pex does or how it works? Watch this quick lightning talk: WTF is PEX?.

Lightning talk. 16:51.

Bulb

@topspin Yeah, when after 10 seconds it was still playing a jingle, I closed it and went to look at the documentation instead.