Recommended environments for machine learning on Arch Linux with Python

machine-learning-python-environment poster image of a cyborg cyber face with a neural network on it

Arch Linux runs a rolling release. This gets you the most up-to-date packages. This also means you may need to update certain drivers or libraries to new versions.

Python dependencies are installed into system or user packages, for the installed python version. This can be problematic if you have different versions of python packages needed as other dependencies also require these files.

Note: I believe they added a feature in newer versions of pip that will prevent you from installing python dependencies to these user and system packages.

I recommend that you create an environment for your project and then install your dependencies into that environment. For projects that utilize special drivers or libraries, like GPU drivers, CUDA, Python version you want a more encapsulating environment to handle this.

Here are some options for creating a new environment for your project.

Anaconda

Anaconda is a distribution of python that aims to simplify package management. It allows you to install specific CUDA versions, python versions, and other dependencies.

Look into making an environment.yml for your Anaconda environment like requirements.txt.

Docker/Podman

Docker/Podman is another option where you can utilize installing specific versions of any dependencies. This option is also useful to utilize deployments of your application up on cloud services that support Docker containers like Amazon SageMaker, Vast.AI.

Cog

Cog is an open-source tool that lets you package machine learning models in a standard, production-ready container.

This is a tool designed from the ground up to support containerization of machine learning models. This could be an option for you.

replicate/cog

Distrobox

Use any linux distribution inside your terminal. Enable both backward and forward compatibility with software and freedom to use whatever distribution you’re more comfortable with.

Distrobox basically allows you to run another linux distro comparable to how docker works for containers. Then you can install all your dependencies and run your project in there.

distrobox

Considerations

Consider these options for what may work for you.

venv

In a lot of cases a venv can be fine if you can utilize the more up-to-date versions. The problem comes when you need to update to a new version, and then you need to try, and hack up the requirements to work for multiple versions of things like PyTorch or TensorFlow.