Modules, packages, and libraries

Author

Marie-Hélène Burle

Definitions

“Modules” are Python files containing reusable code (e.g. functions, constants, utilities).

“Packages” are collections of modules.

“Libraries”, technically, are collections of packages, although “packages” and “libraries” are often used loosely and interchangeably in Python.

Installing packages on your machine

You can install external packages containing additional functions, constants, and utilities to extend the capabilities of Python.

The Python Package Index is a public repository of open source packages contributed by users.

Installation of packages can be done via pip.

Instead of installing packages system wide or for your user, you can create a semi-isolated Python environment in which you install the packages needed for a particular project. This makes reproducibility and collaboration easier. It also helps handle dependency conflicts. Some Linux distributions will not let you use pip outside a virtual environment anymore. It is a great practice to always use virtual environments.

Create a Python virtual environment called env:

python -m venv ~/env

Activate it:

source ~/env/bin/activate

Update pip:

python -m pip install --upgrade pip

Install packages:

python -m pip install <package>

On your local machine, particularly if you are on Windows and want to install a complex software stack, conda can makes things easy by installing from the Anaconda Distribution.

Installing packages on the clusters

Don’t use conda or Anaconda on the Alliance clusters. If you really must, do it in a container with Apptainer.

On the Alliance clusters, install packages inside a virtual environment and use Python wheels whenever possible.

You can see whether a wheel is available with avail_wheels <package> or look at the list of available wheels. To install from wheels instead of downloading from PyPI, add the --no-index flag to the install command.

Advantages of wheels:

  • compiled for the clusters hardware,
  • ensures no missing or conflicting dependencies,
  • much faster installation.

The workflow thus looks like:

python -m venv ~/env
source ~/env/bin/activate
python -m pip install --upgrade --no-index pip
python -m pip install --no-index <package>