Conda/Mamba¶
Conda¶
For the most up-to-date and detailed documentation please refer to the official conda documentation at https://docs.conda.io/en/latest/
If you are looking to only manage python packages, consider the use of venv which can be simpler.
Mamba
Mamba is a drop in replacement for conda which has a faster package and dependency resolver. Mamba is available if you self-install using the miniforge instructions below.
Loading Conda¶
Conda is available through the module system
Terminal
module load miniconda3
source $(conda info --base)/etc/profile.d/conda.sh
This will load the latest version of conda installed on the system.
Note
After loading conda, source $(conda info --base)/etc/profile.d/conda.sh
is needed to load the conda functions into your environment and needs to be done everytime the miniconda module is loaded. This is done instead of using conda init
. The use of conda init
is not recommended as it hard codes a specific version of conda into your bashrc which can cause issues if you are not managing the installation of conda yourself.
This also mirrors the requirement for when using conda in a slurm script as your bashrc is not parsed as part of a SLURM job.
Self installing¶
If you would prefer to install and manage your own version of conda you can do so, we recommend using Miniforge to manage conda environments and packages. Miniforge is a community-led, minimal conda/mamba installer that uses conda-forge as the default channel.
To install Miniforge under your user account, you can use the following commands:
Terminal
wget https://github.com/conda-forge/miniforge/releases/latest/download/Miniforge3-Linux-x86_64.sh
bash Miniforge3-Linux-x86_64.sh -b -u
Once installed (the default location is /home/<user>/miniforge3
), load conda using:
Terminal
source ~/miniforge3/etc/profile.d/conda.sh
Note
Installing miniforge and loading as above will also give you access to mamba
which is a drop in faster replacement to conda
.
Extra Configuration¶
Cache location¶
By default conda will download the code for packages into a cache in your home directory. It is a good idea to change this to instead be your project directory (if you have one)
Terminal
conda config --add pkgs_dirs /path/to/project/conda_pkgs
Bioconda¶
Bioconda (https://bioconda.github.io) is a popular repository for bioinformatic software. To be able to make use of the bioconda repository you must configure conda to know about it. The following commands are from https://bioconda.github.io/#usage and will configure conda to search and download from the bioconda repositiory when installing into enivronments.
Terminal
conda config --add channels bioconda
conda config --add channels conda-forge
conda config --set channel_priority strict
These commands modify your ~/.condarc
file.
Conda environments¶
Conda environments let you manage software (and it's dependencies). Ultimately, conda install software into specific directories and then alters your PATH
for you to make them accessible. To utilise an environment it must first exist, and then be activated.
There are two types of environments: named and prefix.
-
Named environments are installed within your home directory (subdirectories within
~/.conda/envs
) and will let you activiate byconda activate <environment_name>
. -
Prefix environments are installed into the location you specify at creation. If this location is accessible by others, they too can use the environment. For reproducibility it is useful to create an environment in a project directory and use that for operating on data there. When you change to a different project, you can activate the corresponding environment. This lets you manage your software at the project level.
Note
As named environments are stored in your home directory these can take up a large portion of your home directory storage quota. They are also not accessible to others.
The use of prefix environments is encouraged as you can specify the location to store these (ideally in your project directory)
Warning
We strongly recommend against using conda init
. It inserts a snippet in your ~/.bashrc
file that will freeze the version of conda used, bypassing the environment module system.
Creating and activating a sub-environment
Although once you have activated the base conda environment, you can in principle start to install packages immediately, your use of conda will generally be better organised if you do not install packages directly into the base environment, but instead use a named sub-environment. You can have multiple sub-environments under a single base environment, and activate the one that is required at any one time. Unless you install packages directly into the base environment, your sub-environments will work independently.
Managing Conda Environments to Conserve Home Directory Storage
To save home directory storage space, it is recommended to create Conda environments in a shared project directory. This approach allows you to manage your Conda environments within your project directory and if needed share them with collaborators. If you do not yet have a shared project directory, please contact RTIS Solutions to request one.
To create a named environment (for example, called "myenv"), ensure that the base environment is activated (the command prompt should start with "(base) "), and type:
Creating Conda Environments
conda create -p /path/to/create/environment/
conda create -n myenv
It will show the proposed installation location, and once you answer the prompt to proceed, will do the installation. If you have followed these instruction, this location should be /home/users/<your_username>/miniconda3/envs/myenv
. You can alternatively give it a different location using the option -p <path>
instead of -n <name>
.
Warning
Do not create conda environments in subdirectories of /mnt/auto-hcs/
- conda will either fail or have it will have issues.
Once you have created your sub-environment, you can activate it using conda activate <name>
for example:
Terminal
conda activate /path/to/conda/environment
conda activate myenv
The command prompt will then change (e.g. to start with "(myenv) ") to reflect this. Typing conda deactivate once will return you to the base environment; typing it a second time will deactivate conda completely (as above). | To List your conda environments type the following:
Terminal
conda env list
Installing conda packages¶
Once you have activated an environment, you can install packages with the conda install command, for example:
Terminal
conda install gcc
You can also force particular versions to be installed. See the conda cheat sheet for details.
To list the packages installed in the currently activated environment, you can type conda list.
Cleaning up¶
Cleaning Cache¶
Once you've made your environments it can be a good idea to clean up your cache
Terminal
# remove index cache, lock files, unused cache packages, tarballs, and logfiles
conda clean --all
Migrating an Existing Conda Environment¶
To move an existing Conda environment to a new location:
-
Export your current environment to a YAML file:
Terminal
conda env export --name existing_env > environment.yml
-
Create a new environment from the exported YAML file at your chosen location:
Terminal
conda env create --prefix /path/to/project_directory/env/conda_envs/myenv --file environment.yml
-
Activate the newly created environment:
Terminal
conda activate /path/to/project_directory/env/conda_envs/myenv
Creating an Alias for Easy Activation¶
To simplify environment activation, consider adding an alias to your shell configuration file (e.g., .bashrc
or .bash_profile
):
Terminal
alias activate_myenv="conda activate /path/to/project_directory/env"
Activate your environment using the alias:
Terminal
activate_myenv
This method is Python-version agnostic and provides a convenient way to manage Conda environments in shared or collaborative project directories.
Removing environments¶
If you no longer need an environment the easiest way to remove it is:
Terminal
conda env remove -p /path/to/env
conda env remove -n env_name
Running packages from your conda environment¶
In order to run packages from a conda environment that you installed previously, you will first need to activate the environment in the session that you are using. This means repeating some of the commands typed above. Of course, you will not need to repeat the steps to create the environment or install the software, but the following may be needed again:
Terminal
conda deactivate
conda activate myenv
Installing pip packages¶
Many python packages that are available via PyPI are also available as conda packages in conda-forge, and it is generally best to use these via "conda install" as above.
Nonetheless, you can also install pip packages (as opposed to conda packages) into your conda environment. However, first you should type:
Terminal
conda install pip
before typing the desired commands such as
Terminal
pip install numpy
If you do not install pip into your sub-environment, then either:
your shell will fail to find the pip executable, or your shell will find pip in your base environment, which will lead to pip packages being installed into the base environment, resulting in potential interference between your conda environments Explicitly installing pip into your sub-environment will guard against this.
Using conda with SLURM¶
In order to use conda environments within your slurm script you need to source the conda profile script so that the conda paths get set.
Terminal
module load miniconda3
source $(conda info --base)/etc/profile.d/conda.sh
export PYTHONNOUSERSITE=1 # don't add python user site library to path
conda activate /path/to/env/
source ~/miniforge3/etc/profile.d/conda.sh
export PYTHONNOUSERSITE=1 # don't add python user site library to path
conda activate /path/to/env/
Adding custom conda environments to Jupyter¶
On the commandline, first create a conda environment and install the packages/software you wish into it.
Then add the ipykernel
and register it with Juptyer.
Terminal
conda create --path /path/to/env
conda activate /path/to/env
conda install <packages/software of interest>
conda install ipykernel
python -m ipykernel install --user --name=myCondaEnvironment
Then in Jupyter the custom environment can be loaded by Kernel -> Change Kernel