Skip to content

Installing Software

Note

Most of the below examples are tailored to the DSH HPC Cluster -- for non-cluster machines, simply replace all instances of /hpchome/username with your machine's local home path (e.g. /home/username).

Downloading software from Artifactory

When working in the DSH Research Compute environment, you can install software available via Artifactory into your own space (either the DSH HPC Cluster, or your Customer Specialist Server). You can only access the DSH Artifactory instance from inside the DSH.

To download and install software available in Artifactory, you must use your DSH credentials to log in to the Artifactory website, e.g. using DSH Desktop web browser (https://artifactory.idhs.ucl.ac.uk/): jFrog login

After you log in, you will be shown the Packages page by default, which allows you to browse the available packages. Note that due to the security-focused nature of the DSH, there will be some packages or versions that are unavailable. If the package that you are looking for is not present you may be able to request it. Note, however, that there may be some software that simply poses an unacceptable risk. jFrog packages

If the package you want to install is listed, then you can proceed to download it to your DSH Desktop by pressing the download icon. jFrog download

Downloading and installing packages with R, Conda, and Pip.

In addition to using the website directly, you can also download and install packages into an environment using R, Conda, and Pip. To do this, you will need to generate an Artifactory token and set up the appropriate configuration files.

Generating an Artifactory token

To generate an Artifactory token for your configuration files, browse to the Artifactory website using your DSH Desktop web browser, use the left panel to navigate to the "Artifactory > Artifacts" page, and click the "Set Me Up" button in the top-right corner. jFrog artifacts jFrog setmeup

Inside the "Set Me Up" interface, select a package type (doesn't matter which) and generate a personal Artifactory token by typing your password in the text box and clicking "Generate Token & Create Instructions". jFrog token jFrog token

Copy this token, and paste it into the appropriate configuration file(s), as described in the sections below. You can use the same token for all package types/configuration files.

Note

While DSH Desktop uses a Windows environment, the DSH Research Compute environment uses Linux. The keyboard combination "Ctrl+V" will not work to paste things into Linux -- use "Shift+Insert" instead, which works in both Windows and Linux.

Important

The Artifactory token is tied to your current DSH account password. While they are less sensitive than your password, tokens should nonetheless be treated similarly to passwords in terms of keeping them secret. If you change your DSH account password you will also need to generate a new Artifactory token and update your configuration files.

Setting up your configuration files

Note

The guidance below focuses on the DSH HPC Cluster. If you are working in a Customer Specialist Server, you can typically just substitute your machine's local home directory in place of the DSH HPC Cluster home for all instructions below.

If you have already set up your configuration files for use in the DSH Desktop environment, then you can simply copy those files into your DSH HPC Cluster home environment, e.g. using WinSCP. However, note that the files may be located or named slightly differently for the Linux environment -- see the sections below for details.

If you do not already have a suitable configuration file available, you can create it yourself using the guidance below and a text editor such as nano or vi.

Note

Since most of these configuration files and directory names begin with a ., they will be treated as hidden in Linux by default. To see these items in your DSH HPC Cluster home, you can use a command like ls -ahlp ~ which lists all files in your home directory, including hidden files.

For all of the example configurations below, remember to replace the placeholder variables with your own information -- that is, replace "YOURUSERID" with your DSH user ID, and "TOKEN" with the Artifactory token generated above.

Conda: ~/.condarc

To use Conda (Miniconda) in the DSH HPC Cluster, you must have a .condarc configuration file in the root of your DSH HPC Cluster home directory: ~/.condarc.

The .condarc file should generally follow the format outlined below:

channel_alias: https://YOURUSERID:TOKEN@artifactory.idhs.ucl.ac.uk/artifactory/api/conda/conda
channels:
  - defaults
  - https://YOURUSERID:TOKEN@artifactory.idhs.ucl.ac.uk/artifactory/api/conda/conda
default_channels:
  - https://YOURUSERID:TOKEN@artifactory.idhs.ucl.ac.uk/artifactory/api/conda/conda

Pip: ~/.pip/pip.conf

To use Pip in the DSH HPC Cluster, you must have a pip.conf configuration file inside the .pip subdirectory of your DSH HPC Cluster home: ~/.pip/pip.conf. If it doesn't already exist, you may need to create the .pip directory manually, e.g. mkdir ~/.pip.

Note

The Pip configuration file location and name are different from the DSH Desktop environment. Make sure you've named the file correctly and placed it in the correct folder structure.

The pip.conf file should generally follow the format outlined below:

[global]
index-url = https://YOURUSERID:TOKEN@artifactory.idhs.ucl.ac.uk/artifactory/api/pypi/pypi/simple

R: ~/.Rconfig

To use R packages (with R and Rstudio) in the DSH HPC Cluster, you must have an .Rprofile configuration file in the root of your DSH HPC Cluster home directory: ~/.Rprofile.

The .Rprofile file should generally follow the format outlined below:

local({
    r <- list("cran" = "https://YOURUSERID:TOKEN@artifactory.idhs.ucl.ac.uk/artifactory/cran/")
    options(repos = r)
})
local({
    r <- list("cran-remote" = "https://YOURUSERID:TOKEN@artifactory.idhs.ucl.ac.uk/artifactory/cran-remote/")
    options(repos = r)
})

Using virtual environments

We strongly encourage our users to make use of virtual environments to install their packages, as this gives greater control over package interactions and version management, and makes troubleshooting much easier.

To create a virtual environment using Conda, do the following:

# create the new virtual environment and give it a name
conda create --name myenv python=3.10 pip

This environment is called "myenv", uses Python version 3.10, and installs "pip" directly into the new environment.

For Python environments, we recommend specifying a Python version and directly installing pip, to ensure that both Conda and Pip are in sync with one another for a consistent experience.

Once the environment is created, you can activate it with conda activate myenv (replacing "myenv" with your environment's name). Your bash prompt will change to show you which virtual environment is active:

(myenv) [uccacxx@dsh-sge2log01 ~]$ 

To deactivate the current virtual environment and return your prompt to normal, use conda deactivate. You only need to create the virtualenv the first time, simply activate it using conda activate <name> thereafter.

Installing packages using Conda, Pip, and R

Once you have set up your configuration files, and created and activated a virtual environment, you can install your desired packages in the following ways.

For Conda, use the following command in a terminal to install MYPACKAGE to your current Conda environment:
conda install MYPACKAGE

For Pip, use the following command in a terminal to install MYPACKAGE to your current Python environment:
pip install MYPACKAGE

For R, use the following code in an R console to install MYPACKAGE to your active cluster R library: install.packages("MYPACKAGE")

Installing your own R packages

If we do not have R packages installed centrally that you wish to use, you can install them in your space on the cluster and tell R where to find them. First you need to tell R where to install your package to and where to look for user-installed packages, using the R library path.

Set your R library path

There are several ways to modify your R library path so you can pick up packages that you have installed in your own space.

The easiest way is to add them to the R_LIBS environment variable (insert the correct path):

export R_LIBS=/your/local/R/library/path:$R_LIBS

This is a colon-separated list of directories that R will search through.

Setting that in your terminal will let you install to that path from inside R and should also be put in your jobscript (or your .bashrc) when you submit a job to the cluster using those libraries. This appends your directory to the existing value of $R_LIBS rather than overwriting it so the centrally-installed libraries can still be found. You can also change the library path for a session from within R:

.libPaths(c('~/MyRlibs',.libPaths()))

This puts your directory at the beginning of R's search path, and means that install.packages() will automatically put packages there and the library() function will find libraries in your local directory.

Install an R package

To install, after setting your library path:

From inside R, you can do

install.packages("MYPACKAGE")

Or if you have downloaded the tar file, you can do

R CMD INSTALL -l ~/your_R_libs_directory MYPACKAGE.tar.gz

If you want to keep some libraries separate, you can have multiple colon-separated paths in your $R_LIBS and specify which one you want to install into with R CMD INSTALL.

Installing your own Python packages

Python virtual environments

Similar to Conda environments described above, you can create and use Python's own virtualenvs functionality to create a virtual environment:

virtualenv <DIR> 
source <DIR>/bin/activate

Your bash prompt will show you that a different virtualenv is active.

Installing via setup.py

If you need to install a Python package using setup.py, you can use the --user flag and as long as one of the Python bundles is loaded, it will install into the same .python2local or .python3local as Pip and you won't need to add any new paths to your environment.

python setup.py install --user

You can alternatively use --prefix in which case you will have to set the install prefix to somewhere in your space, and also set PYTHONPATH and PATH to include your install location. Some installs won't create the prefix directory for you, in which case create it first. This is useful if you want to keep this package entirely separate and only in your paths on demand.

# For Python 2.7
export PYTHONPATH=/hpchome/username/your/path/lib/python2.7/site-packages:$PYTHONPATH  
# if necessary, create install path  
mkdir -p hpchome/username/your/path/lib/python2.7/site-packages  
python setup.py install --prefix=/hpchome/username/your/path

# add these to your .bashrc or jobscript  
export PYTHONPATH=/hpchome/username/your/path/lib/python2.7/site-packages:$PYTHONPATH  
export PATH=/hpchome/username/your/path/bin:$PATH
# For Python 3.7
# add location to PYTHONPATH so Python can find it
export PYTHONPATH=/hpchome/username/your/path/lib/python3.7/site-packages:$PYTHONPATH
# if necessary, create lib/pythonx.x/site-packages in your desired install location
mkdir -p /hpchome/username/your/path/lib/python3.7/site-packages
# do the install
python setup.py install --prefix=/hpchome/username/your/path

# It will tend to tell you at install time if you need to change or create the `$PYTHONPATH` directory.
# To use this package, you'll need to add it to your paths in your jobscript or `.bashrc`.
#Check that the `PATH` is where your Python executables were installed.

export PYTHONPATH=/hpchome/username/your/path/lib/python3.7/site-packages:$PYTHONPATH
export PATH=/hpchome/username/your/path/bin:$PATH

Check that the PATH is where your Python executables were installed, and the PYTHONPATH is correct. It is very important that you keep the :$PYTHONPATH or :$PATH at the end of these - you are putting your location at the front of the existing contents of the path. If you leave them out, then only your package location will be found and nothing else.

Python script executable paths

If your executable Python script specifies a location for Python, e.g. #!/usr/bin/python2.7, it may fail if that particular Python installation doesn't exist at that location, or doesn't have the necessary additional packages installed. If this happens, you should change it so it uses the first Python found in your environment, e.g.:

#!/usr/bin/env python 

Installing software with no sudo.

Since our users do not have administrative privileges on DSH Research Compute machines, you cannot install anything requiring sudo. If the instructions tell you to do that, read further to see if they also have instructions for installing in user space, or for doing an install from source if they are RPMs.

Alternatively, just leave off the sudo from the command they tell you to run and look for an alternative way to give it an install location if it tries to install somewhere that isn't in your space. Examples for some common build systems are discussed below.

Automake configure

Automake will generate the Makefile for you and hopefully pick up sensible options through configuration. You can give it an install prefix to tell it where to install (or you can build it in place and not use make install at all).

./configure --prefix=/hpchome/username/place/you/want/to/install
make
# if it has a test suite, good idea to use it
make test 
make install

If it has more configuration flags, you can use ./configure --help to view them.

Usually configure will create a config.log: you can look in there to find if any tests have failed or things you think should have been picked up haven't.

CMake

CMake is another build system. It will have a CMakeFile or the instructions will ask you to use cmake or ccmake rather than make. It also generates Makefiles for you. ccmake is a terminal-based interactive interface where you can see what variables are set to and change them, then repeatedly configure until everything is correct, generate the Makefile and quit. cmake is the commandline version. The interactive process tends to go like this:

ccmake CMakeLists.txt
# press c to configure - will pick up some options
# press t to toggle advanced options
# keep making changes and configuring until no more errors or changes
# press g to generate and exit
make
# if it has a test suite, good idea to use it
make test 
make install

The options that you set using ccmake can also be passed on the commandline to cmake with -D. This allows you to script an install and run it again later. CMAKE_INSTALL_PREFIX is how you tell it where to install.

# making a build directory allows you to clean it up more easily
mkdir build
cd build
cmake .. -DCMAKE_INSTALL_PREFIX=/hpchome/username/place/you/want/to/install

If you need to rerun cmake/ccmake and reconfigure, remember to delete the CMakeCache.txt file first or it will still use your old options. Turning on verbose Makefiles in cmake is also useful if your code didn't compile first time - you'll be able to see what flags the compiler or linker is actually being given when it fails.

Make

Your code may come with a Makefile and have no configure, in which case the generic way to compile it is as follows:

make targetname

There's usually a default target, which make on its own will use. make all is also frequently used. If you need to change any configuration options, you'll need to edit those sections of the Makefile (usually near the top, where the variables/flags are defined).

Here are some typical variables you may want to change in a Makefile.

These are what compilers/mpi wrappers to use - these are also defined by the compiler modules, so you can see what they should be. Intel would be icc, icpc, ifort, while the GNU compiler would be gcc, g++, gfortran. If this is a program that can be compiled using MPI and only has a variable for CC, then set that to mpicc.

CC=gcc
CXX=g++
FC=gfortran
MPICC=mpicc
MPICXX=mpicxx
MPIF90=mpif90

CFLAGS and LDFLAGS are flags for the compiler and linker respectively, and there might be LIBS or INCLUDE in the Makefile as well. When linking a library with the name libfoo, use -lfoo.

CFLAGS="-I/path/to/include"
LDFLAGS="-L/path/to/foo/lib -L/path/to/bar/lib"
LDLIBS="-lfoo -lbar"

Remember to make clean first if you are recompiling with new options. This will delete object files from previous attempts.

Set your PATH and other environment variables

After you have installed your software, you'll need to add it to your PATH environment variable so you can run it without having to give the full path to its location.

Put this in your ~/.bashrc file so it will set this with every new session you create. Replace username with your username and point to the directory your binary was built in (frequently program/bin). This adds it to the front of your PATH, so if you install a newer version of something, it will be found before the system one.

export PATH=/hpchome/username/location/of/software/binary:$PATH

If you built a library that you'll go on to compile other software with, you probably want to also add the lib directory to your LD_LIBRARY_PATH and LIBRARY_PATH, and the include directory to CPATH (add export statements as above). This may mean your configure step will pick your library up correctly without any further effort on your part.

To make these changes to your .bashrc take effect in your current session:

source ~/.bashrc

Troubleshooting

Remove your Pip cache

If you built something and it went wrong, and are trying to reinstall it with pip and keep getting errors that you think you should have fixed, you may still be using a previous cached version. The cache is in .cache/pip in your home directory, and you can delete it.

You can prevent caching entirely by installing using pip3 install --user --no-cache-dir <python3pkg>