User Tools

Site Tools


quickstart

Quick Start

Quick Start

SSH Login

Interaction with the supercomputer is typically performed with command line tools. The command line tools can be run via a command prompt, also known as a shell. SSH is used to establish a secure shell with the supercomputer. In general, users should log in via the hostname

liger.ec-nantes.fr

Programs can be tested from the interactive nodes, but anything left running for more than an hour will be killed automatically. Read the batch job submission tutorial when you are ready to run your jobs.

Windows

Windows does not have SSH capabilities built-in. Download PuTTY, Bitvise Tunnelier or MobaXterm. Enter hostname: liger.ec-nantes.fr. Click connect and enter your username and password when prompted. Once connected you can run commands.

Linux / Mac OS

Linux and Mac OS have SSH built-in. Use a terminal that opens a command prompt on the local system, you must now connect to liger.ec-nantes.fr Run

ssh -X <login>@liger.ec-nantes.fr

where <login> is your username. Once connected you can run commands.

Note that you may need to add the -X option to the above ssh command in order to enable transparent forwarding of X applications to your local screen (this assumes that you have an X server running on your local machine). The OpenSSH version of ssh sometimes requires that you use -Y instead of -X (try -Y if applications don't appear or die with errors). These options to ssh direct the X communication through an encrypted tunnel, and should “just work”. In the bad old days before ssh the highly insecure method of persuading X windows to display on remote screens involved commands such as xauth and xhost - you do not and never will need to use either of these with ssh, and if you try to use them you may open up everything you type (including passwords) to be read, and even changed, by evil-doers.

Common Command Line Tools

If you're new to using a shell, look over this list of common commands and if you're really curious about what can be done here is a comprehensive list of shell commands.

More details about the access environment

XCS Web Portal

The extreme factory computing studio (XCS) secure portal gives users and system administrators alike direct access to all resources and applications. The user interface is born out of a continual dialogue between Bull and its customers, so extreme factory computing studio truly reflects users’ expectations when it comes to HPC applications. Once a user is logged on, he or she can access a complete work environment, customized for their job role and applications. From there, they can load and manage data, set parameters for the simulation, run the calculation, track its progress, and then proceed to post-processing and visualization of results.

Use the same username/password credential to access to XCS portal when connecting to the login SSH node for interactive session. To connect you browse the following url link:

https://liger.ec-nantes.fr

As a single gateway to the HPC environment, extreme factory computing studio simplifies the handling and management of an HPC cluster, and lets you focus on the actual calculation and results.

Password

It is possible to change your initial password on the XCS web portal or using the script Ichpasswd on a login node. Note that the security of both users' data and the service itself depends strongly on choosing the password sensibly, which in the age of automated cracking programs unfortunately means the following:

  • Use at least 10 characters
  • Use a mixture of upper and lower case letters, numbers and at least one non-alphanumeric character
  • Do not use dictionary words, common proper nouns or simple rearrangements of these
  • Do not use family names, user identifiers, car registrations, media references,…
  • Do not re-use a password in use on another system (this is for damage limitation in case of a compromise somewhere).
  • Passwords should be treated like credit card numbers (and not left around, emailed or shared etc). The above rules are similar to those which apply to systems elsewhere.

More about password

Remote desktops

It is also possible to connect to a remote VNC desktop session on a Liger login node. Please see the page on XCS Web Portal for details.

Filesystems

Please see here for a brief summary of available filesystems and the rules governing them and the storage policy related.

File transfers

Please refer to support before transfering big data from login nodes.
ICI-SC does not have a dedicated transfer data server with high rate bandwidth

Any method of file transfer that operates over SSH (e.g. scp, sftp, rsync) should work to or from liger, provided SSH access works in the same direction. Thus systems from which it is possible to login should likewise have no difficulty using scp/sftp/rsync, and from Liger out to remote machines such connections should also work provided the other system does not block SSH (unfortunately, some sites do, and even more unfortunately, some even block SSH coming out). In whichever direction the initial connection is made, files can then be transferred in either direction. Note that obsolete and insecure methods such as ftp and rcp will not work (nor should you wish to use such things).

Any UNIX-like system (such as a Linux or MacOSX machine) should already have scp, sftp or rsync (or be able to install them from native media). Similarly these tools can be installed on Windows systems as part of the Cygwin environment. An alternative providing drag-and-drop operation under Windows is WinSCP, and in the same vein MacOSX or Windows users might consider cyberduck.

Of the command-line tools mentioned here, rsync is possibly the fastest, the most sophisticated and also the most dangerous. The man page is extensive but for example the following command will copy a directory called results in your home directory on liger to the directory from_liger/results on the local side (where rsync is being run on your local machine and your username is assumed to be abc123):

rsync -av abc123@liger.ec-nantes.fr:results from_liger

Note that a final / on the source directory is significant for rsync - it would indicate that only the contents of the directory would be transferred (so specifying results/ in the above example would result in the contents being copied straight to from_darwin instead of to from_liger/results). A pleasant feature of rsync is that repeating the same command will lead to only files which appear to have been updated (based on the size and modification timestamp) being transferred. Rsync also validates each actual transfer by comparing checksums.

On directories containing many files rsync can be slow (as it has to examine each file individually). A less sophisticated but faster way to transfer such things may be to pipe tar through ssh, although the final copy should probably be verified by explicitly computing and comparing checksums, or perhaps by using rsync -avc between the original and the copy (which will do the equivalent thing and automatically re-transfer any files which fail the comparison). For example, here is the same copy of /home/abc123/results on liger copied to from_liger/results on the local machine using this method:

cd from_liger
ssh -e none -o cipher=arcfour abc123@liger.ec-nantes.fr 'cd /home/abc123 ; tar -cf - results' | tar -xvBf -

In the above the cd command is not actually necessary, but serves to illustrate how to navigate to a transfer directory in a different location to the /home directory.

Modules

We use the modules environment extensively. A module can for instance be associated with a particular version of Intel compiler, or different MPI libraries etc. Loading a module establishes the environment required to find the related include and library files at compile-time and run-time.

By default the environment is such that the most commonly required modules are already loaded. It is possible to see what modules are loaded by using the command module list

Currently Loaded Modulefiles:
  1) python/3.4.3/gcc/4.4.7

The above shows that python compiler is loaded (these are actually loaded as a result of loading the default- module, which is loaded automatically on login) and the version 3.4.4 has been compiled by a 4.4.7 gcc version. To permanently change what modules are loaded by default, edit your ~/.bashrc file, e.g. adding

module load python/3.4.3/gcc/4.4.7

Further commands:

module load <module>         -> load module
module unload <module>       -> unload module
module purge                 -> unload all modules
module list                  -> show currently loaded modules
module avail                 -> show available modules
module whatis                -> show available modules with brief explanation

Group membership

Your account has membership in one or more Unix groups. On Liger, groups are usually (but not always) organized by lab group and project named. The primary purpose of these groups is to facilitate sharing of files with other users, through the Unix permissions system. To see your Unix groups, try the following command:

user1# groups
group1 group2 L121212

In the example above, the user1 is a member of 3 groups - one of them is a project group.

.bashrc

The default environment should be correctly established automatically via the modules system and the shell initialization scripts. For example, essential system software for compilation, credit and quota management, job execution and scheduling, error-correcting wrappers and MPI recommended settings are all applied in this way. This works by setting the PATH and LD_LIBRARY_PATH environment variables, amongst others, to particular values. Please be careful when editing your ~/.bashrc file, if you wish to do so, as this can wreck the default settings and create many problems if done incorrectly, potentially rendering the account unusable until administrative intervention. In particular, if you wish to modify PATH or LD_LIBRARY_PATH please be sure to preserve the existing settings, e.g. do

export PATH=/your/custom/path/element:
export LD_LIBRARY_PATH=/your/custom/librarypath/element:

and don't simply overwrite the existing values, or you will have problems. If you are trying to add directories relating to centrally-installed software, please note that there is probably a module available which can be loaded to adjust the environment correctly.

Compiling

Note that no module by default is loaded, you must load modules you need listed by this command

module avail
----------------------------------------- /usr/share/Modules/modulefiles -----------------------------------------
cmake/2.8.9/gcc/4.4.7             hdf5/1.8.15/intel-2015.3.187      petsc/intelmpi-5.0.3/3.1-p8
cmake/3.1.0/gcc/4.4.7             iciplayer/1.2                     petsc/intelmpi-5.0.3/3.1-p8-debug
cmake/3.2.3/gcc/4.4.7             intel/2015.3.187                  python/2.7.10/gcc/4.4.7
gcc/4.9.3                         intelmpi/5.0.3.048                python/3.4.3/gcc/4.4.7
hdf5/1.8.15/gcc-4.9.3             lapack/3.5.0/gcc/4.9.3
 
---------------------------------------- /opt/mellanox/bupc/2.20.2/modules ---------------------------------------
bupc/2.20.2

When compiling code, it is usually possible to remove any direct MPI library references in your Makefile as mpicc & mpif90 will take care of these details. In the Makefile, simply set

CC=mpicc

etc, or define

export CC=mpicc

etc before compilation.

If some required libraries are missing, please let us know and we can try to install them centrally (as a module).

Submit a job to the batch system

A computing task submitted to a batch system is called a job. Job can be submitted in two ways:

  • from a command line or
  • using a job script. We recommend using a job script as it makes troubleshooting easier and also allows you to keep track of batch system parameters you used in the past.

Liger runs SLURM batch system.

To create a new job script (also called a submit script or a submit file) you need to:

  1. open a new file in one of our text editors (nano, emacs or vim);
  2. write a job script including batch system directives;
  3. remember to add any 'module load' that are needed - the environment of your shell is not transferred unless you ask for it directly
  4. save the file and submit it to the batch system queue using command sbatch.

There are several examples and more information about using the batch system and writing scripts in the subsection for the batch system.

Here we will demonstrate the usage of the batch system directives on the following simple submit file example.

sbatch example

You can also put these options inside your slurm job scripts in the following form

#!/bin/sh
#SBATCH  [option 1]
#SBATCH  [option 2]
...
#SBATCH  [option N]
...
# put here shell commands and variables
...
srun name_of_exectuable

MPI job (8 processors, 1 hour)

#!/bin/bash
#SBATCH -A <account>
#SBATCH --ntasks=8
# use --exclusive to get the whole nodes exclusively for this job
#SBATCH --exclusive
#SBATCH --time=01:00:00
 
# Load module for MPI and compiler, as needed/desired.
module purge 
module add intel
module add intelmpi
 
srun -n 8 ./mpi_program

Batch system directives start with #SBACTH. The first line says that Linux shell bash will be used to interpret the job script.

  • If applicable, -A specifies your Liger project ID formated as {An upper-case letter}[7 numbers] (if omitted it defaults to your main default project; spaces and slashes are not allowed in the project ID). Note that if you are running with no project/in the default queue, your job will take a VERY long time to start.
  • -N is a job name (default is the name of the submit file);
  • --output= and --error= specify paths to the standard output and standard error files (defaults to jobname.[eo]jobnumber);
  • --ntasks= specifies requested number of tasks/cores
  • --time= is the real time (as opposed to the processor time) that should be reserved for the job.
  • -c CPUs per task. Request that ncpus be allocated per process. This can be useful if the job is multi-threaded and requires more than one CPU per task for optimal performance. The default is one CPU per process.
  • --exclusive requests the whole node exclusively for this job. By default your job's working directory is the directory you start the job in.
  • Before running the program it is necessary to load the appropriate module for the MPI to have access to relevant parallel libraries (see our modules page); Generally, you should run your parallel program with srun.

Batch system commands

There is a set of batch system commands available to users for managing their jobs. The following is a list of commands useful to end-users:

  • sbatch <submit_file> submits a job to the batch system (if there were no syntax errors in the submit file the job is processed and inserted into the job queue, the integer job ID is printed on the screen);
  • squeue shows the current job queue (grouped by running, idle and blocked jobs); shown columns are:
    JOBID PARTITION NAME USER STATE TIME NODES QOS PRIORITY NODELIST(REASON)
  • scontrol show job <jobid> shows detailed information about a specific job;
  • scancel <jobid> deletes a job from the queue;
  • sinfo -s shows you Liger resources queues available.

You can find here everyday useful commands.

To know my account credits (core hours)

Run the user-command

Mybalance

This reports the current balance of credits, which equates to the number of core hours still available in the current accounting since the beginning of the current year (including expired credits if present).

Problems

Please check first of all whether there is an answer to your question in the FAQ. If not, please request support.

quickstart.txt · Last modified: 2016/05/18 14:07 by Richard.Randriatoamanana@ec-nantes.fr