There are a number of file systems on ALICE that should be used in different ways.

Warning

Files within /scratch and /local are not backed up. Furthermore, there is an automated process which deletes any files that haven't been accessed in more than 60 days. This process runs at the administrators' discretion should the file system become close to full. There will be no prior warning that files will be deleted, it is up to users to ensure that important data is not kept in /scratch for long term storage.

Attempts to use tools such as 'touch' to defeat the scratch sweeper are unacceptable and may lead to your account being locked.

/home

Every user has a home directory within the /home file system. This will be your current working directory on when you log into ALICE. The home directory path is derived from your user name, e.g. user nye1 would have a home directory located at

/home/n/nye1

The home directory should only be used to store program code, job submission scripts, configuration files and small amounts of data.

The home directory can be referenced on the Linux command line with the shorthand ~. The simple command cd with no arguments will return you to your home directory.

  • Users' home directories are backed up nightly. The backups are retained for 28 days.
  • Users cannot share content in their home directory with other users.
  • When a user leaves the university, their home directory will be deleted three months after their leaving date.

There is a hard quota of 60GB for all users, you will not be able to write further data to the filesystem if you exceed this quota. You can check your usage relative to quota with the command quotacheck, e.g.:

quotacheck
/home/l/nye1: 25GiB used of 60GiB (hard limit 60GiB)
Overall scratch and data usage: 150GiB, 74943 files

ALICE Shared area                 Use       Files        Size        Used       Avail
/scratch/project                   0%         868        0KiB      151GiB     -151GiB
/data/project                      0%         868        0KiB      151GiB     -151GiB

File number quota: 74943 of 1000000 files used (7.49%) across /scratch and /data

Note that shared area values are currently combined for /scratch and /data

You can see which files or directories in your home directory are using the most space with the command homeusage, e.g.:

homeusage

The homeusage command can take a significant amount of time to return results, depending on the number of files in your home directory.

Home directory quotas are not increased. If you need more storage space you should use one of the other file systems available.

/scratch

Warning

Files within /scratch are not backed up. Furthermore, there is an automated process which deletes any files that haven't been accessed in more than 60 days. This process runs at the administrators' discretion should the file system become close to full. There will be no prior warning that files will be deleted, it is up to users to ensure that irreplaceable data is not kept in /scratch for long term storage.

You will have a scratch directory on the scratch file system for each project that you are associated with.

If you are not a member of an HPC project you will have a scratch area under /scratch/alice, eg /scratch/alice/a/abc1

The environment variable SCRATCHDIR is automatically set on login and points to the first available scratch directory. This is displayed every time a new SSH or NoMachine session is started.

For project members, the location will be /scratch/project/username - so users in multiple projects will have several to choose from. As each user's scratch directory is readable by other members of the same project, it is important to choose the correct one depending on which project is being worked on.

The directory /scratch/project/shared has special permissions to ensure that all files within are always owned by the project group.

The scratch directory used should be the main location for job files, and generally should be used as the working directory for jobs.

In order to keep the file system from filling up, /scratch is automatically cleaned of old files periodically.

You can check your usage of /scratch (and your home directory and any project /data directories) with the quotacheck command as described in the /home section.

/data

This file system is provided for reasonably static data that is shared between users within a project and is actively used.

It is not to be used for intermediate or temporary data or for archiving data which is no longer actively accessed.

Directories are created on request from an HPC project PI (Principal Investigator) by the administrators. Please contact the Service Desk if you have a requirement to use this file system for shared data, specifying:

  • Which ALICE project the space is being requested for.
  • What the space will be used for.
  • How much space is being requested.

Folders in the /data file system are backed up twice a week. The backups are retained for 28 days.

There is a special folder within your project's /data area called nobackup. Files within this folder are not backed up. The nobackup folder can be thought of as something between a /scratch and /data area - it is not subject to our cleaning process, but is also not backed up. It can be a useful place to store data that can be recreated or redownloaded but may not be used for more than the 60 day period that is safe for files in the /scratch area.

You can check your usage of /data (and your home directory and any /scratch) with the quotacheck command as described in the /home section.

/local

Each compute node has a local disc mounted on /local. For some jobs there may be a performance gain over /scratch in using this file system for intermediate files.

The local file system has 400GB available, and this is available to all jobs running on the node. The preferred way to use /local within a job is to refer to it using the environment variable $TMPDIR. This way the job's files are segregated from those of other users' jobs, and are cleaned-up when the job finishes automatically. Otherwise it is your job's responsibility to remove files from the local file system when it ends.

The /local file system is not backed up, and the contents are deleted whenever the compute node reboots for any reason.

Files on /local should only be considered safe for the duration of the job which they belong to.

There is a 100GB per user quota on /local on login nodes. You can check your usage of /local on a login node by running quota -l -f /local.

/rfs

If your project has data stored on the Research File Store (RFS), you will be able to access the data from the login nodes only, via /rfs.

As /rfs is only available from the login nodes, it cannot be used for jobs submitted via the scheduler.

Important

An ongoing issue is preventing authorisation to access /rfs areas when first logging on via NoMachine. The temporary fix for this is to lock your session screen and then unlock it:

From 'System' on your start panel select 'Lock Screen', then enter your password to unlock your session.

Alternatively, start a terminal window and follow the instructions below for running the 'kinit' command.

Due to the way the RFS is secured on the login nodes, you will only be able to access it if you have logged into ALICE using your UOL account password. If you have logged in using a SSH key then the RFS is likely to be inaccessible. In such cases you can get the necessary authentication token to access the RFS by entering the command kinit at the ALICE prompt and entering your password when prompted.

Below we show an example of the RFS being inaccessible until the kinit command has been executed.

nye1@alice-login01:~> ls -l /rfs/Project/
/bin/ls: cannot access /rfs/Project/: Permission denied

nye1@alice-login01:~> kinit
Password for nye1@UOL.LE.AC.UK:

nye1@alice-login01:~> ls -l /rfs/Project/
d---------  2 nye1   ii_staff  7 2015-09-18 15:35 Data
drwxr-xr-x  2 nye1   ii_staff  3 2017-06-08 17:09 Papers
drwxr-xr-x  2 nye1   ii_staff 11 2019-08-01 11:48 Work

Note that there is no access to the RFS from the compute nodes at all; the RFS is not designed for the load that parallel jobs on ALICE's compute nodes could generate. If you need to process data stored on the RFS you must first copy the data to one of ALICE's local file systems.

As file permissions on the RFS are controlled by NFSv4 access control lists (ACLs) there can be issues when copying between RFS and ALICE file systems. For data copied from ALICE into the RFS, it may be necessary to tidy the ACLs with the nfs4_setfacl command. Contact the Service Desk if you have problems.

When copying data from the RFS to ALICE, permissions information can be lost as the traditional Unix permissions that appear on files and folders on the RFS do not show the whole picture. The safest way to copy files from the RFS to ALICE is to preserve all file attributes except the file permissions, e.g.

cp -a --no-preserve=mode /rfs/Project/folder /scratch/project/shared/

Beware that this would have the side-effect of removing the executable permissions of executable files; these would have to be manually applied again on the destination files after the copy.

See Research File Store for more information about RFS.

Using /tmp

The /tmp (temporary) filesystem should not be used. Well behaved codes will use the $TMPDIR environment variable to find temporary disk space. This will be your user local directory (/local/$USERNAME) on login nodes, or a job temporary directory (/local/$JOBID) on compute nodes.

To prevent usage of /tmp from causing problems, we set a low (100MB) quota per user on /tmp.

You can check if you are using space on /tmp on a node by running: quota -l -f /tmp.