Backup¶
For Pelle users: Changed procedures for NEW Gorilla storage versus OLD Crex storage
For projects on crex a project directory had the default content
- nobackup
- private
and everything except what was in 'nobackup' was backed up according the the backup processes.
In gorilla the default content is instead:
- backup
- private
And only what is in the 'backup' directory is backed up according to the backup processes.
Bianca storage works as before.
A backup allows one to restore his/her data after it has been (accidentally) lost.
This page describes how UPPMAX does backups.
All folders have a backup, except those in a folder called nobackup.
While UPPMAX systems may have backup, these are not designed to act as the sole repository of primary data, e.g. raw data or originals.
The PI is the main responsible person
A PI and his/her academic institution are ultimately responsible for his/her data.
We recommend PIs to maintain a primary copy of their data on a system they control, when possible.
If not, ensure that collaborators can only use the data in a responsible way. See the best practices on an UPPMAX filesystem
How can I access my backups?¶
Contact UPPMAX support and ask for help. Provide as much information as possible, especially directory and file names.
What is the UPPMAX backup procedure?¶
At Pelle¶
At Pelle cluster just the backup folder has backup, such as
| Folder | Example | Description | Backed up | Not backed up |
|---|---|---|---|---|
/home/[username] |
/home/sven |
Your home folder | Everything | No exception! |
/proj/UPPMAX2025-2-262 |
/proj/uppmax2025-2-262 |
UPPMAX projects | backup |
All other folders |
At Bianca and long-term storage clusters¶
All folders on Bianca and long-term storage have a backup, except those in a folder called nobackup,
such as:
| Folder | Example | Description | Exceptions |
|---|---|---|---|
/home/[username] |
/home/sven |
Your home folder | No exception! |
/proj/sensYYYYXXX |
/proj/sens2016001 |
Sensitive data project | Folders named nobackup |
/proj/NAISSYYYY-X-ZZ |
/proj/naissYYYY-4-ZZ |
Sensitive data project | Folders named nobackup |
/proj/sllstoreYYYYXXX |
/proj/sllstore2017096 |
SciLifeLab Storage | Folders named nobackup |
/proj/uppoff20YYXXX |
/proj/uppoff2021003 |
UPPMAX offload storage | Folders named nobackup |
Additionally, your home folder has snapshots taken, which take place more often and can be recovered yourself. See the UPPMAX documentation on snapshots.
UPPMAX performs an incremental backup with 30 day retention.
This means:
- After 30 days: your data is irretrievably gone
- Until 30 days: you can get your data back. If you've edited data, there is change you may be able to retrieve the newest version
What determines if a newly-edited file gets a backup?
- The duration of the change persisting. For example, a file that is created and deleted within a day is unlikely to get a backup. The longer the change persisted, the likelier it is to have its latest version in the backup
- The workload of the backup service is low. The lower the workload of the backup service, the likelier it is you have more recent versions of your files in a backup
The backup service works best when it can keep up with the changes on files that have a backup.
One important way to help work the backup service, is to put intermediate/temporary data in a directory that is not backed up.
What should I put in directories with backup?¶
Irreplaceable data that you are not actively working on.
What are examples of irreplaceable data?
Examples of irreplaceable data are:
- Raw/unprocessed measurements, which cannot be reproduced from a script
- Scripts for your analysis
Why should I not work actively on my data in a regular folder?
The backup mechanisms cannot keep up with large amounts of files changing on a rapid basis.
What should I put in directories without backup?¶
Reproducible/intermediate data that you are actively working on.
The backup mechanisms cannot keep up with large amounts of files changing on a rapid basis.
How robust is UPPMAX storage?¶
The hardware setup of UPPMAX storage is robust and unlikely to be the cause of lost data.
How is the hardware set up?
All UPPMAX storage systems use RAID technology to make storage more robust through redundancy.
This means that two or more disks must fail in the same 'RAID volume' before there is a risk of data loss, which has a rather low chance.
Still, this does not protect against disasters, e.g. a fire in the computer hall.
To take this into account, backups are sent off-site to either KTH or LiU, depending on the storage system.
This setup, however, does not protect against user error (e.g. removing all files in your project directory).