TSD- Downtime on Thursday 29/01-15 from 09:00-16:00
TSD services will be unavailable for use from 0900 to 16.00 this day.
We assume 16.00 is a fair estimate on when we are up and running again, but this is no guarantee. We will inform you when TSD is enabled again.
Jobs on Colossus have a max time of 7 days, so please keep that in mind as jobs not finishing before 09.00 on 29/1 will not complete, and must be restarted.
We will :
- change major parts of the storage system to encrease speed, enble larger filesystems for projects (>256TB) and to close the final parts of some security issues
- upgrade SLURM (the queue system on our HPC cluster Colossus)
- upgrade the FhGFS filesystem on Colossus to the same version as on Abel
- upgrade the jumphosts and implement a better failover procedure
- enable far more projects than the previous limit (90 projects vs several 1000)
We are sorry for the inconvenience this will cause. But it is necessary to shut down TSD to complete these tasks.
You will not have to to anything for these changes to take effect in your project after we have restarted TSD after the upgrade.