Norwegian version of this page

TSD Operational Log - Page 6

Published Jan. 19, 2022 2:12 PM

Due to a software error, the Windows access of some users was extended past the period to which the project Administrator has approved. Now this is corrected. If you have lost your Windows machine access, please contact your project administrator to add you to the appropriate group.

f4transkript is not working

Published Dec. 16, 2021 9:00 AM

f4transkript is currently not available in TSD-projects.

We are working to solve the problem as quickly as possible.
Our apologies for the inconvenience.

[PLANNED] Network maintenance Mon 13/12/21

Published Dec. 6, 2021 9:28 AM

We will be performing network maintenance on Monday 13/12/21 at 12:00, the downtime will last 10-15mins.

We do not expect any downtime, but there might be some interruptions.

[Solved] Modules unavailable on submit and compute nodes

Published Nov. 8, 2021 10:58 AM

Between approximately 10:00 and 10:30, the module load command did not work on the submit hosts and Colossus nodes. Please resubmit any jobs that may have failed. This has been resolved.

[DONE] view-ous.tsd.usit.no downtime (Tue 16/11 12:00)

Published Nov. 8, 2021 8:59 AM

We will be performing maintenance on view-ous.tsd.usit.no on Tue 16/11 12:00.

The downtime will last about an hour.

BankID downtime 2021-10-21 00:01–08:01

Published Oct. 20, 2021 10:57 AM

BankID is having planned maintenance downtime on Thursday 2021-10-21 00:01–08:01, and will not be usable for authentication while this work is ongoing.

For TSD, this means that login to TSD Selfservice via BankID and authentication to Nettskjema forms using BankID will be unavailable in the above specified timeframe.

For further information please see Digdir's downtime notice:
https://status.digdir.no/incidents/pnlr6gn69dy4

-----

Update: BankID for mobile is working again. BankID with “kodebrikke” is still unavailable. This will also be unavailable on Oct 23

[SOLVED] Login problems in TSD

Published Oct. 15, 2021 8:34 AM

Login to TSD is currently unavailable.
We are working to solve the problem as quickly as possible.
Our apologies for the inconvenience.

--
The TSD Team

[SOLVED] Colossus downtime Wednesday 13/10 08:00-20:00

Published Oct. 13, 2021 8:31 AM

Update 20:30: Maintenance on the ESS storage system has been completed and the submit hosts are available again. Some submit hosts have been rebooted.

Update 18:58: Computation on Colossus has been resumed.

Update 18:48: The downtime has been extended, since the final update will result in additional NFS hangs.

Update 16:20: The downtime will be extended by approximately 2 hours. There may be additional NFS hangs on hosts mounting /cluster during this period.

Update 16:00: The last few Colossus nodes are being upgraded. Colossus will resume after that.

Colossus will have downtime on Wednesday 13/10 due to third-party maintenance on the /cluster file system. The affects /cluster on Colossus as well as NFS mounts of /cluster on the submit hosts.

Submit hosts will experience intermittent NFS hangs across the day.

Any jobs submitted to Colossus that cannot completed before the scheduled downtime w...

[Solved] /cluster NFS hangs on submit hosts

Published Oct. 10, 2021 8:37 PM

Hosts mounting /cluster/projects/pXX and /cluster/software experienced several NFS hangs on Saturday 09/10 and Sunday 10/10. User processes accessing these shares were killed upon remounting the NFS mounts.

The issue has been resolved.

[DONE] Horizon view login server downtime (Tue 19/10-21 12:00)

Published Oct. 7, 2021 10:22 AM

We will be performing maintennance on the horizon view login servers, the downtime is estimated to last 1 hour.

https://view.tsd.usit.no and https://view-ous.tsd.usit.no will be unavailable during the downtime.

The maintennance is now complete.

From 29.sept 2021 Nettskjema will no longer deliver JSON to TSD

Published Sep. 28, 2021 1:39 PM

Nettskjema will no longer automatically deliver encrypted JSON-files to all TSD-projects.

This will not affect the use of Dataloader.

If you still want to recieve encrypted JSON files for each submission, please notify tsd-drift@usit.uio.no

[PLANNED] Colossus downtime Wednesday 13/10 08:00-16:00

Published Sep. 27, 2021 3:37 PM

Colossus will have downtime on Wednesday 13/10 due to third-party maintenance on the /cluster file system. The affects /cluster on Colossus as well as NFS mounts of /cluster on the submit hosts.

Any jobs submitted to Colossus that cannot completed before the scheduled downtime will remain pending until after the downtime. They will be rescheduled automatically once the downtime ends.

During the downtime we advice you to keep an eye on our operational log for any updates.

[SOLVED]Login to TSD unavailable

Published Sep. 21, 2021 1:45 PM

VMware View is down. We are working on it.

Best

TSD

[SOLVED] Login unavailable

Published Sep. 17, 2021 8:15 AM

Login to TSD is currently unavailable.
We are working to solve the problem as quickly as possible.
Our apologies for the inconvenience.

--
The TSD Team

[DONE] Network maintenance Monday 30/8 10:00-15:00 . (no downtime expected)

Published Aug. 23, 2021 9:57 AM

We are doing network maintenance on Monday 30/8.

We do not expect any downtime during the maintenance.

[Solved] NFS hangs on durable and cluster

Published Aug. 9, 2021 3:11 PM

We're experiencing NFS hangs on data/durable and cluster which started at 14:15-15:00. We're actively working on a solution.

A few hosts were rebooted.

[Solved] Dragen updated to version 3.8.4 and new licenses installed.

Published July 15, 2021 11:44 AM

Dragen has been updated to version 3.8.4 and new licenses have been installed.

[Solved] Colossus unresponsive between 01:00 and 09:30 on 2021-06-30.

Published June 30, 2021 9:52 AM

The Slurm queue system was unresponsive between approximately 01:00 and 09:30 on 2021-06-30, indicated by this Slurm error: "slurm_load_jobs error: Socket timed out on send/recv operation"

No jobs have started in that period, but running jobs should not have been affected. If you had running jobs in this period we advice you to check your job results for errors.

The issue has been resolved.

[Done] Colossus downtime 21-06-29 0800 - 1600

Published June 29, 2021 8:06 AM

Colossus will have downtime today, 2021-06-29 from 08:00-16:00 to upgrade the Slurm job scheduler software.

We have set a reservation on the cluster so that jobs which request running time during the maintenance windows will not be scheduled from now on. These jobs will remain pending until after the downtime, when they will be rescheduled automatically. The submit hosts will be accessible, but cannot be used to submit jobs to Colossus.

During the downtime we advice you to keep an eye on this operational log for any updates.

[Solved] Selfservice Group management problems

Published June 18, 2021 4:46 PM

The group management pages in selfservice hasn't worked properly after the maintenance earlier this week.

There can be cases where a users haven't been added to groups.

[Solved] Colossus job submission failures

Published June 17, 2021 9:40 AM

Users experience random job submission failures with an error message similar to:

"sbatch: error: Batch job submission failed: Socket timed out on send/recv operation."

We're actively working on a solution.

[DONE] Maintenance June 15

Published June 15, 2021 7:35 AM

Selfservice is down for planned maintenance June 15.

Update June 15, 18:28: To ensure that all functions are working as normal, we will keep Selfservice in maintenance mode until tomorrow.

Update June 16, 10:42: We will reenable selfservice around 12.00 today.

Update June 17, 11:15: Most parts of selfservice should work as normal. Please contact tsd-drift@usit.uio.no if you experience any problems.

[Planned] Colossus downtime 21-06-29 08:00-16:00

Published June 7, 2021 8:57 AM

Colossus will have downtime today, 2021-06-29 from 08:00-16:00 to upgrade the Slurm job scheduler software.

During the downtime we advice you to keep an eye on this operational log for any updates.

[Solved] Windows login problems for project ids greater than p1575

Published June 1, 2021 11:39 AM

Projects with a project id (pXX) greater than p1575 may be experiencing problems logging in to Windows hosts. Linux hosts are not affected. We're actively working on a solution.

[Solved] NFS hangs on clients mounting /cluster

Published May 26, 2021 2:59 PM

Hosts mounting /cluster may be experiencing NFS hangs at the moment. We're actively working on a solution.

Previous page 2 3 4 5 6 7 8 9 10 11 Next page

Feed from this page