Norwegian version of this page

TSD Operational Log - Page 2

Published Apr. 25, 2024 9:16 AM

We're experiencing some issues with our data portal, but we're on it and working to get things back to normal. Thanks for your patience while we fix the problem.

Published Apr. 1, 2024 10:24 AM

Several reservations (including tsd) on Colossus are currently unavailable. The Sigma2 allocation is not affected.

 

Published Mar. 25, 2024 12:44 PM

Dear TSD user,

We need to temporarily halt self-service for maintenance. We apologize for the inconvenience.

TSD Team

Published Mar. 20, 2024 10:59 AM

From 2024-03-20 1730 the first 100 (of 250) submit hosts will be migrated to a new virtualization cluster overnight. This requires the vms to be powered off, migrated and powered on. The downtime per host will be about 5-10 minutes.

Update 2024-03-21: The remaining submit hosts will be migrated from 2024-03-21 1730.

Published Mar. 13, 2024 2:00 PM

Applicants who use TSD Self Service to apply for membership in a TSD project currently end up in a loop when they return from ID-porten, which prevents them from submitting their application.

We are working on the problem.

Published Mar. 12, 2024 3:04 PM

[2024-03-18 11:00: update] The maintenance is now over, and jobs are running again.

 

On Monday 18-03-2024 10:00 there will be a short maintenance stop to apply a critical configuration change.

A maintenance reservation has been set in Slurm. Any submitted jobs that cannot complete before the downtime will remain pending until after the downtime.

Published Mar. 11, 2024 11:07 AM

The IAM system of TSD will go for a quick upgrade today between 15.00 and 15.15. The services that will not be available during the period are:

1) Selfservice

2) Nettskjema new forms activation

3) Command line UR

Published Feb. 28, 2024 12:17 PM

We're currently facing storage issues causing input/output errors that are affecting many software applications, including Stata. Our team is actively working on resolving this matter.

Published Feb. 15, 2024 3:32 PM

[2024-02-21 17:49 update] All affected jobs have been requeued. 85 jobs had to be cancelled, so please inspect the output of your jobs to see if they're affected.

[2024-02-21 15:45 update] Several jobs that were running at the start of the upgrade did not successfully resume. We're trying to resolve the issue. New jobs are not affected.

[2024-02-21 10:35: update] The upgrade is now done, and seems to have gone well.

[2024-02-21 10:00: update] The upgrade has now started

The queue system on Colossus will be upgraded on Wednesday (February 21) at 10:00.  During the upgrade, running jobs will be suspended, and slurm commands (squeue, sbatch, etc) will not work.  We expect the upgrade to take no more than 20 minutes.

Published Feb. 14, 2024 7:23 AM

TSD is performing network maintenance (on the DNS service) at 10:00 CET today.

Published Feb. 7, 2024 7:30 AM

We're experiencing technical difficulties with our project creation service. Our team is actively working on resolving the issue to restore full functionality as soon as possible. We apologize for any inconvenience this may cause and appreciate your patience during this time.

Published Jan. 31, 2024 1:30 PM

We are currently experiencing technical difficulties with core services, and are working to restore operations. Sorry for the inconvenience caused by this.

Published Jan. 29, 2024 12:50 PM

Slurm has been restarted on several compute nodes to resolve an issue. Please check the output of your jobs to see if they've been affected.

Published Jan. 26, 2024 10:10 AM

We're currently experiencing issues with some nodes on Colossus. Jobs on these nodes might have crashed and been requeued. Please check the output of your jobs to see if they've been affected.

Published Jan. 26, 2024 8:59 AM

Some users are currently facing issues logging in to the Data Portal to export/import data. The specific error message they encounter is "An unexpected error has occurred which may affect the proper functioning of the application." If you also experience this error while attempting to log in to the Data Portal, please notify us by emailing tsd-drift@usit.uio.no.

Published Jan. 11, 2024 12:10 PM

TSD will be upgrading the storage system, which may cause some instability on the Windows and Linux vms.

Published Jan. 10, 2024 1:35 PM

We've updated our password policy. This change is part of our commitment to enhancing security protocols and safeguarding sensitive information, taking effect on January 8th, 2024.

All TSD users are now required to update their passwords at least once every year. This practice is essential to maintain a high level of security. You may change your password at any time by logging into TSD's Selfservice Portal: https://selfservice.tsd.usit.no/profile/change-password

You will receive an email notification 30 days before your password expiration date, providing sufficient time for a timely update.

Users with over due password changes will be contacted, with the first group of users contacted December 11th, 2023 and requiring a mandatory password change to be completed by January 8th, 2024.

Accounts that have not complied with the password update requirement by the deadline will be temporarily suspended. Access will be restored upon u...

Published Jan. 8, 2024 12:30 PM

ID Porten has logging problem, please follow:

https://status.digdir.no/incidents/ctml93xm9lnh

It impacts both TSD and Nettskjema logins

Published Dec. 22, 2023 9:50 AM

We are currently experiencing some issues with file import through the Data Portal and are looking into the cause of the problem.

Published Dec. 15, 2023 7:58 AM

This affected TSD systems that relied on NFS.

[Update 08:48]

The core problem is resolved and most systems are up again. We are still investigating the reason for the problems, and some system may still have instability.

[Update 11:00]

All systems should work as normal.

Published Nov. 28, 2023 10:11 AM

TSD will be upgrading software on the storage system Thursday, 2023-11-30 from 08:00 CET. We expect storage instability on the Windows and Linux vms throughout the day.

Around 10:00 the storage system will be shut down for an estimated 15min, which means network storage is inaccessible on all TSD hosts (Windows and Linux) as well as on our central services (file import/export, etc). To be on the safe side, please close any programs and log off from your vm prior to the downtime.

A maintenance reservation has been set on Colossus from 08:00. This means any jobs that cannot complete before the downtime will remain pending until after the maintenance completes. They'll resume automatically.

Our automation should fix any file system hangs that may occur, and we will be on standby to fix any remaining issues that do not automatically recover.

Apologies for the short notice, we've been in dialogue with IBM to alleviate storage instabi...

Published Nov. 22, 2023 3:00 PM

TSD will be upgrading software on the storage system tomorrow, 2023-11-17 08:00 - 09:00 CET. Our automation should fix any file system hangs that may occur, and we will be on standby to fix any remaining issues that do not automatically recover. Apologies for the short notice, we've been in dialogue with IBM to alleviate storage instability and want to act on their latest recommendations as fast as possible.

Published Nov. 16, 2023 1:33 PM

IBM will be upgrading software on the storage system tomorrow, 2023-11-17 07:00 - 09:00 CET. This upgrade is being done on short notice to remove bugs that have caused instability. We are taking the opportunity to improve stability as soon as we can, apologies for any inconvenience. Our automation should fix any file system hangs that may occur, and we will be on standby to fix any remaining issues that do not automatically recover.

Published Nov. 9, 2023 9:41 AM

Some users are reporting login issues and problem setting one-time codes. We are working to debug and fix the issue.

Published Nov. 8, 2023 3:39 PM

TSD Internal Publication goes for short maintenance.