TSD Operational Log - Page 2
The IAM system of TSD will go for a quick upgrade today between 15.00 and 15.15. The services that will not be available during the period are:
1) Selfservice
2) Nettskjema new forms activation
3) Command line UR
We're currently facing storage issues causing input/output errors that are affecting many software applications, including Stata. Our team is actively working on resolving this matter.
[2024-02-21 17:49 update] All affected jobs have been requeued. 85 jobs had to be cancelled, so please inspect the output of your jobs to see if they're affected.
[2024-02-21 15:45 update] Several jobs that were running at the start of the upgrade did not successfully resume. We're trying to resolve the issue. New jobs are not affected.
[2024-02-21 10:35: update] The upgrade is now done, and seems to have gone well.
[2024-02-21 10:00: update] The upgrade has now started
The queue system on Colossus will be upgraded on Wednesday (February 21) at 10:00. During the upgrade, running jobs will be suspended, and slurm commands (squeue, sbatch, etc) will not work. We expect the upgrade to take no more than 20 minutes.
TSD is performing network maintenance (on the DNS service) at 10:00 CET today.
We're experiencing technical difficulties with our project creation service. Our team is actively working on resolving the issue to restore full functionality as soon as possible. We apologize for any inconvenience this may cause and appreciate your patience during this time.
We are currently experiencing technical difficulties with core services, and are working to restore operations. Sorry for the inconvenience caused by this.
Slurm has been restarted on several compute nodes to resolve an issue. Please check the output of your jobs to see if they've been affected.
We're currently experiencing issues with some nodes on Colossus. Jobs on these nodes might have crashed and been requeued. Please check the output of your jobs to see if they've been affected.
Some users are currently facing issues logging in to the Data Portal to export/import data. The specific error message they encounter is "An unexpected error has occurred which may affect the proper functioning of the application." If you also experience this error while attempting to log in to the Data Portal, please notify us by emailing tsd-drift@usit.uio.no.
TSD will be upgrading the storage system, which may cause some instability on the Windows and Linux vms.
We've updated our password policy. This change is part of our commitment to enhancing security protocols and safeguarding sensitive information, taking effect on January 8th, 2024.
All TSD users are now required to update their passwords at least once every year. This practice is essential to maintain a high level of security. You may change your password at any time by logging into TSD's Selfservice Portal: https://selfservice.tsd.usit.no/profile/change-password
You will receive an email notification 30 days before your password expiration date, providing sufficient time for a timely update.
Users with over due password changes will be contacted, with the first group of users contacted December 11th, 2023 and requiring a mandatory password change to be completed by January 8th, 2024.
Accounts that have not complied with the password update requirement by the deadline will be temporarily suspended. Access will be restored upon u...
ID Porten has logging problem, please follow:
https://status.digdir.no/incidents/ctml93xm9lnh
It impacts both TSD and Nettskjema logins
We are currently experiencing some issues with file import through the Data Portal and are looking into the cause of the problem.
This affected TSD systems that relied on NFS.
[Update 08:48]
The core problem is resolved and most systems are up again. We are still investigating the reason for the problems, and some system may still have instability.
[Update 11:00]
All systems should work as normal.
TSD will be upgrading software on the storage system Thursday, 2023-11-30 from 08:00 CET. We expect storage instability on the Windows and Linux vms throughout the day.
Around 10:00 the storage system will be shut down for an estimated 15min, which means network storage is inaccessible on all TSD hosts (Windows and Linux) as well as on our central services (file import/export, etc). To be on the safe side, please close any programs and log off from your vm prior to the downtime.
A maintenance reservation has been set on Colossus from 08:00. This means any jobs that cannot complete before the downtime will remain pending until after the maintenance completes. They'll resume automatically.
Our automation should fix any file system hangs that may occur, and we will be on standby to fix any remaining issues that do not automatically recover.
Apologies for the short notice, we've been in dialogue with IBM to alleviate storage instabi...
TSD will be upgrading software on the storage system tomorrow, 2023-11-17 08:00 - 09:00 CET. Our automation should fix any file system hangs that may occur, and we will be on standby to fix any remaining issues that do not automatically recover. Apologies for the short notice, we've been in dialogue with IBM to alleviate storage instability and want to act on their latest recommendations as fast as possible.
IBM will be upgrading software on the storage system tomorrow, 2023-11-17 07:00 - 09:00 CET. This upgrade is being done on short notice to remove bugs that have caused instability. We are taking the opportunity to improve stability as soon as we can, apologies for any inconvenience. Our automation should fix any file system hangs that may occur, and we will be on standby to fix any remaining issues that do not automatically recover.
Some users are reporting login issues and problem setting one-time codes. We are working to debug and fix the issue.
TSD Internal Publication goes for short maintenance.
Due to an ESS upgrade at 13:30 we're experiencing some storage instability. This affects the internal mirrors (CRAN) too. Some vms will be rebooted in the process.
TSD services might be unstable for the moment - we are working to fix it.
Dear TSD-users,
At 07:00 the upcoming Tuesday we will be doing upgrades of the databases of our core services, and the databases in the following projects:
p11
p14
p23
p47
p57
p58
p96
p110
p166
p174
p189
p206
p302
p588
p594
p827
p874
p969
p1075
p1859
p2184
If all goes according to plan should be done around 11:00 at the latest.
During this time our services will be partially or fully unavailable.
--
The TSD team
The backend of several of our services is down.
This will affect file import and export, publication portal, nettskjema delivery and more.
--
TSD
We're experience instability with the TSD, affecting the timeliness of Nettskjema attachment delivery, and file import and export. We're working on solving it.
TSD Self Service is currently unreachable.