TSD Operational Log - Page 4
On the 11th of April at 15:00, TSD will migrate the shared directory to the new storage system. The shared directory is a read-only export available to all TSD projects ("/tsd/shared" on Linux or "\\tsd-evs\shared" on Windows).
Please note that the migration will not impact regular shared project directories such as pXsharedpY and its variants.
We need to conduct a routine upgrade of TSD's storage system between 10:00 and 11:00 on Wednesday, April 19, 2023.
During the downtime, it will still be possible to log in to VMs, but durable storage (M:\ and N:\ in Windows VM) will be unavailable. Therefore, we recommend closing all open files and potentially logging out to avoid data loss. Colossus is unaffected and will operate normally throughout the upgrade.
ESS storage is currently unavailble due to a technical problem. There was a firmware-bug in our IBM setup, the bug caused a metadata-server to crash, and a global disk failure. As there was a minor risk of data-loss or data-corruption we had to spend about 9 hours yesterday working together with the international crisis team in IBM. System was up at about 1900 yesterday (23/3-23) and we believe there was no data-loss or corruption.
We sincerely apologize for this unplanned downtime, but third party bugs in firmware setups is a very unforeseen happening that we could not have easily prevented.
During the downtime we also fixed the MTU settings that IBM had misconfigured and may have been the root cause of previous truble. We also attached more storage, which will soon be put into production.
Due to a previous misconfiguration from IBM, we will reconfigure our storage on Thursday March 23 from 12:00. This may cause network storage downtime on Windows and Linux clients. The Colossus compute nodes might be affected, but we consider that very unlikely.
We're experiencing NFS hangs due to third party maintenance on our storage system.
We'll be rebooting hosts to resolve the issue.
We're experiencing NFS hangs.
We're working to fix it and will have to reboot hosts in the process.
Affected services:
- login to Nettskjema with TSD
- login to TSD virtual machines
- selfservice portal
- data portal (and command-line imports)
- publication portal
- consent portal
We are working to fix this.
ESS NFS clients experience NFS hangs. Some clients may be rebooted to resolve the issue.
We're actively working on resolving the issue.
Starting Wednesday 2022-02-08 15:00 the NFS servers in TSD will be upgraded by a third party vendor.
We expect this to take approximately 1 hour and may cause network/NFS interruptions.
We advice you to keep an eye on this operational log for any updates.
ESS NFS clients experienced NFS hangs on 2023-01-30 between 16:00 and 18:00. Clients were forced to remount NFS.
For some non-ess-migrated projects NFS hangs persisted until 2023-01-31 09:30.
We apologize for the inconvenience.
Update 2023-01-31 1200: Several hosts are still experiencing hangs. These hosts will be rebooted asap.
The new dataloader is currently unavailable.
This only affects nettskjema which have been enabled for the new version, for projects p2336 and up.
We're currently working on a fix.
--
TSD
Users of SPSS will get warning about expiring license. Please ignore this message. TSD has requested new license and it will be in place soon.
We're experiencing storage instability on ESS affecting windows and linux hosts of projects that have been migrated to ESS.
Update 221208: We experienced a new incident between 09:20-10:30.
We're working with third party service providers to resolve the instability.
Starting 15:10 there's been an issue with network traffic in TSD, affecting NFS among others.
We're actively working to fix it.
Update 16:25: The issue has been resolved.
We are experiencing instability with the storage system. We are investigating and have contacted the vendor for support.
We are working to fix it.
We are having issues with the data portal https://data.tsd.usit.no/ - we are working to fix it.
Some users get an error message: 500 Internal Server Error - Negative response from the server message. We are investigating the issue.
SCCM group will be installing a hotfix rollup for CM 2203 (KB14244456) on Wednesday 2022-09-07.
Software Center on all Windows VMs in TSD will be unavailable between 09:00 and 16:00.
Read more...
Starting 2022-09-05 15:00, we'll migrate approximately 1000 projects from the old to the new storage solution. Affected projects will have a downtime from 15:00 until 15:00 the next day, unless specified otherwise in the notification email. For more information about the migration see our FAQ.
Update: Not all projects were migrated by 15:00. The downtime for these projects has been extended.
The Software Center on all Windows VMs will be unavailable between 09:00 and 16:00.
/tjenester/it/aktuelt/planlagte-tjenesteavbrudd/2022/2022-08-31-sccm-oppgradering-tsd.html
As of 2022-08-15 10:30, TSD Self Service is experiencing technical difficulties. This issue affects various operations pertaining to performing updates.
We are working on a solution.
Project Database machine will receive a minor DB security upgrade.
Database Upgrade will disrupt login from 9AM till 10 AM Friday 15.07.
Affected services: Selfeservice, File Transfer and Logging. We apologize for the inconvenience. We will work to make the process as smooth as possible.
We are doing network maintenance in TSD Monday-Tuesday 23/05/2022-24/05/2022.
No downtime is expected but interruptions may occur.