Skip to main content

Maintenance break – 3rd phase HW – LUMI is back to regular service 24.8.2023 – LUMI-F is back online 31.8.2023

Thursday 31 August 13:30 CEST (14:30 EEST)

LUMI-F is back online. Sorry for the inconvenience.

Thursday 24 August 16:30 CEST (17:30 EEST)

We are pleased to let you know that LUMI is back to regular service. However, LUMI-F is not yet available and we’ll keep you informed as soon as the flash-based storage is available again to the users. Jobs which are currently in the queue and using LUMI-F will need to be resubmitted once it is back online.

The system was boosted with additional LUMI-G nodes (368 nodes = 1472 AMD MI250X GPUs), network capability and additional flash storage (1 PB). This integration comes with several changes described here https://lumi-supercomputer.github.io/LUMI-training-materials/User-Updates/Update-202308/ which we ask you to review carefully, especially the changes related to the low-noise mode on LUMI-G which have implications on the job scripts.

We have noticed that some jobs for LUMI-G which are currently in the queue are incompatible with the new configuration. They will either get stuck indefinitely in the queue and will have to be removed with scancel (when the job script requested more than 56 cores per node explicitly) or will fail while running, likely with an error message about conflicting bindings or cores not being available for binding.

Please note that 512 additional LUMI-C nodes will be made available at a later date.

If you need any assistance, please do not hesitate to contact the LUMI User Support Team: https://lumi-supercomputer.eu/user-support/need-help/

The integration of the third phase hardware has been scheduled to start on the 15.8 at 7:00 CET. This process requires downtime of the entire system, including storage an login nodes, meaning LUMI will be completely unavailable during the process. We are reserving 10 days for this process, but we are hoping we can return the system quicker than that.

The system will be boosted with additional LUMI-G nodes (368 nodes = 1472 AMD MI250X GPUs), network capability and additional flash storage (1 PB).

We will make a maintenance reservation to ensure no jobs are running when the break starts. Shorter jobs may run before the break if they can finish before the start of the downtime.

If you need any assistance, please do not hesitate to contact the LUMI User Support Team: https://lumi-supercomputer.eu/user-support/need-help/

3.8.2023 17:00