UTHPC - HPC Cluster upgrade – Hooldustöö üksikasjad

HPC Cluster upgrade

Tehtud
Aeg 15. juuli 2025 kell 6:00:00 – 6:00:00

Mõjutab

rocket.hpc.ut.ee

Hooldustöö alates 6:00 AM kuni 6:00 AM

Värskendused
  • Uuendus
    22. juuli 2025 kell 6:00:00
    Uuendus
    22. juuli 2025 kell 6:00:00
    Maintenance is now in progress.
  • Uuendus
    16. juuli 2025 kell 13:11:18
    Uuendus
    16. juuli 2025 kell 13:11:18

    We will be directing SSH to login1 today. The login2 internal route will still stay open until the 22nd, when we will be performing maintenance and rebooting the machine.

  • Töös
    15. juuli 2025 kell 8:41:38
    Töös
    15. juuli 2025 kell 8:41:38
    Maintenance is now in progress.
  • Lisatud
    15. juuli 2025 kell 6:00:00
    Lisatud
    15. juuli 2025 kell 6:00:00

    HPC cluster Rocket updates are scheduled for July 2025 that will improve the cluster's performance and capabilities.

    1)      Login Node Updates

    We'll be performing system updates on both login nodes this month:

    • Login1: July 15th

    • Login2: July 22nd

    To minimize disruption, we'll close new SSH connections one week before each update, allowing existing connections to naturally expire. One of the login nodes will remain available at all times, so you won't experience any service downtime.

    2)      Slurm Update

    On July 22nd, starting at 15:00, we'll be upgrading Slurm from version 23.02 to 23.11. Your running jobs won't be affected, and you'll be able to submit new jobs during the update. However, commands like sacct, sacctmgr, and related tools will be unavailable during the update. The process should take about two hours but may run longer. We

    After July 22nd, the compute nodes will be updated in a rolling fashion. This means some nodes will be temporarily drained until all updates are complete, which may result in longer queue times depending on cluster usage.

  • Tehtud
    15. juuli 2025 kell 6:00:00
    Tehtud
    15. juuli 2025 kell 6:00:00
    Maintenance has completed successfully