HPC Users@UiO Newsletter #1, 2020

HPC user forum, application deadline for CPU time through Notur, interesting conferences and external courses, new national HPC systems, turning off Abel.

  • USIT Underavdeling for IT i forskning ITF(NO), or Division for research computing (RC,EN), is responsible for delivering IT support for research at University of Oslo.
  • The division's departments operate infrastructure for research, and support researchers in the use of computational resources, data storage, application portals, parallelization and optimizing of code, and advanced user support.
  • Announcement of this newsletter is done on the hpc-users mailing list. To join hpc-users list, send a mail to sympa@usit.uio.no with subject "subscribe hpc-users Niels Henrik Abel" (if your name is Niels Henrik Abel). The news letter will be issued at least twice a year.


 

News and announcements

The HPC Users@UiO Newsletter is back

While we may have been a little silent after we turned off Abel and COVID-19 broke loose (two events that have not been shown to be connected in any way), we have not been sitting still. On the national level, the new HPC cluster Betzy is very close to getting into production while on the local level we are setting up a 3000+ cores HPC cluster. Additionally, aiming at tying the local and national infrastructures together, we are actively taking part in a very interesting initiative for scientific software. Read more about these and other topics below.

Betzy

Betzy is right around the corner. The next generation HPC system for Norwegian research is now almost ready for production under the Sigma2 umbrella. It is built by Atos/BULL and will be based on AMD Rome processors. With a total of 172.000 cores (equivalent to more than 1500 Millions core hours) interconnected with InfiniBand it will represent a giant leap in processing capacity for Norwegian scientists. 

The programming environment will not differ significantly from what is known from Fram and Saga.

Atos Bull HPC cluster

B1 architecture

To be up to date with the current development please watch the Sigma2 web pages.

New local HPC cluster

The local setup is for small but persistent computing needs, filling the gap between the laptops, workstations, single servers on one side and the national Sigma2 infrastructure on the other side. A platform for small scale computation and development and testing of applications before eventually moving on the national systems. With a selection of nodes both for CPU only computation and GPU accelerated computation as well as novel architectures this lineup will serve as a nice starting point for scientific computation.

Oversikt over compute infrastructure

We're aiming at having the new local infrastructure for computing operational before Christmas. This is an infrastructure in broad sense, comprising a batch queue (Slurm) scheduled cluster, a line of interactive nodes and a small farm of GPU accelerated nodes. 

The cluster will have about 3000 cores, it is made up of 24 compute nodes each with 128 cores (AMD) and 512 GiB of memory, 3+ TiB NVMe scratch disk and InfiniBand interconnect. 

The interactive lineup will have about 500+ cores in 4 nodes each with 128 cores (same as cluster), 1024 GiB of memory, 7.5 TiB NVMe scratch disk and Infiniband interconnect.

The GPU accelerated part is made up of today's nodes and new NVIDIA A100 based nodes. Today we have 5 nodes with NVIDIA RTX 2080 to which we'll add two new nodes with 4 NVIDIA A100 each.

The nodes we offer today, freebio and bioint01, will be continued. We aim to have some novel technology as a stepping store to the next generation, hence ARM based systems will be available.

This infrastructure is available to all UiO users. We aim for a very simple self service sign up with little admin around usage. The plan is to have no quotas for UiO users. Users with computational needs exceeding the capacity offered by this infrastructure will be assisted in moving their workloads to the national infrastructure at Sigma2. The local setup is intended for relatively small but persistent computing needs, but to make sure that any required transition to the national infrastructure will be as smooth as possible, we will aim at having a software environment and batch system closely following the national infrastructure.

HPC Course week (Nov 30, 2020)

On the 30th of November, we are organizing an online training event for newcomers to HPC. This will cover basic topics that are related to start using the Saga compute cluster.

More Details: /english/services/it/research/events/HPC_for_research_November_2020

Register for the course: https://nettskjema.no/a/170960

CodeRefinery online workshop (Nov 17 - 26, 2020)

This course is about best practice on research software development and management best practice.

Dates : Nov. 17 - 26, 2020 (09 :00 -12:00)

More details: https://www.ub.uio.no/english/courses-events/courses/other/coderefinery/time-and-place/2020-11-17-coderefinery.html

Nordic-RSE Get together online event, (Nov 30 - Dec 2, 2020)

Are you employed to develop software for research? Or are you simply spending
more time developing software than conducting research? Then you have much in
common with a growing international network of research software engineers,
RSEs. The Nordic-RSE initiative aims to build a community of RSEs in the
Nordics, plan meetings and workshops where knowledge can be shared, organize a
conference biannually, and provide assistance in starting local RSE groups or
hiring RSE staff in Nordic universities.

Please join the first online get-together event of the Nordic Research Software
Engineer initiative on Nov 30 to Dec 2:
https://nordic-rse.org/events/2020-online-get-together/

NeIC training calendar

Looking for more training events? NeIC is maintaining a shared calendar for training events in the Nordics, see https://neic.no/training/ for more information.

ML resources getting ready for production

We are moving our experimental machine learning infrastructure setup to a production system. This includes providing RHEL 8 as the operating system and EasyBuild module system to provision software.  A new machine ml7.hpc.uio.no with 8 GPUs has been put into production with this setup and another one will follow soon.

We are opening up the new system to users gradually, please send a request to  itf-ai-support@usit.uio.no if you want immediate access.

So, what is an Appnode?

HPC clusters may not always be the "holy grail" of HPC computing for research groups. An Appnode is a standalone compute resource flexibly configured to solve the research group's demands when clusters are not able to. It could for instance be due to any of these points:

  • instant access (no queue)
  • memory requirements beyond what clusters can provide
  • compute resources dedicated to your research group only
  • interactive work/testing/programming

The Appnode Team takes care of the whole lifecycle of the Appnode - procurement, racking, infrastructure, installation, backup, configuration and operation of operating system (Red Hat Enterprise Linux) and software packages required for your research. If you want to know more contact at hpc-drift@usit.uio.no

EESSI: Scientific software — everywhere

Software empowers and drives our work. We rely on rich scientific software stacks for teaching, research and studying. Different tasks may require different machines. We probably use a HPC cluster for large scale simulations, a workstation for post-processing and visualization, a laptop for software development on the go, and a Cloud infrastructure to set up short-lived environments for testing or teaching. However, moving from one system to another can be frustrating. Is the software we want to use compatible between the different systems? Is this software optimized to use the locally available hardware in the best way? In addition, we might not even have the knowledge or the permission to install software packages ourselves.

There is a current trend towards more diverse CPU architectures - the revival of AMD (Rome), the new ARM-based systems (#1 top500.org), the exotic POWER-based systems, and let's not forget the emerging RISC-V. This development makes the building, provisioning and maintenance of a uniform software stack optimized for specific microarchitectures very demanding for the local system administrators. However, there is light at the end of the tunnel.

An international group of HPC sysadmins has joined forces to tackle the problem. The European Environment for Scientific Software Installations (EESSI) is building a service that provides a uniform software stack for all architectures, optimized for the microarchitecture in your system, available on-demand, from anywhere. This means you can have the same software on your laptop, server, local cluster, national HPC system or even in virtual machines running in a commercial Cloud infrastructure. You can run the same software on your Raspberry Pi or your Mac. Even modern Windows installations will be supported, and of course any Linux system.

Sounds like magic? It's not. With a few simple steps you can try out our pilot today. The stack will be available on your machine in minutes.
https://eessi.github.io/docs/pilot/

UiO is a proud contributor to the EESSI project.

Bildet kan inneholde: utklipp, symbol.

 

Availability of other computing resources

If you want to explore ARM based compute systems and also explore next generation Vector/SIMD/SVE units and its impact on your code please come forward as we have a nice set of ARM and Allinea tools to run on our ARM testbed. UiO is working closely with Simula which runs a project looking at novel hardware for exascale. If your interests are along those lines we are happy to introduce you. 

Other hardware needs

Are you in need of particular types of hardware (fancy GPUs, kunluns, dragons, Graphcore, etc.) not provided through our local infrastructure, please do contact us (hpc-drift@usit.uio.no), and we'll try to help you as best we can, ref. work with Simula above.

Also, if you have a computational challenge where your laptop is too small but a full-blown HPC solution is a bit of an overkill, it might be worth checking out NREC. This service can provide you with your own dedicated server, with a range of operating systems to choose from.

With the ongoing turmoil about computing architectures we are also looking into RISC-V, European Processor Initiative is aiming for ARM and RISC-V and UiO needs to stay put.

Publication tracker

USIT Department for Research Computing (RC) is interested in keeping track of publications where computation on RC services are involved. We greatly appreciate an email to:

hpc-publications@usit.uio.no

about any publications (including in the general media). If you would like to cite use of our services, please follow this information.

Published Nov. 4, 2020 3:01 PM - Last modified Apr. 25, 2022 1:09 PM