Computing setup

This course has a heavy programming component and will turn increasingly data- and compute-intensive throughout the semester.

Image may contain: Computer hardware, Computer, Computer networking, Electricity, Gas.

Access to Fox

Our reference computing environment will be the Fox high-performance computing cluster belonging to the University of Oslo. All course participants will be granted access to Fox, where about 3,000 CPU cores, compute nodes with up to 512 GByte of main memory, and a (small) rack of massively parallel GPUs are available.  To register for Fox usage, please submit an online form, where you should apply for a membership in the ec30 project (Language Technology Group; LTG) until June 30, 2024.  It will usually take a day or two before account activation is complete, and you will receive status updates by email and text messages.

Once you have received confirmation of account activation on Fox, you need to connect using ssh (e.g., from the Linux command line or any suitable secure shell client), e.g.,

 ssh fox.educloud.no

This will establish an interactive session on one of the Fox login nodes, which we can use for development, debugging, and testing.  In a nutshell, moderate computation is fair game on the interactive login nodes, where we interpret ‘moderate’ as using at most a handful of cores, up to 16 gigabytes of main memory, and run-times best measured in minutes.

If you feel there is a need to familiarize yourself with working in a Linux command line environment, we recommend reading this tutorial and/or taking some online Linux basics course. The Missing Semester of Your CS Education by MIT is another good course on mastering the command line and other basics.

Python Modules

Python 3 is the main programming language for this course.  Once logged into Fox, there is an NLP-specific repository of the relevant Python 3 add-on modules. Most of them are "branded" with NLPL, that is, "Nordic Language Processing Laboratory".  To activate this environment, one needs to execute the following commands:

module purge
module use -a /fp/projects01/ec30/software/easybuild/modules/all/
module load nlpl-nlptools/01-foss-2022b-Python-3.10.8
module load nlpl-gensim/4.3.2-foss-2022b-Python-3.10.8
module load nlpl-pytorch/2.1.2-foss-2022b-cuda-12.0.0-Python-3.10.8

We recommend you add the above lines to your personal .bashrc configuration file in your home directory.  You can check that you have a sane working environment by issuing the commands above and then running our sanity test script.  It will try to import all the necessary Python packages.  If the test script produces some warnings or errors, send a question to our collective mailbox or raise an issue in our UiO GitHub repository.  

List of all modules available in the NLPL Virtual Laboratory

Please do not try to install any Python modules locally to your user directory on Fox.  This may conflict (in subtle, non-transparent ways) with the environment we have prepared for the course.  In other words, make sure that the ~/.local/ directory inside your home directory on Fox is empty (unless you are absolutely sure you know what you are doing).

Fox Foundations

Computations that will run for more than a few minutes, require a GPU, multiple CPUs, or very large amounts of memory, must be submitted through the Fox queue management system using the so-called SLURM scripts. Read the Fox job system overview to learn how to deal with this system. We will have a quick workshop on Fox technicalities during one of the first group sessions. We also provide an example of a SLURM script, which you can use as a template.

Running jobs in your web browser

It is also possible to run (small) jobs on Fox as Jupyter notebooks in your web browser, without the need to connect via SSH. For this, you should use the EduCloud OnDemand web service:

  1. Login to EduCloud OnDemand using your Fox credentials.
  2. In the app dashboard, choose "Jupyter"
  3. In the "Interactive apps" list choose "Jupyter (BETA)"
  4. Make sure your Jupyter version is "JupyterLab/4.0.3-GCCore-12.2.0", your project is "ec30" and you specified the additional module path as "/fp/projects01/ec30/software/easybuild/modules/all/" (this is required to be able to load NLPL modules). All in all, it should look like in the image below:

Image may contain: Font, Rectangle, Parallel, Screenshot, Number.

  1. In the "additional modules" field, you should list all the Fox modules you want to use in your lab, for example, "nlpl-pytorch/2.1.2-foss-2022b-cuda-12.0.0-Python-3.10.8 nlpl-nlptools/01-foss-2022b-Python-3.10.8".
  2. Press "Launch".
  3. Wait until the job status changes to "running" (might take a minute or less) and press "Connect to Jupyter".
  4. Et voila! You are in a Jupyter Lab with easy access to your home directory on Fox, and you can run any Jupyter notebooks: either located on Fox, or uploaded via the browser.

Recommended Editors

You can either develop your code on your computers and then copy it to Fox for larger runs, or you can work with the code on Fox itself.  There is no shortage of text editors with various levels of Python support, and we will respect whatever choices you make.  Most of the course teachers are fond of Vim editor and will be happy to assist in getting maximum productivity with it.

If you need help working with Vim, run the Vimtutor command.  An example of Vim configuration for working with Python code conveniently can be found in the repository.  Copy this file to your home directory on Fox if you do not yet have a file by that name, or merge the contents of our file into yours.

It is also possible to remotely develop with VS Code, if you prefer more modern environment. More information how to setup VS Code can be found here: https://code.visualstudio.com/docs/remote/ssh

 

Published Dec. 5, 2023 10:31 PM - Last modified Feb. 12, 2024 8:49 PM