eResearch Tips

How to choose a HPC cluster

By Dr Marlies Hankel, QCIF eResearch Analyst at UQ

With easy access to high-performance computing (HPC) clusters, many users may feel confused about which cluster to choose and if HPC is actually useful for their research. I will outline some basic considerations users should go through to see if HPC is suitable and if yes, which type.

Are my calculations suitable for HPC?
 
HPC clusters are often designed for specific usages, including for example: large distributed calculations 

  • high throughput jobs (lots of small calculations)
  • high Input/Output (I/O) calculations
  • large memory calculations
  • calculations that benefit from accelerators like GPUs.

But there is also generic-type clusters that try to cater to the broadest of applications and user cases (typical for institutional clusters/facilities), and are often the best entry system for new users.
 
Firstly, it is essential the user actually knows what resources the calculations need. If you are not sure what resources might be needed then I would suggest to start with a local Linux server (a bigger type of desktop running an environment similar to a cluster) or, if that is not available, a local HPC cluster for preliminary testing. In general, users should discuss their requirements with their local eResearch support staff or drop into a local Hacky Hour.
 
In a lot of cases when users consider the move from desktop to cluster for the first time, the software does work on a Linux system but requires a graphical user interface (GUI) to set up and often run the calculation. Linux is the operating system most HPC clusters use and it is command line-based and very different to Windows. If the GUI can save input files and the software has a command line option then all is good, but if this option does not exist then things are a bit more complicated. While HPC is still an option, it usually negates the advantage that on HPC a large number of calculations can be run at the same time.
 
When a GUI is needed, users can then consider a Virtual Machine (VM) as an option. A VM offers resources that are not on your desktop. Users can use the GUI just like on their desktop, and if more resources are needed additional VMs could be requested to extend the number of calculations that can be run at any one time.

HPC clusters often work with a queue system and so GUI use is not standard. However, on most clusters a user can request what is called an interactive session via the queue which gives them direct access to one of the cluster’s nodes with restricted resources (cores, memory, time limit, etc.) as requested from the queue. The user can then bring up the GUI and run the calculation on that node. However, this has some limitations. Interactive sessions are often restricted to a few at the same time, putting a limit on how many calculations can be done. Users also need an X Windows session when accessing the cluster to be able to display the GUI (usually a problem for users’ Windows desktops).
 
Some clusters offer access via a Virtual Network Computing (VNC) server, which gives the user a virtual desktop to work from and allows easy GUI access. However, as the interactive session is time limited, users will have to request the resources from the queue again and again, and when the cluster is busy waiting times might go up. I would therefore suggest to anyone dependent on a GUI to run their calculation to investigate if there is other software out there that might do the same thing but will run from the command line and thereby give all the benefits HPC can offer.
 
If a GUI is essential then local HPC clusters often offer the best option as they tend to offer VNC access and could be more supportive of workloads outside of the standard workloads.
 
There are several cases where users might think to start using HPC:

  1. Calculations are running fine on a desktop but the user wants to do a lot more of them and the desktop is not enough. Here a HPC cluster is of great help as a large number of jobs can be run at the same time. If the core and memory requirements are similar to that of a desktop then a HPC cluster that supports single core or single node (often called high throughput or array) calculations, is the way to go. 
     
    Users with single core or single node jobs should not use clusters optimised for larger or more parallel jobs (for example, clusters with specialised networks to support MPI, or extra-fast disk, or extra-large memory or accelerators such as GPUs). While single core (single node) type of calculations can be run on most clusters, the specialised clusters might not favour, or might even not allow single core/node calculations. On some clusters a smaller amount of resources is set up to take single core jobs, and users will experience frustration with queue wait-times and throughput. So, users should read the specs and descriptions of clusters and also inquire with eResearch support staff if their type and number of calculations is favoured or not.


  2. Calculations run out of memory on a desktop. Most clusters will offer more memory than a desktop, so one of the first steps would be to get access to a local cluster and run the calculations there. Again, users  should check the specs on the clusters’ webpages to see how much memory is offered. Once the calculations can run on a cluster, the user can use the standard outputs from the system/queue, which are commonly provided, to check on memory usage for their jobs and build a picture of what is needed.
     
    If the calculations still run out of memory, this information can then be used to gain access to specialised clusters that offer more memory. This information is often essential to have an application considered. Just stating that the calculations need a lot of memory is often not enough. Usually specific details and proof of memory usage is needed.


  3. Calculations are too slow and the user wants them to run faster. This is one where HPC might struggle to help. In a lot of cases, the CPU of a new desktop/laptop will be faster than the CPU in a HPC cluster. Speed ups can be achieved via HPC if a lot of calculations can be run at the same time instead of one after the other. Other options include looking at different software that makes better use of HPC's advantages. This includes being able to use more than a single core at a time, or using MPI (Message Passing Interface) to spread the calculations across multiple nodes in a cluster. Some investigation from the user on HPC-suitable software, or some work to modify their existing code to use OpenMP,  MPI or GPU, might be needed.

     
  4. The calculations use MPI. Users need to look for a cluster that has a fast internode network. Clusters with gigabit ethernet network will work but calculations will pay a large penalty in run times due to the slow network. MPI calculations usually do not need a lot of memory per core or large disk quotas. This is where users might have to look beyond local clusters due to the network requirements.
     
    If the calculations use only a few nodes (less than 100 cores) then smaller and medium clusters should still be able to give a reasonable throughput. However, larger scale MPI calculations might be hard to schedule. Here it is important to use a cluster that actually favours MPI calculations. Trying to run large MPI on a small-ish cluster (so the number of cores you are asking for is a large percentage of the total core count, e.g. asking for 256 cores on a 1,000 core cluster is not a good idea) will be frustrating, so if you need a lot of cores check how many are actually available in the cluster and if necessary look and apply for access on larger MPI clusters.

     
  5. Calculations use large (size) or a large number of temporary files. Temporary files are created during the calculation and are only needed during the calculations and will be deleted by the software once the calculation is done. Here, the calculations need to use what is called ‘local scratch space’. In modern clusters, user data will sit on a shared file system that can be seen by all nodes via the network. However, each node usually also has its own disk with space available. This is used as local scratch space and should be used for temporary files. The reason for this is that this local scratch space often does not count towards a user’s disk space quota and it is faster to write temporary files as they do not have to be written over the network to the shared file system. If temporary local scratch space is needed, then the user needs to check the cluster’s webpages on what size the local disk on all the nodes is to make sure it can accommodate their calculations.

In most cases, the resources calculations need or the reasons why a user is considering using HPC are a mixture of those above. What is important is that users have some idea about the resources needed and if not, to use local servers or clusters to check.
 
HPC in general is specialised computing and very different to running calculations on a desktop. Gaining experience and confidence in using server or HPC cluster-style computing is highly advisable before considering non-local, national or specialised HPC clusters. Talk to your local eResearch support staff to discuss your options further.