What is HPC?

What is the High Performance Computing?
 

High performance computing (HPC) is the practice of aggregating computing power in a way that delivers much higher performance than one could get out of a typical desktop computer or workstation in order to solve large problems in science, engineering, or business. High performance computing is the foundation for many scientific, industrial, and societal advancements.

As technologies like the Internet of Things (IoT), artificial intelligence (AI), and 3-D imaging evolve, the size and amount of data that organizations use is growing exponentially. For many purposes, such as streaming a live sporting event, tracking a developing storm, testing new products, or analyzing stock trends, the ability to process data in real time is crucial.  

To keep a step ahead of the competition, organizations need lightning-fast processing power to analyze and store massive amounts of data.

HPC solutions have three main components:

1 - Compute
2 - Network
3 - Storage

To build a high-performance computing architecture, compute servers, commonly referred to as 'nodes,' are networked together into what's commonly referred to as a 'cluster'. Software programs and algorithms are run simultaneously on the nodes in the cluster - the thought being that more resources available for software can manipulate data at a much greater scale than utilizing a single computer. The cluster is networked to the data storage to capture the output. Together, these components operate seamlessly to complete a diverse set of tasks.

A cluster is basically considered as a group of interconnected computers that work together to perform computationally intensive tasks.

You can view and read more about the different clusters that the High Performance Computing Core Facility (HPCCF) supports at: hpc.ucdavis.edu/clusters

 

How do I contact HPC?

To help us provide better and more consistent support, per department policy, requests emailed directly to team members will not be addressed. Such requests must be submitted through the ticketing system. See below for information. 

There are various support options for HPC users:

  • Any general service inquires and all HPC support questions and concerns related to compute and storage infrastructure, user account support, login help, software installs, and other technical issues could be submitted to the HPC-CF Helpdesk at hpc-help@ucdavis.edu
  • Farm cluster cases could also be directed to farm-hpc@ucdavis.edu 
  • The HPCCF FAQ page and the campus Service Hub Knowledge Base can provide answers to many common FAQ questions, including hardware specs, pricing, availability, software options, and how to get started. Additional information can be found on the HPCCF Help Documents page.
  • UC Davis Slack  #hpc channel could help connecting with other HPC users at UC Davis and to join the latest discussions. Please visit https://ucdavis.slack.com/, and login with your @ucdavis.edu email address to get started. Please note that this is not an official support channel. 
  • External cross-university HPC user forum Ask.Cyberinfrastructure (ask.ci) could assist with questions and answers for researchers, facilitators, sysadmins and others who do research computing.

 

Glossary

 

  • Cluster - many nodes connected that can coordinate between themselves to handle massive amounts of data at high speed with parallel performance.
  • Node - a single computer or server.
    (processor) - A single unit that executes a single chain of instructions. It can be a head node/login node or a regular compute node.
  • Headnode - Also called a login node, is a node that the user logs in 
  • Slurm job - a scheduled process sent by the user that is allocated, managed, and monitored by the Slurm manager.
  • SSH key pair - Consisting of private and public key is a pair of cryptographic keys used to authenticate and ensure secure communication between user and server
  • CLI - Command Line Interface is an application that lets you interact with the machine and enter commands to accomplish a task.
  • X11 - is a graphical user interface that runs on a Unix machine. It is sometimes required to run GUI software/modules on HPC clusters/servers
  • Module - Installed software/module in a cluster