Model for HPC Hardware Support

The High Performance Computing Core Facility (HPCCF) seeks to provide the best possible support for HPC on campus. To ensure that we can continue to support campus most efficiently, we are pleased to announce the creation of a new campus-wide cluster, Hive. A centrally managed cluster, with standardized hardware and connectivity and a defined life cycle of support, facilitates greater access to HPC for a more significant number of users on campus while maintaining support for college-level needs. HPCCF offers two tiers of support moving forward and incentives to merge existing hardware with the new cluster when possible. Below, we describe our model, including a description of two tiers of support – a campus-funded Priority Tier for hardware on Hive, and a PI-supported Maintenance Tier for hardware not on Hive nor under maintenance contract.  Details about HPCCF Priority (Tier 1) and Maintenance (Tier 2) support can be found at https://hpc.ucdavis.edu/supported-services

Existing HPC Hardware

We will merge HPC hardware less than 5 years of age with the Hive cluster. Clients will be provided priority support as above until the hardware reaches 5 years of age. Clients who are part of existing supported clusters may opt out of merging their hardware with Hive and will be offered priority support at no cost while the hardware remains under maintenance contract. 

Existing HPC hardware no longer under vendor maintenance contract and less than 7 years old will not be merged with Hive. Clients can opt to maintain this hardware, paying the expense for rack fees and purchasing Maintenance support from HPCCF at an hourly rate for up to two additional years.

Existing hardware older than 7 years will not be supported. Hardware in racks maintained by HPCCF will be decommissioned. Clients will be entirely responsible for paying rack fees should they choose to maintain old hardware in other racks. Should the Client wish to retain possession of decommissioned hardware outside of HPCCF, they may do so at their expense and with the approval of their Dean’s office and college IT.

New Purchases

Starting January 2025, new purchases of HPC hardware will only be supported if approved by the High-Performance Computing Core Facility (HPCCF). Most new purchases will be expected to be part of the university cluster Hive. Hive will function under the “condo” model, where a client (such as a PI or a unit such as a college or center) can purchase any number of cores for 5 years. During those 5 years, HPCCF will maintain hardware and provide priority support for users in the client’s group.

After 5 years, HPCCF will no longer support the hardware, and the client and users in the client’s group will lose access to it. At the discretion of HPCCF, the client can maintain the hardware with maintenance support at the established labor rate for up to two additional years. The client will be expected to pay for rack fees and HPCCF-approved supplies necessary for physical maintenance. See here for current rates and fees.

After 7 years, HPCCF will not offer any hardware support. 

Hardware that is not supported will be decommissioned and removed from HPCCF racks at the discretion of HPCCF. Should the Client wish to retain possession of their hardware outside of HPCCF, they may do so at their expense and with the approval of their Dean’s office and college IT.

Buyout Proposal

Under coordination with the START Research Computing task force, HPCCF has put together a proposal to campus to buy out some portion of existing hardware older than 5 years and replace it with new hardware on Hive under the model above. Funds from this proposal would be managed by each unit’s IT in conjunction with HPCCF.

 New Hardware through HPCCFExisting Hardware <5 years oldExisting Hardware >5 yearsExisting Hardware >7 years old 
HPCCFPriority Support for 5 years on HiveMerge with Hive; Priority support up until 5 yearsMaintenance support No support
Outside HPCCFNo supportPriority support for the life of the maintenance contract