HEPCloud: a new paradigm for particle physics computing

HEPCloud is the next step in the evolution of scientific computing, a scientific gateway to resources beyond local worker nodes and grids, expanding into high performance computing (HPC) centers and the cloud.

Particle physics requires copious computing resources to extract scientific results. Fermilab is pursuing a new paradigm in particle physics computing through HEPCloud.

HEPCloud Infographic

How does HEPCloud work?

HEPCloud routes jobs to local or remote computing resources based on the policy for a particular experiment, workflow requirements, cost and efficiency of accessing the various resources. HEPCloud expands the resources available to include HPC centers and commercial cloud resources. You provide HEPCloud with any resource requirements and the HEPCloud Decision Engine routes your jobs to the best resources available based on your requirements.

Who can use HEPCloud?

All Fermilab experiments will submit jobs to HEPCloud behind the scenes. However, to use the extended resources available via HEPCloud, an experiment must be enabled to run on these expanded resources. This process is known as “onboarding.” In the initial phase after rollout, only production jobs will be able to access these expanded resources.

How do I get onboarded?

Experiments that wish to be onboarded to use HEPCloud must make HPC and commercial resource requests to the Scientific Project Portfolio Management (SPPM) committee in the Fermilab Scientific Computing Division. The HEPCloud team will help you to see if you’re compatible.

Requests to increase the allotted resources must also be made to the SPPM by either submitting a Service Desk request or by contacting your experiment liaison.

How will I run jobs if my experiment is not onboarded?

Jobs will continue to run as they did before HEPCloud. You do not need to do anything differently. Technically, you will be submitting jobs through HEPCloud; you just won’t have access to the expanded HPC or cloud resources. You’ll continue to use JobSub — your jobs will still run where they do today and data transfer mechanisms will remain the same as they are today.

As a user who is part of an onboarded experiment, what will I have to do differently?

The HEPCloud onboarding document includes specifications for running jobs.

In addition, you will be required to closely monitor your jobs, particularly the data output and CPU utilization. Exceeding your experiment’s allotted usage may have financial consequences. HEPCloud reserves the right to turn off access to commercial or HPC resources.

Who pays for HEPCloud?

This is discussed on a case-by-case basis with the HEPCloud team.

How do I get user support?

Open a Service Desk ticket as you do today.

What are the plans for the future of HEPCloud?

The HEPCloud team will continue to work with HPC centers, commercial cloud providers and funding agencies to maximize the resources available to experiments of all sizes.