Search | arXiv e-print repository

From Compact Plasma Particle Sources to Advanced Accelerators with Modeling at Exascale

Authors: Axel Huebl, Remi Lehe, Edoardo Zoni, Olga Shapoval, Ryan T. Sandberg, Marco Garten, Arianna Formenti, Revathi Jambunathan, Prabhat Kumar, Kevin Gott, Andrew Myers, Weiqun Zhang, Ann Almgren, Chad E. Mitchell, Ji Qiang, David Grote, Alexander Sinn, Severin Diederichs, Maxence Thevenet, Luca Fedeli, Thomas Clark, Neil Zaim, Henri Vincenti, Jean-Luc Vay

Abstract: Developing complex, reliable advanced accelerators requires a coordinated, extensible, and comprehensive approach in modeling, from source to the end of beam lifetime. We present highlights in Exascale Computing to scale accelerator modeling software to the requirements set for contemporary science drivers. In particular, we present the first laser-plasma modeling on an exaflop supercomputer using… ▽ More Developing complex, reliable advanced accelerators requires a coordinated, extensible, and comprehensive approach in modeling, from source to the end of beam lifetime. We present highlights in Exascale Computing to scale accelerator modeling software to the requirements set for contemporary science drivers. In particular, we present the first laser-plasma modeling on an exaflop supercomputer using the US DOE Exascale Computing Project WarpX. Leveraging developments for Exascale, the new DOE SCIDAC-5 Consortium for Advanced Modeling of Particle Accelerators (CAMPA) will advance numerical algorithms and accelerate community modeling codes in a cohesive manner: from beam source, over energy boost, transport, injection, storage, to application or interaction. Such start-to-end modeling will enable the exploration of hybrid accelerators, with conventional and advanced elements, as the next step for advanced accelerator modeling. Following open community standards, we seed an open ecosystem of codes that can be readily combined with each other and machine learning frameworks. These will cover ultrafast to ultraprecise modeling for future hybrid accelerator design, even enabling virtual test stands and twins of accelerators that can be used in operations. △ Less

Submitted 18 April, 2023; v1 submitted 22 March, 2023; originally announced March 2023.

Comments: 4 pages, 3 figures, presented at the 20th Advanced Accelerator Concepts Workshop (AAC22)

arXiv:2104.11385 [pdf, other]

doi 10.1145/3468267.3470614

In-Situ Assessment of Device-Side Compute Work for Dynamic Load Balancing in a GPU-Accelerated PIC Code

Authors: Michael E. Rowan, Axel Huebl, Kevin N. Gott, Jack Deslippe, Maxence Thévenet, Remi Lehe, Jean-Luc Vay

Abstract: Maintaining computational load balance is important to the performant behavior of codes which operate under a distributed computing model. This is especially true for GPU architectures, which can suffer from memory oversubscription if improperly load balanced. We present enhancements to traditional load balancing approaches and explicitly target GPU architectures, exploring the resulting performan… ▽ More Maintaining computational load balance is important to the performant behavior of codes which operate under a distributed computing model. This is especially true for GPU architectures, which can suffer from memory oversubscription if improperly load balanced. We present enhancements to traditional load balancing approaches and explicitly target GPU architectures, exploring the resulting performance. A key component of our enhancements is the introduction of several GPU-amenable strategies for assessing compute work. These strategies are implemented and benchmarked to find the most optimal data collection methodology for in-situ assessment of GPU compute work. For the fully kinetic particle-in-cell code WarpX, which supports MPI+CUDA parallelism, we investigate the performance of the improved dynamic load balancing via a strong scaling-based performance model and show that, for a laser-ion acceleration test problem run with up to 6144 GPUs on Summit, the enhanced dynamic load balancing achieves from 62%--74% (88% when running on 6 GPUs) of the theoretically predicted maximum speedup; for the 96-GPU case, we find that dynamic load balancing improves performance relative to baselines without load balancing (3.8x speedup) and with static load balancing (1.2x speedup). Our results provide important insights into dynamic load balancing and performance assessment, and are particularly relevant in the context of distributed memory applications ran on GPUs. △ Less

Submitted 22 April, 2021; originally announced April 2021.

Comments: 11 pages, 8 figures. Paper accepted in the Platform for Advanced Scientific Computing Conference (PASC '21), July 5 to 9, 2021, Geneva, Switzerland

Journal ref: PASC 2021: Proceedings of the Platform for Advanced Scientific Computing Conference

arXiv:2101.12149 [pdf, other]

doi 10.1016/j.parco.2021.102833

Porting WarpX to GPU-accelerated platforms

Authors: A. Myers, A. Almgren, L. D. Amorim, J. Bell, L. Fedeli, L. Ge, K. Gott, D. P. Grote, M. Hogan, A. Huebl, R. Jambunathan, R. Lehe, C. Ng, M. Rowan, O. Shapoval, M. Thévenet, J. -L. Vay, H. Vincenti, E. Yang, N. Zaïm, W. Zhang, Y. Zhao, E. Zoni

Abstract: WarpX is a general purpose electromagnetic particle-in-cell code that was originally designed to run on many-core CPU architectures. We describe the strategy followed to allow WarpX to use the GPU-accelerated nodes on OLCF's Summit supercomputer, a strategy we believe will extend to the upcoming machines Frontier and Aurora. We summarize the challenges encountered, lessons learned, and give curren… ▽ More WarpX is a general purpose electromagnetic particle-in-cell code that was originally designed to run on many-core CPU architectures. We describe the strategy followed to allow WarpX to use the GPU-accelerated nodes on OLCF's Summit supercomputer, a strategy we believe will extend to the upcoming machines Frontier and Aurora. We summarize the challenges encountered, lessons learned, and give current performance results on a series of relevant benchmark problems. △ Less

Submitted 2 September, 2021; v1 submitted 28 January, 2021; originally announced January 2021.

Comments: 11 pages, 5 figures, accepted by Parallel Computing. Minor revisions, results unchanged

Journal ref: Parallel Computing, Volume 108, 2021, 102833

arXiv:2009.12009 [pdf, other]

AMReX: Block-Structured Adaptive Mesh Refinement for Multiphysics Applications

Authors: Weiqun Zhang, Andrew Myers, Kevin Gott, Ann Almgren, John Bell

Abstract: Block-structured adaptive mesh refinement (AMR) provides the basis for the temporal and spatial discretization strategy for a number of ECP applications in the areas of accelerator design, additive manufacturing, astrophysics, combustion, cosmology, multiphase flow, and wind plant modelling. AMReX is a software framework that provides a unified infrastructure with the functionality needed for thes… ▽ More Block-structured adaptive mesh refinement (AMR) provides the basis for the temporal and spatial discretization strategy for a number of ECP applications in the areas of accelerator design, additive manufacturing, astrophysics, combustion, cosmology, multiphase flow, and wind plant modelling. AMReX is a software framework that provides a unified infrastructure with the functionality needed for these and other AMR applications to be able to effectively and efficiently utilize machines from laptops to exascale architectures. AMR reduces the computational cost and memory footprint compared to a uniform mesh while preserving accurate descriptions of different physical processes in complex multi-physics algorithms. AMReX supports algorithms that solve systems of partial differential equations (PDEs) in simple or complex geometries, and those that use particles and/or particle-mesh operations to represent component physical processes. In this paper, we will discuss the core elements of the AMReX framework such as data containers and iterators as well as several specialized operations to meet the needs of the application projects. In addition we will highlight the strategy that the AMReX team is pursuing to achieve highly performant code across a range of accelerator-based architectures for a variety of different applications. △ Less

Submitted 24 September, 2020; originally announced September 2020.

Comments: 16 pages, 9 figures, submitted to IJHPCA

arXiv:2007.05218 [pdf, other]

Preparing Nuclear Astrophysics for Exascale

Authors: Max P. Katz, Ann Almgren, Maria Barrios Sazo, Kiran Eiden, Kevin Gott, Alice Harpole, Jean M. Sexton, Don E. Willcox, Weiqun Zhang, Michael Zingale

Abstract: Astrophysical explosions such as supernovae are fascinating events that require sophisticated algorithms and substantial computational power to model. Castro and MAESTROeX are nuclear astrophysics codes that simulate thermonuclear fusion in the context of supernovae and X-ray bursts. Examining these nuclear burning processes using high resolution simulations is critical for understanding how these… ▽ More Astrophysical explosions such as supernovae are fascinating events that require sophisticated algorithms and substantial computational power to model. Castro and MAESTROeX are nuclear astrophysics codes that simulate thermonuclear fusion in the context of supernovae and X-ray bursts. Examining these nuclear burning processes using high resolution simulations is critical for understanding how these astrophysical explosions occur. In this paper we describe the changes that have been made to these codes to transform them from standard MPI + OpenMP codes targeted at petascale CPU-based systems into a form compatible with the pre-exascale systems now online and the exascale systems coming soon. We then discuss what new science is possible to run on systems such as Summit and Perlmutter that could not have been achieved on the previous generation of supercomputers. △ Less

Submitted 10 July, 2020; originally announced July 2020.

Comments: Accepted for publication at SC20

Showing 1–5 of 5 results for author: Gott, K