-
Calibrating Bayesian Generative Machine Learning for Bayesiamplification
Authors:
Sebastian Bieringer,
Sascha Diefenbacher,
Gregor Kasieczka,
Mathias Trabs
Abstract:
Recently, combinations of generative and Bayesian machine learning have been introduced in particle physics for both fast detector simulation and inference tasks. These neural networks aim to quantify the uncertainty on the generated distribution originating from limited training statistics. The interpretation of a distribution-wide uncertainty however remains ill-defined. We show a clear scheme f…
▽ More
Recently, combinations of generative and Bayesian machine learning have been introduced in particle physics for both fast detector simulation and inference tasks. These neural networks aim to quantify the uncertainty on the generated distribution originating from limited training statistics. The interpretation of a distribution-wide uncertainty however remains ill-defined. We show a clear scheme for quantifying the calibration of Bayesian generative machine learning models. For a Continuous Normalizing Flow applied to a low-dimensional toy example, we evaluate the calibration of Bayesian uncertainties from either a mean-field Gaussian weight posterior, or Monte Carlo sampling network weights, to gauge their behaviour on unsteady distribution edges. Well calibrated uncertainties can then be used to roughly estimate the number of uncorrelated truth samples that are equivalent to the generated sample and clearly indicate data amplification for smooth features of the distribution.
△ Less
Submitted 1 August, 2024;
originally announced August 2024.
-
Universal New Physics Latent Space
Authors:
Anna Hallin,
Gregor Kasieczka,
Sabine Kraml,
André Lessa,
Louis Moureaux,
Tore von Schwartz,
David Shih
Abstract:
We develop a machine learning method for mapping data originating from both Standard Model processes and various theories beyond the Standard Model into a unified representation (latent) space while conserving information about the relationship between the underlying theories. We apply our method to three examples of new physics at the LHC of increasing complexity, showing that models can be clust…
▽ More
We develop a machine learning method for mapping data originating from both Standard Model processes and various theories beyond the Standard Model into a unified representation (latent) space while conserving information about the relationship between the underlying theories. We apply our method to three examples of new physics at the LHC of increasing complexity, showing that models can be clustered according to their LHC phenomenology: different models are mapped to distinct regions in latent space, while indistinguishable models are mapped to the same region. This opens interesting new avenues on several fronts, such as model discrimination, selection of representative benchmark scenarios, and identifying gaps in the coverage of model space.
△ Less
Submitted 29 July, 2024;
originally announced July 2024.
-
Shower Separation in Five Dimensions for Highly Granular Calorimeters using Machine Learning
Authors:
S. Lai,
J. Utehs,
A. Wilhahn,
M. C. Fouz,
O. Bach,
E. Brianne,
A. Ebrahimi,
K. Gadow,
P. Göttlicher,
O. Hartbrich,
D. Heuchel,
A. Irles,
K. Krüger,
J. Kvasnicka,
S. Lu,
C. Neubüser,
A. Provenza,
M. Reinecke,
F. Sefkow,
S. Schuwalow,
M. De Silva,
Y. Sudo,
H. L. Tran,
L. Liu,
R. Masuda
, et al. (26 additional authors not shown)
Abstract:
To achieve state-of-the-art jet energy resolution for Particle Flow, sophisticated energy clustering algorithms must be developed that can fully exploit available information to separate energy deposits from charged and neutral particles. Three published neural network-based shower separation models were applied to simulation and experimental data to measure the performance of the highly granular…
▽ More
To achieve state-of-the-art jet energy resolution for Particle Flow, sophisticated energy clustering algorithms must be developed that can fully exploit available information to separate energy deposits from charged and neutral particles. Three published neural network-based shower separation models were applied to simulation and experimental data to measure the performance of the highly granular CALICE Analogue Hadronic Calorimeter (AHCAL) technological prototype in distinguishing the energy deposited by a single charged and single neutral hadron for Particle Flow. The performance of models trained using only standard spatial and energy and charged track position information from an event was compared to models trained using timing information available from AHCAL, which is expected to improve sensitivity to shower development and, therefore, aid in clustering. Both simulation and experimental data were used to train and test the models and their performances were compared. The best-performing neural network achieved significantly superior event reconstruction when timing information was utilised in training for the case where the charged hadron had more energy than the neutral one, motivating temporally sensitive calorimeters. All models under test were observed to tend to allocate energy deposited by the more energetic of the two showers to the less energetic one. Similar shower reconstruction performance was observed for a model trained on simulation and applied to data and a model trained and applied to data.
△ Less
Submitted 28 June, 2024;
originally announced July 2024.
-
Convolutional L2LFlows: Generating Accurate Showers in Highly Granular Calorimeters Using Convolutional Normalizing Flows
Authors:
Thorsten Buss,
Frank Gaede,
Gregor Kasieczka,
Claudius Krause,
David Shih
Abstract:
In the quest to build generative surrogate models as computationally efficient alternatives to rule-based simulations, the quality of the generated samples remains a crucial frontier. So far, normalizing flows have been among the models with the best fidelity. However, as the latent space in such models is required to have the same dimensionality as the data space, scaling up normalizing flows to…
▽ More
In the quest to build generative surrogate models as computationally efficient alternatives to rule-based simulations, the quality of the generated samples remains a crucial frontier. So far, normalizing flows have been among the models with the best fidelity. However, as the latent space in such models is required to have the same dimensionality as the data space, scaling up normalizing flows to high dimensional datasets is not straightforward. The prior L2LFlows approach successfully used a series of separate normalizing flows and sequence of conditioning steps to circumvent this problem. In this work, we extend L2LFlows to simulate showers with a 9-times larger profile in the lateral direction. To achieve this, we introduce convolutional layers and U-Net-type connections, move from masked autoregressive flows to coupling layers, and demonstrate the successful modelling of showers in the ILD Electromagnetic Calorimeter as well as Dataset 3 from the public CaloChallenge dataset.
△ Less
Submitted 3 June, 2024; v1 submitted 30 May, 2024;
originally announced May 2024.
-
Accelerating Resonance Searches via Signature-Oriented Pre-training
Authors:
Congqiao Li,
Antonios Agapitos,
Jovin Drews,
Javier Duarte,
Dawei Fu,
Leyun Gao,
Raghav Kansal,
Gregor Kasieczka,
Louis Moureaux,
Huilin Qu,
Cristina Mantilla Suarez,
Qiang Li
Abstract:
The search for heavy resonances beyond the Standard Model (BSM) is a key objective at the LHC. While the recent use of advanced deep neural networks for boosted-jet tagging significantly enhances the sensitivity of dedicated searches, it is limited to specific final states, leaving vast potential BSM phase space underexplored. We introduce a novel experimental method, Signature-Oriented Pre-traini…
▽ More
The search for heavy resonances beyond the Standard Model (BSM) is a key objective at the LHC. While the recent use of advanced deep neural networks for boosted-jet tagging significantly enhances the sensitivity of dedicated searches, it is limited to specific final states, leaving vast potential BSM phase space underexplored. We introduce a novel experimental method, Signature-Oriented Pre-training for Heavy-resonance ObservatioN (Sophon), which leverages deep learning to cover an extensive number of boosted final states. Pre-trained on the comprehensive JetClass-II dataset, the Sophon model learns intricate jet signatures, ensuring the optimal constructions of various jet tagging discriminates and enabling high-performance transfer learning capabilities. We show that the method can not only push widespread model-specific searches to their sensitivity frontier, but also greatly improve model-agnostic approaches, accelerating LHC resonance searches in a broad sense.
△ Less
Submitted 21 May, 2024;
originally announced May 2024.
-
Complete Optimal Non-Resonant Anomaly Detection
Authors:
Gregor Kasieczka,
John Andrew Raine,
David Shih,
Aman Upadhyay
Abstract:
We propose the first-ever complete, model-agnostic search strategy based on the optimal anomaly score, for new physics on the tails of distributions. Signal sensitivity is achieved via a classifier trained on auxiliary features in a weakly-supervised fashion, and backgrounds are predicted using the ABCD method in the classifier output and the primary tail feature. The independence between the clas…
▽ More
We propose the first-ever complete, model-agnostic search strategy based on the optimal anomaly score, for new physics on the tails of distributions. Signal sensitivity is achieved via a classifier trained on auxiliary features in a weakly-supervised fashion, and backgrounds are predicted using the ABCD method in the classifier output and the primary tail feature. The independence between the classifier output and the tail feature required for ABCD is achieved by first training a conditional normalizing flow that yields a decorrelated version of the auxiliary features; the classifier is then trained on these features. Both the signal sensitivity and background prediction require a sample of events accurately approximating the SM background; we assume this can be furnished by closely related control processes in the data or by accurate simulations, as is the case in countless conventional analyses. The viability of our approach is demonstrated for signatures consisting of (mono)jets and missing transverse energy, where the main SM background is $Z(νν) +\text{jets}$, and the data-driven control process is $γ+\text{jets}$.
△ Less
Submitted 10 April, 2024;
originally announced April 2024.
-
OmniJet-$α$: The first cross-task foundation model for particle physics
Authors:
Joschka Birk,
Anna Hallin,
Gregor Kasieczka
Abstract:
Foundation models are multi-dataset and multi-task machine learning methods that once pre-trained can be fine-tuned for a large variety of downstream applications. The successful development of such general-purpose models for physics data would be a major breakthrough as they could improve the achievable physics performance while at the same time drastically reduce the required amount of training…
▽ More
Foundation models are multi-dataset and multi-task machine learning methods that once pre-trained can be fine-tuned for a large variety of downstream applications. The successful development of such general-purpose models for physics data would be a major breakthrough as they could improve the achievable physics performance while at the same time drastically reduce the required amount of training time and data.
We report significant progress on this challenge on several fronts. First, a comprehensive set of evaluation methods is introduced to judge the quality of an encoding from physics data into a representation suitable for the autoregressive generation of particle jets with transformer architectures (the common backbone of foundation models). These measures motivate the choice of a higher-fidelity tokenization compared to previous works. Finally, we demonstrate transfer learning between an unsupervised problem (jet generation) and a classic supervised task (jet tagging) with our new OmniJet-$α$ model. This is the first successful transfer between two different and actively studied classes of tasks and constitutes a major step in the building of foundation models for particle physics.
△ Less
Submitted 8 March, 2024;
originally announced March 2024.
-
Software Compensation for Highly Granular Calorimeters using Machine Learning
Authors:
S. Lai,
J. Utehs,
A. Wilhahn,
O. Bach,
E. Brianne,
A. Ebrahimi,
K. Gadow,
P. Göttlicher,
O. Hartbrich,
D. Heuchel,
A. Irles,
K. Krüger,
J. Kvasnicka,
S. Lu,
C. Neubüser,
A. Provenza,
M. Reinecke,
F. Sefkow,
S. Schuwalow,
M. De Silva,
Y. Sudo,
H. L. Tran,
E. Buhmann,
E. Garutti,
S. Huck
, et al. (39 additional authors not shown)
Abstract:
A neural network for software compensation was developed for the highly granular CALICE Analogue Hadronic Calorimeter (AHCAL). The neural network uses spatial and temporal event information from the AHCAL and energy information, which is expected to improve sensitivity to shower development and the neutron fraction of the hadron shower. The neural network method produced a depth-dependent energy w…
▽ More
A neural network for software compensation was developed for the highly granular CALICE Analogue Hadronic Calorimeter (AHCAL). The neural network uses spatial and temporal event information from the AHCAL and energy information, which is expected to improve sensitivity to shower development and the neutron fraction of the hadron shower. The neural network method produced a depth-dependent energy weighting and a time-dependent threshold for enhancing energy deposits consistent with the timescale of evaporation neutrons. Additionally, it was observed to learn an energy-weighting indicative of longitudinal leakage correction. In addition, the method produced a linear detector response and outperformed a published control method regarding resolution for every particle energy studied.
△ Less
Submitted 7 March, 2024;
originally announced March 2024.
-
Classifier Surrogates: Sharing AI-based Searches with the World
Authors:
Sebastian Bieringer,
Gregor Kasieczka,
Jan Kieseler,
Mathias Trabs
Abstract:
In recent years, neural network-based classification has been used to improve data analysis at collider experiments. While this strategy proves to be hugely successful, the underlying models are not commonly shared with the public and rely on experiment-internal data as well as full detector simulations. We show a concrete implementation of a newly proposed strategy, so-called Classifier Surrogate…
▽ More
In recent years, neural network-based classification has been used to improve data analysis at collider experiments. While this strategy proves to be hugely successful, the underlying models are not commonly shared with the public and rely on experiment-internal data as well as full detector simulations. We show a concrete implementation of a newly proposed strategy, so-called Classifier Surrogates, to be trained inside the experiments, that only utilise publicly accessible features and truth information. These surrogates approximate the original classifier distribution, and can be shared with the public. Subsequently, such a model can be evaluated by sampling the classification output from high-level information without requiring a sophisticated detector simulation. Technically, we show that Continuous Normalizing Flows are a suitable generative architecture that can be efficiently trained to sample classification results using Conditional Flow Matching. We further demonstrate that these models can be easily extended by Bayesian uncertainties to indicate their degree of validity when confronted with unknown inputs by the user. For a concrete example of tagging jets from hadronically decaying top quarks, we demonstrate the application of flows in combination with uncertainty estimation through either inference of a mean-field Gaussian weight posterior, or Monte Carlo sampling network weights.
△ Less
Submitted 2 July, 2024; v1 submitted 23 February, 2024;
originally announced February 2024.
-
Ultrafast jet classification on FPGAs for the HL-LHC
Authors:
Patrick Odagiu,
Zhiqiang Que,
Javier Duarte,
Johannes Haller,
Gregor Kasieczka,
Artur Lobanov,
Vladimir Loncar,
Wayne Luk,
Jennifer Ngadiuba,
Maurizio Pierini,
Philipp Rincke,
Arpita Seksaria,
Sioni Summers,
Andre Sznajder,
Alexander Tapper,
Thea K. Aarrestad
Abstract:
Three machine learning models are used to perform jet origin classification. These models are optimized for deployment on a field-programmable gate array device. In this context, we demonstrate how latency and resource consumption scale with the input size and choice of algorithm. Moreover, the models proposed here are designed to work on the type of data and under the foreseen conditions at the C…
▽ More
Three machine learning models are used to perform jet origin classification. These models are optimized for deployment on a field-programmable gate array device. In this context, we demonstrate how latency and resource consumption scale with the input size and choice of algorithm. Moreover, the models proposed here are designed to work on the type of data and under the foreseen conditions at the CERN LHC during its high-luminosity phase. Through quantization-aware training and efficient synthetization for a specific field programmable gate array, we show that $O(100)$ ns inference of complex architectures such as Deep Sets and Interaction Networks is feasible at a relatively low computational resource cost.
△ Less
Submitted 4 July, 2024; v1 submitted 2 February, 2024;
originally announced February 2024.
-
Les Houches guide to reusable ML models in LHC analyses
Authors:
Jack Y. Araz,
Andy Buckley,
Gregor Kasieczka,
Jan Kieseler,
Sabine Kraml,
Anders Kvellestad,
Andre Lessa,
Tomasz Procter,
Are Raklev,
Humberto Reyes-Gonzalez,
Krzysztof Rolbiecki,
Sezen Sekmen,
Gokhan Unel
Abstract:
With the increasing usage of machine-learning in high-energy physics analyses, the publication of the trained models in a reusable form has become a crucial question for analysis preservation and reuse. The complexity of these models creates practical issues for both reporting them accurately and for ensuring the stability of their behaviours in different environments and over extended timescales.…
▽ More
With the increasing usage of machine-learning in high-energy physics analyses, the publication of the trained models in a reusable form has become a crucial question for analysis preservation and reuse. The complexity of these models creates practical issues for both reporting them accurately and for ensuring the stability of their behaviours in different environments and over extended timescales. In this note we discuss the current state of affairs, highlighting specific practical issues and focusing on the most promising technical and strategic approaches to ensure trustworthy analysis-preservation. This material originated from discussions in the LHC Reinterpretation Forum and the 2023 PhysTeV workshop at Les Houches.
△ Less
Submitted 10 January, 2024; v1 submitted 22 December, 2023;
originally announced December 2023.
-
AdamMCMC: Combining Metropolis Adjusted Langevin with Momentum-based Optimization
Authors:
Sebastian Bieringer,
Gregor Kasieczka,
Maximilian F. Steffen,
Mathias Trabs
Abstract:
Uncertainty estimation is a key issue when considering the application of deep neural network methods in science and engineering. In this work, we introduce a novel algorithm that quantifies epistemic uncertainty via Monte Carlo sampling from a tempered posterior distribution. It combines the well established Metropolis Adjusted Langevin Algorithm (MALA) with momentum-based optimization using Adam…
▽ More
Uncertainty estimation is a key issue when considering the application of deep neural network methods in science and engineering. In this work, we introduce a novel algorithm that quantifies epistemic uncertainty via Monte Carlo sampling from a tempered posterior distribution. It combines the well established Metropolis Adjusted Langevin Algorithm (MALA) with momentum-based optimization using Adam and leverages a prolate proposal distribution, to efficiently draw from the posterior. We prove that the constructed chain admits the Gibbs posterior as an invariant distribution and converges to this Gibbs posterior in total variation distance. Numerical evaluations are postponed to a first revision.
△ Less
Submitted 21 December, 2023;
originally announced December 2023.
-
Residual ANODE
Authors:
Ranit Das,
Gregor Kasieczka,
David Shih
Abstract:
We present R-ANODE, a new method for data-driven, model-agnostic resonant anomaly detection that raises the bar for both performance and interpretability. The key to R-ANODE is to enhance the inductive bias of the anomaly detection task by fitting a normalizing flow directly to the small and unknown signal component, while holding fixed a background model (also a normalizing flow) learned from sid…
▽ More
We present R-ANODE, a new method for data-driven, model-agnostic resonant anomaly detection that raises the bar for both performance and interpretability. The key to R-ANODE is to enhance the inductive bias of the anomaly detection task by fitting a normalizing flow directly to the small and unknown signal component, while holding fixed a background model (also a normalizing flow) learned from sidebands. In doing so, R-ANODE is able to outperform all classifier-based, weakly-supervised approaches, as well as the previous ANODE method which fit a density estimator to all of the data in the signal region instead of just the signal. We show that the method works equally well whether the unknown signal fraction is learned or fixed, and is even robust to signal fraction misspecification. Finally, with the learned signal model we can sample and gain qualitative insights into the underlying anomaly, which greatly enhances the interpretability of resonant anomaly detection and offers the possibility of simultaneously discovering and characterizing the new physics that could be hiding in the data.
△ Less
Submitted 18 December, 2023;
originally announced December 2023.
-
Flow Matching Beyond Kinematics: Generating Jets with Particle-ID and Trajectory Displacement Information
Authors:
Joschka Birk,
Erik Buhmann,
Cedric Ewen,
Gregor Kasieczka,
David Shih
Abstract:
We introduce the first generative model trained on the JetClass dataset. Our model generates jets at the constituent level, and it is a permutation-equivariant continuous normalizing flow (CNF) trained with the flow matching technique. It is conditioned on the jet type, so that a single model can be used to generate the ten different jet types of JetClass. For the first time, we also introduce a g…
▽ More
We introduce the first generative model trained on the JetClass dataset. Our model generates jets at the constituent level, and it is a permutation-equivariant continuous normalizing flow (CNF) trained with the flow matching technique. It is conditioned on the jet type, so that a single model can be used to generate the ten different jet types of JetClass. For the first time, we also introduce a generative model that goes beyond the kinematic features of jet constituents. The JetClass dataset includes more features, such as particle-ID and track impact parameter, and we demonstrate that our CNF can accurately model all of these additional features as well. Our generative model for JetClass expands on the versatility of existing jet generation techniques, enhancing their potential utility in high-energy physics research, and offering a more comprehensive understanding of the generated jets.
△ Less
Submitted 30 November, 2023;
originally announced December 2023.
-
Statistical guarantees for stochastic Metropolis-Hastings
Authors:
Sebastian Bieringer,
Gregor Kasieczka,
Maximilian F. Steffen,
Mathias Trabs
Abstract:
A Metropolis-Hastings step is widely used for gradient-based Markov chain Monte Carlo methods in uncertainty quantification. By calculating acceptance probabilities on batches, a stochastic Metropolis-Hastings step saves computational costs, but reduces the effective sample size. We show that this obstacle can be avoided by a simple correction term. We study statistical properties of the resulting…
▽ More
A Metropolis-Hastings step is widely used for gradient-based Markov chain Monte Carlo methods in uncertainty quantification. By calculating acceptance probabilities on batches, a stochastic Metropolis-Hastings step saves computational costs, but reduces the effective sample size. We show that this obstacle can be avoided by a simple correction term. We study statistical properties of the resulting stationary distribution of the chain if the corrected stochastic Metropolis-Hastings approach is applied to sample from a Gibbs posterior distribution in a nonparametric regression setting. Focusing on deep neural network regression, we prove a PAC-Bayes oracle inequality which yields optimal contraction rates and we analyze the diameter and show high coverage probability of the resulting credible sets. With a numerical example in a high-dimensional parameter space, we illustrate that credible sets and contraction rates of the stochastic Metropolis-Hastings algorithm indeed behave similar to those obtained from the classical Metropolis-adjusted Langevin algorithm.
△ Less
Submitted 13 October, 2023;
originally announced October 2023.
-
Full Phase Space Resonant Anomaly Detection
Authors:
Erik Buhmann,
Cedric Ewen,
Gregor Kasieczka,
Vinicius Mikuni,
Benjamin Nachman,
David Shih
Abstract:
Physics beyond the Standard Model that is resonant in one or more dimensions has been a longstanding focus of countless searches at colliders and beyond. Recently, many new strategies for resonant anomaly detection have been developed, where sideband information can be used in conjunction with modern machine learning, in order to generate synthetic datasets representing the Standard Model backgrou…
▽ More
Physics beyond the Standard Model that is resonant in one or more dimensions has been a longstanding focus of countless searches at colliders and beyond. Recently, many new strategies for resonant anomaly detection have been developed, where sideband information can be used in conjunction with modern machine learning, in order to generate synthetic datasets representing the Standard Model background. Until now, this approach was only able to accommodate a relatively small number of dimensions, limiting the breadth of the search sensitivity. Using recent innovations in point cloud generative models, we show that this strategy can also be applied to the full phase space, using all relevant particles for the anomaly detection. As a proof of principle, we show that the signal from the R\&D dataset from the LHC Olympics is findable with this method, opening up the door to future studies that explore the interplay between depth and breadth in the representation of the data for anomaly detection.
△ Less
Submitted 9 February, 2024; v1 submitted 10 October, 2023;
originally announced October 2023.
-
EPiC-ly Fast Particle Cloud Generation with Flow-Matching and Diffusion
Authors:
Erik Buhmann,
Cedric Ewen,
Darius A. Faroughy,
Tobias Golling,
Gregor Kasieczka,
Matthew Leigh,
Guillaume Quétant,
John Andrew Raine,
Debajyoti Sengupta,
David Shih
Abstract:
Jets at the LHC, typically consisting of a large number of highly correlated particles, are a fascinating laboratory for deep generative modeling. In this paper, we present two novel methods that generate LHC jets as point clouds efficiently and accurately. We introduce \epcjedi, which combines score-matching diffusion models with the Equivariant Point Cloud (EPiC) architecture based on the deep s…
▽ More
Jets at the LHC, typically consisting of a large number of highly correlated particles, are a fascinating laboratory for deep generative modeling. In this paper, we present two novel methods that generate LHC jets as point clouds efficiently and accurately. We introduce \epcjedi, which combines score-matching diffusion models with the Equivariant Point Cloud (EPiC) architecture based on the deep sets framework. This model offers a much faster alternative to previous transformer-based diffusion models without reducing the quality of the generated jets. In addition, we introduce \epcfm, the first permutation equivariant continuous normalizing flow (CNF) for particle cloud generation. This model is trained with {\it flow-matching}, a scalable and easy-to-train objective based on optimal transport that directly regresses the vector fields connecting the Gaussian noise prior to the data distribution. Our experiments demonstrate that \epcjedi and \epcfm both achieve state-of-the-art performance on the top-quark JetNet datasets whilst maintaining fast generation speed. Most notably, we find that the \epcfm model consistently outperforms all the other generative models considered here across every metric. Finally, we also introduce two new particle cloud performance metrics: the first based on the Kullback-Leibler divergence between feature distributions, the second is the negative log-posterior of a multi-model ParticleNet classifier.
△ Less
Submitted 29 September, 2023;
originally announced October 2023.
-
Back To The Roots: Tree-Based Algorithms for Weakly Supervised Anomaly Detection
Authors:
Thorben Finke,
Marie Hein,
Gregor Kasieczka,
Michael Krämer,
Alexander Mück,
Parada Prangchaikul,
Tobias Quadfasel,
David Shih,
Manuel Sommerhalder
Abstract:
Weakly supervised methods have emerged as a powerful tool for model-agnostic anomaly detection at the Large Hadron Collider (LHC). While these methods have shown remarkable performance on specific signatures such as di-jet resonances, their application in a more model-agnostic manner requires dealing with a larger number of potentially noisy input features. In this paper, we show that using booste…
▽ More
Weakly supervised methods have emerged as a powerful tool for model-agnostic anomaly detection at the Large Hadron Collider (LHC). While these methods have shown remarkable performance on specific signatures such as di-jet resonances, their application in a more model-agnostic manner requires dealing with a larger number of potentially noisy input features. In this paper, we show that using boosted decision trees as classifiers in weakly supervised anomaly detection gives superior performance compared to deep neural networks. Boosted decision trees are well known for their effectiveness in tabular data analysis. Our results show that they not only offer significantly faster training and evaluation times, but they are also robust to a large number of noisy input features. By using advanced gradient boosted decision trees in combination with ensembling techniques and an extended set of features, we significantly improve the performance of weakly supervised methods for anomaly detection at the LHC. This advance is a crucial step towards a more model-agnostic search for new physics.
△ Less
Submitted 22 September, 2023;
originally announced September 2023.
-
Combining Resonant and Tail-based Anomaly Detection
Authors:
Gerrit Bickendorf,
Manuel Drees,
Gregor Kasieczka,
Claudius Krause,
David Shih
Abstract:
In many well-motivated models of the electroweak scale, cascade decays of new particles can result in highly boosted hadronic resonances (e.g. $Z/W/h$). This can make these models rich and promising targets for recently developed resonant anomaly detection methods powered by modern machine learning. We demonstrate this using the state-of-the-art CATHODE method applied to supersymmetry scenarios wi…
▽ More
In many well-motivated models of the electroweak scale, cascade decays of new particles can result in highly boosted hadronic resonances (e.g. $Z/W/h$). This can make these models rich and promising targets for recently developed resonant anomaly detection methods powered by modern machine learning. We demonstrate this using the state-of-the-art CATHODE method applied to supersymmetry scenarios with gluino pair production. We show that CATHODE, despite being model-agnostic, is nevertheless competitive with dedicated cut-based searches, while simultaneously covering a much wider region of parameter space. The gluino events also populate the tails of the missing energy and $H_T$ distributions, making this a novel combination of resonant and tail-based anomaly detection.
△ Less
Submitted 28 May, 2024; v1 submitted 22 September, 2023;
originally announced September 2023.
-
CaloClouds II: Ultra-Fast Geometry-Independent Highly-Granular Calorimeter Simulation
Authors:
Erik Buhmann,
Frank Gaede,
Gregor Kasieczka,
Anatolii Korol,
William Korcari,
Katja Krüger,
Peter McKeown
Abstract:
Fast simulation of the energy depositions in high-granular detectors is needed for future collider experiments with ever-increasing luminosities. Generative machine learning (ML) models have been shown to speed up and augment the traditional simulation chain in physics analysis. However, the majority of previous efforts were limited to models relying on fixed, regular detector readout geometries.…
▽ More
Fast simulation of the energy depositions in high-granular detectors is needed for future collider experiments with ever-increasing luminosities. Generative machine learning (ML) models have been shown to speed up and augment the traditional simulation chain in physics analysis. However, the majority of previous efforts were limited to models relying on fixed, regular detector readout geometries. A major advancement is the recently introduced CaloClouds model, a geometry-independent diffusion model, which generates calorimeter showers as point clouds for the electromagnetic calorimeter of the envisioned International Large Detector (ILD).
In this work, we introduce CaloClouds II which features a number of key improvements. This includes continuous time score-based modelling, which allows for a 25-step sampling with comparable fidelity to CaloClouds while yielding a $6\times$ speed-up over Geant4 on a single CPU ($5\times$ over CaloClouds). We further distill the diffusion model into a consistency model allowing for accurate sampling in a single step and resulting in a $46\times$ ($37\times$ over CaloClouds) speed-up. This constitutes the first application of consistency distillation for the generation of calorimeter showers.
△ Less
Submitted 26 February, 2024; v1 submitted 11 September, 2023;
originally announced September 2023.
-
The Interplay of Machine Learning--based Resonant Anomaly Detection Methods
Authors:
Tobias Golling,
Gregor Kasieczka,
Claudius Krause,
Radha Mastandrea,
Benjamin Nachman,
John Andrew Raine,
Debajyoti Sengupta,
David Shih,
Manuel Sommerhalder
Abstract:
Machine learning--based anomaly detection (AD) methods are promising tools for extending the coverage of searches for physics beyond the Standard Model (BSM). One class of AD methods that has received significant attention is resonant anomaly detection, where the BSM is assumed to be localized in at least one known variable. While there have been many methods proposed to identify such a BSM signal…
▽ More
Machine learning--based anomaly detection (AD) methods are promising tools for extending the coverage of searches for physics beyond the Standard Model (BSM). One class of AD methods that has received significant attention is resonant anomaly detection, where the BSM is assumed to be localized in at least one known variable. While there have been many methods proposed to identify such a BSM signal that make use of simulated or detected data in different ways, there has not yet been a study of the methods' complementarity. To this end, we address two questions. First, in the absence of any signal, do different methods pick the same events as signal-like? If not, then we can significantly reduce the false-positive rate by comparing different methods on the same dataset. Second, if there is a signal, are different methods fully correlated? Even if their maximum performance is the same, since we do not know how much signal is present, it may be beneficial to combine approaches. Using the Large Hadron Collider (LHC) Olympics dataset, we provide quantitative answers to these questions. We find that there are significant gains possible by combining multiple methods, which will strengthen the search program at the LHC and beyond.
△ Less
Submitted 14 March, 2024; v1 submitted 20 July, 2023;
originally announced July 2023.
-
CaloClouds: Fast Geometry-Independent Highly-Granular Calorimeter Simulation
Authors:
Erik Buhmann,
Sascha Diefenbacher,
Engin Eren,
Frank Gaede,
Gregor Kasieczka,
Anatolii Korol,
William Korcari,
Katja Krüger,
Peter McKeown
Abstract:
Simulating showers of particles in highly-granular detectors is a key frontier in the application of machine learning to particle physics. Achieving high accuracy and speed with generative machine learning models would enable them to augment traditional simulations and alleviate a major computing constraint. This work achieves a major breakthrough in this task by, for the first time, directly gene…
▽ More
Simulating showers of particles in highly-granular detectors is a key frontier in the application of machine learning to particle physics. Achieving high accuracy and speed with generative machine learning models would enable them to augment traditional simulations and alleviate a major computing constraint. This work achieves a major breakthrough in this task by, for the first time, directly generating a point cloud of a few thousand space points with energy depositions in the detector in 3D space without relying on a fixed-grid structure. This is made possible by two key innovations: i) Using recent improvements in generative modeling we apply a diffusion model to generate photon showers as high-cardinality point clouds. ii) These point clouds of up to $6,000$ space points are largely geometry-independent as they are down-sampled from initial even higher-resolution point clouds of up to $40,000$ so-called Geant4 steps. We showcase the performance of this approach using the specific example of simulating photon showers in the planned electromagnetic calorimeter of the International Large Detector (ILD) and achieve overall good modeling of physically relevant distributions.
△ Less
Submitted 26 February, 2024; v1 submitted 8 May, 2023;
originally announced May 2023.
-
New Angles on Fast Calorimeter Shower Simulation
Authors:
Sascha Diefenbacher,
Engin Eren,
Frank Gaede,
Gregor Kasieczka,
Anatolii Korol,
Katja Krüger,
Peter McKeown,
Lennart Rustige
Abstract:
The demands placed on computational resources by the simulation requirements of high energy physics experiments motivate the development of novel simulation tools. Machine learning based generative models offer a solution that is both fast and accurate. In this work we extend the Bounded Information Bottleneck Autoencoder (BIB-AE) architecture, designed for the simulation of particle showers in hi…
▽ More
The demands placed on computational resources by the simulation requirements of high energy physics experiments motivate the development of novel simulation tools. Machine learning based generative models offer a solution that is both fast and accurate. In this work we extend the Bounded Information Bottleneck Autoencoder (BIB-AE) architecture, designed for the simulation of particle showers in highly granular calorimeters, in two key directions. First, we generalise the model to a multi-parameter conditioning scenario, while retaining a high degree of physics fidelity. In a second step, we perform a detailed study of the effect of applying a state-of-the-art particle flow-based reconstruction procedure to the generated showers. We demonstrate that the performance of the model remains high after reconstruction. These results are an important step towards creating a more general simulation tool, where maintaining physics performance after reconstruction is the ultimate target.
△ Less
Submitted 31 March, 2023;
originally announced March 2023.
-
L2LFlows: Generating High-Fidelity 3D Calorimeter Images
Authors:
Sascha Diefenbacher,
Engin Eren,
Frank Gaede,
Gregor Kasieczka,
Claudius Krause,
Imahn Shekhzadeh,
David Shih
Abstract:
We explore the use of normalizing flows to emulate Monte Carlo detector simulations of photon showers in a high-granularity electromagnetic calorimeter prototype for the International Large Detector (ILD). Our proposed method -- which we refer to as "Layer-to-Layer-Flows" (L$2$LFlows) -- is an evolution of the CaloFlow architecture adapted to a higher-dimensional setting (30 layers of…
▽ More
We explore the use of normalizing flows to emulate Monte Carlo detector simulations of photon showers in a high-granularity electromagnetic calorimeter prototype for the International Large Detector (ILD). Our proposed method -- which we refer to as "Layer-to-Layer-Flows" (L$2$LFlows) -- is an evolution of the CaloFlow architecture adapted to a higher-dimensional setting (30 layers of $10\times 10$ voxels each). The main innovation of L$2$LFlows consists of introducing $30$ separate normalizing flows, one for each layer of the calorimeter, where each flow is conditioned on the previous five layers in order to learn the layer-to-layer correlations. We compare our results to the BIB-AE, a state-of-the-art generative network trained on the same dataset and find our model has a significantly improved fidelity.
△ Less
Submitted 20 October, 2023; v1 submitted 22 February, 2023;
originally announced February 2023.
-
EPiC-GAN: Equivariant Point Cloud Generation for Particle Jets
Authors:
Erik Buhmann,
Gregor Kasieczka,
Jesse Thaler
Abstract:
With the vast data-collecting capabilities of current and future high-energy collider experiments, there is an increasing demand for computationally efficient simulations. Generative machine learning models enable fast event generation, yet so far these approaches are largely constrained to fixed data structures and rigid detector geometries. In this paper, we introduce EPiC-GAN - equivariant poin…
▽ More
With the vast data-collecting capabilities of current and future high-energy collider experiments, there is an increasing demand for computationally efficient simulations. Generative machine learning models enable fast event generation, yet so far these approaches are largely constrained to fixed data structures and rigid detector geometries. In this paper, we introduce EPiC-GAN - equivariant point cloud generative adversarial network - which can produce point clouds of variable multiplicity. This flexible framework is based on deep sets and is well suited for simulating sprays of particles called jets. The generator and discriminator utilize multiple EPiC layers with an interpretable global latent vector. Crucially, the EPiC layers do not rely on pairwise information sharing between particles, which leads to a significant speed-up over graph- and transformer-based approaches with more complex relation diagrams. We demonstrate that EPiC-GAN scales well to large particle multiplicities and achieves high generation fidelity on benchmark jet generation tasks.
△ Less
Submitted 12 July, 2023; v1 submitted 17 January, 2023;
originally announced January 2023.
-
Morphological Classification of Radio Galaxies with wGAN-supported Augmentation
Authors:
Lennart Rustige,
Janis Kummer,
Florian Griese,
Kerstin Borras,
Marcus Brüggen,
Patrick L. S. Connor,
Frank Gaede,
Gregor Kasieczka,
Tobias Knopp,
Peter Schleper
Abstract:
Machine learning techniques that perform morphological classification of astronomical sources often suffer from a scarcity of labelled training data. Here, we focus on the case of supervised deep learning models for the morphological classification of radio galaxies, which is particularly topical for the forthcoming large radio surveys. We demonstrate the use of generative models, specifically Was…
▽ More
Machine learning techniques that perform morphological classification of astronomical sources often suffer from a scarcity of labelled training data. Here, we focus on the case of supervised deep learning models for the morphological classification of radio galaxies, which is particularly topical for the forthcoming large radio surveys. We demonstrate the use of generative models, specifically Wasserstein GANs (wGANs), to generate data for different classes of radio galaxies. Further, we study the impact of augmenting the training data with images from our wGAN on three different classification architectures. We find that this technique makes it possible to improve models for the morphological classification of radio galaxies. A simple Fully Connected Neural Network (FCN) benefits most from including generated images into the training set, with a considerable improvement of its classification accuracy. In addition, we find it is more difficult to improve complex classifiers. The classification performance of a Convolutional Neural Network (CNN) can be improved slightly. However, this is not the case for a Vision Transformer (ViT).
△ Less
Submitted 14 June, 2023; v1 submitted 16 December, 2022;
originally announced December 2022.
-
Feature Selection with Distance Correlation
Authors:
Ranit Das,
Gregor Kasieczka,
David Shih
Abstract:
Choosing which properties of the data to use as input to multivariate decision algorithms -- a.k.a. feature selection -- is an important step in solving any problem with machine learning. While there is a clear trend towards training sophisticated deep networks on large numbers of relatively unprocessed inputs (so-called automated feature engineering), for many tasks in physics, sets of theoretica…
▽ More
Choosing which properties of the data to use as input to multivariate decision algorithms -- a.k.a. feature selection -- is an important step in solving any problem with machine learning. While there is a clear trend towards training sophisticated deep networks on large numbers of relatively unprocessed inputs (so-called automated feature engineering), for many tasks in physics, sets of theoretically well-motivated and well-understood features already exist. Working with such features can bring many benefits, including greater interpretability, reduced training and run time, and enhanced stability and robustness. We develop a new feature selection method based on Distance Correlation (DisCo), and demonstrate its effectiveness on the tasks of boosted top- and $W$-tagging. Using our method to select features from a set of over 7,000 energy flow polynomials, we show that we can match the performance of much deeper architectures, by using only ten features and two orders-of-magnitude fewer model parameters.
△ Less
Submitted 30 November, 2022;
originally announced December 2022.
-
Resonant anomaly detection without background sculpting
Authors:
Anna Hallin,
Gregor Kasieczka,
Tobias Quadfasel,
David Shih,
Manuel Sommerhalder
Abstract:
We introduce a new technique named Latent CATHODE (LaCATHODE) for performing "enhanced bump hunts", a type of resonant anomaly search that combines conventional one-dimensional bump hunts with a model-agnostic anomaly score in an auxiliary feature space where potential signals could also be localized. The main advantage of LaCATHODE over existing methods is that it provides an anomaly score that i…
▽ More
We introduce a new technique named Latent CATHODE (LaCATHODE) for performing "enhanced bump hunts", a type of resonant anomaly search that combines conventional one-dimensional bump hunts with a model-agnostic anomaly score in an auxiliary feature space where potential signals could also be localized. The main advantage of LaCATHODE over existing methods is that it provides an anomaly score that is well behaved when evaluating it beyond the signal region, which is essential to prevent the sculpting of background distributions in the bump hunt. LaCATHODE accomplishes this by constructing the anomaly score directly in the latent space learned by a conditional normalizing flow trained on sideband regions. We demonstrate the superior stability and comparable performance of LaCATHODE for enhanced bump hunting in an illustrative toy example as well as on the LHC Olympics R&D dataset.
△ Less
Submitted 10 July, 2023; v1 submitted 26 October, 2022;
originally announced October 2022.
-
Anomaly Detection under Coordinate Transformations
Authors:
Gregor Kasieczka,
Radha Mastandrea,
Vinicius Mikuni,
Benjamin Nachman,
Mariel Pettee,
David Shih
Abstract:
There is a growing need for machine learning-based anomaly detection strategies to broaden the search for Beyond-the-Standard-Model (BSM) physics at the Large Hadron Collider (LHC) and elsewhere. The first step of any anomaly detection approach is to specify observables and then use them to decide on a set of anomalous events. One common choice is to select events that have low probability density…
▽ More
There is a growing need for machine learning-based anomaly detection strategies to broaden the search for Beyond-the-Standard-Model (BSM) physics at the Large Hadron Collider (LHC) and elsewhere. The first step of any anomaly detection approach is to specify observables and then use them to decide on a set of anomalous events. One common choice is to select events that have low probability density. It is a well-known fact that probability densities are not invariant under coordinate transformations, so the sensitivity can depend on the initial choice of coordinates. The broader machine learning community has recently connected coordinate sensitivity with anomaly detection and our goal is to bring awareness of this issue to the growing high energy physics literature on anomaly detection. In addition to analytical explanations, we provide numerical examples from simple random variables and from the LHC Olympics Dataset that show how using probability density as an anomaly score can lead to events being classified as anomalous or not depending on the coordinate frame.
△ Less
Submitted 13 September, 2022;
originally announced September 2022.
-
Data Science and Machine Learning in Education
Authors:
Gabriele Benelli,
Thomas Y. Chen,
Javier Duarte,
Matthew Feickert,
Matthew Graham,
Lindsey Gray,
Dan Hackett,
Phil Harris,
Shih-Chieh Hsu,
Gregor Kasieczka,
Elham E. Khoda,
Matthias Komm,
Mia Liu,
Mark S. Neubauer,
Scarlet Norberg,
Alexx Perloff,
Marcel Rieger,
Claire Savard,
Kazuhiro Terao,
Savannah Thais,
Avik Roy,
Jean-Roch Vlimant,
Grigorios Chachamis
Abstract:
The growing role of data science (DS) and machine learning (ML) in high-energy physics (HEP) is well established and pertinent given the complex detectors, large data, sets and sophisticated analyses at the heart of HEP research. Moreover, exploiting symmetries inherent in physics data have inspired physics-informed ML as a vibrant sub-field of computer science research. HEP researchers benefit gr…
▽ More
The growing role of data science (DS) and machine learning (ML) in high-energy physics (HEP) is well established and pertinent given the complex detectors, large data, sets and sophisticated analyses at the heart of HEP research. Moreover, exploiting symmetries inherent in physics data have inspired physics-informed ML as a vibrant sub-field of computer science research. HEP researchers benefit greatly from materials widely available materials for use in education, training and workforce development. They are also contributing to these materials and providing software to DS/ML-related fields. Increasingly, physics departments are offering courses at the intersection of DS, ML and physics, often using curricula developed by HEP researchers and involving open software and data used in HEP. In this white paper, we explore synergies between HEP research and DS/ML education, discuss opportunities and challenges at this intersection, and propose community activities that will be mutually beneficial.
△ Less
Submitted 19 July, 2022;
originally announced July 2022.
-
Radio Galaxy Classification with wGAN-Supported Augmentation
Authors:
Janis Kummer,
Lennart Rustige,
Florian Griese,
Kerstin Borras,
Marcus Brüggen,
Patrick L. S. Connor,
Frank Gaede,
Gregor Kasieczka,
Peter Schleper
Abstract:
Novel techniques are indispensable to process the flood of data from the new generation of radio telescopes. In particular, the classification of astronomical sources in images is challenging. Morphological classification of radio galaxies could be automated with deep learning models that require large sets of labelled training data. Here, we demonstrate the use of generative models, specifically…
▽ More
Novel techniques are indispensable to process the flood of data from the new generation of radio telescopes. In particular, the classification of astronomical sources in images is challenging. Morphological classification of radio galaxies could be automated with deep learning models that require large sets of labelled training data. Here, we demonstrate the use of generative models, specifically Wasserstein GANs (wGAN), to generate artificial data for different classes of radio galaxies. Subsequently, we augment the training data with images from our wGAN. We find that a simple fully-connected neural network for classification can be improved significantly by including generated images into the training set.
△ Less
Submitted 7 October, 2022; v1 submitted 30 June, 2022;
originally announced June 2022.
-
New directions for surrogate models and differentiable programming for High Energy Physics detector simulation
Authors:
Andreas Adelmann,
Walter Hopkins,
Evangelos Kourlitis,
Michael Kagan,
Gregor Kasieczka,
Claudius Krause,
David Shih,
Vinicius Mikuni,
Benjamin Nachman,
Kevin Pedro,
Daniel Winklehner
Abstract:
The computational cost for high energy physics detector simulation in future experimental facilities is going to exceed the current available resources. To overcome this challenge, new ideas on surrogate models using machine learning methods are being explored to replace computationally expensive components. Additionally, differentiable programming has been proposed as a complementary approach, pr…
▽ More
The computational cost for high energy physics detector simulation in future experimental facilities is going to exceed the current available resources. To overcome this challenge, new ideas on surrogate models using machine learning methods are being explored to replace computationally expensive components. Additionally, differentiable programming has been proposed as a complementary approach, providing controllable and scalable simulation routines. In this document, new and ongoing efforts for surrogate models and differential programming applied to detector simulation are discussed in the context of the 2021 Particle Physics Community Planning Exercise (`Snowmass').
△ Less
Submitted 15 March, 2022;
originally announced March 2022.
-
Machine Learning and LHC Event Generation
Authors:
Anja Butter,
Tilman Plehn,
Steffen Schumann,
Simon Badger,
Sascha Caron,
Kyle Cranmer,
Francesco Armando Di Bello,
Etienne Dreyer,
Stefano Forte,
Sanmay Ganguly,
Dorival Gonçalves,
Eilam Gross,
Theo Heimel,
Gudrun Heinrich,
Lukas Heinrich,
Alexander Held,
Stefan Höche,
Jessica N. Howard,
Philip Ilten,
Joshua Isaacson,
Timo Janßen,
Stephen Jones,
Marumi Kado,
Michael Kagan,
Gregor Kasieczka
, et al. (26 additional authors not shown)
Abstract:
First-principle simulations are at the heart of the high-energy physics research program. They link the vast data output of multi-purpose detectors with fundamental theory predictions and interpretation. This review illustrates a wide range of applications of modern machine learning to event generation and simulation-based inference, including conceptional developments driven by the specific requi…
▽ More
First-principle simulations are at the heart of the high-energy physics research program. They link the vast data output of multi-purpose detectors with fundamental theory predictions and interpretation. This review illustrates a wide range of applications of modern machine learning to event generation and simulation-based inference, including conceptional developments driven by the specific requirements of particle physics. New ideas and tools developed at the interface of particle physics and machine learning will improve the speed and precision of forward simulations, handle the complexity of collision data, and enhance inference as an inverse simulation problem.
△ Less
Submitted 28 December, 2022; v1 submitted 14 March, 2022;
originally announced March 2022.
-
Ephemeral Learning -- Augmenting Triggers with Online-Trained Normalizing Flows
Authors:
Anja Butter,
Sascha Diefenbacher,
Gregor Kasieczka,
Benjamin Nachman,
Tilman Plehn,
David Shih,
Ramon Winterhalder
Abstract:
The large data rates at the LHC require an online trigger system to select relevant collisions. Rather than compressing individual events, we propose to compress an entire data set at once. We use a normalizing flow as a deep generative model to learn the probability density of the data online. The events are then represented by the generative neural network and can be inspected offline for anomal…
▽ More
The large data rates at the LHC require an online trigger system to select relevant collisions. Rather than compressing individual events, we propose to compress an entire data set at once. We use a normalizing flow as a deep generative model to learn the probability density of the data online. The events are then represented by the generative neural network and can be inspected offline for anomalies or used for other analysis purposes. We demonstrate our new approach for a toy model and a correlation-enhanced bump hunt.
△ Less
Submitted 28 June, 2022; v1 submitted 18 February, 2022;
originally announced February 2022.
-
Calomplification -- The Power of Generative Calorimeter Models
Authors:
Sebastian Bieringer,
Anja Butter,
Sascha Diefenbacher,
Engin Eren,
Frank Gaede,
Daniel Hundhausen,
Gregor Kasieczka,
Benjamin Nachman,
Tilman Plehn,
Mathias Trabs
Abstract:
Motivated by the high computational costs of classical simulations, machine-learned generative models can be extremely useful in particle physics and elsewhere. They become especially attractive when surrogate models can efficiently learn the underlying distribution, such that a generated sample outperforms a training sample of limited size. This kind of GANplification has been observed for simple…
▽ More
Motivated by the high computational costs of classical simulations, machine-learned generative models can be extremely useful in particle physics and elsewhere. They become especially attractive when surrogate models can efficiently learn the underlying distribution, such that a generated sample outperforms a training sample of limited size. This kind of GANplification has been observed for simple Gaussian models. We show the same effect for a physics simulation, specifically photon showers in an electromagnetic calorimeter.
△ Less
Submitted 25 January, 2023; v1 submitted 15 February, 2022;
originally announced February 2022.
-
Hadrons, Better, Faster, Stronger
Authors:
Erik Buhmann,
Sascha Diefenbacher,
Engin Eren,
Frank Gaede,
Daniel Hundhausen,
Gregor Kasieczka,
William Korcari,
Katja Krüger,
Peter McKeown,
Lennart Rustige
Abstract:
Motivated by the computational limitations of simulating interactions of particles in highly-granular detectors, there exists a concerted effort to build fast and exact machine-learning-based shower simulators. This work reports progress on two important fronts. First, the previously investigated WGAN and BIB-AE generative models are improved and successful learning of hadronic showers initiated b…
▽ More
Motivated by the computational limitations of simulating interactions of particles in highly-granular detectors, there exists a concerted effort to build fast and exact machine-learning-based shower simulators. This work reports progress on two important fronts. First, the previously investigated WGAN and BIB-AE generative models are improved and successful learning of hadronic showers initiated by charged pions in a segment of the hadronic calorimeter of the International Large Detector (ILD) is demonstrated for the first time. Second, we consider how state-of-the-art reconstruction software applied to generated shower energies affects the obtainable energy response and resolution. While many challenges remain, these results constitute an important milestone in using generative models in a realistic setting.
△ Less
Submitted 17 December, 2021;
originally announced December 2021.
-
Machine Learning in the Search for New Fundamental Physics
Authors:
Georgia Karagiorgi,
Gregor Kasieczka,
Scott Kravitz,
Benjamin Nachman,
David Shih
Abstract:
Machine learning plays a crucial role in enhancing and accelerating the search for new fundamental physics. We review the state of machine learning methods and applications for new physics searches in the context of terrestrial high energy physics experiments, including the Large Hadron Collider, rare event searches, and neutrino experiments. While machine learning has a long history in these fiel…
▽ More
Machine learning plays a crucial role in enhancing and accelerating the search for new fundamental physics. We review the state of machine learning methods and applications for new physics searches in the context of terrestrial high energy physics experiments, including the Large Hadron Collider, rare event searches, and neutrino experiments. While machine learning has a long history in these fields, the deep learning revolution (early 2010s) has yielded a qualitative shift in terms of the scope and ambition of research. These modern machine learning developments are the focus of the present review.
△ Less
Submitted 7 December, 2021;
originally announced December 2021.
-
Classifying Anomalies THrough Outer Density Estimation (CATHODE)
Authors:
Anna Hallin,
Joshua Isaacson,
Gregor Kasieczka,
Claudius Krause,
Benjamin Nachman,
Tobias Quadfasel,
Matthias Schlaffer,
David Shih,
Manuel Sommerhalder
Abstract:
We propose a new model-agnostic search strategy for physics beyond the standard model (BSM) at the LHC, based on a novel application of neural density estimation to anomaly detection. Our approach, which we call Classifying Anomalies THrough Outer Density Estimation (CATHODE), assumes the BSM signal is localized in a signal region (defined e.g. using invariant mass). By training a conditional dens…
▽ More
We propose a new model-agnostic search strategy for physics beyond the standard model (BSM) at the LHC, based on a novel application of neural density estimation to anomaly detection. Our approach, which we call Classifying Anomalies THrough Outer Density Estimation (CATHODE), assumes the BSM signal is localized in a signal region (defined e.g. using invariant mass). By training a conditional density estimator on a collection of additional features outside the signal region, interpolating it into the signal region, and sampling from it, we produce a collection of events that follow the background model. We can then train a classifier to distinguish the data from the events sampled from the background model, thereby approaching the optimal anomaly detector. Using the LHC Olympics R&D dataset, we demonstrate that CATHODE nearly saturates the best possible performance, and significantly outperforms other approaches that aim to enhance the bump hunt (CWoLa Hunting and ANODE). Finally, we demonstrate that CATHODE is very robust against correlations between the features and maintains nearly-optimal performance even in this more challenging setting.
△ Less
Submitted 11 September, 2022; v1 submitted 1 September, 2021;
originally announced September 2021.
-
Symmetries, Safety, and Self-Supervision
Authors:
Barry M. Dillon,
Gregor Kasieczka,
Hans Olischlager,
Tilman Plehn,
Peter Sorrenson,
Lorenz Vogel
Abstract:
Collider searches face the challenge of defining a representation of high-dimensional data such that physical symmetries are manifest, the discriminating features are retained, and the choice of representation is new-physics agnostic. We introduce JetCLR to solve the mapping from low-level data to optimized observables though self-supervised contrastive learning. As an example, we construct a data…
▽ More
Collider searches face the challenge of defining a representation of high-dimensional data such that physical symmetries are manifest, the discriminating features are retained, and the choice of representation is new-physics agnostic. We introduce JetCLR to solve the mapping from low-level data to optimized observables though self-supervised contrastive learning. As an example, we construct a data representation for top and QCD jets using a permutation-invariant transformer-encoder network and visualize its symmetry properties. We compare the JetCLR representation with alternative representations using linear classifier tests and find it to work quite well.
△ Less
Submitted 9 August, 2021;
originally announced August 2021.
-
Unsupervised Hadronic SUEP at the LHC
Authors:
Jared Barron,
David Curtin,
Gregor Kasieczka,
Tilman Plehn,
Aris Spourdalakis
Abstract:
Confining dark sectors with pseudo-conformal dynamics produce SUEP, or Soft Unclustered Energy Patterns, at colliders: isotropic dark hadrons with soft and democratic energies. We target the experimental nightmare scenario, SUEPs in exotic Higgs decays, where all dark hadrons decay promptly to SM hadrons. First, we identify three promising observables, the charged particle multiplicity, the event…
▽ More
Confining dark sectors with pseudo-conformal dynamics produce SUEP, or Soft Unclustered Energy Patterns, at colliders: isotropic dark hadrons with soft and democratic energies. We target the experimental nightmare scenario, SUEPs in exotic Higgs decays, where all dark hadrons decay promptly to SM hadrons. First, we identify three promising observables, the charged particle multiplicity, the event ring isotropy, and the matrix of geometric distances between charged tracks. Their patterns can be exploited through a cut-and-count search, supervised machine learning, or an unsupervised autoencoder. We find that the HL-LHC will probe exotic Higgs branching ratios at the per-cent level, even without a detailed knowledge of the signal features. Our techniques can be applied to other SUEP searches, especially the unsupervised strategy, which is independent of overly specific model assumptions and the corresponding precision simulations.
△ Less
Submitted 4 November, 2021; v1 submitted 26 July, 2021;
originally announced July 2021.
-
New Methods and Datasets for Group Anomaly Detection From Fundamental Physics
Authors:
Gregor Kasieczka,
Benjamin Nachman,
David Shih
Abstract:
The identification of anomalous overdensities in data - group or collective anomaly detection - is a rich problem with a large number of real world applications. However, it has received relatively little attention in the broader ML community, as compared to point anomalies or other types of single instance outliers. One reason for this is the lack of powerful benchmark datasets. In this paper, we…
▽ More
The identification of anomalous overdensities in data - group or collective anomaly detection - is a rich problem with a large number of real world applications. However, it has received relatively little attention in the broader ML community, as compared to point anomalies or other types of single instance outliers. One reason for this is the lack of powerful benchmark datasets. In this paper, we first explain how, after the Nobel-prize winning discovery of the Higgs boson, unsupervised group anomaly detection has become a new frontier of fundamental physics (where the motivation is to find new particles and forces). Then we propose a realistic synthetic benchmark dataset (LHCO2020) for the development of group anomaly detection algorithms. Finally, we compare several existing statistically-sound techniques for unsupervised group anomaly detection, and demonstrate their performance on the LHCO2020 dataset.
△ Less
Submitted 6 July, 2021;
originally announced July 2021.
-
Shared Data and Algorithms for Deep Learning in Fundamental Physics
Authors:
Lisa Benato,
Erik Buhmann,
Martin Erdmann,
Peter Fackeldey,
Jonas Glombitza,
Nikolai Hartmann,
Gregor Kasieczka,
William Korcari,
Thomas Kuhr,
Jan Steinheimer,
Horst Stöcker,
Tilman Plehn,
Kai Zhou
Abstract:
We introduce a Python package that provides simply and unified access to a collection of datasets from fundamental physics research - including particle physics, astroparticle physics, and hadron- and nuclear physics - for supervised machine learning studies. The datasets contain hadronic top quarks, cosmic-ray induced air showers, phase transitions in hadronic matter, and generator-level historie…
▽ More
We introduce a Python package that provides simply and unified access to a collection of datasets from fundamental physics research - including particle physics, astroparticle physics, and hadron- and nuclear physics - for supervised machine learning studies. The datasets contain hadronic top quarks, cosmic-ray induced air showers, phase transitions in hadronic matter, and generator-level histories. While public datasets from multiple fundamental physics disciplines already exist, the common interface and provided reference models simplify future work on cross-disciplinary machine learning and transfer learning in fundamental physics. We discuss the design and structure and line out how additional datasets can be submitted for inclusion.
As showcase application, we present a simple yet flexible graph-based neural network architecture that can easily be applied to a wide range of supervised learning tasks. We show that our approach reaches performance close to dedicated methods on all datasets. To simplify adaptation for various problems, we provide easy-to-follow instructions on how graph-based representations of data structures, relevant for fundamental physics, can be constructed and provide code implementations for several of them. Implementations are also provided for our proposed method and all reference algorithms.
△ Less
Submitted 24 March, 2022; v1 submitted 1 July, 2021;
originally announced July 2021.
-
Decoding Photons: Physics in the Latent Space of a BIB-AE Generative Network
Authors:
Erik Buhmann,
Sascha Diefenbacher,
Engin Eren,
Frank Gaede,
Gregor Kasieczka,
Anatolii Korol,
Katja Krüger
Abstract:
Given the increasing data collection capabilities and limited computing resources of future collider experiments, interest in using generative neural networks for the fast simulation of collider events is growing. In our previous study, the Bounded Information Bottleneck Autoencoder (BIB-AE) architecture for generating photon showers in a high-granularity calorimeter showed a high accuracy modelin…
▽ More
Given the increasing data collection capabilities and limited computing resources of future collider experiments, interest in using generative neural networks for the fast simulation of collider events is growing. In our previous study, the Bounded Information Bottleneck Autoencoder (BIB-AE) architecture for generating photon showers in a high-granularity calorimeter showed a high accuracy modeling of various global differential shower distributions. In this work, we investigate how the BIB-AE encodes this physics information in its latent space. Our understanding of this encoding allows us to propose methods to optimize the generation performance further, for example, by altering latent space sampling or by suggesting specific changes to hyperparameters. In particular, we improve the modeling of the shower shape along the particle incident axis.
△ Less
Submitted 29 June, 2021; v1 submitted 24 February, 2021;
originally announced February 2021.
-
The LHC Olympics 2020: A Community Challenge for Anomaly Detection in High Energy Physics
Authors:
Gregor Kasieczka,
Benjamin Nachman,
David Shih,
Oz Amram,
Anders Andreassen,
Kees Benkendorfer,
Blaz Bortolato,
Gustaaf Brooijmans,
Florencia Canelli,
Jack H. Collins,
Biwei Dai,
Felipe F. De Freitas,
Barry M. Dillon,
Ioan-Mihail Dinu,
Zhongtian Dong,
Julien Donini,
Javier Duarte,
D. A. Faroughy,
Julia Gonski,
Philip Harris,
Alan Kahn,
Jernej F. Kamenik,
Charanjit K. Khosa,
Patrick Komiske,
Luc Le Pottier
, et al. (22 additional authors not shown)
Abstract:
A new paradigm for data-driven, model-agnostic new physics searches at colliders is emerging, and aims to leverage recent breakthroughs in anomaly detection and machine learning. In order to develop and benchmark new anomaly detection methods within this framework, it is essential to have standard datasets. To this end, we have created the LHC Olympics 2020, a community challenge accompanied by a…
▽ More
A new paradigm for data-driven, model-agnostic new physics searches at colliders is emerging, and aims to leverage recent breakthroughs in anomaly detection and machine learning. In order to develop and benchmark new anomaly detection methods within this framework, it is essential to have standard datasets. To this end, we have created the LHC Olympics 2020, a community challenge accompanied by a set of simulated collider events. Participants in these Olympics have developed their methods using an R&D dataset and then tested them on black boxes: datasets with an unknown anomaly (or not). This paper will review the LHC Olympics 2020 challenge, including an overview of the competition, a description of methods deployed in the competition, lessons learned from the experience, and implications for data analyses with future datasets as well as future colliders.
△ Less
Submitted 20 January, 2021;
originally announced January 2021.
-
How to GAN Higher Jet Resolution
Authors:
Pierre Baldi,
Lukas Blecher,
Anja Butter,
Julian Collado,
Jessica N. Howard,
Fabian Keilbach,
Tilman Plehn,
Gregor Kasieczka,
Daniel Whiteson
Abstract:
QCD-jets at the LHC are described by simple physics principles. We show how super-resolution generative networks can learn the underlying structures and use them to improve the resolution of jet images. We test this approach on massless QCD-jets and on fat top-jets and find that the network reproduces their main features even without training on pure samples. In addition, we show how a slim networ…
▽ More
QCD-jets at the LHC are described by simple physics principles. We show how super-resolution generative networks can learn the underlying structures and use them to improve the resolution of jet images. We test this approach on massless QCD-jets and on fat top-jets and find that the network reproduces their main features even without training on pure samples. In addition, we show how a slim network architecture can be constructed once we have control of the full network performance.
△ Less
Submitted 2 December, 2021; v1 submitted 22 December, 2020;
originally announced December 2020.
-
DCTRGAN: Improving the Precision of Generative Models with Reweighting
Authors:
Sascha Diefenbacher,
Engin Eren,
Gregor Kasieczka,
Anatolii Korol,
Benjamin Nachman,
David Shih
Abstract:
Significant advances in deep learning have led to more widely used and precise neural network-based generative models such as Generative Adversarial Networks (GANs). We introduce a post-hoc correction to deep generative models to further improve their fidelity, based on the Deep neural networks using the Classification for Tuning and Reweighting (DCTR) protocol. The correction takes the form of a…
▽ More
Significant advances in deep learning have led to more widely used and precise neural network-based generative models such as Generative Adversarial Networks (GANs). We introduce a post-hoc correction to deep generative models to further improve their fidelity, based on the Deep neural networks using the Classification for Tuning and Reweighting (DCTR) protocol. The correction takes the form of a reweighting function that can be applied to generated examples when making predictions from the simulation. We illustrate this approach using GANs trained on standard multimodal probability densities as well as calorimeter simulations from high energy physics. We show that the weighted GAN examples significantly improve the accuracy of the generated samples without a large loss in statistical power. This approach could be applied to any generative model and is a promising refinement method for high energy physics applications and beyond.
△ Less
Submitted 3 September, 2020;
originally announced September 2020.
-
GANplifying Event Samples
Authors:
Anja Butter,
Sascha Diefenbacher,
Gregor Kasieczka,
Benjamin Nachman,
Tilman Plehn
Abstract:
A critical question concerning generative networks applied to event generation in particle physics is if the generated events add statistical precision beyond the training sample. We show for a simple example with increasing dimensionality how generative networks indeed amplify the training statistics. We quantify their impact through an amplification factor or equivalent numbers of sampled events…
▽ More
A critical question concerning generative networks applied to event generation in particle physics is if the generated events add statistical precision beyond the training sample. We show for a simple example with increasing dimensionality how generative networks indeed amplify the training statistics. We quantify their impact through an amplification factor or equivalent numbers of sampled events.
△ Less
Submitted 25 March, 2021; v1 submitted 14 August, 2020;
originally announced August 2020.
-
ABCDisCo: Automating the ABCD Method with Machine Learning
Authors:
Gregor Kasieczka,
Benjamin Nachman,
Matthew D. Schwartz,
David Shih
Abstract:
The ABCD method is one of the most widely used data-driven background estimation techniques in high energy physics. Cuts on two statistically-independent classifiers separate signal and background into four regions, so that background in the signal region can be estimated simply using the other three control regions. Typically, the independent classifiers are chosen "by hand" to be intuitive and p…
▽ More
The ABCD method is one of the most widely used data-driven background estimation techniques in high energy physics. Cuts on two statistically-independent classifiers separate signal and background into four regions, so that background in the signal region can be estimated simply using the other three control regions. Typically, the independent classifiers are chosen "by hand" to be intuitive and physically motivated variables. Here, we explore the possibility of automating the design of one or both of these classifiers using machine learning. We show how to use state-of-the-art decorrelation methods to construct powerful yet independent discriminators. Along the way, we uncover a previously unappreciated aspect of the ABCD method: its accuracy hinges on having low signal contamination in control regions not just overall, but relative to the signal fraction in the signal region. We demonstrate the method with three examples: a simple model consisting of three-dimensional Gaussians; boosted hadronic top jet tagging; and a recasted search for paired dijet resonances. In all cases, automating the ABCD method with machine learning significantly improves performance in terms of ABCD closure, background rejection and signal contamination.
△ Less
Submitted 28 July, 2020;
originally announced July 2020.
-
Towards Machine Learning Analytics for Jet Substructure
Authors:
Gregor Kasieczka,
Simone Marzani,
Gregory Soyez,
Giovanni Stagnitto
Abstract:
The past few years have seen a rapid development of machine-learning algorithms. While surely augmenting performance, these complex tools are often treated as black-boxes and may impair our understanding of the physical processes under study. The aim of this paper is to move a first step into the direction of applying expert-knowledge in particle physics to calculate the optimal decision function…
▽ More
The past few years have seen a rapid development of machine-learning algorithms. While surely augmenting performance, these complex tools are often treated as black-boxes and may impair our understanding of the physical processes under study. The aim of this paper is to move a first step into the direction of applying expert-knowledge in particle physics to calculate the optimal decision function and test whether it is achieved by standard training, thus making the aforementioned black-box more transparent. In particular, we consider the binary classification problem of discriminating quark-initiated jets from gluon-initiated ones. We construct a new version of the widely used N-subjettiness, which features a simpler theoretical behaviour than the original one, while maintaining, if not exceeding, the discrimination power. We input these new observables to the simplest possible neural network, i.e. the one made by a single neuron, or perceptron, and we analytically study the network behaviour at leading logarithmic accuracy. We are able to determine under which circumstances the perceptron achieves optimal performance. We also compare our analytic findings to an actual implementation of a perceptron and to a more realistic neural network and find very good agreement.
△ Less
Submitted 22 September, 2020; v1 submitted 8 July, 2020;
originally announced July 2020.
-
Invertible Networks or Partons to Detector and Back Again
Authors:
Marco Bellagente,
Anja Butter,
Gregor Kasieczka,
Tilman Plehn,
Armand Rousselot,
Ramon Winterhalder,
Lynton Ardizzone,
Ullrich Köthe
Abstract:
For simulations where the forward and the inverse directions have a physics meaning, invertible neural networks are especially useful. A conditional INN can invert a detector simulation in terms of high-level observables, specifically for ZW production at the LHC. It allows for a per-event statistical interpretation. Next, we allow for a variable number of QCD jets. We unfold detector effects and…
▽ More
For simulations where the forward and the inverse directions have a physics meaning, invertible neural networks are especially useful. A conditional INN can invert a detector simulation in terms of high-level observables, specifically for ZW production at the LHC. It allows for a per-event statistical interpretation. Next, we allow for a variable number of QCD jets. We unfold detector effects and QCD radiation to a pre-defined hard process, again with a per-event probabilistic interpretation over parton-level phase space.
△ Less
Submitted 1 October, 2020; v1 submitted 11 June, 2020;
originally announced June 2020.