(Translated by https://www.hiragana.jp/)
Photon Reconstruction in the Belle II Calorimeter Using Graph Neural Networks

HTML conversions sometimes display errors due to content that did not convert correctly from the source. This paper uses the following packages that are not yet supported by the HTML conversion tool. Feedback on these issues are not necessary; they are known and are being worked on.

  • failed: manyfoot

Authors: achieve the best HTML results from your LaTeX submissions by following these best practices.

License: CC BY 4.0
arXiv:2306.04179v2 [hep-ex] 03 Mar 2024

Photon Reconstruction in the Belle II Calorimeter
Using Graph Neural Networks

F. Wemmer 0000-0002-6475-0834    I. Haide 0000-0003-0962-6344    J. Eppelt 0000-0001-8368-3721    T. Ferber 0000-0002-6849-0427    A. Beaubien 0000-0001-9438-089X    P. Branchini 0000-0002-2270-9673    M. Campajola 0000-0003-2518-7134    C. Cecchi 0000-0002-2192-8233    P. Cheema 0000-0001-8472-5727    G. De Nardo 0000-0002-2047-9675    C. Hearty 0000-0001-6568-0252    A. Kuzmin 0000-0002-7011-5044    S. Longo 0000-0002-8124-8969    E. Manoni 0000-0002-9826-7947    F. Meier 0000-0002-6088-0412    M. Merola 0000-0002-7082-8108    K. Miyabayashi 0000-0003-4352-734X    S. Moneta 0000-0003-2184-7510    M. Remnev 0000-0001-6975-1724    J. M. Roney 0000-0001-7802-4617    J.-G. Shiu 0000-0002-8478-5639    B. Shwartz 0000-0002-1456-1496    Y. Unno 0000-0003-3355-765X    R. van Tonder 0000-0002-7448-4816    R. Volpe 0000-0003-1782-2978
Abstract

We present the study of a fuzzy clustering algorithm for the Belle II electromagnetic calorimeter using Graph Neural Networks. We use a realistic detector simulation including simulated beam backgrounds and focus on the reconstruction of both isolated and overlapping photons. We find significant improvements of the energy resolution compared to the currently used reconstruction algorithm for both isolated and overlapping photons of more than 30% for photons with energies Eγがんま<0.5GeVsubscript𝐸𝛾0.5GeVE_{\gamma}<0.5\,\mathrm{\,Ge\kern-1.00006ptV}italic_E start_POSTSUBSCRIPT italic_γがんま end_POSTSUBSCRIPT < 0.5 roman_GeV and high levels of beam backgrounds. Overall, the GNN reconstruction improves the resolution and reduces the tails of the reconstructed energy distribution and therefore is a promising option for the upcoming high luminosity running of Belle II.

keywords:
calorimeter, photon reconstruction, overlapping clusters, high background, fuzzy clustering, machine learning, deep learning, graph neural networks, end-to-end representation spaces

1 Introduction

The Belle II experiment is located at the high-intensity, asymmetric electron-positron-collider SuperKEKB in Tsukuba, Japan. SuperKEKB is colliding 4 GeVGeV\mathrm{\,Ge\kern-1.00006ptV}roman_GeV positron and 7 GeVGeV\mathrm{\,Ge\kern-1.00006ptV}roman_GeV electron beams at a center-of-mass energy of around 10.58 GeVGeV\mathrm{\,Ge\kern-1.00006ptV}roman_GeV to search for rare meson decays and new physics phenomena. Many of these decays include photons in the final state that are reconstructed exclusively in the electromagnetic calorimeter. The experimental program of Belle II targets a significantly increased instantaneous luminosity that ultimately exceeds the predecessor experiment by a factor of 30. This increase in luminosity also leads to a significant increase in beam-induced backgrounds [1]. These background processes produce both high-energy particle interactions that could be misidentified as physics signals, but also energy depositions of low-energy particles that degrade the energy resolution of the electromagnetic crystal calorimeter. The electronics signals from the calorimeter are interpreted during a process called reconstruction to determine the properties of particles that created the signals.

In this paper, we describe a fuzzy clustering algorithm based on Graph Neural Networks (GNNs) to reconstruct photons. The term fuzzy clustering [2] refers to the partial assignment of individual calorimeter crystals to several clustering classes. In our case, these are potentially overlapping, different signal photons, but also a beam background class.

The paper is organized as follows: Section 2 gives an overview of related work on Machine Learning for calorimeter reconstruction. Section 3 describes the Belle II electromagnetic calorimeter. The event simulation and details of the beam background simulation are discussed in Section 4. The conventional Belle II reconstruction algorithm and the new GNN algorithm are described in Section 5. We introduce the metrics used to measure the performance of the GNN algorithm in Section 6. The main performance studies and results are discussed in Section 7. We summarize our results in Section 8.

2 Related work

Machine Learning is widely used in high energy physics for the reconstruction of calorimeter signals both for clustering [3, 4], energy regression [5, 6], but also particle identification [7, 8] and fast simulation [9, 10, 11]. Most of the recent work has been performed in the context of the high-granularity calorimeter (HGCAL) at CMS [12, 13]. For Belle II, the use of machine learning utilizing the electromagnetic calorimeter is so far limited to image-based particle identification in the barrel [14, 8].

GNNs are now widely recognized as one possible solution for irregular geometries in high energy physics [15, 16, 17]. GNN architectures that are able to learn a latent space representation of the detector geometry itself [18, 19] are the basis of the work presented in this paper.

Previous work has focused on simplified and idealized detector geometries, often approximated as a regular grid of readout cells expressed as 2D or 3D images. Additionally, the presence of geometry changes and overlaps between barrel and endcap regions, large variations of cell sizes, and the presence of very high spatially non-uniform noise levels induced by beam background energy depositions are neglected.

For a complete list of works in particle physics that utilize machine learning, we refer to the review [20].

3 The Belle II Electromagnetic Calorimeter

The Belle II detector consists of several subdetectors arranged around the beam pipe in a cylindrical structure that is described in detail in Ref. [21, 22]. We define the z𝑧zitalic_z-axis of the laboratory frame as the central axis of the solenoid. The positive direction is pointing in the direction of the electron beam. The x𝑥xitalic_x axis is horizontal and points away from the accelerator center, while the y𝑦yitalic_y axis is vertical and points upwards. The longitudinal direction, the transverse plane with azimuthal angle ϕitalic-ϕ\phiitalic_ϕ, and the polar angle θしーた𝜃\thetaitalic_θしーた are defined with respect to the detector’s solenoidal axis.

The Belle II electromagnetic calorimeter (ECL) consists of 8736 Thallium-doped CsI (CsI(Tl)) crystals that are grouped in a forward endcap, covering a polar angle 12.4<θしーた<31.4superscript12.4𝜃superscript31.412.4^{\circ}<\theta<31.4^{\circ}12.4 start_POSTSUPERSCRIPT ∘ end_POSTSUPERSCRIPT < italic_θしーた < 31.4 start_POSTSUPERSCRIPT ∘ end_POSTSUPERSCRIPT, a barrel, covering a polar angle 32.2<θしーた<128.7superscript32.2𝜃superscript128.732.2^{\circ}<\theta<128.7^{\circ}32.2 start_POSTSUPERSCRIPT ∘ end_POSTSUPERSCRIPT < italic_θしーた < 128.7 start_POSTSUPERSCRIPT ∘ end_POSTSUPERSCRIPT, and a backward endcap, covering a polar angle 130.7<θしーた<155.1superscript130.7𝜃superscript155.1130.7^{\circ}<\theta<155.1^{\circ}130.7 start_POSTSUPERSCRIPT ∘ end_POSTSUPERSCRIPT < italic_θしーた < 155.1 start_POSTSUPERSCRIPT ∘ end_POSTSUPERSCRIPT. The crystals have a trapezoidal geometry with a nominal cross-sectional area of approximately 6×6666\times 66 × 6 cm22{}^{2}start_FLOATSUPERSCRIPT 2 end_FLOATSUPERSCRIPT and a length of 30 cm, providing 16.1 radiation lengths of material. While crystals in the barrel are similar in cross-section and shape, the crystals in the endcaps vary with masses between 4.03 kg and 5.94 kg [23]; crystals in the endcaps also have significantly more passive material in front of the crystals. Each crystal is aligned in the direction of the collision point with a small tilt in polar angle θしーた𝜃\thetaitalic_θしーた to reduce detection inefficiencies from particles passing between two crystals. Crystals in the barrel additionally have a small tilt in azimuthal angle ϕitalic-ϕ\phiitalic_ϕ. The scintillation light produced in the CsI(Tl) crystals is read out by two photodiodes glued to the back of each crystal. After shaping electronics, the waveform is digitized and the crystal energy Ereccrystalsubscriptsuperscript𝐸crystalrecE^{\mathrm{crystal}}_{\mathrm{rec}}italic_E start_POSTSUPERSCRIPT roman_crystal end_POSTSUPERSCRIPT start_POSTSUBSCRIPT roman_rec end_POSTSUBSCRIPT over baseline and time treccrystalsubscriptsuperscript𝑡crystalrect^{\mathrm{crystal}}_{\mathrm{rec}}italic_t start_POSTSUPERSCRIPT roman_crystal end_POSTSUPERSCRIPT start_POSTSUBSCRIPT roman_rec end_POSTSUBSCRIPT since trigger time of the energy deposition are reconstructed online using FPGAs [24]. Waveforms of crystals with energy depositions above 50 MeV are stored for offline processing to allow for electromagnetic vs. hadronic shower identification through pulse shape discrimination (PSD) [25]. Available information from PSD is

  • the fit type ID of a multi-template fit indicating which of the possible templates provides the best goodness-of-fit,

  • the respective χかい2superscript𝜒2\chi^{2}italic_χかい start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT value as an indicator of the goodness-of-fit,

  • and the ratio of reconstructed hadronic and photon template energies, referred to as PSD hadronic energy ratio in the following.

4 Data Set

In this work, we use simulated events to train and evaluate the reconstruction algorithms. The detector geometry and interactions of final-state particles with detector materials are simulated using Geant4[26] combined with a dedicated detector response simulation. Simulated events are reconstructed and analyzed using the Belle II Analysis Software Framework (basf2) [27, 28]. We simulate isolated photons, with energy 0.1<Egen<1.5GeV0.1subscript𝐸gen1.5GeV0.1<E_{\mathrm{gen}}<1.5\,\text{GeV}0.1 < italic_E start_POSTSUBSCRIPT roman_gen end_POSTSUBSCRIPT < 1.5 GeV, and direction 17<θしーたgen<150superscript17subscript𝜃gensuperscript15017^{\circ}<\theta_{\mathrm{gen}}<150^{\circ}17 start_POSTSUPERSCRIPT ∘ end_POSTSUPERSCRIPT < italic_θしーた start_POSTSUBSCRIPT roman_gen end_POSTSUBSCRIPT < 150 start_POSTSUPERSCRIPT ∘ end_POSTSUPERSCRIPT and 0<ϕgen<360superscript0subscriptitalic-ϕgensuperscript3600^{\circ}<\phi_{\mathrm{gen}}<360^{\circ}0 start_POSTSUPERSCRIPT ∘ end_POSTSUPERSCRIPT < italic_ϕ start_POSTSUBSCRIPT roman_gen end_POSTSUBSCRIPT < 360 start_POSTSUPERSCRIPT ∘ end_POSTSUPERSCRIPT drawn randomly from independent uniform distributions in E𝐸Eitalic_E, θしーた𝜃\thetaitalic_θしーた, and ϕitalic-ϕ\phiitalic_ϕ. The generation vertex of the photons is x=0𝑥0x=0italic_x = 0, y=0𝑦0y=0italic_y = 0, and z=0𝑧0z=0italic_z = 0. For events with two overlapping photons, we first draw randomly one photon with independent uniform distributions as outline above. We then simulate a second photon with an angular separation 2.9<Δでるたαあるふぁ<9.72.9Δでるた𝛼superscript9.72.9<\Delta\alpha<9.7\,^{\circ}2.9 < roman_Δでるた italic_αあるふぁ < 9.7 start_POSTSUPERSCRIPT ∘ end_POSTSUPERSCRIPT drawn randomly from uniform distributions in ΔでるたαあるふぁΔでるた𝛼\Delta\alpharoman_Δでるた italic_αあるふぁ and in E𝐸Eitalic_E. This angular separation covers approximately the distance needed to create two overlapping clusters. These two cases are typical calorimeter signatures in Belle II that describe the majority of photons. We note that the reconstructions of hadrons is a more difficult task not yet covered by our algorithm.

As part of the simulation, we overlay simulated beam background events corresponding to different collision conditions to our signal particles [29, 1]. The simulated beam backgrounds correspond to an instantaneous luminosity of beam=1.06×1034subscriptbeam1.06superscript1034\mathcal{L}_{\text{beam}}=1.06\times 10^{34}caligraphic_L start_POSTSUBSCRIPT beam end_POSTSUBSCRIPT = 1.06 × 10 start_POSTSUPERSCRIPT 34 end_POSTSUPERSCRIPT cm22{}^{-2}start_FLOATSUPERSCRIPT - 2 end_FLOATSUPERSCRIPTs11{}^{-1}start_FLOATSUPERSCRIPT - 1 end_FLOATSUPERSCRIPT (called low beam background), and beam=8×1035subscriptbeam8superscript1035\mathcal{L}_{\text{beam}}=8\times 10^{35}caligraphic_L start_POSTSUBSCRIPT beam end_POSTSUBSCRIPT = 8 × 10 start_POSTSUPERSCRIPT 35 end_POSTSUPERSCRIPT cm22{}^{-2}start_FLOATSUPERSCRIPT - 2 end_FLOATSUPERSCRIPTs11{}^{-1}start_FLOATSUPERSCRIPT - 1 end_FLOATSUPERSCRIPT (called high beam background). Those two values approximately correspond to the conditions in 2021, and the expected conditions slightly above the design luminosity, respectively. The spatial distribution of beam backgrounds is asymmetric: They are much higher in the backward endcap than in the forward endcap, and they are slightly higher in the barrel than in the forward endcap. Additional electronics noise per crystal of about 0.35 MeV is included in our simulation as well.

The supervised training and the performance evaluation both use labeled information that relies on matching reconstructed information with the simulated truth information. For each of the four configurations, isolated and overlapping photons with low and high beam backgrounds, we use 1.8 million events for training and 200 000 events for validation. The performance evaluation is carried out on a large number of statistically independent samples simulated with various energies and in different detector regions.

We then study the performance of the GNN clustering algorithm in all four scenarios and compare it to the baseline basf2 reconstruction. Both reconstruction algorithms are described in detail in Sec. 5.

4.1 Isolated Photon

To study isolated photons, we use the simulated events with a generated isolated photon only. For each event, we select a region of interest (ROI): We first determine the azimuthal angles of the fourth neighbour on either side of the local maximum (LM), and the polar angles of the fourth neighbours on either direction of the LM. We then include all crystals in that angular range. In the barrel this defines a regular 9×9999\times 99 × 9 array of crystals centered around a LM, while in the endcaps this array is not necessarily regular, but can contain a few crystals more or less. The LM is a crystal with at least 10 MeV of reconstructed crystal energy, and energy higher than all its direct eight neighbors. The LM must be the only LM in the ROI, and the matched truth particle must be a simulated photon responsible for at least 20% of the reconstructed crystal energy. Precisely, for the LM we require the ratio

rLMγがんま1=Edepγがんま1,crystalLMEreccrystalLM0.2.subscriptsuperscript𝑟subscript𝛾1LMsubscriptsuperscript𝐸subscript𝛾1subscriptcrystalLMdepsubscriptsuperscript𝐸subscriptcrystalLMrec0.2r^{\gamma_{1}}_{\mathrm{LM}}=\frac{E^{\gamma_{1}\mathrm{,crystal}_{\mathrm{LM}% }}_{\mathrm{dep}}}{E^{\mathrm{crystal}_{\mathrm{LM}}}_{\mathrm{rec}}}\geq 0.2.italic_r start_POSTSUPERSCRIPT italic_γがんま start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT end_POSTSUPERSCRIPT start_POSTSUBSCRIPT roman_LM end_POSTSUBSCRIPT = divide start_ARG italic_E start_POSTSUPERSCRIPT italic_γがんま start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , roman_crystal start_POSTSUBSCRIPT roman_LM end_POSTSUBSCRIPT end_POSTSUPERSCRIPT start_POSTSUBSCRIPT roman_dep end_POSTSUBSCRIPT end_ARG start_ARG italic_E start_POSTSUPERSCRIPT roman_crystal start_POSTSUBSCRIPT roman_LM end_POSTSUBSCRIPT end_POSTSUPERSCRIPT start_POSTSUBSCRIPT roman_rec end_POSTSUBSCRIPT end_ARG ≥ 0.2 . (1)

Here, Edepγがんま1,crystalLMsubscriptsuperscript𝐸subscript𝛾1subscriptcrystalLMdepE^{\gamma_{1}\mathrm{,crystal}_{\mathrm{LM}}}_{\mathrm{dep}}italic_E start_POSTSUPERSCRIPT italic_γがんま start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , roman_crystal start_POSTSUBSCRIPT roman_LM end_POSTSUBSCRIPT end_POSTSUPERSCRIPT start_POSTSUBSCRIPT roman_dep end_POSTSUBSCRIPT denotes the truth energy deposition of photon 1 in the LM, and EreccrystalLMsubscriptsuperscript𝐸subscriptcrystalLMrecE^{\mathrm{crystal}_{\mathrm{LM}}}_{\mathrm{rec}}italic_E start_POSTSUPERSCRIPT roman_crystal start_POSTSUBSCRIPT roman_LM end_POSTSUBSCRIPT end_POSTSUPERSCRIPT start_POSTSUBSCRIPT roman_rec end_POSTSUBSCRIPT the reconstructed crystal energy in the LM. The crystals contained in the ROI are considered for the clustering by the GNN algorithm and significantly extend the 5×5555\times 55 × 5 area considered by the baseline algorithm (Sec. 5). Furthermore, the ROI represents the area of the local coordinate system later used as an input feature, with the LM as the origin. Figure 1 (top) shows a typical isolated photon event with high beam background.

Refer to caption
Refer to caption
Refer to caption
Refer to caption
(a) Truth assignment, colors indicate the fraction belonging to each of the photons and beam background.
Refer to caption
(b) Reconstructed time t𝑡titalic_t since trigger time.
Refer to caption
(c) Reconstructed PSD hadronic energy ratio. Gray markers indicate that no PSD information is available.
Figure 1: Typical event displays showing (left) simulated truth assignments, (center) input variables time, and (right) PSD hadronic energy ratio for (top) isolated and (bottom) overlapping photons for two example events with high beam background. The marker centers indicate the crystal centers, the marker area is proportional to the truth energy deposition for the left plots; it is proportional to the reconstructed crystal energy for the other plots.

4.2 Overlapping Photons

Two different photons that deposit some of their energy in identical crystals are referred to as overlapping photons. To study overlapping photons, we use the simulated events with two overlapping photons only. We select events that have exactly two LMs that must fulfill the following selection criteria:

  1. a)

    each LM must have reconstructed crystal energies greater than 10 MeV,

  2. b)

    rLM1γがんま10.2subscriptsuperscript𝑟subscript𝛾1subscriptLM10.2r^{\gamma_{1}}_{\mathrm{LM_{1}}}\geq 0.2italic_r start_POSTSUPERSCRIPT italic_γがんま start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT end_POSTSUPERSCRIPT start_POSTSUBSCRIPT roman_LM start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT end_POSTSUBSCRIPT ≥ 0.2 and rLM1γがんま1>rLM1γがんま2subscriptsuperscript𝑟subscript𝛾1subscriptLM1subscriptsuperscript𝑟subscript𝛾2subscriptLM1r^{\gamma_{1}}_{\mathrm{LM_{1}}}>r^{\gamma_{2}}_{\mathrm{LM_{1}}}italic_r start_POSTSUPERSCRIPT italic_γがんま start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT end_POSTSUPERSCRIPT start_POSTSUBSCRIPT roman_LM start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT end_POSTSUBSCRIPT > italic_r start_POSTSUPERSCRIPT italic_γがんま start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT end_POSTSUPERSCRIPT start_POSTSUBSCRIPT roman_LM start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT end_POSTSUBSCRIPT,

  3. c)

    rLM2γがんま20.2subscriptsuperscript𝑟subscript𝛾2subscriptLM20.2r^{\gamma_{2}}_{\mathrm{LM_{2}}}\geq 0.2italic_r start_POSTSUPERSCRIPT italic_γがんま start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT end_POSTSUPERSCRIPT start_POSTSUBSCRIPT roman_LM start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT end_POSTSUBSCRIPT ≥ 0.2 and rLM2γがんま2>rLM2γがんま1subscriptsuperscript𝑟subscript𝛾2subscriptLM2subscriptsuperscript𝑟subscript𝛾1subscriptLM2r^{\gamma_{2}}_{\mathrm{LM_{2}}}>r^{\gamma_{1}}_{\mathrm{LM_{2}}}italic_r start_POSTSUPERSCRIPT italic_γがんま start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT end_POSTSUPERSCRIPT start_POSTSUBSCRIPT roman_LM start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT end_POSTSUBSCRIPT > italic_r start_POSTSUPERSCRIPT italic_γがんま start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT end_POSTSUPERSCRIPT start_POSTSUBSCRIPT roman_LM start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT end_POSTSUBSCRIPT.

We refer to criteria a)-c) as LM separation criteria since they ensure that the particles form two separate LMs. Additionally, events must meet the overlap criterion:

  1. d)

    each of the two photons must deposit at least 10 MeV energy in shared crystals within a 5×5555\times 55 × 5 area around its respective LM.

Refer to caption
Figure 2: Fraction of selected overlapping photon events in the barrel as a function of generated opening angle. The orange markers correspond to events fulfilling LM separation criteria a)-c); the blue markers correspond to events that additionally pass the overlap criterion d) (see text for details).

Figure 2 shows the fraction of events accepted by these selections as a function of the simulated opening angle. In the scope of this paper, we additionally require LMs to exclusively originate from simulated particles without additional LMs, e.g. from beam background, in the ROI, that is:

  1. e)

    the two LMs must be the only ones in the ROI and they must be truth-matched to the simulated photons.

Finally, we remove rare cases of small truth energy depositions and large backgrounds, by requiring:

  1. f)

    the crystal with the largest truth energy deposition of a photon must be within a 5×5555\times 55 × 5 area around its corresponding LM.

We then create a ROI centered at the midpoint between the two LMs, calculated using the shortest distance between two LMs projected onto the surface of a sphere. The crystal closest to the midpoint is defined as the ROI center. The LM positions for this are determined by interpreting the global LM coordinates of their associated crystals as latitude and longitude. Figure 1 (bottom) shows an overlapping photon event with high beam background.

The truth energy deposition per photon and the reconstructed crystal energy Ereccrystalsubscriptsuperscript𝐸crystalrecE^{\mathrm{crystal}}_{\mathrm{rec}}italic_E start_POSTSUPERSCRIPT roman_crystal end_POSTSUPERSCRIPT start_POSTSUBSCRIPT roman_rec end_POSTSUBSCRIPT, crystal time treccrystalsubscriptsuperscript𝑡crystalrect^{\mathrm{crystal}}_{\mathrm{rec}}italic_t start_POSTSUPERSCRIPT roman_crystal end_POSTSUPERSCRIPT start_POSTSUBSCRIPT roman_rec end_POSTSUBSCRIPT, crystal PSD information (see Sec. 3), and the LM positions within the ROI are recorded for each event.

5 Reconstruction Algorithms

Interactions of energetic photons in the Belle II ECL typically deposit energy in up to 5×5555\times 55 × 5 crystals. The task of the clustering reconstruction algorithms is to select a set of crystals that contains all the energy of the incoming photon, but no energy from other particles or from beam background. Low beam background results in approximately 17%percent1717\,\%17 % of all crystals in the ECL having significant reconstructed energy Ereccrystal1subscriptsuperscript𝐸crystalrec1E^{\mathrm{crystal}}_{\mathrm{rec}}\geq 1\,italic_E start_POSTSUPERSCRIPT roman_crystal end_POSTSUPERSCRIPT start_POSTSUBSCRIPT roman_rec end_POSTSUBSCRIPT ≥ 1MeV; for high beam backgrounds this number is expected to increase to about 40%percent4040\,\%40 %. This increase in the number of crystals to consider in the clustering, adds to the complexity of the reconstruction.

5.1 Baseline

The baseline algorithm is designed to provide maximum efficiency for cluster finding, contain all crystals from the incoming particle for particle identification, and select an optimal subset of the cluster crystals that provides the best energy resolution [21]. The clustering is performed in three steps. In the first step, all crystals are grouped into a connected set of crystals, so-called connected regions starting with LMs, as defined previously. In an iterative procedure all direct neighbors with energies above 0.5 MeV are added to this LM, and the process is continued if any neighbor itself has energy above 10 MeV. Overlapping connected regions are merged into one.

In the second step, each connected region is split into clusters, one per LM. If there is only one LM in the connected region, up to 21 crystals in a 5×5555\times 55 × 5 area excluding corners centered at the local maximum are grouped into a cluster. If there is more than one LM in a connected region, the energy in each crystal of the connected region is assigned a distance-dependent weight and can be shared between different clusters. The distance is calculated from the cluster centroid to each crystal center, where the cluster centroid is updated iteratively using logarithmic energy weights. This process is repeated until all cluster centroids in a connected region are stable within 1 mm.

In a third step, an optimal subset, including the n𝑛nitalic_n highest energetic crystals of all non-zero weighted crystals that minimize the energy resolution, is used to predict the cluster energy Erecbasf2superscriptsubscript𝐸recbasf2E_{\mathrm{rec}}^{\mathrm{basf2}}italic_E start_POSTSUBSCRIPT roman_rec end_POSTSUBSCRIPT start_POSTSUPERSCRIPT basf2 end_POSTSUPERSCRIPT. n𝑛nitalic_n depends on the measured noise in the event, and on the energy of the LM itself. The noise level is estimated by counting the number of crystals in the event containing more than 5 MeV that have times t𝑡titalic_t more than 125 ns from the trigger time. Erecbasf2superscriptsubscript𝐸recbasf2E_{\mathrm{rec}}^{\mathrm{basf2}}italic_E start_POSTSUBSCRIPT roman_rec end_POSTSUBSCRIPT start_POSTSUPERSCRIPT basf2 end_POSTSUPERSCRIPT is also corrected already within basf2 for possible bias using simulated events. This bias includes leakage (energy not deposited in the crystals included in the energy sum) and beam backgrounds (energy included in the sum that is not from the signal photon). Erecbasf2superscriptsubscript𝐸recbasf2E_{\mathrm{rec}}^{\mathrm{basf2}}italic_E start_POSTSUBSCRIPT roman_rec end_POSTSUBSCRIPT start_POSTSUPERSCRIPT basf2 end_POSTSUPERSCRIPT is the estimator for the generated energy of a particle.

The basf2 clustering algorithm also returns a cluster energy Erec,rawbasf2superscriptsubscript𝐸recrawbasf2E_{\mathrm{rec,\,raw}}^{\mathrm{basf2}}italic_E start_POSTSUBSCRIPT roman_rec , roman_raw end_POSTSUBSCRIPT start_POSTSUPERSCRIPT basf2 end_POSTSUPERSCRIPT that is not corrected for energy bias. Erec,rawbasf2superscriptsubscript𝐸recrawbasf2E_{\mathrm{rec,\,raw}}^{\mathrm{basf2}}italic_E start_POSTSUBSCRIPT roman_rec , roman_raw end_POSTSUBSCRIPT start_POSTSUPERSCRIPT basf2 end_POSTSUPERSCRIPT is the estimator for the deposited energy of a particle.

5.2 Graph Neural Network Architecture

GNN architectures have shown that they are powerful network types to deal with both irregular geometries and varying input sizes. In this work, all crystals of an ROI with an energy deposition above 1 MeV are interpreted as nodes in a graph, which leads to variable input sizes and is thus a good use case for GNNs. The implementation of this GNN is done in PyTorch Geometric [30].

The input features consist of crystal properties and crystal measurements: The global coordinates θしーた𝜃\thetaitalic_θしーた and ϕitalic-ϕ\phiitalic_ϕ of each crystal, the local coordinates θしーたsuperscript𝜃\theta^{\prime}italic_θしーた start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT and ϕsuperscriptitalic-ϕ\phi^{\prime}italic_ϕ start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT with respect to the ROI center, the crystal mass, and the LM(s) (in one-hot encoding) represent crystal properties. The crystal energy Ereccrystalsubscriptsuperscript𝐸crystalrecE^{\mathrm{crystal}}_{\mathrm{rec}}italic_E start_POSTSUPERSCRIPT roman_crystal end_POSTSUPERSCRIPT start_POSTSUBSCRIPT roman_rec end_POSTSUBSCRIPT in GeV, the time treccrystalsubscriptsuperscript𝑡crystalrect^{\mathrm{crystal}}_{\mathrm{rec}}italic_t start_POSTSUPERSCRIPT roman_crystal end_POSTSUPERSCRIPT start_POSTSUBSCRIPT roman_rec end_POSTSUBSCRIPT in μみゅー𝜇\muitalic_μみゅーs, and the PSD fit type, PSD χかい2superscript𝜒2\chi^{2}italic_χかい start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT, and PSD hadronic energy ratio are crystal measurements used as input features. Pre-processing scales the input uniformly before further processing with the GNN: All features are min-max normalized to an interval of [0,1]01[0,1][ 0 , 1 ] with the exception of treccrystalsubscriptsuperscript𝑡crystalrect^{\mathrm{crystal}}_{\mathrm{rec}}italic_t start_POSTSUPERSCRIPT roman_crystal end_POSTSUPERSCRIPT start_POSTSUBSCRIPT roman_rec end_POSTSUBSCRIPT and the PSD hadronic energy ratio which are both normalized to the interval [1,1]11[-1,1][ - 1 , 1 ]. The global coordinates and the crystal masses are normalized based on the range of coordinates and masses of all crystals in the detector instead of only the ones in the ROI. Additionally, we average each input feature over all nodes in the ROI and concatenate the averaged input features as additional inputs, thus enabling a global exchange of information.

As displayed in Fig. 3, our model is built out of four so-called GravNet [19] blocks of which the concatenated outputs are passed through three dense output layers with a final softmax activation function. Each GravNet block features three dense layers at the beginning of the block, the initial two of which with ELU [31] activation functions and the last one with a tanh\tanhroman_tanh activation function. The dense layers feed into a GravNet layer and the overall GravNet block is concluded by a batch normalization layer [32]. The GravNet layer is responsible for the graph building and subsequent message passing between the nodes of the graph. It first translates the input features into two learned representation spaces: one representing spatial information S𝑆Sitalic_S while the other, denoted FLRsubscript𝐹LRF_{\mathrm{LR}}italic_F start_POSTSUBSCRIPT roman_LR end_POSTSUBSCRIPT, contains the transformed features used for message passing. In the second step, each node is connected to its k𝑘kitalic_k nearest neighbors defined by the Euclidean distances in S𝑆Sitalic_S, thus creating an undirected, connected graph. For each node, the input features of connected nodes are then weighted by a Gaussian potential depending on the distance in S𝑆Sitalic_S and aggregated by summation. The resulting features are concatenated with the GravNet input features and, after batch normalization, passed to the next GravNet block and to the dense output layers.

The implementation in the present work follows the concept of fuzzy clustering which refers to the partial assignment of individual crystals to several clustering classes. Consequently, the GNN predicts weights wiXsuperscriptsubscript𝑤𝑖Xw_{i}^{\mathrm{X}}italic_w start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT start_POSTSUPERSCRIPT roman_X end_POSTSUPERSCRIPT that indicate the proportion of the reconstructed energy Ereccrystalisubscriptsuperscript𝐸subscriptcrystalirecE^{\mathrm{crystal_{i}}}_{\mathrm{rec}}italic_E start_POSTSUPERSCRIPT roman_crystal start_POSTSUBSCRIPT roman_i end_POSTSUBSCRIPT end_POSTSUPERSCRIPT start_POSTSUBSCRIPT roman_rec end_POSTSUBSCRIPT in a crystal i𝑖iitalic_i that belongs to a clustering class X. For models used with isolated photons, X{γがんま1,background}Xsubscript𝛾1background\mathrm{X}\in\{\gamma_{1},\mathrm{background}\}roman_X ∈ { italic_γがんま start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , roman_background }, for models with overlapping photons X{γがんま1,γがんま2,background}Xsubscript𝛾1subscript𝛾2background\mathrm{X}\in\{\gamma_{1},\gamma_{2},\mathrm{background}\}roman_X ∈ { italic_γがんま start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , italic_γがんま start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT , roman_background }. As a loss function, we then use the Mean Squared Error (MSE) between the true and predicted weights summed over all classes and crystals. The training is stopped when there has been no improvement for 15 epochs in the optimization objective. For low beam background models that objective is the MSE loss on the validation data set, whereas the high beam background models employ the more high-level FWHMdepdep{}_{\mathrm{dep}}start_FLOATSUBSCRIPT roman_dep end_FLOATSUBSCRIPT (Sec. 6) on the validation data set.

Refer to caption
Figure 3: An illustration of the GNN architecture. Each pair of gray, square brackets represents one GravNet block consisting of dense layers, a GravNet layer and a batch norm layer. The input features describe the feature vector of one node. The global exchange denotes appending the average each input features over all nodes in the ROI.
Table 1: Optimized hyperparameters of the isolated photon, and overlapping photon GravNet models. The hyperparameters are the result of an optimization of the FWHMdepdep{}_{\mathrm{dep}}start_FLOATSUBSCRIPT roman_dep end_FLOATSUBSCRIPT on the respective high background validation data set.
Hyperparameter Isolated Photon Models Overlapping Photon Models
Width of the Dense Layers, FIN,IN{}_{\mathrm{IN}},\,start_FLOATSUBSCRIPT roman_IN end_FLOATSUBSCRIPT ,FOUTOUT{}_{\mathrm{OUT}}start_FLOATSUBSCRIPT roman_OUT end_FLOATSUBSCRIPT 22 24
Feature Space Dimension FLRLR{}_{\mathrm{LR}}start_FLOATSUBSCRIPT roman_LR end_FLOATSUBSCRIPT 16 16
Spatial Information Space Dimension S 6 6
Connected Nearest Neighbors k𝑘kitalic_k 14 16
Batch Norm Momentum 0.01 0.4
Stacked GravNet Blocks 4 4
Batch Size 1024 512
Refer to caption
(a) Truth
Refer to caption
(b) GNN
Refer to caption
(c) basf2
Figure 4: Comparison of (3(a)) truth energy fractions , (3(b)) reconstructed energy fraction by the GNN , and (3(c)) reconstructed energy fraction by basf2 for an example event with high beam background. Colors indicate the fractions belonging to each photon or background. The marker centers indicate the crystal centers, the marker area is proportional to the truth or reconstructed (GNN, basf2) energy deposition respectively.

Hyperparameters have been chosen through a hyperparameter optimization using Optuna [33]. The optimization is done with respect to the FWHMdepdep{}_{\mathrm{dep}}start_FLOATSUBSCRIPT roman_dep end_FLOATSUBSCRIPT (Sec. 6) instead of the loss function. We optimize the two models trained for high beam backgrounds and use the respective hyperparameters also for the corresponding low beam background models. The final hyperparameters for both the isolated photon models and the overlapping photon models are shown in Table 1.

The learning rate, the number of dense layers in each GravNet block, and all dimensions of the output layers have been manually optimized by testing a reasonable range of values. The learning rate is set to 5 × 103absentsuperscript103\times\,10^{-3}× 10 start_POSTSUPERSCRIPT - 3 end_POSTSUPERSCRIPT and is subject to a decay factor of 0.25 after every five epochs of stagnating validation loss. We did not observe significant over-training and as a consequence, we do not use dropout layers or other regularization methods but rely on the large data set.

The GNN algorithm yields the weights wiXsuperscriptsubscript𝑤𝑖Xw_{i}^{\mathrm{X}}italic_w start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT start_POSTSUPERSCRIPT roman_X end_POSTSUPERSCRIPT per crystal for all crystals in the ROI with an energy deposition above 1 MeV. In order to reconstruct the total cluster energy ErecGNNsuperscriptsubscript𝐸recGNNE_{\mathrm{rec}}^{\mathrm{GNN}}italic_E start_POSTSUBSCRIPT roman_rec end_POSTSUBSCRIPT start_POSTSUPERSCRIPT roman_GNN end_POSTSUPERSCRIPT associated with a certain particle, we then sum over all specific weights multiplied by the reconstructed energies per crystal, ErecGNN=wiXEreccrystalisuperscriptsubscript𝐸recGNNsuperscriptsubscript𝑤𝑖Xsubscriptsuperscript𝐸subscriptcrystalirecE_{\mathrm{rec}}^{\mathrm{GNN}}=\sum w_{i}^{\mathrm{X}}E^{\mathrm{crystal_{i}}% }_{\mathrm{rec}}italic_E start_POSTSUBSCRIPT roman_rec end_POSTSUBSCRIPT start_POSTSUPERSCRIPT roman_GNN end_POSTSUPERSCRIPT = ∑ italic_w start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT start_POSTSUPERSCRIPT roman_X end_POSTSUPERSCRIPT italic_E start_POSTSUPERSCRIPT roman_crystal start_POSTSUBSCRIPT roman_i end_POSTSUBSCRIPT end_POSTSUPERSCRIPT start_POSTSUBSCRIPT roman_rec end_POSTSUBSCRIPT.

Figure 4 shows how the GNN and the basf2 algorithms behave in clustering a typical case of overlapping photons.

6 Metrics

For performance evaluation, the reconstructed energy of a particle is compared with two different truth targets: the total deposited truth energy Edepsubscript𝐸depE_{\mathrm{dep}}italic_E start_POSTSUBSCRIPT roman_dep end_POSTSUBSCRIPT per photon in the ROI, and the generated truth energy Egensubscript𝐸genE_{\mathrm{gen}}italic_E start_POSTSUBSCRIPT roman_gen end_POSTSUBSCRIPT per photon. This results in two variants of relative reconstruction errors. The reconstruction error on the deposited energy

ηいーたdepbasf2superscriptsubscript𝜂depbasf2\displaystyle\eta_{\text{dep}}^{\text{basf2}}italic_ηいーた start_POSTSUBSCRIPT dep end_POSTSUBSCRIPT start_POSTSUPERSCRIPT basf2 end_POSTSUPERSCRIPT =Erec,rawbasf2EdepEdepandabsentsuperscriptsubscript𝐸recrawbasf2subscript𝐸depsubscript𝐸depand\displaystyle=\frac{E_{\mathrm{rec,\,raw}}^{\mathrm{basf2}}-E_{\mathrm{dep}}}{% E_{\mathrm{dep}}}\quad\text{and}= divide start_ARG italic_E start_POSTSUBSCRIPT roman_rec , roman_raw end_POSTSUBSCRIPT start_POSTSUPERSCRIPT basf2 end_POSTSUPERSCRIPT - italic_E start_POSTSUBSCRIPT roman_dep end_POSTSUBSCRIPT end_ARG start_ARG italic_E start_POSTSUBSCRIPT roman_dep end_POSTSUBSCRIPT end_ARG and
ηいーたdepGNNsuperscriptsubscript𝜂depGNN\displaystyle\eta_{\text{dep}}^{\text{GNN}}italic_ηいーた start_POSTSUBSCRIPT dep end_POSTSUBSCRIPT start_POSTSUPERSCRIPT GNN end_POSTSUPERSCRIPT =ErecGNNEdepEdepabsentsuperscriptsubscript𝐸recGNNsubscript𝐸depsubscript𝐸dep\displaystyle=\frac{E_{\mathrm{rec}}^{\mathrm{GNN}}-E_{\mathrm{dep}}}{E_{% \mathrm{dep}}}= divide start_ARG italic_E start_POSTSUBSCRIPT roman_rec end_POSTSUBSCRIPT start_POSTSUPERSCRIPT roman_GNN end_POSTSUPERSCRIPT - italic_E start_POSTSUBSCRIPT roman_dep end_POSTSUBSCRIPT end_ARG start_ARG italic_E start_POSTSUBSCRIPT roman_dep end_POSTSUBSCRIPT end_ARG (2)

gives access to the energy resolution ignoring leakage and other detector effects. It is a direct evaluation of the clustering performance of an algorithm.

On the other hand, the reconstruction error on the generated energy

ηいーたgenbasf2superscriptsubscript𝜂genbasf2\displaystyle\eta_{\text{gen}}^{\text{basf2}}italic_ηいーた start_POSTSUBSCRIPT gen end_POSTSUBSCRIPT start_POSTSUPERSCRIPT basf2 end_POSTSUPERSCRIPT =Erecbasf2EgenEgenandabsentsuperscriptsubscript𝐸recbasf2subscript𝐸gensubscript𝐸genand\displaystyle=\frac{E_{\mathrm{rec}}^{\mathrm{basf2}}-E_{\mathrm{gen}}}{E_{% \mathrm{gen}}}\quad\text{and}= divide start_ARG italic_E start_POSTSUBSCRIPT roman_rec end_POSTSUBSCRIPT start_POSTSUPERSCRIPT basf2 end_POSTSUPERSCRIPT - italic_E start_POSTSUBSCRIPT roman_gen end_POSTSUBSCRIPT end_ARG start_ARG italic_E start_POSTSUBSCRIPT roman_gen end_POSTSUBSCRIPT end_ARG and
ηいーたgenGNNsuperscriptsubscript𝜂genGNN\displaystyle\eta_{\text{gen}}^{\text{GNN}}italic_ηいーた start_POSTSUBSCRIPT gen end_POSTSUBSCRIPT start_POSTSUPERSCRIPT GNN end_POSTSUPERSCRIPT =ErecGNNEgenEgenabsentsuperscriptsubscript𝐸recGNNsubscript𝐸gensubscript𝐸gen\displaystyle=\frac{E_{\mathrm{rec}}^{\mathrm{GNN}}-E_{\mathrm{gen}}}{E_{% \mathrm{gen}}}= divide start_ARG italic_E start_POSTSUBSCRIPT roman_rec end_POSTSUBSCRIPT start_POSTSUPERSCRIPT roman_GNN end_POSTSUPERSCRIPT - italic_E start_POSTSUBSCRIPT roman_gen end_POSTSUBSCRIPT end_ARG start_ARG italic_E start_POSTSUBSCRIPT roman_gen end_POSTSUBSCRIPT end_ARG (3)

factors in all detector and physics effects and quantifies how much of the improvements to the underlying clustering carry over to downstream physics object reconstruction.

Refer to caption
Figure 5: Example distribution of the relative reconstruction error ηいーたgensubscript𝜂gen\eta_{\text{gen}}italic_ηいーた start_POSTSUBSCRIPT gen end_POSTSUBSCRIPT of the generated energy and illustration of the bias correction, the FWHM, and the tail ranges.

Evaluating both algorithms on a large number of simulated photons yields peaking distributions in both reconstruction errors ηいーたdepsubscript𝜂dep\eta_{\text{dep}}italic_ηいーた start_POSTSUBSCRIPT dep end_POSTSUBSCRIPT and ηいーたgensubscript𝜂gen\eta_{\text{gen}}italic_ηいーた start_POSTSUBSCRIPT gen end_POSTSUBSCRIPT. Both distributions are potentially biased because of energy leakage and the presence of beam backgrounds (see Sec. 5.1). We perform a binned fit using a double-sided crystal ball [34, 35] function as probability density function (pdf) with the kafe2 [36] framework. We shift all reconstruction error distributions independently by a multiplicative factor to correct the difference between the fitted peak position and zero (Fig. 5). Since ηいーたdepsubscript𝜂dep\eta_{\text{dep}}italic_ηいーた start_POSTSUBSCRIPT dep end_POSTSUBSCRIPT and ηいーたgensubscript𝜂gen\eta_{\text{gen}}italic_ηいーた start_POSTSUBSCRIPT gen end_POSTSUBSCRIPT are asymmetric distributions, we repeat this procedure until the difference between the fitted peak position and zero is less than 0.002. This procedure usually converges within two or three iterations.

We then determine the full width half maximum (FWHM) of the final shifted distributions in ηいーたdepsubscript𝜂dep\eta_{\text{dep}}italic_ηいーた start_POSTSUBSCRIPT dep end_POSTSUBSCRIPT and ηいーたgensubscript𝜂gen\eta_{\text{gen}}italic_ηいーた start_POSTSUBSCRIPT gen end_POSTSUBSCRIPT, yielding FWHMdepsubscriptFWHMdep\text{FWHM}_{\text{dep}}FWHM start_POSTSUBSCRIPT dep end_POSTSUBSCRIPT and FWHMgensubscriptFWHMgen\text{FWHM}_{\text{gen}}FWHM start_POSTSUBSCRIPT gen end_POSTSUBSCRIPT respectively. The uncertainty on the FWHM is calculated from the uncertainties of the fit parameters. In addition to the FWHM, we determine the tails of the reconstruction error distribution. The left and right tails TL,Rsubscript𝑇L,RT_{\text{L,R}}italic_T start_POSTSUBSCRIPT L,R end_POSTSUBSCRIPT are calculated as the 95th percentile when ranking the unbinned events on the respective side of the peak position, as given by the fit parameters, in ascending order (TRsubscript𝑇RT_{\text{R}}italic_T start_POSTSUBSCRIPT R end_POSTSUBSCRIPT) and descending order (TLsubscript𝑇LT_{\text{L}}italic_T start_POSTSUBSCRIPT L end_POSTSUBSCRIPT) respectively. Propagating the uncertainty on the peak position as given by the fit yields the uncertainty on TL,Rsubscript𝑇L,RT_{\text{L,R}}italic_T start_POSTSUBSCRIPT L,R end_POSTSUBSCRIPT.

7 Results

The first sections of the results focus on detailed studies of isolated clusters. Section 7.4 then introduces overlapping clusters and their effects on the performance. Figure 6 shows examples for the distributions of both reconstruction errors ηいーたdepsubscript𝜂dep\eta_{\text{dep}}italic_ηいーた start_POSTSUBSCRIPT dep end_POSTSUBSCRIPT and ηいーたgensubscript𝜂gen\eta_{\text{gen}}italic_ηいーた start_POSTSUBSCRIPT gen end_POSTSUBSCRIPT, as well as the fit results for events with low beam background. Figure 7 shows the equivalent distributions for events with high beam background.

The ηいーたgensubscript𝜂gen\eta_{\text{gen}}italic_ηいーた start_POSTSUBSCRIPT gen end_POSTSUBSCRIPT distributions are wider because the reconstruction error includes the effects of leakage which result in missing energy with respect to the generated photon energy. This only affects the left-side tails.

Refer to caption
(a) Relative reconstruction error ηいーたdepsubscript𝜂dep\eta_{\text{dep}}italic_ηいーた start_POSTSUBSCRIPT dep end_POSTSUBSCRIPT of the deposited energy.
Refer to caption
(b) Relative reconstruction error ηいーたgensubscript𝜂gen\eta_{\text{gen}}italic_ηいーた start_POSTSUBSCRIPT gen end_POSTSUBSCRIPT of the generated energy.
Figure 6: Distribution of relative reconstruction errors (5(a)ηいーたdepsubscript𝜂dep\eta_{\text{dep}}italic_ηいーた start_POSTSUBSCRIPT dep end_POSTSUBSCRIPT and (5(b)ηいーたgensubscript𝜂gen\eta_{\text{gen}}italic_ηいーた start_POSTSUBSCRIPT gen end_POSTSUBSCRIPT for isolated clusters for low beam backgrounds. The first bin contains all underflow entries; the last bin contains all overflow entries.
Refer to caption
(a) Relative reconstruction error ηいーたdepsubscript𝜂dep\eta_{\text{dep}}italic_ηいーた start_POSTSUBSCRIPT dep end_POSTSUBSCRIPT of the generated energy.
Refer to caption
(b) Relative reconstruction error ηいーたgensubscript𝜂gen\eta_{\text{gen}}italic_ηいーた start_POSTSUBSCRIPT gen end_POSTSUBSCRIPT of the generated energy.
Figure 7: Distributions of relative reconstruction errors (6(a)ηいーたdepsubscript𝜂dep\eta_{\text{dep}}italic_ηいーた start_POSTSUBSCRIPT dep end_POSTSUBSCRIPT and (6(b)) ηいーたgensubscript𝜂gen\eta_{\text{gen}}italic_ηいーた start_POSTSUBSCRIPT gen end_POSTSUBSCRIPT  for isolated clusters for high beam backgrounds. The first bin contains all underflow entries; the last bin contains all overflow entries.

In the following subsections, we are comparing the performance of the GNN and the basf2 reconstruction algorithms for different detector regions for low and high beam backgrounds by evaluating the energy resolution FWHMgen/2.355subscriptFWHMgen2.355\text{FWHM}_{\text{gen}}/2.355FWHM start_POSTSUBSCRIPT gen end_POSTSUBSCRIPT / 2.355 and the tail parameters. We then analyze the GNN in more detail by testing the input variable dependencies and the robustness against differences in beam background levels between training and evaluation.

7.1 Energy resolution and energy tails

The three detector regions barrel, forward endcap, and backward endcap described in Sec. 3 differ in crystal geometry, levels of background, and amount of passive material before and in between crystals. The following section studies the variations in the energy reconstruction performance that arise as a direct result of these differences.

Refer to caption
(a) Low beam background.
Refer to caption
(b) High beam background.
Figure 8: Resolution FWHMgen/2.355subscriptFWHMgen2.355\text{FWHM}_{\text{gen}}/2.355FWHM start_POSTSUBSCRIPT gen end_POSTSUBSCRIPT / 2.355 of the GNN and basf2 as function of the simulated photon energy Egensubscript𝐸genE_{\mathrm{gen}}italic_E start_POSTSUBSCRIPT roman_gen end_POSTSUBSCRIPT for both endcaps and the barrel for (7(a)) low and (7(b)) high beam background. Each color is associated with one detector region; the light color indicates basf2, the dark color the GNN. The bands indicate the uncertainty of the fits, see text for details. The fit parameters are summarized in Tab. 2.

In order to access the energy dependence of the resolution and tail parameters we simulate test data sets of photons at various fixed energies. The FWHM for each simulated data set is then determined according to Sec. 6. Plotting the resolutions FWHMgen/2.355subscriptFWHMgen2.355\text{FWHM}_{\text{gen}}/2.355FWHM start_POSTSUBSCRIPT gen end_POSTSUBSCRIPT / 2.355 over the generated photon energies Egensubscript𝐸genE_{\mathrm{gen}}italic_E start_POSTSUBSCRIPT roman_gen end_POSTSUBSCRIPT reveals a characteristic relationship that is parameterized by the function a/Egenb/Egencdirect-sum𝑎subscript𝐸gen𝑏subscript𝐸gen𝑐a/E_{\mathrm{gen}}\oplus b/\sqrt{E_{\mathrm{gen}}}\oplus citalic_a / italic_E start_POSTSUBSCRIPT roman_gen end_POSTSUBSCRIPT ⊕ italic_b / square-root start_ARG italic_E start_POSTSUBSCRIPT roman_gen end_POSTSUBSCRIPT end_ARG ⊕ italic_c, where direct-sum\oplus indicates addition in quadrature.

Both the GNN as well as the baseline algorithm perform differently in regards to the energy resolution in all three detector parts, as can be seen in Fig. 7(a) for low beam background and as Fig. 7(b) for high beam background. Table 2 reports the parameters of the fitted parameterization of the resolution. We attribute these difference to the large spread of both shape and size of crystals in the endcaps, the asymmetric distribution of beam backgrounds, and the different amount of passive material in front of the different detector regions.

Overall, the energy resolution of the GNN algorithm is significantly better than the baseline algorithm for all photon energies. The GNN energy resolution is better by more than 30 % for photon energies below 500MeV500MeV500\,\mathrm{\,Me\kern-1.00006ptV}500 roman_MeV which is the energy range of more than 90 % of all photons in B𝐵Bitalic_B-meson decay chains. The higher the beam background, the larger the difference between the GNN and the baseline algorithm. The difference between the two algorithms decreases with energy because the relative contribution of beam backgrounds to the photon energy resolution decreases.

The shape of the left-side tails is dominated by passive material and is hence expected to be different in the different detector regions. The left-side tails are almost independent of beam backgrounds as can be seen by comparing Fig. 8(a) for low beam background and Fig. 8(c) for high beam background. The GNN and the baseline algorithm both show the smallest tail length for the barrel region with decreasing tail lengths for increasing energy. The left-side tails are largest in the backward endcap due to the highest ratio of passive to active material as expected. The right-side tails are mostly originating from beam background being wrongly added to photon clusters. The GNN produces shorter tails than the baseline algorithm for all energies and for both low and high beam backgrounds, with the performance difference increasing for lower energies and higher beam backgrounds.

Table 2: Fit results (a/Egenb/Egencdirect-sum𝑎subscript𝐸gen𝑏subscript𝐸gen𝑐a/E_{\mathrm{gen}}\oplus b/\sqrt{E_{\mathrm{gen}}}\oplus citalic_a / italic_E start_POSTSUBSCRIPT roman_gen end_POSTSUBSCRIPT ⊕ italic_b / square-root start_ARG italic_E start_POSTSUBSCRIPT roman_gen end_POSTSUBSCRIPT end_ARG ⊕ italic_c) of the fits shown in Fig. 8.
Region Algorithm Low Beam Background High Beam Background
a (×102absentsuperscript102\times 10^{-2}× 10 start_POSTSUPERSCRIPT - 2 end_POSTSUPERSCRIPT) b (×102absentsuperscript102\times 10^{-2}× 10 start_POSTSUPERSCRIPT - 2 end_POSTSUPERSCRIPT) c (×102absentsuperscript102\times 10^{-2}× 10 start_POSTSUPERSCRIPT - 2 end_POSTSUPERSCRIPT) a (×102absentsuperscript102\times 10^{-2}× 10 start_POSTSUPERSCRIPT - 2 end_POSTSUPERSCRIPT) b (×102absentsuperscript102\times 10^{-2}× 10 start_POSTSUPERSCRIPT - 2 end_POSTSUPERSCRIPT) c (×102absentsuperscript102\times 10^{-2}× 10 start_POSTSUPERSCRIPT - 2 end_POSTSUPERSCRIPT)
Barrel GNN 0.23±plus-or-minus\pm±0.02 1.32±plus-or-minus\pm±0.02 1.00±plus-or-minus\pm±0.01 1.25±plus-or-minus\pm±0.02 2.39±plus-or-minus\pm±0.02 0.75±plus-or-minus\pm±0.03
basf2 0.35±plus-or-minus\pm±0.02 1.54±plus-or-minus\pm±0.02 0.91±plus-or-minus\pm±0.02 1.88±plus-or-minus\pm±0.02 3.11±plus-or-minus\pm±0.03 0.31±plus-or-minus\pm±0.10
Forward GNN 0.00+++0.14 1.11±plus-or-minus\pm±0.01 1.49±plus-or-minus\pm±0.00 0.61±plus-or-minus\pm±0.03 2.23±plus-or-minus\pm±0.02 1.20±plus-or-minus\pm±0.02
basf2 0.00+++0.37 1.51±plus-or-minus\pm±0.01 1.38±plus-or-minus\pm±0.01 1.11±plus-or-minus\pm±0.03 2.92±plus-or-minus\pm±0.03 0.84±plus-or-minus\pm±0.03
Backward GNN 0.50±plus-or-minus\pm±0.02 1.69±plus-or-minus\pm±0.03 1.59±plus-or-minus\pm±0.02 2.18±plus-or-minus\pm±0.03 2.51±plus-or-minus\pm±0.05 2.28±plus-or-minus\pm±0.02
basf2 0.78±plus-or-minus\pm±0.03 2.12±plus-or-minus\pm±0.04 1.50±plus-or-minus\pm±0.03 2.72±plus-or-minus\pm±0.05 4.64±plus-or-minus\pm±0.05 0.91±plus-or-minus\pm±0.08
Refer to caption
(a) Left tail length TLsubscript𝑇𝐿T_{L}italic_T start_POSTSUBSCRIPT italic_L end_POSTSUBSCRIPT.
Refer to caption
(b) Right tail length TRsubscript𝑇𝑅T_{R}italic_T start_POSTSUBSCRIPT italic_R end_POSTSUBSCRIPT.
Refer to caption
(c) Left tail length TLsubscript𝑇𝐿T_{L}italic_T start_POSTSUBSCRIPT italic_L end_POSTSUBSCRIPT.
Refer to caption
(d) Right tail length TRsubscript𝑇𝑅T_{R}italic_T start_POSTSUBSCRIPT italic_R end_POSTSUBSCRIPT.
Figure 9: 95 % left- and right tail lengths TLsubscript𝑇𝐿T_{L}italic_T start_POSTSUBSCRIPT italic_L end_POSTSUBSCRIPT and TRsubscript𝑇𝑅T_{R}italic_T start_POSTSUBSCRIPT italic_R end_POSTSUBSCRIPT of ηいーたgensubscript𝜂gen\eta_{\text{gen}}italic_ηいーた start_POSTSUBSCRIPT gen end_POSTSUBSCRIPT for the GNN and basf2 as function of the simulated photon energy Egensubscript𝐸genE_{\mathrm{gen}}italic_E start_POSTSUBSCRIPT roman_gen end_POSTSUBSCRIPT for both endcaps and the barrel for (8(a) and 8(b)) low and (8(c) and 8(d)) high beam background. Each color is associated with one detector region.

7.2 Beam Background Robustness

The beam background levels are changing continuously during detector operations. Ideally, reconstruction algorithms at Belle II are insensitive to such changes. The basf2 baseline algorithm achieves robustness against increasing beam backgrounds by adaptively including fewer crystals in the energy sum calculation. Since our GNN is trained with a large number of events with event-by-event fluctuations of beam backgrounds, we expect robustness against varying beam backgrounds if the GNN generalizes well enough. We test the robustness of our GNN by comparing GNNs trained and tested on the same backgrounds, against GNNs trained and tested on the two different beam backgrounds (Fig. 10, parameterization in Tab. 3). While the GNNs trained on the same beam backgrounds achieve a better resolution than the ones trained on different beam backgrounds, the GNN still outperforms the baseline algorithm even for networks trained on the different beam backgrounds. This demonstrates an promising generalization with respect to different levels of beam backgrounds.

Refer to caption
Figure 10: Resolution FWHMgen/2.355subscriptFWHMgen2.355\text{FWHM}_{\text{gen}}/2.355FWHM start_POSTSUBSCRIPT gen end_POSTSUBSCRIPT / 2.355 as a function of the simulated photon energy Egensubscript𝐸genE_{\mathrm{gen}}italic_E start_POSTSUBSCRIPT roman_gen end_POSTSUBSCRIPT for the GNNs trained with low beam background (LBB GNN) and high beam background (HBB GNN) in the barrel. The color is associated with the evaluation on either beam background; the dark color indicates the model trained with the beam background identical to the evaluation, and the light color indicates the model trained with the respective other beam background. The bands indicate the uncertainty of the fits, see text for details. The fit parameters are summarized in Tab. 3. The resolution of the basf2 algorithm is shown for comparison.
Table 3: Fit results (a/Egenb/Egencdirect-sum𝑎subscript𝐸gen𝑏subscript𝐸gen𝑐a/E_{\mathrm{gen}}\oplus b/\sqrt{E_{\mathrm{gen}}}\oplus citalic_a / italic_E start_POSTSUBSCRIPT roman_gen end_POSTSUBSCRIPT ⊕ italic_b / square-root start_ARG italic_E start_POSTSUBSCRIPT roman_gen end_POSTSUBSCRIPT end_ARG ⊕ italic_c) of the fits shown in Fig. 10 for the GNN trained with low beam background (LBB GNN) and high beam background (HBB GNN). The values for the LBB GNN inferred on low beam background test samples, and for the HBB GNN inferred on high beam background are identical to the ones reported in Tab. 2.
Region Algorithm Low Beam Background High Beam Background
a (×102absentsuperscript102\times 10^{-2}× 10 start_POSTSUPERSCRIPT - 2 end_POSTSUPERSCRIPT) b (×102absentsuperscript102\times 10^{-2}× 10 start_POSTSUPERSCRIPT - 2 end_POSTSUPERSCRIPT) c (×102absentsuperscript102\times 10^{-2}× 10 start_POSTSUPERSCRIPT - 2 end_POSTSUPERSCRIPT) a (×102absentsuperscript102\times 10^{-2}× 10 start_POSTSUPERSCRIPT - 2 end_POSTSUPERSCRIPT) b (×102absentsuperscript102\times 10^{-2}× 10 start_POSTSUPERSCRIPT - 2 end_POSTSUPERSCRIPT) c (×102absentsuperscript102\times 10^{-2}× 10 start_POSTSUPERSCRIPT - 2 end_POSTSUPERSCRIPT)
Barrel LBB GNN 0.23±plus-or-minus\pm±0.02 1.32±plus-or-minus\pm±0.02 1.00±plus-or-minus\pm±0.01 1.59±plus-or-minus\pm±0.02 2.27±plus-or-minus\pm±0.03 1.32±plus-or-minus\pm±0.02
HBB GNN 0.28±plus-or-minus\pm±0.02 1.58±plus-or-minus\pm±0.01 0.85±plus-or-minus\pm±0.02 1.25±plus-or-minus\pm±0.02 2.39±plus-or-minus\pm±0.02 0.75±plus-or-minus\pm±0.03

7.3 Input Parameter Dependency

As discussed in Sec. 3, multiple input features are available for the GNN, while the basf2 algorithm uses crystal position and energy only. This section presents a study of the influence of the input features on the FWHM. For that, the architecture described in Sec. 5.2 is trained on isolated photon events with low or high beam backgrounds using different combinations of input features. The 200 000 events from the respective validation data set, as described in Sec. 4, are used for inference. The data set covers an energy range of 0.1<Egen<1.5GeV0.1subscript𝐸gen1.5GeV0.1<E_{\mathrm{gen}}<1.5\,\text{GeV}0.1 < italic_E start_POSTSUBSCRIPT roman_gen end_POSTSUBSCRIPT < 1.5 GeV and the full detector range 17<θしーたgen<150superscript17subscript𝜃gensuperscript15017^{\circ}<\theta_{\mathrm{gen}}<150^{\circ}17 start_POSTSUPERSCRIPT ∘ end_POSTSUPERSCRIPT < italic_θしーた start_POSTSUBSCRIPT roman_gen end_POSTSUBSCRIPT < 150 start_POSTSUPERSCRIPT ∘ end_POSTSUPERSCRIPT and 0<ϕgen<360superscript0subscriptitalic-ϕgensuperscript3600^{\circ}<\phi_{\mathrm{gen}}<360^{\circ}0 start_POSTSUPERSCRIPT ∘ end_POSTSUPERSCRIPT < italic_ϕ start_POSTSUBSCRIPT roman_gen end_POSTSUBSCRIPT < 360 start_POSTSUPERSCRIPT ∘ end_POSTSUPERSCRIPT, each of which in uniform distribution. The FWHM of Egensubscript𝐸genE_{\mathrm{gen}}italic_E start_POSTSUBSCRIPT roman_gen end_POSTSUBSCRIPT and Edepsubscript𝐸depE_{\mathrm{dep}}italic_E start_POSTSUBSCRIPT roman_dep end_POSTSUBSCRIPT is calculated as described in Sec. 6. All GNNs use the global crystal coordinates, the LM position, and the crystal mass as input features. A comparison of the FWHM for the different additional input features is shown in Tab. 4. The results show, that even for the minimal set of input variables, the GNN’s FWHM is smaller than basf2’s for both the deposited and the generated energy in both beam background scenarios. Adding local coordinates leads to small improvements and using time information brings significant improvement in the GNN performance. PSD information has almost no effect on the FWHM. Since the main purpose of the PSD information is to differentiate electromagnetic and hadronic interactions per crystal, this is expected. In anticipation of future extensions of the GNN to hadronic interactions as well, the PSD information is kept throughout this work.

Table 4: Comparison of the performances of GNN models with different additional input features, and the performance of the basf2 baseline. Shown are the FWHMdepsubscriptFWHMdep\mathrm{FWHM_{dep}}roman_FWHM start_POSTSUBSCRIPT roman_dep end_POSTSUBSCRIPT and FWHMgensubscriptFWHMgen\mathrm{FWHM_{gen}}roman_FWHM start_POSTSUBSCRIPT roman_gen end_POSTSUBSCRIPT (see Sec. 6), for 200 000 events in the validation data sets (see Sec. 4) with low and high beam background. The data sets cover an energy range of 0.1<Egen<1.5GeV0.1subscript𝐸gen1.5GeV0.1<E_{\mathrm{gen}}<1.5\,\text{GeV}0.1 < italic_E start_POSTSUBSCRIPT roman_gen end_POSTSUBSCRIPT < 1.5 GeV and the full detector range 17<θしーたgen<150superscript17subscript𝜃gensuperscript15017^{\circ}<\theta_{\mathrm{gen}}<150^{\circ}17 start_POSTSUPERSCRIPT ∘ end_POSTSUPERSCRIPT < italic_θしーた start_POSTSUBSCRIPT roman_gen end_POSTSUBSCRIPT < 150 start_POSTSUPERSCRIPT ∘ end_POSTSUPERSCRIPT and 0<ϕgen<360superscript0subscriptitalic-ϕgensuperscript3600^{\circ}<\phi_{\mathrm{gen}}<360^{\circ}0 start_POSTSUPERSCRIPT ∘ end_POSTSUPERSCRIPT < italic_ϕ start_POSTSUBSCRIPT roman_gen end_POSTSUBSCRIPT < 360 start_POSTSUPERSCRIPT ∘ end_POSTSUPERSCRIPT, each of which in uniform distribution. The uncertainties of the FWHM in each column are correlated since they use the same simulated events. The input features are described in detail in Sec. 3.
Input Features Low Beam Background    High Beam Background
FWHMdepsubscriptFWHMdep\mathrm{FWHM_{dep}}roman_FWHM start_POSTSUBSCRIPT roman_dep end_POSTSUBSCRIPT ×102absentsuperscript102\times 10^{-2}× 10 start_POSTSUPERSCRIPT - 2 end_POSTSUPERSCRIPT

FWHMgensubscriptFWHMgen\ \ \mathrm{FWHM_{gen}}roman_FWHM start_POSTSUBSCRIPT roman_gen end_POSTSUBSCRIPT ×102absentsuperscript102\ \ \times 10^{-2}× 10 start_POSTSUPERSCRIPT - 2 end_POSTSUPERSCRIPT

FWHMdepsubscriptFWHMdep\mathrm{FWHM_{dep}}roman_FWHM start_POSTSUBSCRIPT roman_dep end_POSTSUBSCRIPT ×102absentsuperscript102\times 10^{-2}× 10 start_POSTSUPERSCRIPT - 2 end_POSTSUPERSCRIPT

FWHMgensubscriptFWHMgen\ \ \mathrm{FWHM_{gen}}roman_FWHM start_POSTSUBSCRIPT roman_gen end_POSTSUBSCRIPT ×102absentsuperscript102\ \ \times 10^{-2}× 10 start_POSTSUPERSCRIPT - 2 end_POSTSUPERSCRIPT

Energy 2.17±plus-or-minus\pm±0.01

5.25±plus-or-minus\pm±0.02

5.05±plus-or-minus\pm±0.03

8.08±plus-or-minus\pm±0.04

Energy, local coordinates 2.11±plus-or-minus\pm±0.02

5.19±plus-or-minus\pm±0.02

5.04±plus-or-minus\pm±0.04

8.04±plus-or-minus\pm±0.04

Energy, local coordinates, PSD 2.19±plus-or-minus\pm±0.01

5.20±plus-or-minus\pm±0.02

5.06±plus-or-minus\pm±0.03

8.07±plus-or-minus\pm±0.04

Energy, local coordinates, time 1.72±plus-or-minus\pm±0.01

4.85±plus-or-minus\pm±0.02

4.52±plus-or-minus\pm±0.03

7.63±plus-or-minus\pm±0.03

Energy, local coordinates, time, PSD 1.72±plus-or-minus\pm±0.01

4.85±plus-or-minus\pm±0.02

4.51±plus-or-minus\pm±0.03

7.62±plus-or-minus\pm±0.03

basf2 2.32±plus-or-minus\pm±0.02

5.13±plus-or-minus\pm±0.02

6.73±plus-or-minus\pm±0.05

8.97±plus-or-minus\pm±0.07

7.4 Overlapping Photons

When discussing overlapping photon events, it is important to note that the FWHM of the photon energy distribution not only depends on its own properties but also on the properties of the second photon present. To account for that, the evaluation is split in energy bins of [0.1, 0.2], [0.2, 0.5], [0.5, 1.0], and [1.0, 1.5] GeVGeV\mathrm{\,Ge\kern-1.00006ptV}roman_GeV for both photons respectively. We report the FWHM of the first photon for different simulated energies of the second photon for low beam backgrounds (see Tab. 5) and high beam backgrounds (Tab. 6).

The GNN provides a better FWHM for all combinations, but the improvement is most significant if the photon is low energetic. For low beam backgrounds, the GNN improves the FWHM by up to 20 % for photons with simulated energies between 0.1<Egen<0.20.1subscript𝐸gen0.20.1<E_{\mathrm{gen}}<0.20.1 < italic_E start_POSTSUBSCRIPT roman_gen end_POSTSUBSCRIPT < 0.2GeVGeV\mathrm{\,Ge\kern-1.00006ptV}roman_GeV. For high beam backgrounds, the GNN improves the FWHM by more than 35 % for photons with simulated energies between 0.1<Egen<0.20.1subscript𝐸gen0.20.1<E_{\mathrm{gen}}<0.20.1 < italic_E start_POSTSUBSCRIPT roman_gen end_POSTSUBSCRIPT < 0.2GeVGeV\mathrm{\,Ge\kern-1.00006ptV}roman_GeV.

The result shows that the significant performance improvement observed for isolated photons can also be achieved for the more complicated overlapping photon signatures.

8 Conclusion and Outlook

Refer to caption
(a) Truth
Refer to caption
(b) GravNet
Refer to caption
(c) basf2
Figure 11: Comparison of truth energy fractions (10(a)), the reconstructed energy fraction by the GNN (10(b)), and the reconstructed energy fraction by basf2 (10(c)) for one example event with only one local maximum. Colors indicate the fractions belonging to each photon or background. The marker centers indicate the crystal centers, the marker area is proportional to the reconstructed energy in each crystal.

In this work, we have presented a complete study of a GNN-based fuzzy clustering algorithm for the Belle II electromagnetic calorimeter. We have been using a realistic full detector simulation and simulated beam background for low and high luminosity conditions of Belle II. The GNN algorithm has been compared to the currently used basf2 baseline algorithm. We find a significantly improved resolution of more than 30 % for high beam backgrounds, but also improved performance in reducing the right-side tails of the reconstruction errors that are caused by beam background. Such significant improvements in photon reconstruction performance directly improve the physics reach of Belle II for almost all final states with photons, but also analyses that use missing energy information [21]. We also trained different GNNs to separate energy depositions of overlapping photon clusters. The improvement of the energy resolution is up to 30 % for the low energy photon in asymmetric photon pairs. Any improvement in overlapping photon reconstruction has direct implications for the reconstruction of boosted πぱい0superscript𝜋0\pi^{0}italic_πぱい start_POSTSUPERSCRIPT 0 end_POSTSUPERSCRIPT mesons or axion-like particles with couplings to photons [37].

While the basf2 algorithm strictly reconstructs one cluster for each LM, the GNN algorithm only uses the LMs to center the ROI. The GNN algorithm can therefore in principle also be used to reconstruct overlapping photons that only produced one LM (Fig. 11). The extension of the GNN algorithm to such overlapping signatures as well as to charged particles and neutral hadrons will be the focus of follow-up work. Future work is also going to address robustness against varying beam backgrounds explicitly, for example by introducing features that are directly sensitive to beam-background levels.

This is the first application of a GNN-based clustering algorithm at Belle II for a realistic detector geometry and realistic and high beam backgrounds. This is also the first time that an algorithm has shown to improve the performance of the photon reconstruction by explicitly including timing information on clustering level at Belle II.

Table 5: FWHMgen×102subscriptFWHMgensuperscript102\mathrm{FWHM_{gen}}\times 10^{2}roman_FWHM start_POSTSUBSCRIPT roman_gen end_POSTSUBSCRIPT × 10 start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT of one photon with photon energy Eγがんま(1)superscriptsubscript𝐸𝛾1E_{\gamma}^{(1)}italic_E start_POSTSUBSCRIPT italic_γがんま end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ( 1 ) end_POSTSUPERSCRIPT in dependence of the second photon energy Eγがんま(2)superscriptsubscript𝐸𝛾2E_{\gamma}^{(2)}italic_E start_POSTSUBSCRIPT italic_γがんま end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ( 2 ) end_POSTSUPERSCRIPT for low beam background for the full detector (barrel and endcaps combined). The uncertainties of the FWHM for the two algorithms are correlated for each energy interval since they use the same simulated events. The improvement over the basf2 baseline algorithm is stated in percent for each energy interval.
Eγがんま(1)superscriptsubscript𝐸𝛾1E_{\gamma}^{(1)}italic_E start_POSTSUBSCRIPT italic_γがんま end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ( 1 ) end_POSTSUPERSCRIPT (GeV) \downarrow Eγがんま(2)superscriptsubscript𝐸𝛾2E_{\gamma}^{(2)}italic_E start_POSTSUBSCRIPT italic_γがんま end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ( 2 ) end_POSTSUPERSCRIPT (GeV) \rightarrow

[0.1, 0.2]

[0.2, 0.5]

[0.5, 1.0]

[1.0, 1.5]

[0.1, 0.2] GNN

11.04±plus-or-minus\pm±0.79

11.98±plus-or-minus\pm±0.40

11.94±plus-or-minus\pm±0.31

13.25±plus-or-minus\pm±0.34

basf2

12.72±plus-or-minus\pm±0.80

13.93±plus-or-minus\pm±0.55

14.32±plus-or-minus\pm±0.41

15.16±plus-or-minus\pm±0.48

Improvement

15.2 %

16.3 %

20.0 %

14.4 %

[0.2, 0.5] GNN

7.38±plus-or-minus\pm±0.18

7.57±plus-or-minus\pm±0.12

8.23±plus-or-minus\pm±0.09

8.38±plus-or-minus\pm±0.12

basf2

8.48±plus-or-minus\pm±0.22

8.30±plus-or-minus\pm±0.14

8.84±plus-or-minus\pm±0.12

8.96±plus-or-minus\pm±0.12

Improvement

14.9 %

9.7 %

7.5 %

7.0 %

[0.5, 1.0] GNN

5.22±plus-or-minus\pm±0.08

5.43±plus-or-minus\pm±0.05

5.69±plus-or-minus\pm±0.04

5.89±plus-or-minus\pm±0.04

basf2

5.58±plus-or-minus\pm±0.10

5.71±plus-or-minus\pm±0.06

5.85±plus-or-minus\pm±0.05

6.17±plus-or-minus\pm±0.05

Improvement

6.7 %

5.1 %

2.8 %

4.9 %

[1.0, 1.5] GNN

4.24±plus-or-minus\pm±0.06

4.43±plus-or-minus\pm±0.04

4.67±plus-or-minus\pm±0.03

4.77±plus-or-minus\pm±0.03

basf2

4.55±plus-or-minus\pm±0.07

4.58±plus-or-minus\pm±0.04

4.74±plus-or-minus\pm±0.04

4.85±plus-or-minus\pm±0.04

Improvement

7.3 %

3.4 %

1.4 %

1.8 %

Table 6: FWHMgen×102subscriptFWHMgensuperscript102\mathrm{FWHM_{gen}}\times 10^{2}roman_FWHM start_POSTSUBSCRIPT roman_gen end_POSTSUBSCRIPT × 10 start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT of one photon with photon energy Eγがんま(1)superscriptsubscript𝐸𝛾1E_{\gamma}^{(1)}italic_E start_POSTSUBSCRIPT italic_γがんま end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ( 1 ) end_POSTSUPERSCRIPT in dependence of the second photon energy Eγがんま(2)superscriptsubscript𝐸𝛾2E_{\gamma}^{(2)}italic_E start_POSTSUBSCRIPT italic_γがんま end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ( 2 ) end_POSTSUPERSCRIPT for high beam background for the full detector (barrel and endcaps combined). The uncertainties of the FWHM for the two algorithms are correlated for each energy interval since they use the same simulated events. The improvement to the basf2 baseline is stated in percent for each energy interval.
Eγがんま(1)superscriptsubscript𝐸𝛾1E_{\gamma}^{(1)}italic_E start_POSTSUBSCRIPT italic_γがんま end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ( 1 ) end_POSTSUPERSCRIPT (GeV) \downarrow Eγがんま(2)superscriptsubscript𝐸𝛾2E_{\gamma}^{(2)}italic_E start_POSTSUBSCRIPT italic_γがんま end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ( 2 ) end_POSTSUPERSCRIPT (GeV) \rightarrow

[0.1, 0.2]

[0.2, 0.5]

[0.5, 1.0]

[1.0, 1.5]

[0.1, 0.2] GNN

24.77±plus-or-minus\pm±0.83

24.10±plus-or-minus\pm±0.76

24.02±plus-or-minus\pm±0.60

24.72±plus-or-minus\pm±0.63

basf2

33.12±plus-or-minus\pm±1.08

32.82±plus-or-minus\pm±1.38

31.28±plus-or-minus\pm±0.79

32.42±plus-or-minus\pm±0.88

Improvement

33.7 %

36.2 %

30.3 %

31.1 %

[0.2, 0.5] GNN

13.16±plus-or-minus\pm±0.30

13.96±plus-or-minus\pm±0.20

14.17±plus-or-minus\pm±0.16

14.17±plus-or-minus\pm±0.16

basf2

17.73±plus-or-minus\pm±0.47

17.56±plus-or-minus\pm±0.31

17.62±plus-or-minus\pm±0.24

16.88±plus-or-minus\pm±0.23

Improvement

34.8 %

25.8 %

24.3 %

19.1 %

[0.5, 1.0] GNN

8.07±plus-or-minus\pm±0.12

8.56±plus-or-minus\pm±0.08

8.71±plus-or-minus\pm±0.06

8.84±plus-or-minus\pm±0.06

basf2

10.53±plus-or-minus\pm±0.19

10.77±plus-or-minus\pm±0.12

10.75±plus-or-minus\pm±0.09

10.73±plus-or-minus\pm±0.08

Improvement

30.6 %

25.8 %

23.4 %

21.4 %

[1.0, 1.5] GNN

6.05±plus-or-minus\pm±0.08

6.33±plus-or-minus\pm±0.05

6.42±plus-or-minus\pm±0.04

6.54±plus-or-minus\pm±0.04

basf2

7.52±plus-or-minus\pm±0.12

7.56±plus-or-minus\pm±0.07

7.60±plus-or-minus\pm±0.06

7.68±plus-or-minus\pm±0.06

Improvement

24.2 %

19.6 %

18.3 %

17.4 %

\bmhead

Data Availability StatementThe datasets generated during and analysed during the current study are property of the Belle II collaboration and not publicly available. The instructions and code to replicate the studies in this paper are available at [38, 39].

\bmhead

Acknowledgments The authors would like to thank the Belle II collaboration for useful discussions and suggestions on how to improve this work. The authors would like to thank Jan Kieseler for helpful discussions.

The training of the models was performed on the TOpAS GPU cluster at the Steinbuch Centre for Computing (SCC) at KIT. This work is funded by Helmholtz (HGF) Young Investigators Group VH-NG-1303 and BMBF ErUM-Pro 05H23VKKBA. I. Haide is supported by the Landesgraduiertenförderung Baden-Württemberg.

Compliance with ethical standards

Conflict of interest

The authors declare that they have no conflict of interest.

References

  • [1] A. Natochii et al. Beam Background Expectations for Belle II at SuperKEKB, 03 2022. arxiv:2203.05731.
  • [2] J. C. Dunn. A Fuzzy Relative of the ISODATA Process and Its Use in Detecting Compact Well-Separated Clusters. Journal of Cybernetics, 3(3):32–57, 1973.
  • [3] N. V. Canudas et al. Graph Clustering: A Graph-Based Clustering Algorithm for the Electromagnetic Calorimeter in LHCb. The European Physical Journal C, 83, 02 2023.
  • [4] D. Valsecchi. Deep Learning Techniques for Energy Clustering in The CMS ECAL. Journal of Physics: Conference Series, 2438(1):012077, 02 2023.
  • [5] P. Simkina. Machine Learning Techniques for Calorimetry. Instruments, 6:47, 09 2022.
  • [6] D. T. Belayneh et al. Calorimetry With Deep Learning: Particle Simulation and Reconstruction for Collider Physics. The European Physical Journal C, 80, 2019.
  • [7] A. Boldyrev, V. Chekalina, and F. Ratnikov. Machine Learning Approach to Boosting Neutral Particles Identification in the LHCb Calorimeter. J. Phys. Conf. Ser., 1525(1):012096, 04 2020.
  • [8] A. N. Charan. Particle Identification with the Belle II Calorimeter Using Machine Learning. J. Phys. Conf. Ser., 2438(1):012111, 2023. arxiv:2301.11654.
  • [9] M. Paganini et al. CaloGAN: Simulating 3D High Energy Particle Showers in Multilayer Electromagnetic Calorimeters With Generative Adversarial Networks. Phys. Rev. D, 97(1):014021, 2018. arxiv:1712.10321.
  • [10] E. Buhmann et al. Getting High: High Fidelity Simulation of High Granularity Calorimeters with High Speed. Comput. Softw. Big Sci., 5(1):13, 2021. arxiv:2005.05334.
  • [11] Deep generative models for fast photon shower simulation in ATLAS. 10 2022. arxiv:2210.06204.
  • [12] S. Bhattacharya et al. GNN-Based End-To-End Reconstruction in the CMS Phase 2 High-Granularity Calorimeter. J. Phys. Conf. Ser., 2438:012090, 02 2023.
  • [13] G. Grasseau et al. A Deep Neural Network Method for Analyzing the CMS High Granularity Calorimeter (HGCAL) events. EPJ Web of Conferences, 245:02003, 01 2020.
  • [14] A. Novosel et al. Identification of Light Leptons and Pions in the Electromagnetic Calorimeter of Belle II. In 11th International Workshop on Ring Imaging Cherenkov Detectors , 01 2023. arxiv:2301.05074.
  • [15] J. Shlomi, P. Battaglia, and J.-R. Vlimant. Graph Neural Networks in Particle Physics. Machine Learning: Science and Technology, 2(2):021001, 01 2021.
  • [16] Javier Duarte and Jean-Roch Vlimant. Graph Neural Networks for Particle Tracking and Reconstruction, 12 2020. arxiv:2012.01249.
  • [17] Gage DeZoort, Peter W. Battaglia, Catherine Biscarat, and Jean-Roch Vlimant. Graph neural networks at the Large Hadron Collider. Nature Rev. Phys., 5(5):281–303, 2023.
  • [18] Y. Wang et al. Dynamic Graph CNN for Learning on Point Clouds. ACM Trans. Graph., 38(5), 10 2019.
  • [19] S. R. Qasim et al. Learning Representations of Irregular Particle-Detector Geometry With Distance-Weighted Graph Networks. Eur. Phys. J. C, 79(7):608, 2019. arxiv:1902.07987.
  • [20] HEP ML Community. A Living Review of Machine Learning for Particle Physics. https://iml-wg.github.io/HEPML-LivingReview/.
  • [21] E. Kou et al. The Belle II Physics Book. PTEP, 2019(12):123 C01, 2019. arxiv:1808.10567.
  • [22] T. Abe et al. Belle II Technical Design Report. Technical report, Belle-II, 11 2010. arxiv:1011.0352.
  • [23] H. Ikeda. Development of the CsI(Tl) Calorimeter for the Measurement of CP Violation at KEK B-Factory. PhD thesis, Nara Women’s University, 1999.
  • [24] V. Aulchenko et al. Time and Energy Reconstruction at the Electromagnetic Calorimeter of the Belle II Detector. Journal of Instrumentation, 12(08):C08001–C08001, 08 2017.
  • [25] S. Longo et al. CsI(Tl) Pulse Shape Discrimination With the Belle II Electromagnetic Calorimeter as a Novel Method to Improve Particle Identification at Electron–Positron Colliders. Nucl. Instrum. Meth. A, 982:164562, 2020.
  • [26] S. Agostinelli et al. GEANT4: A Simulation Toolkit. Nucl.Instrum.Meth., A506:250–303, 2003.
  • [27] T. Kuhr et al. The Belle II Core Software. Computing and Software for Big Science., 3(1), 2019.
  • [28] Belle II Collaboration. Belle II Analysis Software Framework (basf2). https://doi.org/10.5281/zenodo.5574115.
  • [29] Z. J. Liptak et al. Measurements of Beam Backgrounds in SuperKEKB Phase 2. Nucl. Instrum. Meth. A, 1040:167168, 2022. arxiv:2112.14537.
  • [30] M. Fey and J. E. Lenssen. Fast Graph Representation Learning with PyTorch Geometric. In ICLR Workshop on Representation Learning on Graphs and Manifolds, 2019.
  • [31] D.-A. Clevert et al. Fast and Accurate Deep Network Learning by Exponential Linear Units (ELUs). In 4th International Conference on Learning Representations, ICLR 2016, San Juan, Puerto Rico, May 2-4, 2016, Conference Track Proceedings, 2016.
  • [32] S. Ioffe. Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift. In Francis Bach and David Blei, editors, Proceedings of the 32nd International Conference on Machine Learning, volume 37, pages 448–456, Lille, France, 07 2015. PMLR.
  • [33] T. Akiba et al. Optuna: A Next-Generation Hyperparameter Optimization Framework. In Proceedings of the 25rd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, 2019.
  • [34] J. Gaiser. Charmonium Spectroscopy from Radiative Decays of the J/ψぷさい𝐽𝜓J/\psiitalic_J / italic_ψぷさい and ψぷさいsuperscript𝜓normal-′\psi^{\prime}italic_ψぷさい start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT. PhD thesis, Stanford University, 1982.
  • [35] T. Skwarnicki. A study of the radiative cascade transitions between the Upsilon-Prime and Upsilon resonances. PhD thesis, Cracow, INP, 1986.
  • [36] J. Gäßler et al. kafe2 – a Modern Tool for Model Fitting in Physics Lab Courses. arXiv:2210.12768.
  • [37] F. Abudinén et al. Search for Axion-Like Particles produced in e+esuperscript𝑒superscript𝑒e^{+}e^{-}italic_e start_POSTSUPERSCRIPT + end_POSTSUPERSCRIPT italic_e start_POSTSUPERSCRIPT - end_POSTSUPERSCRIPT Collisions at Belle II. Phys. Rev. Lett., 125(16):161806, 2020. arxiv:2007.13071.
  • [38] F. Wemmer et al. Photon Reconstruction in the Belle II Calorimeter Using Graph Neural Networks. https://github.com/JonasEppelt/gnn_photon_clustering_in_belleII_ecl, 2023.
  • [39] F. Wemmer et al. Photon Reconstruction in the Belle II Calorimeter Using Graph Neural Networks. https://zenodo.org/record/8409638.