-
Electrostatics on Branching Processes
Authors:
Christopher D. Sinclair
Abstract:
We introduce a random probability measure on the profinite completion of the random tree of a branching process and introduce the canonical and grand canonical ensembles of random repelling particles on this random profinite completion at inverse temperature $β> 0$. We think of this as a random spatial process of particles in a random tree, and we introduce the notion of the {\em mean} canonical a…
▽ More
We introduce a random probability measure on the profinite completion of the random tree of a branching process and introduce the canonical and grand canonical ensembles of random repelling particles on this random profinite completion at inverse temperature $β> 0$. We think of this as a random spatial process of particles in a random tree, and we introduce the notion of the {\em mean} canonical and grand canonical partition functions where in this context `mean' means averaged over the random environment. We give a recursion for these mean partition functions and demonstrate that in certain instances, determined by the law for the branching process, these partition functions as a function of $β$ have algebraic properties which generalize those that appear in the non-random and $p$-adic environments.
△ Less
Submitted 8 July, 2024;
originally announced July 2024.
-
T3: Transparent Tracking & Triggering for Fine-grained Overlap of Compute & Collectives
Authors:
Suchita Pati,
Shaizeen Aga,
Mahzabeen Islam,
Nuwan Jayasena,
Matthew D. Sinclair
Abstract:
Large Language Models increasingly rely on distributed techniques for their training and inference. These techniques require communication across devices which can reduce scaling efficiency as the number of devices increases. While some distributed techniques can overlap, and thus, hide this communication with independent computations, techniques such as Tensor Parallelism (TP) inherently serializ…
▽ More
Large Language Models increasingly rely on distributed techniques for their training and inference. These techniques require communication across devices which can reduce scaling efficiency as the number of devices increases. While some distributed techniques can overlap, and thus, hide this communication with independent computations, techniques such as Tensor Parallelism (TP) inherently serialize communication with model execution. One approach to hide this serialized communication is to interleave it with the producer operation (of the communicated data) in a fine-grained manner. However, this fine-grained interleaving of communication and computation in software can be difficult. Furthermore, as with any concurrent execution, it requires compute and memory resources to be shared between computation and communication, causing resource contention that reduces overlapping efficacy.
To overcome these challenges, we propose T3 which applies hardware-software co-design to transparently overlap serialized communication while minimizing resource contention with compute. T3 transparently fuses producer operations with the subsequent communication via a simple configuration of the producer's output address space and requires minor software changes. At the hardware level, T3 adds a lightweight track and trigger mechanism to orchestrate the producer's compute, and communication. It further uses compute-enhanced memories for communication's attendant compute. As a result, T3 reduces resource contention, and efficiently overlaps serialized communication with computation. For important Transformer models like T-NLG, T3 speeds up communication-heavy sublayers by 30% geomean (max 47%) and reduces data movement by 22% geomean (max 36%). Furthermore, T3's benefits persist as models scale: geomean 29% for sublayers in $\sim$500-billion parameter models, PALM and MT-NLG.
△ Less
Submitted 29 January, 2024;
originally announced January 2024.
-
Lattice QED in an external magnetic field: Evidence for dynamical chiral symmetry breaking
Authors:
J. B. Kogut,
D. K. Sinclair
Abstract:
We simulate QED in a strong constant homogeneous external magnetic field on a euclidean space-time lattice using the Rational Hybrid Monte Carlo method, developed for simulating lattice QCD. Our primary goal is to measure the chiral condensate in the limit when the input electron mass $m$ is zero. We observe a non-zero value, indicating that the external magnetic field catalyzes chiral symmetry br…
▽ More
We simulate QED in a strong constant homogeneous external magnetic field on a euclidean space-time lattice using the Rational Hybrid Monte Carlo method, developed for simulating lattice QCD. Our primary goal is to measure the chiral condensate in the limit when the input electron mass $m$ is zero. We observe a non-zero value, indicating that the external magnetic field catalyzes chiral symmetry breaking as predicted by approximate truncated Schwinger-Dyson methods. Such behaviour is associated with dominance by the lowest Landau level which causes the effective dimensional reduction from $3+1$~dimensions to $1+1$ dimensions for charged particles (electrons and positrons) where the attractive forces of QED can produce chiral symmetry breaking with a dynamical electron mass and associated chiral condensate. Since our lattice simulations use bare (lattice) parameters, while the Schwinger-Dyson analyses work with renormalized quantities, direct numerical comparison will require renormalization of our lattice results.
△ Less
Submitted 6 November, 2023;
originally announced November 2023.
-
Towards Emotion-Based Synthetic Consciousness: Using LLMs to Estimate Emotion Probability Vectors
Authors:
David Sinclair,
Willem Pye
Abstract:
This paper shows how LLMs (Large Language Models) may be used to estimate a summary of the emotional state associated with piece of text. The summary of emotional state is a dictionary of words used to describe emotion together with the probability of the word appearing after a prompt comprising the original text and an emotion eliciting tail. Through emotion analysis of Amazon product reviews we…
▽ More
This paper shows how LLMs (Large Language Models) may be used to estimate a summary of the emotional state associated with piece of text. The summary of emotional state is a dictionary of words used to describe emotion together with the probability of the word appearing after a prompt comprising the original text and an emotion eliciting tail. Through emotion analysis of Amazon product reviews we demonstrate emotion descriptors can be mapped into a PCA type space. It was hoped that text descriptions of actions to improve a current text described state could also be elicited through a tail prompt. Experiment seemed to indicate that this is not straightforward to make work. This failure put our hoped for selection of action via choosing the best predict ed outcome via comparing emotional responses out of reach for the moment.
△ Less
Submitted 9 October, 2023;
originally announced October 2023.
-
Fifty Years of ISCA: A data-driven retrospective on key trends
Authors:
Gaurang Upasani,
Matthew D. Sinclair,
Adrian Sampson,
Parthasarathy Ranganathan,
David Patterson,
Shaan Shah,
Nidhi Parthasarathy,
Rutwik Jain
Abstract:
Computer Architecture, broadly, involves optimizing hardware and software for current and future processing systems. Although there are several other top venues to publish Computer Architecture research, including ASPLOS, HPCA, and MICRO, ISCA (the International Symposium on Computer Architecture) is one of the oldest, longest running, and most prestigious venues for publishing Computer Architectu…
▽ More
Computer Architecture, broadly, involves optimizing hardware and software for current and future processing systems. Although there are several other top venues to publish Computer Architecture research, including ASPLOS, HPCA, and MICRO, ISCA (the International Symposium on Computer Architecture) is one of the oldest, longest running, and most prestigious venues for publishing Computer Architecture research. Since 1973, except for 1975, ISCA has been organized annually. Accordingly, this year will be the 50th year of ISCA. Thus, we set out to analyze the past 50 years of ISCA to understand who and what has been driving and innovating computing systems thus far. Our analysis identifies several interesting trends that reflect how ISCA, and Computer Architecture in general, has grown and evolved in the past 50 years, including minicomputers, general-purpose uniprocessor CPUs, multiprocessor and multi-core CPUs, general-purpose GPUs, and accelerators.
△ Less
Submitted 18 November, 2023; v1 submitted 6 June, 2023;
originally announced June 2023.
-
Integrating Per-Stream Stat Tracking into Accel-Sim
Authors:
Shichen Qiao,
Xin Su,
Matthew D. Sinclair
Abstract:
Accel-Sim is a widely used computer architecture simulator that models the behavior of modern NVIDIA GPUs in great detail. However, although Accel-Sim and the underlying GPGPU-Sim model many of the features of real GPUs, thus far it has not been able to track statistics separately per stream. Instead, Accel-Sim combines statistics (e.g., cycles and cache hits/misses) across all simultaneously runn…
▽ More
Accel-Sim is a widely used computer architecture simulator that models the behavior of modern NVIDIA GPUs in great detail. However, although Accel-Sim and the underlying GPGPU-Sim model many of the features of real GPUs, thus far it has not been able to track statistics separately per stream. Instead, Accel-Sim combines statistics (e.g., cycles and cache hits/misses) across all simultaneously running streams. This can prevent users from properly identifying the behavior of specific kernels and streams and potentially lead to incorrect conclusions. Thus, in this work we extend Accel-Sim's and GPGPU-Sim's statistic tracking support to track per-stream statistics. To validate this support, we designed a series of multi-stream microbenchmarks and checked their reported per-kernel, per-stream counts.
△ Less
Submitted 4 September, 2023; v1 submitted 21 April, 2023;
originally announced April 2023.
-
Generative Adversarial Networks for Scintillation Signal Simulation in EXO-200
Authors:
S. Li,
I. Ostrovskiy,
Z. Li,
L. Yang,
S. Al Kharusi,
G. Anton,
I. Badhrees,
P. S. Barbeau,
D. Beck,
V. Belov,
T. Bhatta,
M. Breidenbach,
T. Brunner,
G. F. Cao,
W. R. Cen,
C. Chambers,
B. Cleveland,
M. Coon,
A. Craycraft,
T. Daniels,
L. Darroch,
S. J. Daugherty,
J. Davis,
S. Delaquis,
A. Der Mesrobian-Kabakian
, et al. (65 additional authors not shown)
Abstract:
Generative Adversarial Networks trained on samples of simulated or actual events have been proposed as a way of generating large simulated datasets at a reduced computational cost. In this work, a novel approach to perform the simulation of photodetector signals from the time projection chamber of the EXO-200 experiment is demonstrated. The method is based on a Wasserstein Generative Adversarial N…
▽ More
Generative Adversarial Networks trained on samples of simulated or actual events have been proposed as a way of generating large simulated datasets at a reduced computational cost. In this work, a novel approach to perform the simulation of photodetector signals from the time projection chamber of the EXO-200 experiment is demonstrated. The method is based on a Wasserstein Generative Adversarial Network - a deep learning technique allowing for implicit non-parametric estimation of the population distribution for a given set of objects. Our network is trained on real calibration data using raw scintillation waveforms as input. We find that it is able to produce high-quality simulated waveforms an order of magnitude faster than the traditional simulation approach and, importantly, generalize from the training sample and discern salient high-level features of the data. In particular, the network correctly deduces position dependency of scintillation light response in the detector and correctly recognizes dead photodetector channels. The network output is then integrated into the EXO-200 analysis framework to show that the standard EXO-200 reconstruction routine processes the simulated waveforms to produce energy distributions comparable to that of real waveforms. Finally, the remaining discrepancies and potential ways to improve the approach further are highlighted.
△ Less
Submitted 8 May, 2023; v1 submitted 11 March, 2023;
originally announced March 2023.
-
Search for Two-neutrino Double-Beta Decay of $^{136}\rm Xe$ to the $0^+_1$ excited state of $^{136}\rm Ba$ with the Complete EXO-200 Dataset
Authors:
EXO-200 Collaboration,
:,
S. Al Kharusi,
G. Anton,
I. Badhrees,
P. S. Barbeau,
D. Beck,
V. Belov,
T. Bhatta,
M. Breidenbach,
T. Brunner,
G. F. Cao,
W. R. Cen,
C. Chambers,
B. Cleveland,
M. Coon,
A. Craycraft,
T. Daniels,
L. Darroch,
S. J. Daugherty,
J. Davis,
S. Delaquis,
A. Der Mesrobian-Kabakian,
R. DeVoe,
J. Dilling
, et al. (83 additional authors not shown)
Abstract:
A new search for two-neutrino double-beta ($2νββ$) decay of $^{136}\rm Xe$ to the $0^+_1$ excited state of $^{136}\rm Ba$ is performed with the full EXO-200 dataset. A deep learning-based convolutional neural network is used to discriminate signal from background events. Signal detection efficiency is increased relative to previous searches by EXO-200 by more than a factor of two. With the additio…
▽ More
A new search for two-neutrino double-beta ($2νββ$) decay of $^{136}\rm Xe$ to the $0^+_1$ excited state of $^{136}\rm Ba$ is performed with the full EXO-200 dataset. A deep learning-based convolutional neural network is used to discriminate signal from background events. Signal detection efficiency is increased relative to previous searches by EXO-200 by more than a factor of two. With the addition of the Phase II dataset taken with an upgraded detector, the median 90$\%$ confidence level half-life sensitivity of $2νββ$ decay to the $0^+_1$ state of $^{136}\rm Ba$ is $2.9 \times 10^{24}~\rm yr$ using a total $^{136}\rm Xe$ exposure of $234.1~\rm kg~yr$. No statistically significant evidence for $2νββ$ decay to the $0^+_1$ state is observed, leading to a lower limit of $T^{2ν}_{1/2}(0^+ \rightarrow 0^+_1) > 1.4\times10^{24}~\rm yr$ at 90$\%$ confidence level, improved by 70$\%$ relative to the current world's best constraint.
△ Less
Submitted 16 October, 2023; v1 submitted 2 March, 2023;
originally announced March 2023.
-
Computation vs. Communication Scaling for Future Transformers on Future Hardware
Authors:
Suchita Pati,
Shaizeen Aga,
Mahzabeen Islam,
Nuwan Jayasena,
Matthew D. Sinclair
Abstract:
Scaling neural network models has delivered dramatic quality gains across ML problems. However, this scaling has increased the reliance on efficient distributed training techniques. Accordingly, as with other distributed computing scenarios, it is important to understand how will compute and communication scale relative to one another as models scale and hardware evolves? A careful study which ans…
▽ More
Scaling neural network models has delivered dramatic quality gains across ML problems. However, this scaling has increased the reliance on efficient distributed training techniques. Accordingly, as with other distributed computing scenarios, it is important to understand how will compute and communication scale relative to one another as models scale and hardware evolves? A careful study which answers this question can better guide the design of future systems which can efficiently train future large models.
Accordingly, this work provides a comprehensive multi-axial (algorithmic, empirical, hardware evolution) analysis of compute vs. communication (Comp-vs.-Comm) scaling for future Transformer models on future hardware. First, our algorithmic analysis shows that compute generally enjoys an edge over communication as models scale. However, since memory capacity scales slower than compute, these trends are being stressed. Next, we quantify this edge by empirically studying how Comp-vs.-Comm scales for future models on future hardware. To avoid profiling numerous Transformer models across many setups, we extract execution regions and project costs using operator models. This allows a spectrum (hundreds) of future model/hardware scenarios to be accurately studied ($<$15% error), and reduces profiling costs by 2100$\times$. Our experiments show that communication will be a significant portion (40-75%) of runtime as models and hardware evolve. Moreover, communication which is hidden by overlapped computation in today's models often cannot be hidden in future, larger models. Overall, this work highlights the increasingly large role communication will play as models scale and discusses techniques and upcoming technologies that can help address it.
△ Less
Submitted 2 May, 2023; v1 submitted 6 February, 2023;
originally announced February 2023.
-
Patch DCT vs LeNet
Authors:
David Sinclair
Abstract:
This paper compares the performance of a NN taking the output of a DCT (Discrete Cosine Transform) of an image patch with leNet for classifying MNIST hand written digits. The basis functions underlying the DCT bear a passing resemblance to some of the learned basis function of the Visual Transformer but are an order of magnitude faster to apply.
This paper compares the performance of a NN taking the output of a DCT (Discrete Cosine Transform) of an image patch with leNet for classifying MNIST hand written digits. The basis functions underlying the DCT bear a passing resemblance to some of the learned basis function of the Visual Transformer but are an order of magnitude faster to apply.
△ Less
Submitted 4 November, 2022;
originally announced November 2022.
-
Chiral Symmetry Breaking in QED induced by an External Magnetic Field
Authors:
D. K. Sinclair,
J. B. Kogut
Abstract:
We simulate Lattice QED in a constant external magnetic field using the RHMC algorithm. We seek evidence for chiral symmetry breaking predicted by truncated Schwinger-Dyson methods. Since the predicted values of the dynamical electron mass and chiral condensate at the physical fine structure constant are too small to be measured, we simulate at a larger value $α=1/5$. This requires using electron…
▽ More
We simulate Lattice QED in a constant external magnetic field using the RHMC algorithm. We seek evidence for chiral symmetry breaking predicted by truncated Schwinger-Dyson methods. Since the predicted values of the dynamical electron mass and chiral condensate at the physical fine structure constant are too small to be measured, we simulate at a larger value $α=1/5$. This requires using electron masses as low as $m=0.001$ to extrapolate to $m=0$. At a large magnetic field, the electrons occupy the lowest Landau level which has a small profile in the plane orthogonal to the magnetic field, so that we are able to use a lattice with small extent in these 2 directions. If chiral symmetry is unbroken at $m=0$ the chiral condensate is dominated by large momenta and should be insensitive to the lattice extent in the direction of the magnetic field and the time direction. When chiral symmetry is broken at $m=0$, the chiral condensate should be sensitive to the lattice size in these directions as $m \rightarrow 0$. We search for this behaviour by increasing the lattice extent in these 2 directions. Preliminary simulations show strong dependence of the chiral condensate on the lattice extent in these 2 directions for the smallest masses, and these increased condensates appear to be approaching a non-zero limit as $m \rightarrow 0$.
△ Less
Submitted 19 October, 2022;
originally announced October 2022.
-
A Saccaded Visual Transformer for General Object Spotting
Authors:
Willem. T. Pye,
David. A. Sinclair
Abstract:
This paper presents the novel combination of a visual transformer style patch classifier with saccaded local attention. A novel optimisation paradigm for training object models is also presented, rather than the optimisation function minimising class membership probability error the network is trained to estimate the normalised distance to the centroid of labelled objects. This approach builds a d…
▽ More
This paper presents the novel combination of a visual transformer style patch classifier with saccaded local attention. A novel optimisation paradigm for training object models is also presented, rather than the optimisation function minimising class membership probability error the network is trained to estimate the normalised distance to the centroid of labelled objects. This approach builds a degree of transnational invariance directly into the model and allows fast saccaded search with gradient ascent to find object centroids. The resulting saccaded visual transformer is demonstrated on human faces.
△ Less
Submitted 17 October, 2022;
originally announced October 2022.
-
Not All GPUs Are Created Equal: Characterizing Variability in Large-Scale, Accelerator-Rich Systems
Authors:
Prasoon Sinha,
Akhil Guliani,
Rutwik Jain,
Brandon Tran,
Matthew D. Sinclair,
Shivaram Venkataraman
Abstract:
Scientists are increasingly exploring and utilizing the massive parallelism of general-purpose accelerators such as GPUs for scientific breakthroughs. As a result, datacenters, hyperscalers, national computing centers, and supercomputers have procured hardware to support this evolving application paradigm. These systems contain hundreds to tens of thousands of accelerators, enabling peta- and exa-…
▽ More
Scientists are increasingly exploring and utilizing the massive parallelism of general-purpose accelerators such as GPUs for scientific breakthroughs. As a result, datacenters, hyperscalers, national computing centers, and supercomputers have procured hardware to support this evolving application paradigm. These systems contain hundreds to tens of thousands of accelerators, enabling peta- and exa-scale levels of compute for scientific workloads. Recent work demonstrated that power management (PM) can impact application performance in CPU-based HPC systems, even when machines have the same architecture and SKU (stock keeping unit). This variation occurs due to manufacturing variability and the chip's PM. However, while modern HPC systems widely employ accelerators such as GPUs, it is unclear how much this variability affects applications. Accordingly, we seek to characterize the extent of variation due to GPU PM in modern HPC and supercomputing systems. We study a variety of applications that stress different GPU components on five large-scale computing centers with modern GPUs: Oak Ridge's Summit, Sandia's Vortex, TACC's Frontera and Longhorn, and Livermore's Corona. These clusters use a variety of cooling methods and GPU vendors. In total, we collect over 18,800 hours of data across more than 90% of the GPUs in these clusters. Regardless of the application, cluster, GPU vendor, and cooling method, our results show significant variation: 8% (max 22%) average performance variation even though the GPU architecture and vendor SKU are identical within each cluster, with outliers up to 1.5X slower than the median GPU. These results highlight the difficulty in efficiently using existing GPU clusters for modern HPC and scientific workloads, and the need to embrace variability in future accelerator-based systems.
△ Less
Submitted 8 November, 2022; v1 submitted 23 August, 2022;
originally announced August 2022.
-
Rocket Lab Mission to Venus
Authors:
Richard French,
Christophe Mandy,
Richard Hunter,
Ehson Mosleh,
Doug Sinclair,
Peter Beck,
Sara Seager,
Janusz J. Petkowski,
Christopher E. Carr,
David H. Grinspoon,
Darrel Baumgardner
Abstract:
Regular, low-cost Decadal-class science missions to planetary destinations will be enabled by high-ΔV small spacecraft, such as the high-energy Photon, and small launch vehicles, such as Electron, to support expanding opportunities for scientists and to increase the rate of science return. The Rocket Lab mission to Venus is a small direct entry probe planned for baseline launch in May 2023 with ac…
▽ More
Regular, low-cost Decadal-class science missions to planetary destinations will be enabled by high-ΔV small spacecraft, such as the high-energy Photon, and small launch vehicles, such as Electron, to support expanding opportunities for scientists and to increase the rate of science return. The Rocket Lab mission to Venus is a small direct entry probe planned for baseline launch in May 2023 with accommodation for a single ~1 kg instrument. A backup launch window is available in January 2025. The probe mission will spend about 5 min in the Venus cloud layers at 48-60 km altitude above the surface and collect in situ measurements. We have chosen a low-mass, low-cost autofluorescing nephelometer to search for organic molecules in the cloud particles and constrain the particle composition.
△ Less
Submitted 16 August, 2022;
originally announced August 2022.
-
Search for MeV Electron Recoils from Dark Matter in EXO-200
Authors:
EXO-200 Collaboration,
:,
S. Al Kharusi,
G. Anton,
I. Badhrees,
P. S. Barbeau,
D. Beck,
V. Belov,
T. Bhatta,
M. Breidenbach,
T. Brunner,
G. F. Cao,
W. R. Cen,
C. Chambers,
B. Cleveland,
M. Coon,
A. Craycraft,
T. Daniels,
L. Darroch,
S. J. Daugherty,
J. Davis,
S. Delaquis,
A. Der Mesrobian-Kabakian,
R. DeVoe,
J. Dilling
, et al. (83 additional authors not shown)
Abstract:
We present a search for electron-recoil signatures from the charged-current absorption of fermionic dark matter using the EXO-200 detector. We report an average electron recoil background rate of $6.8 \times 10^{-4}\, \mathrm{cts}\,\mathrm{kg}^{-1}\mathrm{yr}^{-1}\mathrm{keV}^{-1}$ above $4\,\mathrm{MeV}$ and find no statistically significant excess over our background projection. Using a total…
▽ More
We present a search for electron-recoil signatures from the charged-current absorption of fermionic dark matter using the EXO-200 detector. We report an average electron recoil background rate of $6.8 \times 10^{-4}\, \mathrm{cts}\,\mathrm{kg}^{-1}\mathrm{yr}^{-1}\mathrm{keV}^{-1}$ above $4\,\mathrm{MeV}$ and find no statistically significant excess over our background projection. Using a total ${}^{136}\mathrm{Xe}$ exposure of $234.1\,\mathrm{kg}\,\mathrm{yr}$ we exclude new parameter space for the charged-current absorption cross-section for dark matter masses between $m_χ= 2.6\,\mathrm{MeV} - 11.6\,\mathrm{MeV}$ with a minimum of $6\times 10^{-51}\,\mathrm{cm}^2$ at $8.3\,\mathrm{MeV}$ at the $90\%$ confidence level.
△ Less
Submitted 20 February, 2023; v1 submitted 2 July, 2022;
originally announced July 2022.
-
A Search for Electron Neutrino Transitions to Sterile States in the BEST Experiment
Authors:
V. V. Barinov,
B. T. Cleveland,
S. N. Danshin,
H. Ejiri,
S. R. Elliott,
D. Frekers,
V. N. Gavrin,
V. V. Gorbachev,
D. S. Gorbunov,
W. C. Haxton,
T. V. Ibragimova,
I. Kim,
Yu. P. Kozlova,
L. V. Kravchuk,
V. V. Kuzminov,
B. K. Lubsandorzhiev,
Yu. M. Malyshkin,
R. Massarczyk,
V. A. Matveev,
I. N. Mirmov,
J. S. Nico,
A. L. Petelin,
R. G. H. Robertson,
D. Sinclair,
A. A. Shikhin
, et al. (5 additional authors not shown)
Abstract:
The Baksan Experiment on Sterile Transitions (BEST) probes the gallium anomaly and its possible connections to oscillations between active and sterile neutrinos. Based on the Gallium-Germanium Neutrino Telescope (GGNT) technology of the SAGE experiment, BEST employs two zones of liquid Ga target to explore neutrino oscillations on the meter scale. Oscillations on this short scale could produce def…
▽ More
The Baksan Experiment on Sterile Transitions (BEST) probes the gallium anomaly and its possible connections to oscillations between active and sterile neutrinos. Based on the Gallium-Germanium Neutrino Telescope (GGNT) technology of the SAGE experiment, BEST employs two zones of liquid Ga target to explore neutrino oscillations on the meter scale. Oscillations on this short scale could produce deficits in the $^{71}$Ge production rates within the two zones, as well as a possible rate difference between the zones.
From July 5th to October 13th 2019, the two-zone target was exposed to a primarily monoenergetic, 3.4-MCi $^{51}$Cr neutrino source 10 times for a total of 20 independent $^{71}$Ge extractions from the two Ga targets. The $^{71}$Ge production rates from the neutrino source were measured from July 2019 to March 2020. At the end of these measurements, the counters were filled with $^{71}$Ge doped gas and calibrated during November 2020. In this paper, results from the BEST sterile neutrino oscillation experiment are presented in details. The ratio of the measured $^{71}$Ge production rates to the predicted rates for the inner and the outer target volumes are calculated from the known neutrino capture cross section. Comparable deficits in the measured ratios relative to predicted values are found for both zones, with the $4 σ$ deviations from unity consistent with the previously reported gallium anomaly. If interpreted in the context of neutrino oscillations, the deficits give best fit oscillation parameters of $Δm^2=3.3^{+\infty}_{-2.3}$ eV$^2$ and sin$^2 2θ=0.42^{+0.15}_{-0.17}$, consistent with $ν_e \rightarrow ν_s$ oscillations governed by a surprisingly large mixing angle.
△ Less
Submitted 6 May, 2022; v1 submitted 18 January, 2022;
originally announced January 2022.
-
Lattice QED in external electromagnetic fields
Authors:
D. K. Sinclair,
J. B. Kogut
Abstract:
We study QED in external electromagnetic fields using methods developed for simulating lattice QCD. Our first project is to simulate QED in a constant (in space and time) external magnetic field on a euclidean space-time lattice using the Rational Hybrid Monte Carlo (RHMC) method. Observables we measure include the condensate $\langle\barψψ\rangle$ and the effective electron action after integrati…
▽ More
We study QED in external electromagnetic fields using methods developed for simulating lattice QCD. Our first project is to simulate QED in a constant (in space and time) external magnetic field on a euclidean space-time lattice using the Rational Hybrid Monte Carlo (RHMC) method. Observables we measure include the condensate $\langle\barψψ\rangle$ and the effective electron action after integrating out the fermion fields. We look for evidence that the combined effect of the magnetic field and the electron-positron attraction from QED produces a non-zero condensate in the limit of zero electron mass, a non-perturbative effect analogous to spontaneous chiral symmetry breaking. Very preliminary evidence is that such a condensate exists, at least for strong external magnetic fields and unphysically large electric charge. In addition, we are storing field configurations to measure the expected distortions and screenings of the coulomb field of a charged particle due to the vacuum polarization asymmetries produced by the magnetic field. We hope also to measure the dynamical contribution to the electron mass produced by the same mechanism that produces a finite condensate in the zero input mass limit.
△ Less
Submitted 2 November, 2021;
originally announced November 2021.
-
Results from the Baksan Experiment on Sterile Transitions (BEST)
Authors:
V. V. Barinov,
B. T. Cleveland,
S. N. Danshin,
H. Ejiri,
S. R. Elliott,
D. Frekers,
V. N. Gavrin,
V. V. Gorbachev,
D. S. Gorbunov,
W. C. Haxton,
T. V. Ibragimova,
I. Kim,
Yu. P. Kozlova,
L. V. Kravchuk,
V. V. Kuzminov,
B. K. Lubsandorzhiev,
Yu. M. Malyshkin,
R. Massarczyk,
V. A. Matveev,
I. N. Mirmov,
J. S. Nico,
A. L. Petelin,
R. G. H. Robertson,
D. Sinclair,
A. A. Shikhin
, et al. (5 additional authors not shown)
Abstract:
The Baksan Experiment on Sterile Transitions (BEST) was designed to investigate the deficit of electron neutrinos, $ν_{e}$, observed in previous gallium-based radiochemical measurements with high-intensity neutrino sources, commonly referred to as the \textit{gallium anomaly}, which could be interpreted as evidence for oscillations between $ν_e$ and sterile neutrino ($ν_s$) states. A 3.414-MCi \nu…
▽ More
The Baksan Experiment on Sterile Transitions (BEST) was designed to investigate the deficit of electron neutrinos, $ν_{e}$, observed in previous gallium-based radiochemical measurements with high-intensity neutrino sources, commonly referred to as the \textit{gallium anomaly}, which could be interpreted as evidence for oscillations between $ν_e$ and sterile neutrino ($ν_s$) states. A 3.414-MCi \nuc{51}{Cr} $ν_e$ source was placed at the center of two nested Ga volumes and measurements were made of the production of \nuc{71}{Ge} through the charged current reaction, \nuc{71}{Ga}($ν_e$,e$^-$)\nuc{71}{Ge}, at two average distances. The measured production rates for the inner and the outer targets respectively are ($54.9^{+2.5}_{-2.4}(\mbox{stat})\pm1.4 (\mbox{syst})$) and ($55.6^{+2.7}_{-2.6}(\mbox{stat})\pm1.4 (\mbox{syst})$) atoms of \nuc{71}{Ge}/d. The ratio ($R$) of the measured rate of \nuc{71}{Ge} production at each distance to the expected rate from the known cross section and experimental efficiencies are $R_{in}=0.79\pm0.05$ and $R_{out}= 0.77\pm0.05$. The ratio of the outer to the inner result is 0.97$\pm$0.07, which is consistent with unity within uncertainty. The rates at each distance were found to be similar, but 20-24\% lower than expected, thus reaffirming the anomaly. These results are consistent with $ν_e \rightarrow ν_s$ oscillations with a relatively large $Δm^2$ ($>$0.5 eV$^2$) and mixing sin$^2 2θ$ ($\approx$0.4).
△ Less
Submitted 30 March, 2022; v1 submitted 23 September, 2021;
originally announced September 2021.
-
Search for Majoron-emitting modes of $^{136}$Xe double beta decay with the complete EXO-200 dataset
Authors:
S. Al Kharusi,
G. Anton,
I. Badhrees,
P. S. Barbeau,
D. Beck,
V. Belov,
T. Bhatta,
M. Breidenbach,
T. Brunner,
G. F. Cao,
W. R. Cen,
C. Chambers,
B. Cleveland,
M. Coon,
A. Craycraft,
T. Daniels,
L. Darroch,
S. J. Daugherty,
J. Davis,
S. Delaquis,
A. Der Mesrobian-Kabakian,
R. DeVoe,
J. Dilling,
A. Dolgolenko,
M. J. Dolinski
, et al. (81 additional authors not shown)
Abstract:
A search for Majoron-emitting modes of the neutrinoless double-beta decay of $^{136}$Xe is performed with the full EXO-200 dataset. This dataset consists of a total $^{136}$Xe exposure of 234.1 kg$\cdot$yr, and includes data with detector upgrades that have improved the energy threshold relative to previous searches. A lower limit of T$_{1/2}^{\rm{^{136}Xe}}>$4.3$\cdot$10$^{24}$ yr at 90\% C.L. on…
▽ More
A search for Majoron-emitting modes of the neutrinoless double-beta decay of $^{136}$Xe is performed with the full EXO-200 dataset. This dataset consists of a total $^{136}$Xe exposure of 234.1 kg$\cdot$yr, and includes data with detector upgrades that have improved the energy threshold relative to previous searches. A lower limit of T$_{1/2}^{\rm{^{136}Xe}}>$4.3$\cdot$10$^{24}$ yr at 90\% C.L. on the half-life of the spectral index $n=1$ Majoron decay was obtained, a factor of 3.6 more stringent than the previous limit from EXO-200, corresponding to a constraint on the Majoron-neutrino coupling constant of $|\langle g_{ee}^{M}\rangle|$$<(0.4$-$0.9)\cdot10^{-5}$. The lower threshold and the additional data taken resulted in a factor 8.4 improvement for the $n=7$ mode compared to the previous EXO search. This search provides the most stringent limits to-date on the Majoron-emitting decays of $^{136}$Xe with spectral indices $n=1,2,3,$ and 7.
△ Less
Submitted 17 November, 2021; v1 submitted 3 September, 2021;
originally announced September 2021.
-
First direct detection constraints on Planck-scale mass dark matter with multiple-scatter signatures using the DEAP-3600 detector
Authors:
P. Adhikari,
R. Ajaj,
M. Alpízar-Venegas,
D. J. Auty,
H. Benmansour,
C. E. Bina,
W. Bonivento,
M. G. Boulay,
M. Cadeddu,
B. Cai,
M. Cárdenas-Montes,
S. Cavuoti,
Y. Chen,
B. T. Cleveland,
J. M. Corning,
S. Daugherty,
P. DelGobbo,
P. Di Stefano,
L. Doria,
M. Dunford,
E. Ellingwood,
A. Erlandson,
S. S. Farahani,
N. Fatemighomi,
G. Fiorillo
, et al. (72 additional authors not shown)
Abstract:
Dark matter particles with Planck-scale mass ($\simeq10^{19}\text{GeV}/c^2$) arise in well-motivated theories and could be produced by several cosmological mechanisms. Using a blind analysis of data collected over a 813 d live time with DEAP-3600, a 3.3 t single-phase liquid argon-based dark matter experiment at SNOLAB, a search for supermassive dark matter was performed, looking for multiple-scat…
▽ More
Dark matter particles with Planck-scale mass ($\simeq10^{19}\text{GeV}/c^2$) arise in well-motivated theories and could be produced by several cosmological mechanisms. Using a blind analysis of data collected over a 813 d live time with DEAP-3600, a 3.3 t single-phase liquid argon-based dark matter experiment at SNOLAB, a search for supermassive dark matter was performed, looking for multiple-scatter signals. No candidate signal events were observed, leading to the first direct detection constraints on Planck-scale mass dark matter. Leading limits constrain dark matter masses between $8.3\times10^{6}$ and $1.2\times10^{19} \text{GeV}/c^2$, and cross sections for scattering on $^{40}$Ar between $1.0\times10^{-23}$ and $2.4\times10^{-18} \text{cm}^2$. These are used to constrain two composite dark matter models.
△ Less
Submitted 5 January, 2022; v1 submitted 20 August, 2021;
originally announced August 2021.
-
The EXO-200 detector, part II: Auxiliary Systems
Authors:
N. Ackerman,
J. Albert,
M. Auger,
D. J. Auty,
I. Badhrees,
P. S. Barbeau,
L. Bartoszek,
E. Baussan,
V. Belov,
C. Benitez-Medina,
T. Bhatta,
M. Breidenbach,
T. Brunner,
G. F. Cao,
W. R. Cen,
C. Chambers,
B. Cleveland,
R. Conley,
S. Cook,
M. Coon,
W. Craddock,
A. Craycraft,
W. Cree,
T. Daniels,
L. Darroch
, et al. (135 additional authors not shown)
Abstract:
The EXO-200 experiment searched for neutrinoless double-beta decay of $^{136}$Xe with a single-phase liquid xenon detector. It used an active mass of 110 kg of 80.6%-enriched liquid xenon in an ultra-low background time projection chamber with ionization and scintillation detection and readout. This paper describes the design and performance of the various support systems necessary for detector op…
▽ More
The EXO-200 experiment searched for neutrinoless double-beta decay of $^{136}$Xe with a single-phase liquid xenon detector. It used an active mass of 110 kg of 80.6%-enriched liquid xenon in an ultra-low background time projection chamber with ionization and scintillation detection and readout. This paper describes the design and performance of the various support systems necessary for detector operation, including cryogenics, xenon handling, and controls. Novel features of the system were driven by the need to protect the thin-walled detector chamber containing the liquid xenon, to achieve high chemical purity of the Xe, and to maintain thermal uniformity across the detector.
△ Less
Submitted 22 October, 2021; v1 submitted 13 July, 2021;
originally announced July 2021.
-
A Case for Fine-grain Coherence Specialization in Heterogeneous Systems
Authors:
Johnathan Alsop,
Weon Taek Na,
Matthew D. Sinclair,
Samuel Grayson,
Sarita V. Adve
Abstract:
Hardware specialization is becoming a key enabler of energyefficient performance. Future systems will be increasingly heterogeneous, integrating multiple specialized and programmable accelerators, each with different memory demands. Traditionally, communication between accelerators has been inefficient, typically orchestrated through explicit DMA transfers between different address spaces. More re…
▽ More
Hardware specialization is becoming a key enabler of energyefficient performance. Future systems will be increasingly heterogeneous, integrating multiple specialized and programmable accelerators, each with different memory demands. Traditionally, communication between accelerators has been inefficient, typically orchestrated through explicit DMA transfers between different address spaces. More recently, industry has proposed unified coherent memory which enables implicit data movement and more data reuse, but often these interfaces limit the coherence flexibility available to heterogeneous systems. This paper demonstrates the benefits of fine-grained coherence specialization for heterogeneous systems. We propose an architecture that enables low-complexity independent specialization of each individual coherence request in heterogeneous workloads by building upon a simple and flexible baseline coherence interface, Spandex. We then describe how to optimize individual memory requests to improve cache reuse and performance-critical memory latency in emerging heterogeneous workloads. Collectively, our techniques enable significant gains, reducing execution time by up to 61% or network traffic by up to 99% while adding minimal complexity to the Spandex protocol.
△ Less
Submitted 23 April, 2021;
originally announced April 2021.
-
Demystifying BERT: Implications for Accelerator Design
Authors:
Suchita Pati,
Shaizeen Aga,
Nuwan Jayasena,
Matthew D. Sinclair
Abstract:
Transfer learning in natural language processing (NLP), as realized using models like BERT (Bi-directional Encoder Representation from Transformer), has significantly improved language representation with models that can tackle challenging language problems. Consequently, these applications are driving the requirements of future systems. Thus, we focus on BERT, one of the most popular NLP transfer…
▽ More
Transfer learning in natural language processing (NLP), as realized using models like BERT (Bi-directional Encoder Representation from Transformer), has significantly improved language representation with models that can tackle challenging language problems. Consequently, these applications are driving the requirements of future systems. Thus, we focus on BERT, one of the most popular NLP transfer learning algorithms, to identify how its algorithmic behavior can guide future accelerator design. To this end, we carefully profile BERT training and identify key algorithmic behaviors which are worthy of attention in accelerator design.
We observe that while computations which manifest as matrix multiplication dominate BERT's overall runtime, as in many convolutional neural networks, memory-intensive computations also feature prominently. We characterize these computations, which have received little attention so far. Further, we also identify heterogeneity in compute-intensive BERT computations and discuss software and possible hardware mechanisms to further optimize these computations. Finally, we discuss implications of these behaviors as networks get larger and use distributed training environments, and how techniques such as micro-batching and mixed-precision training scale. Overall, our analysis identifies holistic solutions to optimize systems for BERT-like models.
△ Less
Submitted 13 April, 2021;
originally announced April 2021.
-
Pulseshape discrimination against low-energy Ar-39 beta decays in liquid argon with 4.5 tonne-years of DEAP-3600 data
Authors:
The DEAP Collaboration,
P. Adhikari,
R. Ajaj,
M. Alpízar-Venegas,
P. -A. Amaudruz,
D. J. Auty,
M. Batygov,
B. Beltran,
H. Benmansour,
C. E. Bina,
J. Bonatt,
W. Bonivento,
M. G. Boulay,
B. Broerman,
J. F. Bueno,
P. M. Burghardt,
A. Butcher,
M. Cadeddu,
B. Cai,
M. Cárdenas-Montes,
S. Cavuoti,
M. Chen,
Y. Chen,
B. T. Cleveland,
J. M. Corning
, et al. (104 additional authors not shown)
Abstract:
The DEAP-3600 detector searches for the scintillation signal from dark matter particles scattering on a 3.3 tonne liquid argon target. The largest background comes from $^{39}$Ar beta decays and is suppressed using pulseshape discrimination (PSD).
We use two types of PSD algorithm: the prompt-fraction, which considers the fraction of the scintillation signal in a narrow and a wide time window ar…
▽ More
The DEAP-3600 detector searches for the scintillation signal from dark matter particles scattering on a 3.3 tonne liquid argon target. The largest background comes from $^{39}$Ar beta decays and is suppressed using pulseshape discrimination (PSD).
We use two types of PSD algorithm: the prompt-fraction, which considers the fraction of the scintillation signal in a narrow and a wide time window around the event peak, and the log-likelihood-ratio, which compares the observed photon arrival times to a signal and a background model. We furthermore use two algorithms to determine the number of photons detected at a given time: (1) simply dividing the charge of each PMT pulse by the charge of a single photoelectron, and (2) a likelihood analysis that considers the probability to detect a certain number of photons at a given time, based on a model for the scintillation pulseshape and for afterpulsing in the light detectors.
The prompt-fraction performs approximately as well as the log-likelihood-ratio PSD algorithm if the photon detection times are not biased by detector effects. We explain this result using a model for the information carried by scintillation photons as a function of the time when they are detected.
△ Less
Submitted 6 April, 2021; v1 submitted 22 March, 2021;
originally announced March 2021.
-
A generalised feature for low level vision
Authors:
Dr David Sinclair,
Dr Christopher Town
Abstract:
This papers presents a novel quantised transform (the Sinclair-Town or ST transform for short) that subsumes the rolls of both edge-detector, MSER style region detector and corner detector. The transform is similar to the $unsharp$ transform but the difference from the local mean is quantised to 3 values (dark-neutral-light). The transform naturally leads to the definition of an appropriate local…
▽ More
This papers presents a novel quantised transform (the Sinclair-Town or ST transform for short) that subsumes the rolls of both edge-detector, MSER style region detector and corner detector. The transform is similar to the $unsharp$ transform but the difference from the local mean is quantised to 3 values (dark-neutral-light). The transform naturally leads to the definition of an appropriate local scale. A range of methods for extracting shape features form the transformed image are presented. The generalized feature provides a robust basis for establishing correspondence between images. The transform readily admits more complicated kernel behaviour including multi-scale and asymmetric elements to prefer shorter scale or oriented local features.
△ Less
Submitted 3 February, 2021;
originally announced February 2021.
-
Characterizing oxygen atoms in perovskite and pyrochlore oxides using ADF-STEM at a resolution of a few tens of picometers
Authors:
Ali Mostaed,
Brant Walkley,
Monica Ciomaga Hatnean,
Geetha Balakrishnan,
Martin R. Lees,
Richard Beanland,
Derek C. Sinclair,
Ian M. Reaney
Abstract:
We present an aberration corrected scanning transmission electron microscopy (ac-STEM) analysis of the perovskite (LaFeO3) and pyrochlore (Yb2Ti2O7 and Pr2Zr2O7) oxides and demonstrate that both the shape and contrast of visible atomic columns in annular dark-field (ADF) images are sensitive to the presence of nearby atoms of low atomic number (e.g. oxygen). We show that point defects (e.g. oxygen…
▽ More
We present an aberration corrected scanning transmission electron microscopy (ac-STEM) analysis of the perovskite (LaFeO3) and pyrochlore (Yb2Ti2O7 and Pr2Zr2O7) oxides and demonstrate that both the shape and contrast of visible atomic columns in annular dark-field (ADF) images are sensitive to the presence of nearby atoms of low atomic number (e.g. oxygen). We show that point defects (e.g. oxygen vacancies), which are invisible - or difficult to observe due to limited sensitivity - in X-ray and neutron diffraction measurements, are the origin of the complex magnetic ground state of pyrochlore oxides. In addition, we present, for the first time, a method by which light atoms can be resolved in quantitative ADF-STEM images. Using this method, we resolved oxygen atoms in perovskite and pyrochlore oxides.
△ Less
Submitted 21 September, 2020;
originally announced September 2020.
-
SeqPoint: Identifying Representative Iterations of Sequence-based Neural Networks
Authors:
Suchita Pati,
Shaizeen Aga,
Matthew D. Sinclair,
Nuwan Jayasena
Abstract:
The ubiquity of deep neural networks (DNNs) continues to rise, making them a crucial application class for hardware optimizations. However, detailed profiling and characterization of DNN training remains difficult as these applications often run for hours to days on real hardware. Prior works exploit the iterative nature of DNNs to profile a few training iterations. While such a strategy is sound…
▽ More
The ubiquity of deep neural networks (DNNs) continues to rise, making them a crucial application class for hardware optimizations. However, detailed profiling and characterization of DNN training remains difficult as these applications often run for hours to days on real hardware. Prior works exploit the iterative nature of DNNs to profile a few training iterations. While such a strategy is sound for networks like convolutional neural networks (CNNs), where the nature of the computation is largely input independent, we observe in this work that this approach is sub-optimal for sequence-based neural networks (SQNNs) such as recurrent neural networks (RNNs). The amount and nature of computations in SQNNs can vary for each input, resulting in heterogeneity across iterations. Thus, arbitrarily selecting a few iterations is insufficient to accurately summarize the behavior of the entire training run. To tackle this challenge, we carefully study the factors that impact SQNN training iterations and identify input sequence length as the key determining factor for variations across iterations. We then use this observation to characterize all iterations of an SQNN training run (requiring no profiling or simulation of the application) and select representative iterations, which we term SeqPoints. We analyze two state-of-the-art SQNNs, DeepSpeech2 and Google's Neural Machine Translation (GNMT), and show that SeqPoints can represent their entire training runs accurately, resulting in geomean errors of only 0.11% and 0.53%, respectively, when projecting overall runtime and 0.13% and 1.50% when projecting speedups due to architectural changes. This high accuracy is achieved while reducing the time needed for profiling by 345x and 214x for the two networks compared to full training runs. As a result, SeqPoint can enable analysis of SQNN training runs in mere minutes instead of hours or days.
△ Less
Submitted 20 July, 2020;
originally announced July 2020.
-
Search for $hep$ solar neutrinos and the diffuse supernova neutrino background using all three phases of the Sudbury Neutrino Observatory
Authors:
B. Aharmim,
S. N. Ahmed,
A. E. Anthony,
N. Barros,
E. W. Beier,
A. Bellerive,
B. Beltran,
M. Bergevin,
S. D. Biller,
E. Blucher,
R. Bonventre,
K. Boudjemline,
M. G. Boulay,
B. Cai,
E. J. Callaghan,
J. Caravaca,
Y. D. Chan,
D. Chauhan,
M. Chen,
B. T. Cleveland,
G. A. Cox,
X. Dai,
H. Deng,
F. B. Descamps,
J. A. Detwiler
, et al. (107 additional authors not shown)
Abstract:
A search has been performed for neutrinos from two sources, the $hep$ reaction in the solar $pp$ fusion chain and the $ν_e$ component of the diffuse supernova neutrino background (DSNB), using the full dataset of the Sudbury Neutrino Observatory with a total exposure of 2.47 kton-years after fiducialization. The $hep$ search is performed using both a single-bin counting analysis and a likelihood f…
▽ More
A search has been performed for neutrinos from two sources, the $hep$ reaction in the solar $pp$ fusion chain and the $ν_e$ component of the diffuse supernova neutrino background (DSNB), using the full dataset of the Sudbury Neutrino Observatory with a total exposure of 2.47 kton-years after fiducialization. The $hep$ search is performed using both a single-bin counting analysis and a likelihood fit. We find a best-fit flux that is compatible with solar model predictions while remaining consistent with zero flux, and set a one-sided upper limit of $Φ_{hep} < 30\times10^{3}~\mathrm{cm}^{-2}~\mathrm{s}^{-1}$ [90% credible interval (CI)]. No events are observed in the DSNB search region, and we set an improved upper bound on the $ν_e$ component of the DSNB flux of $Φ^\mathrm{DSNB}_{ν_e} < 19~\textrm{cm}^{-2}~\textrm{s}^{-1}$ (90% CI) in the energy range $22.9 < E_ν< 36.9$~MeV.
△ Less
Submitted 12 November, 2020; v1 submitted 15 July, 2020;
originally announced July 2020.
-
The gem5 Simulator: Version 20.0+
Authors:
Jason Lowe-Power,
Abdul Mutaal Ahmad,
Ayaz Akram,
Mohammad Alian,
Rico Amslinger,
Matteo Andreozzi,
Adrià Armejach,
Nils Asmussen,
Brad Beckmann,
Srikant Bharadwaj,
Gabe Black,
Gedare Bloom,
Bobby R. Bruce,
Daniel Rodrigues Carvalho,
Jeronimo Castrillon,
Lizhong Chen,
Nicolas Derumigny,
Stephan Diestelhorst,
Wendy Elsasser,
Carlos Escuin,
Marjan Fariborz,
Amin Farmahini-Farahani,
Pouya Fotouhi,
Ryan Gambord,
Jayneel Gandhi
, et al. (53 additional authors not shown)
Abstract:
The open-source and community-supported gem5 simulator is one of the most popular tools for computer architecture research. This simulation infrastructure allows researchers to model modern computer hardware at the cycle level, and it has enough fidelity to boot unmodified Linux-based operating systems and run full applications for multiple architectures including x86, Arm, and RISC-V. The gem5 si…
▽ More
The open-source and community-supported gem5 simulator is one of the most popular tools for computer architecture research. This simulation infrastructure allows researchers to model modern computer hardware at the cycle level, and it has enough fidelity to boot unmodified Linux-based operating systems and run full applications for multiple architectures including x86, Arm, and RISC-V. The gem5 simulator has been under active development over the last nine years since the original gem5 release. In this time, there have been over 7500 commits to the codebase from over 250 unique contributors which have improved the simulator by adding new features, fixing bugs, and increasing the code quality. In this paper, we give and overview of gem5's usage and features, describe the current state of the gem5 simulator, and enumerate the major changes since the initial release of gem5. We also discuss how the gem5 simulator has transitioned to a formal governance model to enable continued improvement and community support for the next 20 years of computer architecture research.
△ Less
Submitted 29 September, 2020; v1 submitted 6 July, 2020;
originally announced July 2020.
-
Constraints on dark matter-nucleon effective couplings in the presence of kinematically distinct halo substructures using the DEAP-3600 detector
Authors:
P. Adhikari,
R. Ajaj,
C. E. Bina,
W. Bonivento,
M. G. Boulay,
M. Cadeddu,
B. Cai,
M. Cárdenas-Montes,
S. Cavuoti,
Y. Chen,
B. T. Cleveland,
J. M. Corning,
S. Daugherty,
P. DelGobbo,
P. Di Stefano,
L. Doria,
M. Dunford,
A. Erlandson,
S. S. Farahani,
N. Fatemighomi,
G. Fiorillo,
D. Gallacher,
E. A. Garcés,
P. García Abia,
S. Garg
, et al. (59 additional authors not shown)
Abstract:
DEAP-3600 is a single-phase liquid argon detector aiming to directly detect Weakly Interacting Massive Particles (WIMPs), located at SNOLAB (Sudbury, Canada). After analyzing data taken during the first year of operation, a null result was used to place an upper bound on the WIMP-nucleon spin-independent, isoscalar cross section. This study reinterprets this result within a Non-Relativistic Effect…
▽ More
DEAP-3600 is a single-phase liquid argon detector aiming to directly detect Weakly Interacting Massive Particles (WIMPs), located at SNOLAB (Sudbury, Canada). After analyzing data taken during the first year of operation, a null result was used to place an upper bound on the WIMP-nucleon spin-independent, isoscalar cross section. This study reinterprets this result within a Non-Relativistic Effective Field Theory framework, and further examines how various possible substructures in the local dark matter halo may affect these constraints. Such substructures are hinted at by kinematic structures in the local stellar distribution observed by the Gaia satellite and other recent astronomical surveys. These include the Gaia Sausage (or Enceladus), as well as a number of distinct streams identified in recent studies. Limits are presented for the coupling strength of the effective contact interaction operators $\mathcal{O}_1$, $\mathcal{O}_3$, $\mathcal{O}_5$, $\mathcal{O}_8$, and $\mathcal{O}_{11}$, considering isoscalar, isovector, and xenonphobic scenarios, as well as the specific operators corresponding to millicharge, magnetic dipole, electric dipole, and anapole interactions. The effects of halo substructures on each of these operators are explored as well, showing that the $\mathcal{O}_5$ and $\mathcal{O}_8$ operators are particularly sensitive to the velocity distribution, even at dark matter masses above 100 GeV/$c^2$.
△ Less
Submitted 5 January, 2022; v1 submitted 29 May, 2020;
originally announced May 2020.
-
Specializing Coherence, Consistency, and Push/Pull for GPU Graph Analytics
Authors:
Giordano Salvador,
Wesley H. Darvin,
Muhammad Huzaifa,
Johnathan Alsop,
Matthew D. Sinclair,
Sarita V. Adve
Abstract:
This work provides the first study to explore the interaction of update propagation with and without fine-grained synchronization (push vs. pull), emerging coherence protocols (GPU vs. DeNovo coherence), and software-centric consistency models (DRF0, DRF1, and DRFrlx) for graph workloads on emerging integrated GPU-CPU systems with native unified shared memory. We study 6 graph applications with 6…
▽ More
This work provides the first study to explore the interaction of update propagation with and without fine-grained synchronization (push vs. pull), emerging coherence protocols (GPU vs. DeNovo coherence), and software-centric consistency models (DRF0, DRF1, and DRFrlx) for graph workloads on emerging integrated GPU-CPU systems with native unified shared memory. We study 6 graph applications with 6 graph inputs for a total of 36 workloads running on 12 system (hardware+software) configurations reflecting the above design space of update propagation, coherence, and memory consistency. We make three key contributions. First, we show that there is no single best system configuration for all workloads, motivating systems with flexible coherence and consistency support. Second, we develop a model to accurately predict the best system configuration -- this model can be used by software designers to decide on push vs. pull and the consistency model and by flexible hardware to invoke the appropriate coherence and consistency configuration for the given workload. Third, we show that the design dimensions explored here are inter-dependent, reinforcing the need for software-hardware co-design in the above design dimensions. For example, software designers deciding on push vs. pull must consider the consistency model supported by hardware -- in some cases, push maybe better if hardware supports DRFrlx while pull may be better if hardware does not support DRFrlx.
△ Less
Submitted 25 February, 2020; v1 submitted 19 February, 2020;
originally announced February 2020.
-
Non-Archimedean Electrostatics
Authors:
Christopher D. Sinclair
Abstract:
We introduce ensembles of repelling charged particles restricted to a ball in a non-archimedean field (such as the $p$-adic numbers) with interaction energy between pairs of particles proportional to the logarithm of the ($p$-adic) distance between them. In the {\em canonical ensemble}, a system of $N$ particles is put in contact with a heat bath at fixed inverse temperature $β$ and energy is allo…
▽ More
We introduce ensembles of repelling charged particles restricted to a ball in a non-archimedean field (such as the $p$-adic numbers) with interaction energy between pairs of particles proportional to the logarithm of the ($p$-adic) distance between them. In the {\em canonical ensemble}, a system of $N$ particles is put in contact with a heat bath at fixed inverse temperature $β$ and energy is allowed to flow between the system and the heat bath. Using standard axioms of statistical physics, the relative density of states is given by the $β$ power of the ($p$-adic) absolute value of the Vandermonde determinant in the locations of the particles. The partition function is the normalizing constant (as a function of $β$) of this ensemble, and we identify a recursion that allows this to be computed explicitly in finite time. Probabilities of interest, including the probabilities that specified subsets will have a prescribed occupation number of particles, and the conditional distribution of particles within a subset given a prescribed occupation number, are given explicitly in terms of the partition function. We then turn to the {\em grand canonical ensemble} where both the energy and number of particles are variable. We compute similar probabilities to those in the canonical ensemble and show how these probabilities can be given in terms the canonical and grand canonical partition functions. Finally, we briefly consider the multi-component ensemble where particles are allowed to take different integer charges, and we connect basic properties of this ensemble to the canonical and grand canonical ensembles.
△ Less
Submitted 17 February, 2020;
originally announced February 2020.
-
Measurement of the Spectral Shape of the beta-decay of 137Xe to the Ground State of 137Cs in EXO-200 and Comparison with Theory
Authors:
S. Al Kharusi,
G. Anton,
I. Badhrees,
P. S. Barbeau,
D. Beck,
V. Belov,
T. Bhatta,
M. Breidenbach,
T. Brunner,
G. F. Cao,
W. R. Cen,
C. Chambers,
B. Cleveland,
M. Coon,
A. Craycraft,
T. Daniels,
L. Darroch,
S. J. Daugherty,
J. Davis,
S. Delaquis,
A. Der Mesrobian-Kabakian,
R. DeVoe,
J. Dilling,
A. Dolgolenko,
M. J. Dolinski
, et al. (83 additional authors not shown)
Abstract:
We report on a comparison between the theoretically predicted and experimentally measured spectra of the first-forbidden non-unique $β$-decay transition $^{137}\textrm{Xe}(7/2^-)\to\,^{137}\textrm{Cs}(7/2^+)$. The experimental data were acquired by the EXO-200 experiment during a deployment of an AmBe neutron source. The ultra-low background environment of EXO-200, together with dedicated source d…
▽ More
We report on a comparison between the theoretically predicted and experimentally measured spectra of the first-forbidden non-unique $β$-decay transition $^{137}\textrm{Xe}(7/2^-)\to\,^{137}\textrm{Cs}(7/2^+)$. The experimental data were acquired by the EXO-200 experiment during a deployment of an AmBe neutron source. The ultra-low background environment of EXO-200, together with dedicated source deployment and analysis procedures, allowed for collection of a pure sample of the decays, with an estimated signal-to-background ratio of more than 99-to-1 in the energy range from 1075 to 4175 keV. In addition to providing a rare and accurate measurement of the first-forbidden non-unique $β$-decay shape, this work constitutes a novel test of the calculated electron spectral shapes in the context of the reactor antineutrino anomaly and spectral bump.
△ Less
Submitted 7 May, 2020; v1 submitted 31 January, 2020;
originally announced February 2020.
-
Applying Complex Langevin to Lattice QCD at finite $μ$
Authors:
D. K. Sinclair,
J. B. Kogut
Abstract:
We continue our simulations of lattice QCD at finite quark-number chemical potential, $μ$, using the complex-Langevin equation (CLE) with gauge-cooling and adaptive updating. The CLE is used because QCD at finite finite $μ$ has a complex fermion determinant, which prevents use of standard simulation methods. Simulations using the standard lattice action show a transition from hadronic to nuclear m…
▽ More
We continue our simulations of lattice QCD at finite quark-number chemical potential, $μ$, using the complex-Langevin equation (CLE) with gauge-cooling and adaptive updating. The CLE is used because QCD at finite finite $μ$ has a complex fermion determinant, which prevents use of standard simulation methods. Simulations using the standard lattice action show a transition from hadronic to nuclear matter for $μ< m_π/2$ rather than the expected $μ\approx m_N/3$. This suggests that the CLE is being influenced by the phase-quenched theory, which has a transition at $μ= m_π/2$. We are therefore performing CLE simulations with a new action which includes an irrelevant chiral 4-fermion interaction. This separates the physics at energies of order of the pion mass and smaller from that at energies of the other hadrons. In doing this, it breaks the extended symmetry of the phase-quenched theory over that of the full theory, raising the masses of the extra pion-like excitations consisting of a quark and a conjugate quark, which could otherwise produce such an anomalous transition. Our preliminary CLE simulations using massless quarks, so that $m_π=0$, show no transition at $μ=m_π/2=0$, but do show a transition at an appreciably higher value of $μ$. It remains to be seen if this transition is near to $m_N/3$.
△ Less
Submitted 24 October, 2019;
originally announced October 2019.
-
Optimizing GPU Cache Policies for MI Workloads
Authors:
Johnathan Alsop,
Matthew D. Sinclair,
Srikant Bharadwaj,
Alexandru Dutu,
Anthony Gutierrez,
Onur Kayiran,
Michael LeBeane,
Sooraj Puthoor,
Xianwei Zhang,
Tsung Tai Yeh,
Bradford M. Beckmann
Abstract:
In recent years, machine intelligence (MI) applications have emerged as a major driver for the computing industry. Optimizing these workloads is important but complicated. As memory demands grow and data movement overheads increasingly limit performance, determining the best GPU caching policy to use for a diverse range of MI workloads represents one important challenge. To study this, we evaluate…
▽ More
In recent years, machine intelligence (MI) applications have emerged as a major driver for the computing industry. Optimizing these workloads is important but complicated. As memory demands grow and data movement overheads increasingly limit performance, determining the best GPU caching policy to use for a diverse range of MI workloads represents one important challenge. To study this, we evaluate 17 MI applications and characterize their behaviors using a range of GPU caching strategies. In our evaluations, we find that the choice of caching policy in GPU caches involves multiple performance trade-offs and interactions, and there is no one-size-fits-all GPU caching policy for MI workloads. Based on detailed simulation results, we motivate and evaluate a set of cache optimizations that consistently match the performance of the best static GPU caching policies.
△ Less
Submitted 30 September, 2019;
originally announced October 2019.
-
Cosmogenic Neutron Production at the Sudbury Neutrino Observatory
Authors:
B. Aharmim,
S. N. Ahmed,
A. E. Anthony,
N. Barros,
E. W. Beier,
A. Bellerive,
B. Beltran,
M. Bergevin,
S. D. Biller,
R. Bonventre,
K. Boudjemline,
M. G. Boulay,
B. Cai,
E. J. Callaghan,
J. Caravaca,
Y. D. Chan,
D. Chauhan,
M. Chen,
B. T. Cleveland,
G. A. Cox,
R. Curley,
X. Dai,
H. Deng,
F. B. Descamps,
J. A. Detwiler
, et al. (106 additional authors not shown)
Abstract:
Neutrons produced in nuclear interactions initiated by cosmic-ray muons present an irreducible background to many rare-event searches, even in detectors located deep underground. Models for the production of these neutrons have been tested against previous experimental data, but the extrapolation to deeper sites is not well understood. Here we report results from an analysis of cosmogenically prod…
▽ More
Neutrons produced in nuclear interactions initiated by cosmic-ray muons present an irreducible background to many rare-event searches, even in detectors located deep underground. Models for the production of these neutrons have been tested against previous experimental data, but the extrapolation to deeper sites is not well understood. Here we report results from an analysis of cosmogenically produced neutrons at the Sudbury Neutrino Observatory. A specific set of observables are presented, which can be used to benchmark the validity of GEANT4 physics models. In addition, the cosmogenic neutron yield, in units of $10^{-4}\;\text{cm}^{2}/\left(\text{g}\cdotμ\right)$, is measured to be $7.28 \pm 0.09\;\text{stat.} ^{+1.59}_{-1.12}\;\text{syst.}$ in pure heavy water and $7.30 \pm 0.07\;\text{stat.} ^{+1.40}_{-1.02}\;\text{syst.}$ in NaCl-loaded heavy water. These results provide unique insights into this potential background source for experiments at SNOLAB.
△ Less
Submitted 25 September, 2019;
originally announced September 2019.
-
Measurement of the scintillation and ionization response of liquid xenon at MeV energies in the EXO-200 experiment
Authors:
EXO-200 Collaboration,
:,
G. Anton,
I. Badhrees,
P. S. Barbeau,
D. Beck,
V. Belov,
T. Bhatta,
M. Breidenbach,
T. Brunner,
G. F. Cao,
W. R. Cen,
C. Chambers,
B. Cleveland,
M. Coon,
A. Craycraft,
T. Daniels,
L. Darroch,
S. J. Daugherty,
J. Davis,
S. Delaquis,
A. Der Mesrobian-Kabakian,
R. DeVoe,
J. Dilling,
A. Dolgolenko
, et al. (78 additional authors not shown)
Abstract:
Liquid xenon (LXe) is employed in a number of current and future detectors for rare event searches. We use the EXO-200 experimental data to measure the absolute scintillation and ionization yields generated by $γ$ interactions from $^{228}$Th (2615~keV), $^{226}$Ra (1764~keV) and $^{60}$Co (1332~keV and 1173~keV) calibration sources, over a range of electric fields. The $W$-value that defines the…
▽ More
Liquid xenon (LXe) is employed in a number of current and future detectors for rare event searches. We use the EXO-200 experimental data to measure the absolute scintillation and ionization yields generated by $γ$ interactions from $^{228}$Th (2615~keV), $^{226}$Ra (1764~keV) and $^{60}$Co (1332~keV and 1173~keV) calibration sources, over a range of electric fields. The $W$-value that defines the recombination-independent energy scale is measured to be $11.5~\pm~0.5$~(syst.)~$\pm~0.1$~(stat.) eV. These data are also used to measure the recombination fluctuations in the number of electrons and photons produced by the calibration sources at the MeV-scale, which deviate from extrapolations of lower-energy data. Additionally, a semi-empirical model for the energy resolution of the detector is developed, which is used to constrain the recombination efficiency, i.e., the fraction of recombined electrons that result in the emission of a detectable photon. Detailed measurements of the absolute charge and light yields for MeV-scale electron recoils are important for predicting the performance of future neutrinoless double beta decay detectors.
△ Less
Submitted 15 June, 2020; v1 submitted 12 August, 2019;
originally announced August 2019.
-
Search for Neutrinoless Double-Beta Decay with the Complete EXO-200 Dataset
Authors:
G. Anton,
I. Badhrees,
P. S. Barbeau,
D. Beck,
V. Belov,
T. Bhatta,
M. Breidenbach,
T. Brunner,
G. F. Cao,
W. R. Cen,
C. Chambers,
B. Cleveland,
M. Coon,
A. Craycraft,
T. Daniels,
M. Danilov,
L. Darroch,
S. J. Daugherty,
J. Davis,
S. Delaquis,
A. Der Mesrobian-Kabakian,
R. DeVoe,
J. Dilling,
A. Dolgolenko,
M. J. Dolinski
, et al. (77 additional authors not shown)
Abstract:
A search for neutrinoless double-beta decay ($0νββ$) in $^{136}$Xe is performed with the full EXO-200 dataset using a deep neural network to discriminate between $0νββ$ and background events. Relative to previous analyses, the signal detection efficiency has been raised from 80.8% to 96.4$\pm$3.0% and the energy resolution of the detector at the Q-value of $^{136}$Xe $0νββ$ has been improved from…
▽ More
A search for neutrinoless double-beta decay ($0νββ$) in $^{136}$Xe is performed with the full EXO-200 dataset using a deep neural network to discriminate between $0νββ$ and background events. Relative to previous analyses, the signal detection efficiency has been raised from 80.8% to 96.4$\pm$3.0% and the energy resolution of the detector at the Q-value of $^{136}$Xe $0νββ$ has been improved from $σ/E=1.23\%$ to $1.15\pm0.02\%$ with the upgraded detector. Accounting for the new data, the median 90% confidence level $0νββ$ half-life sensitivity for this analysis is $5.0 \cdot 10^{25}$ yr with a total $^{136}$Xe exposure of 234.1 kg$\cdot$yr. No statistically significant evidence for $0νββ$ is observed, leading to a lower limit on the $0νββ$ half-life of $3.5\cdot10^{25}$ yr at the 90% confidence level.
△ Less
Submitted 18 October, 2019; v1 submitted 6 June, 2019;
originally announced June 2019.
-
Measurement of neutron production in atmospheric neutrino interactions at the Sudbury Neutrino Observatory
Authors:
SNO Collaboration,
B. Aharmim,
S. N. Ahmed,
A. E. Anthony,
N. Barros,
E. W. Beier,
A. Bellerive,
B. Beltran,
M. Bergevin,
S. D. Biller,
R. Bonventre,
K. Boudjemline,
M. G. Boulay,
B. Cai,
E. J. Callaghan,
J. Caravaca,
Y. D. Chan,
D. Chauhan,
M. Chen,
B. T. Cleveland,
G. A. Cox,
X. Dai,
H. Deng,
F. B. Descamps,
J. A. Detwiler
, et al. (107 additional authors not shown)
Abstract:
Neutron production in GeV-scale neutrino interactions is a poorly studied process. We have measured the neutron multiplicities in atmospheric neutrino interactions in the Sudbury Neutrino Observatory experiment and compared them to the prediction of a Monte Carlo simulation using GENIE and a minimally modified version of GEANT4. We analyzed 837 days of exposure corresponding to Phase I, using pure…
▽ More
Neutron production in GeV-scale neutrino interactions is a poorly studied process. We have measured the neutron multiplicities in atmospheric neutrino interactions in the Sudbury Neutrino Observatory experiment and compared them to the prediction of a Monte Carlo simulation using GENIE and a minimally modified version of GEANT4. We analyzed 837 days of exposure corresponding to Phase I, using pure heavy water, and Phase II, using a mixture of Cl in heavy water. Neutrons produced in atmospheric neutrino interactions were identified with an efficiency of $15.3\%$ and $44.3\%$, for Phase I and II respectively. The neutron production is measured as a function of the visible energy of the neutrino interaction and, for charged current quasi-elastic interaction candidates, also as a function of the neutrino energy. This study is also performed classifying the complete sample into two pairs of event categories: charged current quasi-elastic and non charged current quasi-elastic, and $ν_μ$ and $ν_e$. Results show good overall agreement between data and Monte Carlo for both phases, with some small tension with a statistical significance below $2σ$ for some intermediate energies.
△ Less
Submitted 19 June, 2019; v1 submitted 1 April, 2019;
originally announced April 2019.
-
Applying Complex Langevin Simulations to Lattice QCD at Finite Density
Authors:
J. B. Kogut,
D. K. Sinclair
Abstract:
We study the use of the complex-Langevin equation (CLE) to simulate lattice QCD at a finite chemical potential ($μ$) for quark-number, which has a complex fermion determinant that prevents the use of standard simulation methods based on importance sampling. Recent enhancements to the CLE specific to lattice QCD inhibit runaway solutions which had foiled earlier attempts to use it for such simulati…
▽ More
We study the use of the complex-Langevin equation (CLE) to simulate lattice QCD at a finite chemical potential ($μ$) for quark-number, which has a complex fermion determinant that prevents the use of standard simulation methods based on importance sampling. Recent enhancements to the CLE specific to lattice QCD inhibit runaway solutions which had foiled earlier attempts to use it for such simulations. However, it is not guaranteed to produce correct results. Our goal is to determine under what conditions the CLE yields correct values for the observables of interest. Zero temperature simulations indicate that for moderate couplings, good agreement with expected results is obtained for small $μ$ and for $μ$ large enough to reach saturation, and that this agreement improves as we go to weaker coupling. For intermediate $μ$ values these simulations do not produce the correct physics. We compare our results with those of the phase-quenched approximation. Since there are indications that correct results might be obtained if the CLE trajectories remain close to the $SU(3)$ manifold, we study how the distance from this manifold depends on the quark mass and on the coupling. We find that this distance decreases with decreasing quark mass and as the coupling decreases, i.e. as the simulations approach the continuum limit.
△ Less
Submitted 21 August, 2019; v1 submitted 6 March, 2019;
originally announced March 2019.
-
Constraints on Neutrino Lifetime from the Sudbury Neutrino Observatory
Authors:
SNO Collaboration,
B. Aharmim,
S. N. Ahmed,
A. E. Anthony,
N. Barros,
E. W. Beier,
A. Bellerive,
B. Beltran,
M. Bergevin,
S. D. Biller,
R. Bonventre,
K. Boudjemline,
M. G. Boulay,
B. Cai,
E. J. Callaghan,
J. Caravaca,
Y. D. Chan,
D. Chauhan,
M. Chen,
B. T. Cleveland,
G. A. Cox,
X. Dai,
H. Deng,
F. B. Descamps,
J. A. Detwiler
, et al. (106 additional authors not shown)
Abstract:
The long baseline between the Earth and the Sun makes solar neutrinos an excellent test beam for exploring possible neutrino decay. The signature of such decay would be an energy-dependent distortion of the traditional survival probability which can be fit for using well-developed and high precision analysis methods. Here a model including neutrino decay is fit to all three phases of $^8$B solar n…
▽ More
The long baseline between the Earth and the Sun makes solar neutrinos an excellent test beam for exploring possible neutrino decay. The signature of such decay would be an energy-dependent distortion of the traditional survival probability which can be fit for using well-developed and high precision analysis methods. Here a model including neutrino decay is fit to all three phases of $^8$B solar neutrino data taken by the Sudbury Neutrino Observatory. This fit constrains the lifetime of neutrino mass state $ν_2$ to be ${>8.08\times10^{-5}}$ s/eV at $90\%$ confidence. An analysis combining this SNO result with those from other solar neutrino experiments results in a combined limit for the lifetime of mass state $ν_2$ of ${>1.04\times10^{-3}}$ s/eV at $99\%$ confidence.
△ Less
Submitted 3 December, 2018;
originally announced December 2018.
-
Analyzing Machine Learning Workloads Using a Detailed GPU Simulator
Authors:
Jonathan Lew,
Deval Shah,
Suchita Pati,
Shaylin Cattell,
Mengchi Zhang,
Amruth Sandhupatla,
Christopher Ng,
Negar Goli,
Matthew D. Sinclair,
Timothy G. Rogers,
Tor Aamodt
Abstract:
Most deep neural networks deployed today are trained using GPUs via high-level frameworks such as TensorFlow and PyTorch. This paper describes changes we made to the GPGPU-Sim simulator to enable it to run PyTorch by running PTX kernels included in NVIDIA's cuDNN library. We use the resulting modified simulator, which has been made available publicly with this paper, to study some simple deep lear…
▽ More
Most deep neural networks deployed today are trained using GPUs via high-level frameworks such as TensorFlow and PyTorch. This paper describes changes we made to the GPGPU-Sim simulator to enable it to run PyTorch by running PTX kernels included in NVIDIA's cuDNN library. We use the resulting modified simulator, which has been made available publicly with this paper, to study some simple deep learning workloads. With our changes to GPGPU-Sim's functional simulation model, we find GPGPU-Sim performance model running a cuDNN enabled implementation of LeNet for MNIST reports results within 30% of real hardware. Using GPGPU-Sim's AerialVision performance analysis tool we observe that cuDNN API calls contain many varying phases and appear to include potentially inefficient microarchitecture behaviour such as DRAM partition bank camping, at least when executed on GPGPU-Sim's current performance model.
△ Less
Submitted 26 January, 2019; v1 submitted 18 November, 2018;
originally announced November 2018.
-
Tests of Lorentz invariance at the Sudbury Neutrino Observatory
Authors:
SNO Collaboration,
B. Aharmim,
S. N. Ahmed,
A. E. Anthony,
N. Barros,
E. W. Beier,
A. Bellerive,
B. Beltran,
M. Bergevin,
S. D. Biller,
E. Blucher,
R. Bonventre,
K. Boudjemline,
M. G. Boulay,
B. Cai,
E. J. Callaghan,
J. Caravaca,
Y. D. Chan,
D. Chauhan,
M. Chen,
B. T. Cleveland,
G. A. Cox,
X. Dai,
H. Deng,
F. B. Descamps
, et al. (109 additional authors not shown)
Abstract:
Experimental tests of Lorentz symmetry in systems of all types are critical for ensuring that the basic assumptions of physics are well-founded. Data from all phases of the Sudbury Neutrino Observatory, a kiloton-scale heavy water Cherenkov detector, are analyzed for possible violations of Lorentz symmetry in the neutrino sector. Such violations would appear as one of eight possible signal types i…
▽ More
Experimental tests of Lorentz symmetry in systems of all types are critical for ensuring that the basic assumptions of physics are well-founded. Data from all phases of the Sudbury Neutrino Observatory, a kiloton-scale heavy water Cherenkov detector, are analyzed for possible violations of Lorentz symmetry in the neutrino sector. Such violations would appear as one of eight possible signal types in the detector: six seasonal variations in the solar electron neutrino survival probability differing in energy and time dependence, and two shape changes to the oscillated solar neutrino energy spectrum. No evidence for such signals is observed, and limits on the size of such effects are established in the framework of the Standard Model Extension, including 40 limits on perviously unconstrained operators and improved limits on 15 additional operators. This makes limits on all minimal, Dirac-type Lorentz violating operators in the neutrino sector available for the first time.
△ Less
Submitted 3 January, 2019; v1 submitted 31 October, 2018;
originally announced November 2018.
-
Complex Langevin for Lattice QCD
Authors:
D. K. Sinclair,
J. B. Kogut
Abstract:
We simulate lattice QCD at finite quark-number chemical potential, $μ$, using the complex-Langevin equation (CLE) with gauge-cooling and adaptive updating to prevent instabilities. The CLE is used because QCD at finite $μ$ has a complex fermion determinant which precludes the use of standard simulation methods based on importance sampling. Since, even when CLE simulations converge, they are not gu…
▽ More
We simulate lattice QCD at finite quark-number chemical potential, $μ$, using the complex-Langevin equation (CLE) with gauge-cooling and adaptive updating to prevent instabilities. The CLE is used because QCD at finite $μ$ has a complex fermion determinant which precludes the use of standard simulation methods based on importance sampling. Since, even when CLE simulations converge, they are not guaranteed to produce correct results except under very stringent conditions, which lattice QCD at finite $μ$ does not obey, we need extensive testing to determine under what conditions it produces reliable results. We performed simulations at $β=6/g^2=5.6$ and $β=5.7$, both at $m=0.025$. For small $μ$ and $μ$ large enough to produce saturation, measured observables appear to be approaching their correct values as the coupling is decreased. However, for intermediate $μ$ values, these simulations predict a transition from hadronic to nuclear matter at a $μ$ which is far too small. Since there is evidence that for CLE simulations to produce correct results the trajectories should remain close to the $SU(3)$ manifold (at least for small $μ$), we explore the parameter space to see where this is true. We find that the distance from this manifold decreases as the coupling decreases and as the quark mass (in lattice units) decreases, i.e. as we approach the continuum limit. This indicates that we need to simulate at smaller couplings and quark masses (requiring larger lattices) to see if these can produce the correct physics.
△ Less
Submitted 28 October, 2018;
originally announced October 2018.
-
Study of Silicon Photomultiplier Performance in External Electric Fields
Authors:
X. L. Sun,
T. Tolba,
G. F. Cao,
P. Lv,
L. J. Wen,
A. Odian,
F. Vachon,
A. Alamre,
J. B. Albert,
G. Anton,
I. J. Arnquist,
I. Badhrees,
P. S. Barbeau,
D. Beck,
V. Belov,
T. Bhatta,
F. Bourque,
J. P. Brodsky,
E. Brown,
T. Brunner,
A. Burenkov,
L. Cao,
W. R. Cen,
C. Chambers,
S. A. Charlebois
, et al. (127 additional authors not shown)
Abstract:
We report on the performance of silicon photomultiplier (SiPM) light sensors operating in electric field strength up to 30 kV/cm and at a temperature of 149K, relative to their performance in the absence of an external electric field. The SiPM devices used in this study show stable gain, photon detection efficiency, and rates of correlated pulses, when exposed to external fields, within the estima…
▽ More
We report on the performance of silicon photomultiplier (SiPM) light sensors operating in electric field strength up to 30 kV/cm and at a temperature of 149K, relative to their performance in the absence of an external electric field. The SiPM devices used in this study show stable gain, photon detection efficiency, and rates of correlated pulses, when exposed to external fields, within the estimated uncertainties. No observable physical damage to the bulk or surface of the devices was caused by the exposure.
△ Less
Submitted 9 July, 2018;
originally announced July 2018.
-
Imaging individual barium atoms in solid xenon for barium tagging in nEXO
Authors:
C. Chambers,
T. Walton,
D. Fairbank,
A. Craycraft,
D. R. Yahne,
J. Todd,
A. Iverson,
W. Fairbank,
A. Alamare,
J. B. Albert,
G. Anton,
I. J. Arnquist,
I. Badhrees,
P. S. Barbeau,
D. Beck,
V. Belov,
T. Bhatta,
F. Bourque,
J. P. Brodsky,
E. Brown,
T. Brunner,
A. Burenkov,
G. F. Cao,
L. Cao,
W. R. Cen
, et al. (126 additional authors not shown)
Abstract:
The search for neutrinoless double beta decay probes the fundamental properties of neutrinos, including whether or not the neutrino and antineutrino are distinct. Double beta detectors are large and expensive, so background reduction is essential for extracting the highest sensitivity. The identification, or 'tagging', of the $^{136}$Ba daughter atom from double beta decay of $^{136}$Xe provides a…
▽ More
The search for neutrinoless double beta decay probes the fundamental properties of neutrinos, including whether or not the neutrino and antineutrino are distinct. Double beta detectors are large and expensive, so background reduction is essential for extracting the highest sensitivity. The identification, or 'tagging', of the $^{136}$Ba daughter atom from double beta decay of $^{136}$Xe provides a technique for eliminating backgrounds in the nEXO neutrinoless double beta decay experiment. The tagging scheme studied in this work utilizes a cryogenic probe to trap the barium atom in solid xenon, where the barium atom is tagged via fluorescence imaging in the solid xenon matrix. Here we demonstrate imaging and counting of individual atoms of barium in solid xenon by scanning a focused laser across a solid xenon matrix deposited on a sapphire window. When the laser sits on an individual atom, the fluorescence persists for $\sim$30~s before dropping abruptly to the background level, a clear confirmation of one-atom imaging. No barium fluorescence persists following evaporation of a barium deposit to a limit of $\leq$0.16\%. This is the first time that single atoms have been imaged in solid noble element. It establishes the basic principle of a barium tagging technique for nEXO.
△ Less
Submitted 12 December, 2018; v1 submitted 27 June, 2018;
originally announced June 2018.
-
The reciprocal Mahler ensembles of random polynomials
Authors:
Christopher D. Sinclair,
Maxim L. Yattselev
Abstract:
We consider the roots of uniformly chosen complex and real reciprocal polynomials of degree $N$ whose Mahler measure is bounded by a constant. After a change of variables this reduces to a generalization of Ginibre's complex and real ensembles of random matrices where the weight function (on the eigenvalues of the matrices) is replaced by the exponentiated equilibrium potential of the interval…
▽ More
We consider the roots of uniformly chosen complex and real reciprocal polynomials of degree $N$ whose Mahler measure is bounded by a constant. After a change of variables this reduces to a generalization of Ginibre's complex and real ensembles of random matrices where the weight function (on the eigenvalues of the matrices) is replaced by the exponentiated equilibrium potential of the interval $[-2,2]$ on the real axis in the complex plane. In the complex (real) case the random roots form a determinantal (Pfaffian) point process, and in both cases the empirical measure on roots converges weakly to the arcsine distribution supported on $[-2,2]$. Outside this region the kernels converge without scaling, implying among other things that there is a positive expected number of outliers away from $[-2,2]$. These kernels, as well as the scaling limits for the kernels in the bulk $(-2,2)$ and at the endpoints $\{-2,2\}$ are presented. These kernels appear to be new, and we compare their behavior with related kernels which arise from the (non-reciprocal) Mahler measure ensemble of random polynomials as well as the classical Sine and Bessel kernels.
△ Less
Submitted 7 June, 2018;
originally announced June 2018.
-
VUV-sensitive Silicon Photomultipliers for Xenon Scintillation Light Detection in nEXO
Authors:
A. Jamil,
T. Ziegler,
P. Hufschmidt,
G. Li,
L. Lupin-Jimenez,
T. Michel,
I. Ostrovskiy,
F. Retière,
J. Schneider,
M. Wagenpfeil,
J. B. Albert,
G. Anton,
I. J. Arnquist,
I. Badhrees,
P. Barbeau,
D. Beck,
V. Belov,
J. P. Brodsky,
E. Brown,
T. Brunner,
A. Burenkov,
G. F. Cao,
L. Cao,
W. R. Cen,
C. Chambers
, et al. (118 additional authors not shown)
Abstract:
Future tonne-scale liquefied noble gas detectors depend on efficient light detection in the VUV range. In the past years Silicon Photomultipliers (SiPMs) have emerged as a valid alternative to standard photomultiplier tubes or large area avalanche photodiodes. The next generation double beta decay experiment, nEXO, with a 5 tonne liquid xenon time projection chamber, will use SiPMs for detecting t…
▽ More
Future tonne-scale liquefied noble gas detectors depend on efficient light detection in the VUV range. In the past years Silicon Photomultipliers (SiPMs) have emerged as a valid alternative to standard photomultiplier tubes or large area avalanche photodiodes. The next generation double beta decay experiment, nEXO, with a 5 tonne liquid xenon time projection chamber, will use SiPMs for detecting the $178\,\text{nm}$ xenon scintillation light, in order to achieve an energy resolution of $σ/ Q_{ββ} = 1\, \%$. This paper presents recent measurements of the VUV-HD generation SiPMs from Fondazione Bruno Kessler in two complementary setups. It includes measurements of the photon detection efficiency with gaseous xenon scintillation light in a vacuum setup and dark measurements in a dry nitrogen gas setup. We report improved photon detection efficiency at $175\,\text{nm}$ compared to previous generation devices, that would meet the criteria of nEXO. Furthermore, we present the projected nEXO detector light collection and energy resolution that could be achieved by using these SiPMs.
△ Less
Submitted 13 March, 2019; v1 submitted 6 June, 2018;
originally announced June 2018.
-
Deep Neural Networks for Energy and Position Reconstruction in EXO-200
Authors:
S. Delaquis,
M. J. Jewell,
I. Ostrovskiy,
M. Weber,
T. Ziegler,
J. Dalmasson,
L. J. Kaufman,
T. Richards,
J. B. Albert,
G. Anton,
I. Badhrees,
P. S. Barbeau,
R. Bayerlein,
D. Beck,
V. Belov,
M. Breidenbach,
T. Brunner,
G. F. Cao,
W. R. Cen,
C. Chambers,
B. Cleveland,
M. Coon,
A. Craycraft,
W. Cree,
T. Daniels
, et al. (69 additional authors not shown)
Abstract:
We apply deep neural networks (DNN) to data from the EXO-200 experiment. In the studied cases, the DNN is able to reconstruct the relevant parameters - total energy and position - directly from raw digitized waveforms, with minimal exceptions. For the first time, the developed algorithms are evaluated on real detector calibration data. The accuracy of reconstruction either reaches or exceeds what…
▽ More
We apply deep neural networks (DNN) to data from the EXO-200 experiment. In the studied cases, the DNN is able to reconstruct the relevant parameters - total energy and position - directly from raw digitized waveforms, with minimal exceptions. For the first time, the developed algorithms are evaluated on real detector calibration data. The accuracy of reconstruction either reaches or exceeds what was achieved by the conventional approaches developed by EXO-200 over the course of the experiment. Most existing DNN approaches to event reconstruction and classification in particle physics are trained on Monte Carlo simulated events. Such algorithms are inherently limited by the accuracy of the simulation. We describe a unique approach that, in an experiment such as EXO-200, allows to successfully perform certain reconstruction and analysis tasks by training the network on waveforms from experimental data, either reducing or eliminating the reliance on the Monte Carlo.
△ Less
Submitted 30 August, 2018; v1 submitted 25 April, 2018;
originally announced April 2018.
-
Complex Langevin Simulations of QCD at Finite Density -- Progress Report
Authors:
D. K. Sinclair,
J. B. Kogut
Abstract:
We simulate lattice QCD at finite quark-number chemical potential to study nuclear matter, using the complex Langevin equation (CLE). The CLE is used because the fermion determinant is complex so that standard methods relying on importance sampling fail. Adaptive methods and gauge-cooling are used to prevent runaway solutions. Even then, the CLE is not guaranteed to give correct results. We are th…
▽ More
We simulate lattice QCD at finite quark-number chemical potential to study nuclear matter, using the complex Langevin equation (CLE). The CLE is used because the fermion determinant is complex so that standard methods relying on importance sampling fail. Adaptive methods and gauge-cooling are used to prevent runaway solutions. Even then, the CLE is not guaranteed to give correct results. We are therefore performing extensive testing to determine under what, if any, conditions we can achieve reliable results. Our earlier simulations at $β=6/g^2=5.6$, $m=0.025$ on a $12^4$ lattice reproduced the expected phase structure but failed in the details. Our current simulations at $β=5.7$ on a $16^4$ lattice fail in similar ways while showing some improvement. We are therefore moving to even weaker couplings to see if the CLE might produce the correct results in the continuum (weak-coupling) limit, or, if it still fails, whether it might reproduce the results of the phase-quenched theory. We also discuss action (and other dynamics) modifications which might improve the performance of the CLE.
△ Less
Submitted 23 October, 2017;
originally announced October 2017.