Search | arXiv e-print repository

Gemma 2: Improving Open Language Models at a Practical Size

Authors: Gemma Team, Morgane Riviere, Shreya Pathak, Pier Giuseppe Sessa, Cassidy Hardin, Surya Bhupatiraju, Léonard Hussenot, Thomas Mesnard, Bobak Shahriari, Alexandre Ramé, Johan Ferret, Peter Liu, Pouya Tafti, Abe Friesen, Michelle Casbon, Sabela Ramos, Ravin Kumar, Charline Le Lan, Sammy Jerome, Anton Tsitsulin, Nino Vieillard, Piotr Stanczyk, Sertan Girgin, Nikola Momchev, Matt Hoffman , et al. (172 additional authors not shown)

Abstract: In this work, we introduce Gemma 2, a new addition to the Gemma family of lightweight, state-of-the-art open models, ranging in scale from 2 billion to 27 billion parameters. In this new version, we apply several known technical modifications to the Transformer architecture, such as interleaving local-global attentions (Beltagy et al., 2020a) and group-query attention (Ainslie et al., 2023). We al… ▽ More In this work, we introduce Gemma 2, a new addition to the Gemma family of lightweight, state-of-the-art open models, ranging in scale from 2 billion to 27 billion parameters. In this new version, we apply several known technical modifications to the Transformer architecture, such as interleaving local-global attentions (Beltagy et al., 2020a) and group-query attention (Ainslie et al., 2023). We also train the 2B and 9B models with knowledge distillation (Hinton et al., 2015) instead of next token prediction. The resulting models deliver the best performance for their size, and even offer competitive alternatives to models that are 2-3 times bigger. We release all our models to the community. △ Less

Submitted 2 August, 2024; v1 submitted 31 July, 2024; originally announced August 2024.

arXiv:2407.15786 [pdf, other]

Concept-Based Interpretable Reinforcement Learning with Limited to No Human Labels

Authors: Zhuorui Ye, Stephanie Milani, Geoffrey J. Gordon, Fei Fang

Abstract: Recent advances in reinforcement learning (RL) have predominantly leveraged neural network-based policies for decision-making, yet these models often lack interpretability, posing challenges for stakeholder comprehension and trust. Concept bottleneck models offer an interpretable alternative by integrating human-understandable concepts into neural networks. However, a significant limitation in pri… ▽ More Recent advances in reinforcement learning (RL) have predominantly leveraged neural network-based policies for decision-making, yet these models often lack interpretability, posing challenges for stakeholder comprehension and trust. Concept bottleneck models offer an interpretable alternative by integrating human-understandable concepts into neural networks. However, a significant limitation in prior work is the assumption that human annotations for these concepts are readily available during training, necessitating continuous real-time input from human annotators. To overcome this limitation, we introduce a novel training scheme that enables RL algorithms to efficiently learn a concept-based policy by only querying humans to label a small set of data, or in the extreme case, without any human labels. Our algorithm, LICORICE, involves three main contributions: interleaving concept learning and RL training, using a concept ensembles to actively select informative data points for labeling, and decorrelating the concept data with a simple strategy. We show how LICORICE reduces manual labeling efforts to to 500 or fewer concept labels in three environments. Finally, we present an initial study to explore how we can use powerful vision-language models to infer concepts from raw visual inputs without explicit labels at minimal cost to performance. △ Less

Submitted 22 July, 2024; originally announced July 2024.

Comments: 23 pages, 6 figures, 9 tables

arXiv:2407.09387 [pdf]

Meta-Analysis with Untrusted Data

Authors: Shiva Kaul, Geoffrey J. Gordon

Abstract: [See paper for full abstract] Meta-analysis is a crucial tool for answering scientific questions. It is usually conducted on a relatively small amount of ``trusted'' data -- ideally from randomized, controlled trials -- which allow causal effects to be reliably estimated with minimal assumptions. We show how to answer causal questions much more precisely by making two changes. First, we incorporat… ▽ More [See paper for full abstract] Meta-analysis is a crucial tool for answering scientific questions. It is usually conducted on a relatively small amount of ``trusted'' data -- ideally from randomized, controlled trials -- which allow causal effects to be reliably estimated with minimal assumptions. We show how to answer causal questions much more precisely by making two changes. First, we incorporate untrusted data drawn from large observational databases, related scientific literature and practical experience -- without sacrificing rigor or introducing strong assumptions. Second, we train richer models capable of handling heterogeneous trials, addressing a long-standing challenge in meta-analysis. Our approach is based on conformal prediction, which fundamentally produces rigorous prediction intervals, but doesn't handle indirect observations: in meta-analysis, we observe only noisy effects due to the limited number of participants in each trial. To handle noise, we develop a simple, efficient version of fully-conformal kernel ridge regression, based on a novel condition called idiocentricity. We introduce noise-correcting terms in the residuals and analyze their interaction with a ``variance shaving'' technique. In multiple experiments on healthcare datasets, our algorithms deliver tighter, sounder intervals than traditional ones. This paper charts a new course for meta-analysis and evidence-based medicine, where heterogeneity and untrusted data are embraced for more nuanced and precise predictions. △ Less

Submitted 12 July, 2024; originally announced July 2024.

Comments: Full-length version of conference submission

arXiv:2406.15730 [pdf, other]

Proton discrimination in CLYC for fast neutron spectroscopy

Authors: J. A. Brown, B. L. Goldblum, J. M. Gordon, T. A. Laplace, T. S. Nagel, A. Venkatraman

Abstract: The Cs$_2$LiYCl$_6$:Ce (CLYC) elpasolite scintillator is known for its response to fast and thermal neutrons along with good $γがんま$-ray energy resolution. While the $^{35}$Cl($n,p$) reaction has been identified as a potential means for CLYC-based fast neutron spectroscopy in the absence of time-of-flight (TOF), previous efforts to functionalize CLYC as a fast neutron spectrometer have been thwarted b… ▽ More The Cs$_2$LiYCl$_6$:Ce (CLYC) elpasolite scintillator is known for its response to fast and thermal neutrons along with good $γがんま$-ray energy resolution. While the $^{35}$Cl($n,p$) reaction has been identified as a potential means for CLYC-based fast neutron spectroscopy in the absence of time-of-flight (TOF), previous efforts to functionalize CLYC as a fast neutron spectrometer have been thwarted by the inability to isolate proton interactions from $^{6}$Li($n,αあるふぁ$) and $^{35}$Cl($n,αあるふぁ$) signals. This work introduces a new approach to particle discrimination in CLYC for fission spectrum neutrons using a multi-gate charge integration algorithm that provides excellent separation between protons and heavier charged particles. Neutron TOF data were collected using a $^{252}$Cf source, an array of EJ-309 organic liquid scintillators, and a $^6$Li-enriched CLYC scintillator outfitted with fast electronics. Modal waveforms were constructed corresponding to the different reaction channels, revealing significant differences in the pulse characteristics of protons and heavier charged particles at ultrafast, fast, and intermediate time scales. These findings informed the design of a pulse shape discrimination algorithm, which was validated using the TOF data. This study also proposes an iterative subtraction method to mitigate contributions from confounding reaction channels in proton and heavier charged particle pulse height spectra, opening the door for CLYC-based fast neutron and $γがんま$-ray spectroscopy while preserving sensitivity to thermal neutron capture signals. △ Less

Submitted 22 June, 2024; originally announced June 2024.

Comments: 9 pages, 6 figures

arXiv:2405.03147 [pdf]

doi 10.1007/s10278-024-01100-2

Data Format Standardization and DICOM Integration for Hyperpolarized 13C MRI

Authors: Ernesto Diaz, Renuka Sriram, Jeremy W. Gordon, Avantika Sinha, Xiaoxi Liu, Sule Sahin, Jason Crane, Marram P Olson, Hsin-Yu Chen, Jenna Bernard, Daniel B. Vigneron, Zhen Jane Wang, Duan Xu, Peder E. Z. Larson

Abstract: Hyperpolarized (HP) 13C MRI has shown promise as a valuable modality for in vivo measurements of metabolism and is currently in human trials at 15 research sites worldwide. With this growth it is important to adopt standardized data storage practices as it will allow sites to meaningfully compare data. In this paper we (1) describe data that we believe should be stored and (2) demonstrate pipeli… ▽ More Hyperpolarized (HP) 13C MRI has shown promise as a valuable modality for in vivo measurements of metabolism and is currently in human trials at 15 research sites worldwide. With this growth it is important to adopt standardized data storage practices as it will allow sites to meaningfully compare data. In this paper we (1) describe data that we believe should be stored and (2) demonstrate pipelines and methods that utilize the Digital Imaging and Communications in Medicine (DICOM) standard. This includes proposing a set of minimum set of information that is specific to HP 13C MRI studies. We then show where the majority of these can be fit into existing DICOM Attributes, primarily via the "Contrast/Bolus" module. We also demonstrate pipelines for utilizing DICOM for HP 13C MRI. DICOM is the most common standard for clinical medical image storage and provides the flexibility to accommodate the unique aspects of HP 13C MRI, including the HP agent information but also spectroscopic and metabolite dimensions. The pipelines shown include creating DICOM objects for studies on human and animal imaging systems with various pulse sequences. We also show a python-based method to efficiently modify DICOM objects to incorporate the unique HP 13C MRI information that is not captured by existing pipelines. Moreover, we propose best practices for HP 13C MRI data storage that will support future multi-site trials, research studies and technical developments of this imaging technique. △ Less

Submitted 5 May, 2024; originally announced May 2024.

arXiv:2404.12344 [pdf, other]

Modeling nonlinear scales with COLA: preparing for LSST-Y1

Authors: Jonathan Gordon, Bernardo F. de Aguiar, João Rebouças, Guilherme Brando, Felipe Falciano, Vivian Miranda, Kazuya Koyama, Hans A. Winther

Abstract: Year 1 results of the Legacy Survey of Space and Time (LSST) will provide tighter constraints on small-scale cosmology, beyond the validity of linear perturbation theory. This heightens the demand for a computationally affordable prescription that can accurately capture nonlinearities in beyond-$Λらむだ$CDM models. The COmoving Lagrangian Acceleration (COLA) method, a cost-effective \textit{N}-body tech… ▽ More Year 1 results of the Legacy Survey of Space and Time (LSST) will provide tighter constraints on small-scale cosmology, beyond the validity of linear perturbation theory. This heightens the demand for a computationally affordable prescription that can accurately capture nonlinearities in beyond-$Λらむだ$CDM models. The COmoving Lagrangian Acceleration (COLA) method, a cost-effective \textit{N}-body technique, has been proposed as a viable alternative to high-resolution \textit{N}-body simulations for training emulators of the nonlinear matter power spectrum. In this study, we evaluate this approach by employing COLA emulators to conduct a cosmic shear analysis with LSST-Y1 simulated data across three different nonlinear scale cuts. We use the $w$CDM model, for which the \textsc{EuclidEmulator2} (\textsc{ee2}) exists as a benchmark, having been trained with high-resolution \textit{N}-body simulations. We primarily utilize COLA simulations with mass resolution $M_{\rm part}\approx 8 \times 10^{10} ~h^{-1} M_{\odot}$ and force resolution $\ell_{\rm force}=0.5 ~h^{-1}$Mpc, though we also test refined settings with $M_{\rm part}\approx 1 \times 10^{10} ~h^{-1}M_{\odot}$ and force resolution $\ell_{\rm force}=0.17 ~h^{-1}$Mpc. We find the performance of the COLA emulators is sensitive to the placement of high-resolution \textit{N}-body reference samples inside the prior, which only ensure agreement in their local vicinity. However, the COLA emulators pass stringent criteria in goodness-of-fit and parameter bias throughout the prior, when $Λらむだ$CDM predictions of \textsc{ee2} are computed alongside every COLA emulator prediction, suggesting a promising approach for extended models. △ Less

Submitted 18 April, 2024; originally announced April 2024.

Comments: 18 pages, 14 figures, 8 tables

arXiv:2404.06487 [pdf, other]

Uncovering Tidal Treasures: Automated Classification of Faint Tidal Features in DECaLS Data

Authors: Alexander J. Gordon, Annette M. N. Ferguson, Robert G. Mann

Abstract: Tidal features are a key observable prediction of the hierarchical model of galaxy formation and contain a wealth of information about the properties and history of a galaxy. Modern wide-field surveys such as LSST and Euclid will revolutionise the study of tidal features. However, the volume of data will far surpass the capacity to inspect each galaxy to identify the feature visually, thereby moti… ▽ More Tidal features are a key observable prediction of the hierarchical model of galaxy formation and contain a wealth of information about the properties and history of a galaxy. Modern wide-field surveys such as LSST and Euclid will revolutionise the study of tidal features. However, the volume of data will far surpass the capacity to inspect each galaxy to identify the feature visually, thereby motivating an urgent need to develop automated detection methods. This paper presents a visual classification of $\sim$2,000 galaxies from the DECaLS survey into different tidal feature categories: arms, streams, shells, and diffuse. Using these labels, we trained a Convolutional Neural Network (CNN) to reproduce the assigned visual classifications. Overall our network performed well and retrieved a median $81.1^{+5.8}_{-6.5}$, $65.7^{+5.0}_{-8.4}$, $91.3^{+6.0}_{-5.9}$, and $82.3^{+1.4}_{-7.9}$ per cent of the actual instances of arm, stream, shell, and diffuse features respectively for just 20 per cent contamination. We verified that the network was classifying the images correctly by using a Gradient-weighted Class Activation Mapping analysis to highlight important regions on the images for a given classification. This is the first demonstration of using CNNs to classify tidal features into sub-categories, and it will pave the way for the identification of different categories of tidal features in the vast samples of galaxies that forthcoming wide-field surveys will deliver. △ Less

Submitted 9 April, 2024; originally announced April 2024.

Comments: 17 pages, 10 figures, submitted to MNRAS

arXiv:2404.02973 [pdf, other]

Scaling Laws for Galaxy Images

Authors: Mike Walmsley, Micah Bowles, Anna M. M. Scaife, Jason Shingirai Makechemu, Alexander J. Gordon, Annette M. N. Ferguson, Robert G. Mann, James Pearson, Jürgen J. Popp, Jo Bovy, Josh Speagle, Hugh Dickinson, Lucy Fortson, Tobias Géron, Sandor Kruk, Chris J. Lintott, Kameswara Mantha, Devina Mohan, David O'Ryan, Inigo V. Slijepevic

Abstract: We present the first systematic investigation of supervised scaling laws outside of an ImageNet-like context - on images of galaxies. We use 840k galaxy images and over 100M annotations by Galaxy Zoo volunteers, comparable in scale to Imagenet-1K. We find that adding annotated galaxy images provides a power law improvement in performance across all architectures and all tasks, while adding trainab… ▽ More We present the first systematic investigation of supervised scaling laws outside of an ImageNet-like context - on images of galaxies. We use 840k galaxy images and over 100M annotations by Galaxy Zoo volunteers, comparable in scale to Imagenet-1K. We find that adding annotated galaxy images provides a power law improvement in performance across all architectures and all tasks, while adding trainable parameters is effective only for some (typically more subjectively challenging) tasks. We then compare the downstream performance of finetuned models pretrained on either ImageNet-12k alone vs. additionally pretrained on our galaxy images. We achieve an average relative error rate reduction of 31% across 5 downstream tasks of scientific interest. Our finetuned models are more label-efficient and, unlike their ImageNet-12k-pretrained equivalents, often achieve linear transfer performance equal to that of end-to-end finetuning. We find relatively modest additional downstream benefits from scaling model size, implying that scaling alone is not sufficient to address our domain gap, and suggest that practitioners with qualitatively different images might benefit more from in-domain adaption followed by targeted downstream labelling. △ Less

Submitted 3 April, 2024; originally announced April 2024.

Comments: 10+6 pages, 12 figures. Appendix C2 based on arxiv:2206.11927. Code, demos, documentation at https://github.com/mwalmsley/zoobot

arXiv:2401.05546 [pdf, other]

Static and fluctuating zigzag order, and possible signatures of Kitaev physics, in torque measurements of $αあるふぁ$-RuCl${_3}$

Authors: Shaun Froude-Powers, Subin Kim, Jacob Gordon, Hae-Young Kee, Young-June Kim, Stephen R. Julian

Abstract: We have measured magnetic torque on a T${_N}$ = 7 K single crystal of $αあるふぁ$-RuCl${_3}$ , as a function of the field angle in the ab-plane, focusing on temperatures between 2 and 20 K and fields from 0 to 9 T. We find a rich spectrum of signals, many of which can be classified by their angular periodicity. The sample shows an oscillation with a period of 180$^{\circ}$ (i.e. two-fold periodicity) whic… ▽ More We have measured magnetic torque on a T${_N}$ = 7 K single crystal of $αあるふぁ$-RuCl${_3}$ , as a function of the field angle in the ab-plane, focusing on temperatures between 2 and 20 K and fields from 0 to 9 T. We find a rich spectrum of signals, many of which can be classified by their angular periodicity. The sample shows an oscillation with a period of 180$^{\circ}$ (i.e. two-fold periodicity) which we argue is due to residual strain within the crystal, rather than being intrinsic. In addition, within the magnetically ordered zigzag phase there is a 60$^{\circ}$ period (i.e. six-fold) sawtooth pattern, which can be explained by reorientation of the zigzag domains as the crystal rotates in the applied field. Suppressing the zigzag order with an applied field above ${\sim}$ 8 T at low temperature, a six-fold sinusoidal signal remains, suggesting that there is fluctuating zigzag order in the putative field-induced quantum spin liquid state. Finally, our key finding is a sharp, step-like feature that appears at low temperature for fields just above the zigzag phase boundary, at the so-called B2-axes. This is similar to theoretically predicted behaviour for a state with Ising topological order, which is expected for a Kitaev spin liquid in an applied magnetic field. △ Less

Submitted 10 January, 2024; originally announced January 2024.

arXiv:2312.01591 [pdf, ps, other]

Integrability and singularities of Harish-Chandra characters

Authors: Itay Glazer, Julia Gordon, Yotam I. Hendel

Abstract: Let $G$ be a reductive group over a local field $F$ of characteristic $0$. By Harish-Chandra's regularity theorem, every global character $Θしーた_πぱい$ of an irreducible, admissible representation $πぱい$ of $G$ is given by a locally integrable function $θしーた_πぱい$ on $G$. It is a natural question whether $θしーた_πぱい$ has better integrability properties, namely, whether it is locally $L^{1+εいぷしろん}$-integrable for some $εいぷしろん>0$. I… ▽ More Let $G$ be a reductive group over a local field $F$ of characteristic $0$. By Harish-Chandra's regularity theorem, every global character $Θしーた_πぱい$ of an irreducible, admissible representation $πぱい$ of $G$ is given by a locally integrable function $θしーた_πぱい$ on $G$. It is a natural question whether $θしーた_πぱい$ has better integrability properties, namely, whether it is locally $L^{1+εいぷしろん}$-integrable for some $εいぷしろん>0$. It follows from Harish-Chandra's work that the answer is positive, and this gives rise to a new singularity invariant of representations $εいぷしろん_{\star}(πぱい):=\sup\left\{ εいぷしろん:θしーた_πぱい\in L_{\mathrm{Loc}}^{1+εいぷしろん}(G)\right\} $, which we explore in this paper. We provide a lower bound on $εいぷしろん_{\star}(πぱい)$ for any $G$, and determine $εいぷしろん_{\star}(πぱい)$ in the case of a $p$-adic $\mathrm{GL}_{n}$. This is done by studying integrability properties of the Fourier transforms $\widehat{ξくしー_{\mathcal{O}}}$ of stable Richardson nilpotent orbital integrals $ξくしー_{\mathcal{O}}$. We express $εいぷしろん_{\star}(\widehat{ξくしー_{\mathcal{O}}})$ as the log-canonical threshold of a suitable relative Weyl discriminant, and use a resolution of singularities algorithm coming from the theory of hyperplane arrangements, to compute it in terms of the partition associated with the orbit. As an application, we obtain bounds on the multiplicities of $K$-types in irreducible representations of $G$ in the $p$-adic case, where $K$ is an open compact subgroup. △ Less

Submitted 3 December, 2023; originally announced December 2023.

Comments: 31 pages. Comments are welcome!

MSC Class: 20G05; 14B05; 20G05 (Primary) 14N20; 17B08; 22E30; 22E35; 22E46; 32S22; 43A30 (Secondary)

arXiv:2309.04040 [pdf]

doi 10.1002/mrm.29875

Current Methods for Hyperpolarized [1-13C]pyruvate MRI Human Studies

Authors: Peder EZ Larson, Jenna ML Bernard, James A Bankson, Nikolaj Bøgh, Robert A Bok, Albert P. Chen, Charles H Cunningham, Jeremy Gordon, Jan-Bernd Hövener, Christoffer Laustsen, Dirk Mayer, Mary A McLean, Franz Schilling, James Slater, Jean-Luc Vanderheyden, Cornelius von Morze, Daniel B Vigneron, Duan Xu, the HP 13C MRI Consensus Group

Abstract: MRI with hyperpolarized (HP) 13C agents, also known as HP 13C MRI, can measure processes such as localized metabolism that is altered in numerous cancers, liver, heart, kidney diseases, and more. It has been translated into human studies during the past 10 years, with recent rapid growth in studies largely based on increasing availability of hyperpolarized agent preparation methods suitable for us… ▽ More MRI with hyperpolarized (HP) 13C agents, also known as HP 13C MRI, can measure processes such as localized metabolism that is altered in numerous cancers, liver, heart, kidney diseases, and more. It has been translated into human studies during the past 10 years, with recent rapid growth in studies largely based on increasing availability of hyperpolarized agent preparation methods suitable for use in humans. This paper aims to capture the current successful practices for HP MRI human studies with [1-13C]pyruvate - by far the most commonly used agent, which sits at a key metabolic junction in glycolysis. The paper is divided into four major topic areas: (1) HP 13C-pyruvate preparation, (2) MRI system setup and calibrations, (3) data acquisition and image reconstruction, and (4) data analysis and quantification. In each area, we identified the key components for a successful study, summarized both published studies and current practices, and discuss evidence gaps, strengths, and limitations. This paper is the output of the HP 13C MRI Consensus Group as well as the ISMRM Hyperpolarized Media MR and Hyperpolarized Methods & Equipment study groups. It further aims to provide a comprehensive reference for future consensus building as the field continues to advance human studies with this metabolic imaging modality. △ Less

Submitted 22 November, 2023; v1 submitted 7 September, 2023; originally announced September 2023.

Comments: Accepted at Magnetic Resonance in Medicine

arXiv:2304.04869 [pdf, other]

doi 10.1088/1538-3873/acd1b5

The James Webb Space Telescope Mission

Authors: Jonathan P. Gardner, John C. Mather, Randy Abbott, James S. Abell, Mark Abernathy, Faith E. Abney, John G. Abraham, Roberto Abraham, Yasin M. Abul-Huda, Scott Acton, Cynthia K. Adams, Evan Adams, David S. Adler, Maarten Adriaensen, Jonathan Albert Aguilar, Mansoor Ahmed, Nasif S. Ahmed, Tanjira Ahmed, Rüdeger Albat, Loïc Albert, Stacey Alberts, David Aldridge, Mary Marsha Allen, Shaune S. Allen, Martin Altenburg , et al. (983 additional authors not shown)

Abstract: Twenty-six years ago a small committee report, building on earlier studies, expounded a compelling and poetic vision for the future of astronomy, calling for an infrared-optimized space telescope with an aperture of at least $4m$. With the support of their governments in the US, Europe, and Canada, 20,000 people realized that vision as the $6.5m$ James Webb Space Telescope. A generation of astrono… ▽ More Twenty-six years ago a small committee report, building on earlier studies, expounded a compelling and poetic vision for the future of astronomy, calling for an infrared-optimized space telescope with an aperture of at least $4m$. With the support of their governments in the US, Europe, and Canada, 20,000 people realized that vision as the $6.5m$ James Webb Space Telescope. A generation of astronomers will celebrate their accomplishments for the life of the mission, potentially as long as 20 years, and beyond. This report and the scientific discoveries that follow are extended thank-you notes to the 20,000 team members. The telescope is working perfectly, with much better image quality than expected. In this and accompanying papers, we give a brief history, describe the observatory, outline its objectives and current observing program, and discuss the inventions and people who made it possible. We cite detailed reports on the design and the measured performance on orbit. △ Less

Submitted 10 April, 2023; originally announced April 2023.

Comments: Accepted by PASP for the special issue on The James Webb Space Telescope Overview, 29 pages, 4 figures

arXiv:2303.08774 [pdf, other]

GPT-4 Technical Report

Authors: OpenAI, Josh Achiam, Steven Adler, Sandhini Agarwal, Lama Ahmad, Ilge Akkaya, Florencia Leoni Aleman, Diogo Almeida, Janko Altenschmidt, Sam Altman, Shyamal Anadkat, Red Avila, Igor Babuschkin, Suchir Balaji, Valerie Balcom, Paul Baltescu, Haiming Bao, Mohammad Bavarian, Jeff Belgum, Irwan Bello, Jake Berdine, Gabriel Bernadett-Shapiro, Christopher Berner, Lenny Bogdonoff, Oleg Boiko , et al. (256 additional authors not shown)

Abstract: We report the development of GPT-4, a large-scale, multimodal model which can accept image and text inputs and produce text outputs. While less capable than humans in many real-world scenarios, GPT-4 exhibits human-level performance on various professional and academic benchmarks, including passing a simulated bar exam with a score around the top 10% of test takers. GPT-4 is a Transformer-based mo… ▽ More We report the development of GPT-4, a large-scale, multimodal model which can accept image and text inputs and produce text outputs. While less capable than humans in many real-world scenarios, GPT-4 exhibits human-level performance on various professional and academic benchmarks, including passing a simulated bar exam with a score around the top 10% of test takers. GPT-4 is a Transformer-based model pre-trained to predict the next token in a document. The post-training alignment process results in improved performance on measures of factuality and adherence to desired behavior. A core component of this project was developing infrastructure and optimization methods that behave predictably across a wide range of scales. This allowed us to accurately predict some aspects of GPT-4's performance based on models trained with no more than 1/1,000th the compute of GPT-4. △ Less

Submitted 4 March, 2024; v1 submitted 15 March, 2023; originally announced March 2023.

Comments: 100 pages; updated authors list; fixed author names and added citation

arXiv:2302.07333 [pdf, other]

doi 10.1088/1475-7516/2024/02/042

Early dark energy constraints with late-time expansion marginalization

Authors: João Rebouças, Jonathan Gordon, Diogo H. F. de Souza, Kunhao Zhong, Vivian Miranda, Rogerio Rosenfeld, Tim Eifler, Elisabeth Krause

Abstract: Early dark energy (EDE) is an extension to the $Λらむだ$CDM model, proposed to reduce the tension between the measurements of the Hubble constant $H_0$ from the cosmic microwave background (CMB) and from the local cosmic distance ladder. However, this model increases the $S_8$ tension between CMB and large scale structure measurements. Analyses of galaxy clustering and lensing correlation functions repo… ▽ More Early dark energy (EDE) is an extension to the $Λらむだ$CDM model, proposed to reduce the tension between the measurements of the Hubble constant $H_0$ from the cosmic microwave background (CMB) and from the local cosmic distance ladder. However, this model increases the $S_8$ tension between CMB and large scale structure measurements. Analyses of galaxy clustering and lensing correlation functions report a decreased preference for EDE and its effect on the Hubble tension. Smooth dark energy models affect growth of structure through the background expansion. In this work, we study the inclusion of a general, smooth late-time dark energy modification in combination with EDE and obtain constraints on EDE marginalized over the late-time expansion. We assess the impact on the $S_8$ and Hubble tensions. In order to generalize the late expansion, we use a late dark energy fluid model with a piecewise constant equation of state $w(z)$ over 3, 5 and 10 redshift bins in the window $z \in [0,3]$. We show that, when analyzing ACT and Planck CMB data combined with Pantheon supernovae, BAO from 6dF, SDSS and BOSS, Planck 2018 CMB lensing and Dark Energy Survey cosmic shear and clustering data, the inclusion of a general smooth dark energy modification at late times has no significant effect on $S_8$ and EDE parameter constraints. Using the aforementioned datasets, the EDE fraction constraint with late-time expansion marginalization is $f_\mathrm{EDE} = 0.067^{+0.019}_{-0.027}$ using 3 redshift bins, with similar results for 5 and 10 redshift bins. This work shows that in order to solve simultaneously the Hubble and $S_8$ tensions, one needs a mechanism for increasing the clustering of matter at late times different from a simple change in the background evolution of late dark energy. [Abridged] △ Less

Submitted 11 March, 2024; v1 submitted 14 February, 2023; originally announced February 2023.

Comments: 22 pages, 9 figures

Journal ref: JCAP02(2024)042

arXiv:2209.06221 [pdf, other]

doi 10.1103/PhysRevResearch.5.L012027

Field Induced Chiral Soliton Phase in the Kitaev Spin Chain

Authors: Erik S. Sørensen, Jacob Gordon, Jonathon Riddell, Tianyi Wang, Hae-Young Kee

Abstract: The bond-dependent Ising interaction present in the Kitaev model has attracted considerable attention. The appearance of an unexpected intermediate phase under a magnetic field is particularly intriguing, and one may wonder if a similar phase occurs in the Kitaev spin chain with alternating $x$- and $y$-bond Ising interactions. Previous studies have focused on a transverse field, $h_z$, and report… ▽ More The bond-dependent Ising interaction present in the Kitaev model has attracted considerable attention. The appearance of an unexpected intermediate phase under a magnetic field is particularly intriguing, and one may wonder if a similar phase occurs in the Kitaev spin chain with alternating $x$- and $y$-bond Ising interactions. Previous studies have focused on a transverse field, $h_z$, and reported a direct transition to the polarized state. Here, we investigate phases with arbitrary angle of two longitudinal fields, $h_x$ and $h_y$. For a magnetic field applied along the diagonal, $h_x$=$h_y$, the chain remains gapless up to a critical field $h^{c_1}_{xy}$. Surprisingly, above $h^{c1}_{xy}$ it enters an unusual intermediate phase before reaching the polarized state at $h^{c_2}_{xy}$. This phase is characterized by a staggered vector chirality and for periodic boundary conditions, a two-fold degeneracy with a finite gap. For open boundary systems the ground-state exhibits a single soliton, lowering the energy, and gapless excitations. However, the corresponding anti-soliton raises the energy sufficiently that a gap appears for soliton and anti-soliton pairs in periodic systems. An intuitive variational picture is developed describing the soliton phase. △ Less

Submitted 13 September, 2022; originally announced September 2022.

Comments: 6 pages, 5 figures

Journal ref: Phys. Rev. Research 5, L012027 (2023)

arXiv:2206.08896 [pdf, other]

Evolution through Large Models

Authors: Joel Lehman, Jonathan Gordon, Shawn Jain, Kamal Ndousse, Cathy Yeh, Kenneth O. Stanley

Abstract: This paper pursues the insight that large language models (LLMs) trained to generate code can vastly improve the effectiveness of mutation operators applied to programs in genetic programming (GP). Because such LLMs benefit from training data that includes sequential changes and modifications, they can approximate likely changes that humans would make. To highlight the breadth of implications of s… ▽ More This paper pursues the insight that large language models (LLMs) trained to generate code can vastly improve the effectiveness of mutation operators applied to programs in genetic programming (GP). Because such LLMs benefit from training data that includes sequential changes and modifications, they can approximate likely changes that humans would make. To highlight the breadth of implications of such evolution through large models (ELM), in the main experiment ELM combined with MAP-Elites generates hundreds of thousands of functional examples of Python programs that output working ambulating robots in the Sodarace domain, which the original LLM had never seen in pre-training. These examples then help to bootstrap training a new conditional language model that can output the right walker for a particular terrain. The ability to bootstrap new models that can output appropriate artifacts for a given context in a domain where zero training data was previously available carries implications for open-endedness, deep learning, and reinforcement learning. These implications are explored here in depth in the hope of inspiring new directions of research now opened up by ELM. △ Less

Submitted 17 June, 2022; originally announced June 2022.

arXiv:2205.02391 [pdf, ps, other]

Orbital integrals and normalizations of measures

Authors: Julia Gordon, with appendix by Matthew Koster

Abstract: This note provides an informal introduction, with examples, to some technical aspects of the re-normalization of measures on orbital integrals used in the work of Langlands, Frenkel-Langlands-Ngô, and Altug on Beyond Endoscopy. In particular, we survey different relevant measures on algebraic tori and explain the connection with the Tamagawa numbers. We work out the example of $\mathrm{GL}_2$ in c… ▽ More This note provides an informal introduction, with examples, to some technical aspects of the re-normalization of measures on orbital integrals used in the work of Langlands, Frenkel-Langlands-Ngô, and Altug on Beyond Endoscopy. In particular, we survey different relevant measures on algebraic tori and explain the connection with the Tamagawa numbers. We work out the example of $\mathrm{GL}_2$ in complete detail. The Appendix by Matthew Koster illustrates, for the Lie algebras $\mathfrak{sl}_2$ and $\mathfrak{so}_3$, the relation between the so-called geometric measure on the orbits and Kirillov's measure on co-adjoint orbits in the linear dual of the Lie algebra. △ Less

Submitted 4 May, 2022; originally announced May 2022.

MSC Class: 22B02; 11E02; 11F02

arXiv:2112.07069 [pdf, other]

doi 10.1103/PhysRevD.105.042001

Analysis of a Tau Neutrino Origin for the Near-Horizon Air Shower Events Observed by the Fourth Flight of the Antarctic Impulsive Transient Antenna (ANITA)

Authors: R. Prechelt, S. A. Wissel, A. Romero-Wolf, C. Burch, P. W. Gorham, P. Allison, J. Alvarez-Muñiz, O. Banerjee, L. Batten, J. J. Beatty, K. Belov, D. Z. Besson, W. R. Binns, V. Bugaev, P. Cao, W. Carvalho Jr., C. H. Chen, P. Chen, Y. Chen, J. M. Clem, A. Connolly, L. Cremonesi, B. Dailey, C. Deaconu, P. F. Dowkontt , et al. (43 additional authors not shown)

Abstract: We study in detail the sensitivity of the Antarctic Impulsive Transient Antenna (ANITA) to possible $νにゅー_τたう$ point source fluxes detected via $τたう$-lepton-induced air showers. This investigation is framed around the observation of four upward-going extensive air shower events very close to the horizon seen in ANITA-IV. We find that these four upgoing events are not observationally inconsistent with… ▽ More We study in detail the sensitivity of the Antarctic Impulsive Transient Antenna (ANITA) to possible $νにゅー_τたう$ point source fluxes detected via $τたう$-lepton-induced air showers. This investigation is framed around the observation of four upward-going extensive air shower events very close to the horizon seen in ANITA-IV. We find that these four upgoing events are not observationally inconsistent with $τたう$-induced EASs from Earth-skimming $νにゅー_τたう$, both in their spectral properties as well as in their observed locations on the sky. These four events, as well as the overall diffuse and point source exposure to Earth-skimming $νにゅー_τたう$, are also compared against published ultrahigh-energy neutrino limits from the Pierre Auger Observatory. While none of these four events occurred at sky locations simultaneously visible by Auger, the implied fluence necessary for ANITA to observe these events is in strong tension with limits set by Auger across a wide range of energies and is additionally in tension with ANITA's Askaryan in-ice neutrino channel above $10^{19}$ eV. We conclude by discussing some of the technical challenges with simulating and analyzing these near horizon events and the potential for future observatories to observe similar events. △ Less

Submitted 13 December, 2021; originally announced December 2021.

Comments: 19 pages, 22 figures, will be published in Physical Review D (PRD)

arXiv:2111.11457 [pdf, other]

doi 10.1103/PhysRevResearch.4.013205

Insights into the Anisotropic Spin-$S$ Kitaev Chain

Authors: Jacob S. Gordon, Hae-Young Kee

Abstract: Recently there has been a renewed interest in properties of the higher-spin Kitaev models, especially their low-dimensional analogues with additional interactions. These quasi-1D systems exhibit rich phase diagrams with symmetry-protected topological phases, Luttinger liquids, hidden order, and higher-rank magnetism. However, the nature of the pure spin-$S$ Kitaev chains are not yet fully understo… ▽ More Recently there has been a renewed interest in properties of the higher-spin Kitaev models, especially their low-dimensional analogues with additional interactions. These quasi-1D systems exhibit rich phase diagrams with symmetry-protected topological phases, Luttinger liquids, hidden order, and higher-rank magnetism. However, the nature of the pure spin-$S$ Kitaev chains are not yet fully understood. Earlier works found a unique ground state with short-ranged correlations for $S = 1$, and an intriguing double-peak structure in the heat capacity associated with an entropy plateau. To understand the low-energy excitations and thermodynamics for general $S$ we study the anisotropic spin-$S$ Kitaev chain. Starting from the dimerized limit we derive an effective low-energy Hamiltonian at finite anisotropy. For half-integer spins we find a trivial effective model, reflecting a non-local symmetry protecting the degeneracy, while for integer $S$ we find interactions among the flux degrees of freedom that select a unique ground state. The effective model for integer spins is used to predict the low-energy excitations and thermodynamics, and we make a comparison with the semiclassical limit through linear spin wave theory. Finally, we speculate on the nature of the isotropic limit. △ Less

Submitted 22 November, 2021; originally announced November 2021.

Comments: 12 pages, 4 figures

Journal ref: Phys. Rev. Research 4, 013205 (2022)

arXiv:2109.14146 [pdf, other]

doi 10.1103/PhysRevLett.129.021801

First Leptophobic Dark Matter Search from Coherent CAPTAIN-Mills

Authors: A. A. Aguilar-Arevalo, D. S. M. Alves, S. Biedron, J. Boissevain, M. Borrego, M. Chavez-Estrada, A. Chavez, J. M. Conrad, R. L. Cooper, A. Diaz, J. R. Distel, J. C. D'Olivo, E. Dunton, B. Dutta, A. Elliott, D. Evans, D. Fields, J. Greenwood, M. Gold, J. Gordon, E. Guarincerri, E. C. Huang, N. Kamp, C. Kelsey, K. Knickerbocker , et al. (26 additional authors not shown)

Abstract: We report the first results of a search for leptophobic dark matter (DM) from the Coherent CAPTAIN-Mills (CCM) liquid argon (LAr) detector. An engineering run with 120 photomultiplier tubes (PMTs) and $17.9 \times 10^{20}$ protons-on-target (POT) was performed in Fall 2019 to study the characteristics of the CCM detector. The operation of this 10-ton detector was strictly light-based with a thresh… ▽ More We report the first results of a search for leptophobic dark matter (DM) from the Coherent CAPTAIN-Mills (CCM) liquid argon (LAr) detector. An engineering run with 120 photomultiplier tubes (PMTs) and $17.9 \times 10^{20}$ protons-on-target (POT) was performed in Fall 2019 to study the characteristics of the CCM detector. The operation of this 10-ton detector was strictly light-based with a threshold of 50 keV and used coherent elastic scattering off argon nuclei to detect DM. Despite only 1.5 months of accumulated luminosity, contaminated LAr, and non-optimized shielding, CCM's first engineering run already achieved sensitivity to previously unexplored parameter space of light dark matter (LDM) models with a baryonic vector portal. With an expected background of 115,005 events, we observe 115,005+16.5 events which is compatible with background expectations. For a benchmark mediator-to-dark matter mass ratio of $m_{_{V_B}}/m_χかい=2.1$, DM masses within the range $9\,\text{MeV} \lesssim m_χかい\lesssim 50\,\text{MeV}$ have been excluded at 90% C.L. in the leptophobic model after applying the Feldman-Cousins test statistic. CCM's upgraded run with 200 PMTs, filtered LAr, improved shielding, and ten times more POT will be able to exclude the remaining thermal relic density parameter space of this model, as well as probe new parameter space of other leptophobic DM models. △ Less

Submitted 19 May, 2022; v1 submitted 28 September, 2021; originally announced September 2021.

Report number: Report-no: LA-UR-21-28552

Journal ref: Physical Review Letters Vol. 129, No. 2 (2022)

arXiv:2107.09595 [pdf, other]

Optimal control and comprehensive cost-effectiveness analysis for COVID-19

Authors: Joshua Kiddy Kwasi Asamoah, Eric Okyere, Afeez Abidemi, Stephen E. Moore, Gui-Quan Sun, Zhen Jin, Edward Acheampong, Joseph Frank Gordon

Abstract: Cost-effectiveness analysis is a mode of determining both the cost and economic health outcomes of one or more control interventions. In this work, we have formulated a non-autonomous nonlinear deterministic model to study the control of COVID-19 to unravel the cost and economic health outcomes for the autonomous nonlinear model proposed for the Kingdom of Saudi Arabia. The optimal control model c… ▽ More Cost-effectiveness analysis is a mode of determining both the cost and economic health outcomes of one or more control interventions. In this work, we have formulated a non-autonomous nonlinear deterministic model to study the control of COVID-19 to unravel the cost and economic health outcomes for the autonomous nonlinear model proposed for the Kingdom of Saudi Arabia. The optimal control model captures four time-dependent control functions, thus, $u_1$-practising physical or social distancing protocols; $u_2$-practising personal hygiene by cleaning contaminated surfaces with alcohol-based detergents; $u_3$-practising proper and safety measures by exposed, asymptomatic and symptomatic infected individuals; $u_4$-fumigating schools in all levels of education, sports facilities, commercial areas and religious worship centres. We proved the existence of the proposed optimal control model. The optimality system associated with the non-autonomous epidemic model is derived using Pontryagin's maximum principle. We have performed numerical simulations to investigate extensive cost-effectiveness analysis for fourteen optimal control strategies. Comparing the control strategies, we noticed that; Strategy 1 (practising physical or social distancing protocols) is the most cost-saving and most effective control intervention in Saudi Arabia in the absence of vaccination. But, in terms of the infection averted, we saw that strategy 6, strategy 11, strategy 12, and strategy 14 are just as good in controlling COVID-19. △ Less

Submitted 20 July, 2021; originally announced July 2021.

arXiv:2107.05752 [pdf, other]

doi 10.1103/PhysRevC.104.014609

Simultaneous measurement of organic scintillator response to carbon and proton recoils

Authors: T. A. Laplace, B. L. Goldblum, J. J. Manfredi, J. A. Brown, D. L. Bleuel, C. A. Brand, G. Gabella, J. Gordon, E. Brubaker

Abstract: Background: Organic scintillators are widely used for neutron detection in both basic nuclear physics and applications. While the proton light yield of organic scintillators has been extensively studied, measurements of the light yield from neutron interactions with carbon nuclei are scarce. Purpose: Demonstrate a new approach for the simultaneous measurement of the proton and carbon light yield o… ▽ More Background: Organic scintillators are widely used for neutron detection in both basic nuclear physics and applications. While the proton light yield of organic scintillators has been extensively studied, measurements of the light yield from neutron interactions with carbon nuclei are scarce. Purpose: Demonstrate a new approach for the simultaneous measurement of the proton and carbon light yield of organic scintillators. Provide new carbon light yield data for the EJ-309 liquid and EJ-204 plastic organic scintillators. Method: A 33~MeV $^{2}$H$^{+}$ beam from the 88-Inch Cyclotron at Lawrence Berkeley National Laboratory was impinged upon a 3-mm-thick Be target to produce a high-flux, broad-spectrum neutron beam. The double time-of-flight technique was extended to simultaneously measure the proton and carbon light yield of the organic scintillators, wherein the light output associated with the recoil particle was determined using $np$ and $n$C elastic scattering kinematics. Results: The proton and carbon light yield relations of the EJ-309 liquid and EJ-204 plastic organic scintillators were measured over a recoil energy range of approximately 0.3 to 1~MeV and 2 to 5~MeV, respectively for EJ-309, and 0.2 to 0.5~MeV and 1 to 4~MeV, respectively for EJ-204. Conclusions: These data provide new insight into the ionization quenching effect in organic scintillators and key input for simulation of the response of organic scintillators for both basic science and a broad range of applications. △ Less

Submitted 12 July, 2021; originally announced July 2021.

Comments: 11 pages, 9 figures

Journal ref: Phys. Rev. C 104, 014609, Published 9 July 2021

arXiv:2107.05725 [pdf, ps, other]

doi 10.1109/TNS.2019.2959979

Proton light yield of fast plastic scintillators for neutron imaging

Authors: J. J. Manfredi, B. L. Goldblum, T. A. Laplace, G. Gabella, J. Gordon, A. O'Brien, S. Chowdhury, J. A. Brown, E. Brubaker

Abstract: Plastic organic scintillators have been tailored in composition to achieve ultra-fast temporal response, thereby enabling the design and development of fast neutron detection systems with high timing resolution. Eljen Technology's plastic organic scintillators -- EJ-230, EJ-232, and EJ-232Q -- are prospective candidates for use in emerging neutron imaging systems, where fast timing is paramount. T… ▽ More Plastic organic scintillators have been tailored in composition to achieve ultra-fast temporal response, thereby enabling the design and development of fast neutron detection systems with high timing resolution. Eljen Technology's plastic organic scintillators -- EJ-230, EJ-232, and EJ-232Q -- are prospective candidates for use in emerging neutron imaging systems, where fast timing is paramount. To support the neutron response characterization of these materials, the relative proton light yields of EJ-230, EJ-232, and EJ-232Q were measured at the 88-Inch Cyclotron at Lawrence Berkeley National Laboratory. Using a broad-spectrum neutron source and a double time-of-flight technique, the proton light yield relations were obtained over a proton recoil energy range of approximately 300 keV to 4 MeV. The EJ-230, EJ-232, and EJ-232Q scintillators exhibited similar proton light yield relations to each other as well as to other plastic scintillators with the same polymer base material. A comparison of the relative proton light yield of different sized cylindrical EJ-232 and EJ-232Q scintillators also revealed consistent results. This work provides key input data for the realistic computational modeling of neutron detection technologies employing these materials, thereby supporting new capabilities in near-field radionuclide detection for national security applications. △ Less

Submitted 12 July, 2021; originally announced July 2021.

Comments: 10 pages, 8 figures

Journal ref: in IEEE Transactions on Nuclear Science, vol. 67, no. 2, pp. 434-442, Feb. 2020

arXiv:2107.00648 [pdf, other]

Deep Orthogonal Fusion: Multimodal Prognostic Biomarker Discovery Integrating Radiology, Pathology, Genomic, and Clinical Data

Authors: Nathaniel Braman, Jacob W. H. Gordon, Emery T. Goossens, Caleb Willis, Martin C. Stumpe, Jagadish Venkataraman

Abstract: Clinical decision-making in oncology involves multimodal data such as radiology scans, molecular profiling, histopathology slides, and clinical factors. Despite the importance of these modalities individually, no deep learning framework to date has combined them all to predict patient prognosis. Here, we predict the overall survival (OS) of glioma patients from diverse multimodal data with a Deep… ▽ More Clinical decision-making in oncology involves multimodal data such as radiology scans, molecular profiling, histopathology slides, and clinical factors. Despite the importance of these modalities individually, no deep learning framework to date has combined them all to predict patient prognosis. Here, we predict the overall survival (OS) of glioma patients from diverse multimodal data with a Deep Orthogonal Fusion (DOF) model. The model learns to combine information from multiparametric MRI exams, biopsy-based modalities (such as H&E slide images and/or DNA sequencing), and clinical variables into a comprehensive multimodal risk score. Prognostic embeddings from each modality are learned and combined via attention-gated tensor fusion. To maximize the information gleaned from each modality, we introduce a multimodal orthogonalization (MMO) loss term that increases model performance by incentivizing constituent embeddings to be more complementary. DOF predicts OS in glioma patients with a median C-index of 0.788 +/- 0.067, significantly outperforming (p=0.023) the best performing unimodal model with a median C-index of 0.718 +/- 0.064. The prognostic model significantly stratifies glioma patients by OS within clinical subsets, adding further granularity to prognostic clinical grading and molecular subtyping. △ Less

Submitted 1 July, 2021; originally announced July 2021.

Comments: Accepted for presentation at MICCAI 2021

arXiv:2106.06660 [pdf, other]

Least Squares Optimal Density Compensation for the Gridding Non-uniform Discrete Fourier Transform

Authors: Nicholas Dwork, Daniel O'Connor, Ethan M. I. Johnson, Corey A. Baron, Jeremy W. Gordon, John M. Pauly, Peder E. Z. Larson

Abstract: The Gridding algorithm has shown great utility for reconstructing images from non-uniformly spaced samples in the Fourier domain in several imaging modalities. Due to the non-uniform spacing, some correction for the variable density of the samples must be made. Existing methods for generating density compensation values are either sub-optimal or only consider a finite set of points (a set of measu… ▽ More The Gridding algorithm has shown great utility for reconstructing images from non-uniformly spaced samples in the Fourier domain in several imaging modalities. Due to the non-uniform spacing, some correction for the variable density of the samples must be made. Existing methods for generating density compensation values are either sub-optimal or only consider a finite set of points (a set of measure 0) in the optimization. This manuscript presents the first density compensation algorithm for a general trajectory that takes into account the point spread function over a set of non-zero measure. We show that the images reconstructed with Gridding using the density compensation values of this method are of superior quality when compared to density compensation weights determined in other ways. Results are shown with a numerical phantom and with magnetic resonance images of the abdomen and the knee. △ Less

Submitted 16 June, 2021; v1 submitted 11 June, 2021; originally announced June 2021.

arXiv:2105.14020 [pdf, other]

doi 10.1103/PhysRevD.106.012001

First Dark Matter Search Results From Coherent CAPTAIN-Mills

Authors: A. A. Aguilar-Arevalo, S. Biedron, J. Boissevain, M. Borrego, M. Chavez-Estrada, A. Chavez, J. M. Conrad, R. L. Cooper, A. Diaz, J. R. Distel, J. D'Olivo, E. Dunton, B. Dutta, A. Elliott, D. Evans, D. Fields, J. Greenwood, M. Gold, J. Gordon, E. D. Guarincerri, E. C. Huang, N. Kamp, C. Kelsey, K. Knickerbocker, R. Lake , et al. (25 additional authors not shown)

Abstract: This paper describes the operation of the Coherent CAPTAIN-Mills (CCM) detector located at the Lujan Neutron Science Center (LANSCE) at Los Alamos National Laboratory (LANL). CCM is a 10-ton liquid argon (LAr) detector located 20 meters from a high flux neutron/neutrino source and is designed to search for sterile neutrinos ($νにゅー_s$) and light dark matter (LDM). An engineering run was performed in F… ▽ More This paper describes the operation of the Coherent CAPTAIN-Mills (CCM) detector located at the Lujan Neutron Science Center (LANSCE) at Los Alamos National Laboratory (LANL). CCM is a 10-ton liquid argon (LAr) detector located 20 meters from a high flux neutron/neutrino source and is designed to search for sterile neutrinos ($νにゅー_s$) and light dark matter (LDM). An engineering run was performed in Fall 2019 to study the characteristics of the CCM120 detector by searching for coherent scattering signals consistent with $νにゅー_s$'s and LDM resulting from $πぱい^+$ and $πぱい^0$ decays in the tungsten target. New parameter space in a leptophobic dark matter model was excluded for DM masses between $\sim2.0$ and 30 MeV. The lessons learned from this run have guided the development and construction of the new CCM200 detector that will begin operations in 2021 and significantly improve on these searches. △ Less

Submitted 19 May, 2022; v1 submitted 28 May, 2021; originally announced May 2021.

Report number: LA-UR-21-24983

Journal ref: Physical Review D Vol. 106, No. 1 (2022)

arXiv:2105.05100 [pdf, other]

doi 10.3847/1538-4357/ac03ae

Galaxy Properties at the Faint End of the HI Mass Function

Authors: Kristen B. W. McQuinn, Anjana K. Telidevara, Jackson Fuson, Elizabeth A. K. Adams, John M. Cannon, Evan D. Skillman, Andrew E. Dolphin, Martha P. Haynes, Katherine L. Rhode, John. J. Salzer, Riccardo Giovanelli, Alex J. R. Gordon

Abstract: The Survey of HI in Extremely Low-mass Dwarfs (SHIELD) includes a volumetrically complete sample of 82 gas-rich dwarfs with M_HI~<10^7.2 Msun selected from the ALFALFA survey. We are obtaining extensive follow-up observations of the SHIELD galaxies to study their gas, stellar, and chemical content, and to better understand galaxy evolution at the faint end of the HI mass function. Here, we investi… ▽ More The Survey of HI in Extremely Low-mass Dwarfs (SHIELD) includes a volumetrically complete sample of 82 gas-rich dwarfs with M_HI~<10^7.2 Msun selected from the ALFALFA survey. We are obtaining extensive follow-up observations of the SHIELD galaxies to study their gas, stellar, and chemical content, and to better understand galaxy evolution at the faint end of the HI mass function. Here, we investigate the properties of 30 SHIELD galaxies using Hubble Space Telescope imaging of their resolved stars and Westerbork Synthesis Radio Telescope observations of their neutral hydrogen. We measure tip of the red giant branch (TRGB) distances, star formation activity, and gas properties. The TRGB distances are up to 4x greater than estimates from flow models, highlighting the importance of velocity-independent distance indicators in the nearby universe. The SHIELD galaxies are in under-dense regions, with 23% located in voids; one galaxy appears paired with a more massive dwarf. We quantify galaxy properties at low masses including stellar and HI masses, SFRs, sSFRs, SFEs, birthrate parameters, and gas fractions. The lowest mass systems lie below the mass thresholds where stellar mass assembly is predicted to be impacted by reionization. Even so, we find the star formation properties follow the same trends as higher mass gas-rich systems, albeit with a different normalization. The HI disks are small (<r><0.7 kpc) making it difficult to measure the HI rotation using standard techniques; we develop a new methodology and report the velocity extent, and its associated spatial extent, with robust uncertainties. △ Less

Submitted 25 May, 2021; v1 submitted 11 May, 2021; originally announced May 2021.

Comments: 43 pages, 32 figures, 6 tables

arXiv:2104.02114 [pdf, other]

doi 10.1016/j.solmat.2021.111253

Temperature and intensity dependence of the open-circuit voltage of InGaN/GaN multi-quantum well solar cells

Authors: M. Auf der Maur, G. Moses, J. M. Gordon, X. Huang, Y. Zhao, E. A. Katz

Abstract: We have analyzed the temperature and intensity dependence of the open-circuit voltage of InGaN/GaN multi-quantum well solar cells up to 725 K and more than 1000 suns. We show that the simple ABC model routinely used to analyze the measured quantum efficiency data of InGaN/GaN LEDs can accurately reproduce the temperature and intensity dependence of the measured open-circuit voltage if a temperatur… ▽ More We have analyzed the temperature and intensity dependence of the open-circuit voltage of InGaN/GaN multi-quantum well solar cells up to 725 K and more than 1000 suns. We show that the simple ABC model routinely used to analyze the measured quantum efficiency data of InGaN/GaN LEDs can accurately reproduce the temperature and intensity dependence of the measured open-circuit voltage if a temperature-dependent Shockley-Read-Hall lifetime is used and device heating is taken into account. △ Less

Submitted 5 April, 2021; originally announced April 2021.

arXiv:2103.07966 [pdf, other]

Active Dynamical Prospection: Modeling Mental Simulation as Particle Filtering for Sensorimotor Control during Pathfinding

Authors: Jeremy Gordon, John Chuang

Abstract: What do humans do when confronted with a common challenge: we know where we want to go but we are not yet sure the best way to get there, or even if we can. This is the problem posed to agents during spatial navigation and pathfinding, and its solution may give us clues about the more abstract domain of planning in general. In this work, we model pathfinding behavior in a continuous, explicitly ex… ▽ More What do humans do when confronted with a common challenge: we know where we want to go but we are not yet sure the best way to get there, or even if we can. This is the problem posed to agents during spatial navigation and pathfinding, and its solution may give us clues about the more abstract domain of planning in general. In this work, we model pathfinding behavior in a continuous, explicitly exploratory paradigm. In our task, participants (and agents) must coordinate both visual exploration and navigation within a partially observable environment. Our contribution has three primary components: 1) an analysis of behavioral data from 81 human participants in a novel pathfinding paradigm conducted as an online experiment, 2) a proposal to model prospective mental simulation during navigation as particle filtering, and 3) an instantiation of this proposal in a computational agent. We show that our model, Active Dynamical Prospection, demonstrates similar patterns of map solution rate, path selection, and trial duration, as well as attentional behavior (at both aggregate and individual levels) when compared with data from human participants. We also find that both distal attention and delay prior to first move (both potential correlates of prospective simulation) are predictive of task performance. △ Less

Submitted 14 March, 2021; originally announced March 2021.

Comments: 8 pages, 8 figures

ACM Class: I.2; I.6

arXiv:2103.02650 [pdf, other]

Successor Feature Sets: Generalizing Successor Representations Across Policies

Authors: Kianté Brantley, Soroush Mehri, Geoffrey J. Gordon

Abstract: Successor-style representations have many advantages for reinforcement learning: for example, they can help an agent generalize from past experience to new goals, and they have been proposed as explanations of behavioral and neural data from human and animal learners. They also form a natural bridge between model-based and model-free RL methods: like the former they make predictions about future e… ▽ More Successor-style representations have many advantages for reinforcement learning: for example, they can help an agent generalize from past experience to new goals, and they have been proposed as explanations of behavioral and neural data from human and animal learners. They also form a natural bridge between model-based and model-free RL methods: like the former they make predictions about future experiences, and like the latter they allow efficient prediction of total discounted rewards. However, successor-style representations are not optimized to generalize across policies: typically, we maintain a limited-length list of policies, and share information among them by representation learning or GPI. Successor-style representations also typically make no provision for gathering information or reasoning about latent variables. To address these limitations, we bring together ideas from predictive state representations, belief space value iteration, successor features, and convex analysis: we develop a new, general successor-style representation, together with a Bellman equation that connects multiple sources of information within this representation, including different latent states, policies, and reward functions. The new representation is highly expressive: for example, it lets us efficiently read off an optimal policy for a new reward function, or a policy that imitates a new demonstration. For this paper, we focus on exact computation of the new representation in small, known environments, since even this restricted setting offers plenty of interesting questions. Our implementation does not scale to large, unknown environments -- nor would we expect it to, since it generalizes POMDP value iteration, which is difficult to scale. However, we believe that future work will allow us to extend our ideas to approximate reasoning in large, unknown environments. △ Less

Submitted 15 March, 2021; v1 submitted 3 March, 2021; originally announced March 2021.

arXiv:2102.12013 [pdf, other]

Understanding and Mitigating Accuracy Disparity in Regression

Authors: Jianfeng Chi, Yuan Tian, Geoffrey J. Gordon, Han Zhao

Abstract: With the widespread deployment of large-scale prediction systems in high-stakes domains, e.g., face recognition, criminal justice, etc., disparity in prediction accuracy between different demographic subgroups has called for fundamental understanding on the source of such disparity and algorithmic intervention to mitigate it. In this paper, we study the accuracy disparity problem in regression. To… ▽ More With the widespread deployment of large-scale prediction systems in high-stakes domains, e.g., face recognition, criminal justice, etc., disparity in prediction accuracy between different demographic subgroups has called for fundamental understanding on the source of such disparity and algorithmic intervention to mitigate it. In this paper, we study the accuracy disparity problem in regression. To begin with, we first propose an error decomposition theorem, which decomposes the accuracy disparity into the distance between marginal label distributions and the distance between conditional representations, to help explain why such accuracy disparity appears in practice. Motivated by this error decomposition and the general idea of distribution alignment with statistical distances, we then propose an algorithm to reduce this disparity, and analyze its game-theoretic optima of the proposed objective functions. To corroborate our theoretical findings, we also conduct experiments on five benchmark datasets. The experimental results suggest that our proposed algorithms can effectively mitigate accuracy disparity while maintaining the predictive power of the regression models. △ Less

Submitted 12 June, 2021; v1 submitted 23 February, 2021; originally announced February 2021.

Comments: ICML 2021

arXiv:2101.12071 [pdf, other]

doi 10.1063/5.0045601

Determining the Angle-of-Arrival of an Radio-Frequency Source with a Rydberg Atom-Based Sensor

Authors: Amy K. Robinson, Nikunjkumar Prajapati, Damir Senic, Matthew T. Simons, Joshua A. Gordon, Christopher L. Holloway

Abstract: In this work, we demonstrate the use of a Rydberg atom-based sensor for determining the angle-of-arrival of an incident radio-frequency (RF) wave or signal. The technique uses electromagnetically induced transparency in Rydberg atomic vapor in conjunction with a heterodyne Rydberg atom-based mixer. The Rydberg atom mixer measures the phase of the incident RF wave at two different locations inside… ▽ More In this work, we demonstrate the use of a Rydberg atom-based sensor for determining the angle-of-arrival of an incident radio-frequency (RF) wave or signal. The technique uses electromagnetically induced transparency in Rydberg atomic vapor in conjunction with a heterodyne Rydberg atom-based mixer. The Rydberg atom mixer measures the phase of the incident RF wave at two different locations inside an atomic vapor cell. The phase difference at these two locations is related to the direction of arrival of the incident RF wave. To demonstrate this approach, we measure phase differences of an incident 19.18 GHz wave at two locations inside a vapor cell filled with cesium atoms for various incident angles. Comparisons of these measurements to both full-wave simulation and to a plane-wave theoretical model show that these atom-based sub-wavelength phase measurements can be used to determine the angle-of-arrival of an RF field. △ Less

Submitted 28 January, 2021; originally announced January 2021.

Comments: 4 pages, 5 figures

arXiv:2101.03606 [pdf, other]

The Gaussian Neural Process

Authors: Wessel P. Bruinsma, James Requeima, Andrew Y. K. Foong, Jonathan Gordon, Richard E. Turner

Abstract: Neural Processes (NPs; Garnelo et al., 2018a,b) are a rich class of models for meta-learning that map data sets directly to predictive stochastic processes. We provide a rigorous analysis of the standard maximum-likelihood objective used to train conditional NPs. Moreover, we propose a new member to the Neural Process family called the Gaussian Neural Process (GNP), which models predictive correla… ▽ More Neural Processes (NPs; Garnelo et al., 2018a,b) are a rich class of models for meta-learning that map data sets directly to predictive stochastic processes. We provide a rigorous analysis of the standard maximum-likelihood objective used to train conditional NPs. Moreover, we propose a new member to the Neural Process family called the Gaussian Neural Process (GNP), which models predictive correlations, incorporates translation equivariance, provides universal approximation guarantees, and demonstrates encouraging performance. △ Less

Submitted 10 January, 2021; originally announced January 2021.

Comments: 34 pages; includes supplementary material; to appear in AABI 2020

arXiv:2101.02340 [pdf, ps, other]

doi 10.1109/TNS.2020.3041215

Neutron Response of the EJ-254 Boron-Loaded Plastic Scintillator

Authors: Gino Gabella, Bethany L. Goldblum, Thibault A. Laplace, Juan J. Manfredi, Joseph Gordon, Zachary W. Sweger, Edith Bourret

Abstract: Organic scintillators doped with capture agents provide a detectable signal for neutrons over a broad energy range. This work characterizes the fast and slow neutron response of EJ-254, an organic plastic scintillator with 5% natural boron loading by weight. For fast neutrons, the primary mechanism for light generation in organic scintillators is n-p elastic scattering. To study the fast neutron r… ▽ More Organic scintillators doped with capture agents provide a detectable signal for neutrons over a broad energy range. This work characterizes the fast and slow neutron response of EJ-254, an organic plastic scintillator with 5% natural boron loading by weight. For fast neutrons, the primary mechanism for light generation in organic scintillators is n-p elastic scattering. To study the fast neutron response, the proton light yield of EJ-254 was measured at the 88-Inch Cyclotron at Lawrence Berkeley National Laboratory. Using a broad-spectrum neutron source and a double time-of-flight technique, the EJ-254 proton light yield was obtained over the energy range of approximately 270 keV to 4.5 MeV and determined to be in agreement with other plastic scintillators comprised of the same polymer base. To isolate the slow neutron response, an AmBe source with polyethylene moderator was made incident on the EJ-254 scintillator surrounded by an array of EJ-309 observation detectors. Events in the EJ-254 target coincident with the signature 477.6 keV $γがんま$ ray (resulting from deexcitation of the residual $^{7}$Li nucleus following boron neutron capture) were identified. Pulse shape discrimination was used to evaluate the temporal differences in the response of EJ-254 scintillation signals arising from $γがんま$-ray and fast/slow neutron interactions. Clear separation between $γがんま$-ray and fast neutrons signals was not achieved and the neutron capture feature was observed to overlap both the $γがんま$-ray and fast neutron bands. Taking into account the electron light nonproportionality, the neutron-capture light yield in EJ-254 was determined to be 89.4$\pm$1.1 keVee. △ Less

Submitted 6 January, 2021; originally announced January 2021.

Comments: 9 pages, 10 figures

arXiv:2101.00771 [pdf, other]

doi 10.1145/3411764.3445309

Covert Embodied Choice: Decision-Making and the Limits of Privacy Under Biometric Surveillance

Authors: Jeremy Gordon, Max Curran, John Chuang, Coye Cheshire

Abstract: Algorithms engineered to leverage rich behavioral and biometric data to predict individual attributes and actions continue to permeate public and private life. A fundamental risk may emerge from misconceptions about the sensitivity of such data, as well as the agency of individuals to protect their privacy when fine-grained (and possibly involuntary) behavior is tracked. In this work, we examine h… ▽ More Algorithms engineered to leverage rich behavioral and biometric data to predict individual attributes and actions continue to permeate public and private life. A fundamental risk may emerge from misconceptions about the sensitivity of such data, as well as the agency of individuals to protect their privacy when fine-grained (and possibly involuntary) behavior is tracked. In this work, we examine how individuals adjust their behavior when incentivized to avoid the algorithmic prediction of their intent. We present results from a virtual reality task in which gaze, movement, and other physiological signals are tracked. Participants are asked to decide which card to select without an algorithmic adversary anticipating their choice. We find that while participants use a variety of strategies, data collected remains highly predictive of choice (80% accuracy). Additionally, a significant portion of participants became more predictable despite efforts to obfuscate, possibly indicating mistaken priors about the dynamics of algorithmic prediction. △ Less

Submitted 3 January, 2021; originally announced January 2021.

Comments: 12 pages. To be presented at CHI 2021

arXiv:2012.10713 [pdf, other]

Fundamental Limits and Tradeoffs in Invariant Representation Learning

Authors: Han Zhao, Chen Dan, Bryon Aragam, Tommi S. Jaakkola, Geoffrey J. Gordon, Pradeep Ravikumar

Abstract: A wide range of machine learning applications such as privacy-preserving learning, algorithmic fairness, and domain adaptation/generalization among others, involve learning invariant representations of the data that aim to achieve two competing goals: (a) maximize information or accuracy with respect to a target response, and (b) maximize invariance or independence with respect to a set of protect… ▽ More A wide range of machine learning applications such as privacy-preserving learning, algorithmic fairness, and domain adaptation/generalization among others, involve learning invariant representations of the data that aim to achieve two competing goals: (a) maximize information or accuracy with respect to a target response, and (b) maximize invariance or independence with respect to a set of protected features (e.g., for fairness, privacy, etc). Despite their wide applicability, theoretical understanding of the optimal tradeoffs -- with respect to accuracy, and invariance -- achievable by invariant representations is still severely lacking. In this paper, we provide an information theoretic analysis of such tradeoffs under both classification and regression settings. More precisely, we provide a geometric characterization of the accuracy and invariance achievable by any representation of the data; we term this feasible region the information plane. We provide an inner bound for this feasible region for the classification case, and an exact characterization for the regression case, which allows us to either bound or exactly characterize the Pareto optimal frontier between accuracy and invariance. Although our contributions are mainly theoretical, a key practical application of our results is in certifying the potential sub-optimality of any given representation learning algorithm for either classification or regression tasks. Our results shed new light on the fundamental interplay between accuracy and invariance, and may be useful in guiding the design of future representation learning algorithms. △ Less

Submitted 23 November, 2022; v1 submitted 19 December, 2020; originally announced December 2020.

Comments: JMLR camera-ready version

arXiv:2010.11233 [pdf, other]

doi 10.1103/PhysRevResearch.4.013062

Spontaneous Chiral-Spin Ordering in Spin-Orbit Coupled Honeycomb Magnets

Authors: Qiang Luo, P. Peter Stavropoulos, Jacob S. Gordon, Hae-Young Kee

Abstract: Frustrated magnets with highly degenerate ground states are at the heart of hunting exotic states of matter. Recent studies in spin-orbit coupled honeycomb magnets have generated immense interest in bond-dependent interactions, appreciating a symmetric off-diagonal $Γがんま$ interaction which exhibits a macroscopic degeneracy in the classical limit. Here, we study a generic spin model and discover a nov… ▽ More Frustrated magnets with highly degenerate ground states are at the heart of hunting exotic states of matter. Recent studies in spin-orbit coupled honeycomb magnets have generated immense interest in bond-dependent interactions, appreciating a symmetric off-diagonal $Γがんま$ interaction which exhibits a macroscopic degeneracy in the classical limit. Here, we study a generic spin model and discover a novel chiral-spin ordering with spontaneously broken time-reversal symmetry near the dominant $Γがんま$ region. The chiral-spin phase is demonstrated to possess a staggered chirality relation in different sublattices, and it exhibits gapless excitations as revealed by the vanishing energy gap and the finite central charge on cylinders. Although there is a vestige of a tiny peak in the corner of the second Brillouin zone, the magnetic order is likely to vanish as the system size increases. Finally, we also attempt to gain insight into the possible topological signature of the chiral-spin phase by calculating the dynamic structure factor and the modular $\mathcal{S}$ matrix. △ Less

Submitted 6 January, 2022; v1 submitted 21 October, 2020; originally announced October 2020.

Comments: 12+6 pages, 9+11 figures, 1+2 tables. Accepted by Physical Review Research

Journal ref: Phys. Rev. Research 4, 013062 (2022)

arXiv:2010.04980 [pdf, other]

An Empirical Investigation of Beam-Aware Training in Supertagging

Authors: Renato Negrinho, Matthew R. Gormley, Geoffrey J. Gordon

Abstract: Structured prediction is often approached by training a locally normalized model with maximum likelihood and decoding approximately with beam search. This approach leads to mismatches as, during training, the model is not exposed to its mistakes and does not use beam search. Beam-aware training aims to address these problems, but unfortunately, it is not yet widely used due to a lack of understand… ▽ More Structured prediction is often approached by training a locally normalized model with maximum likelihood and decoding approximately with beam search. This approach leads to mismatches as, during training, the model is not exposed to its mistakes and does not use beam search. Beam-aware training aims to address these problems, but unfortunately, it is not yet widely used due to a lack of understanding about how it impacts performance, when it is most useful, and whether it is stable. Recently, Negrinho et al. (2018) proposed a meta-algorithm that captures beam-aware training algorithms and suggests new ones, but unfortunately did not provide empirical results. In this paper, we begin an empirical investigation: we train the supertagging model of Vaswani et al. (2016) and a simpler model with instantiations of the meta-algorithm. We explore the influence of various design choices and make recommendations for choosing them. We observe that beam-aware training improves performance for both models, with large improvements for the simpler model which must effectively manage uncertainty during decoding. Our results suggest that a model must be learned with search to maximize its effectiveness. △ Less

Submitted 10 October, 2020; originally announced October 2020.

Comments: EMNLP Findings 2020 camera-ready. Code can be found at https://github.com/negrinho/beam_learn_supertagging

arXiv:2010.02869 [pdf, other]

doi 10.1088/1475-7516/2021/04/017

A search for ultrahigh-energy neutrinos associated with astrophysical sources using the third flight of ANITA

Authors: C. Deaconu, L. Batten, P. Allison, O. Banerjee, J. J. Beatty, K. Belov, D. Z. Besson, W. R. Binns, V. Bugaev, P. Cao, C. H. Chen, P. Chen, Y. Chen, J. M. Clem, A. Connolly, L. Cremonesi, B. Dailey, P. F. Dowkontt, B. D. Fox, J. W. H. Gordon, P. W. Gorham, C. Hast, B. Hill, S. Y. Hsu, J. J. Huang , et al. (38 additional authors not shown)

Abstract: The ANtarctic Impulsive Transient Antenna (ANITA) long-duration balloon experiment is sensitive to interactions of ultra high-energy (E > 10^{18} eV) neutrinos in the Antarctic ice sheet. The third flight of ANITA, lasting 22 days, began in December 2014. We develop a methodology to search for energetic neutrinos spatially and temporally coincident with potential source classes in ANITA data. This… ▽ More The ANtarctic Impulsive Transient Antenna (ANITA) long-duration balloon experiment is sensitive to interactions of ultra high-energy (E > 10^{18} eV) neutrinos in the Antarctic ice sheet. The third flight of ANITA, lasting 22 days, began in December 2014. We develop a methodology to search for energetic neutrinos spatially and temporally coincident with potential source classes in ANITA data. This methodology is applied to several source classes: the TXS 0506+056 blazar and NGC 1068, the first potential TeV neutrino sources identified by IceCube, flaring high-energy blazars reported by the Fermi All-Sky Variability Analysis, gamma-ray bursts, and supernovae. Among searches within the five source classes, one candidate was identified as associated with SN 2015D, although not at a statistically significant level. We proceed to place upper limits on the source classes. We further comment on potential applications of this methodology to more sensitive future instruments. △ Less

Submitted 15 March, 2021; v1 submitted 6 October, 2020; originally announced October 2020.

Comments: 23 pages, 7 figures, version accepted to JCAP

arXiv:2009.05453 [pdf]

Enhanced Algal Photosynthetic Photon Efficiency by Pulsed Light

Authors: Yair Zarmi, Jeffrey M. Gordon, Amit Mahulkar, Avinash R. Khopkar, Smita D. Patil, Arun Banerjee, Badari Gade Reddy, Thomas P. Griffin, Ajit Sapre

Abstract: We present experimental results demonstrating that, relative to continuous illumination, an increase of a factor of 3-10 in the photon efficiency of algal photo-synthesis is attainable via the judicious application of pulsed light for light intensities of practical interest (e.g., average-to-peak solar photon flux). We also propose a simple model that can account for all the measurements. The mode… ▽ More We present experimental results demonstrating that, relative to continuous illumination, an increase of a factor of 3-10 in the photon efficiency of algal photo-synthesis is attainable via the judicious application of pulsed light for light intensities of practical interest (e.g., average-to-peak solar photon flux). We also propose a simple model that can account for all the measurements. The model (1) reflects the essential rate-limiting elements in bio-productivity, (2) incorporates the impact of photon arrival-time statistics and (3) accounts for how the enhancement in photon efficiency depends on the timescales of light pulsing and photon flux density. The key is avoiding clogging of the photosynthetic pathway by properly timing the light-dark cycles experienced by algal cells. We show how this can be realized with pulsed light sources, or by producing pulsed-light effects from continuous illumination via turbulent mixing in dense algal cultures in thin photo-bioreactors. △ Less

Submitted 11 September, 2020; originally announced September 2020.

Comments: 22 pages, 17 Figures

MSC Class: 97M60

Journal ref: i-Science 23, 101115, May 22, 2020

arXiv:2008.08800 [pdf, other]

doi 10.1002/mrm.28204

A Metabolite Specific 3D Stack-of-Spiral bSSFP Sequence for Improved Lactate Imaging in Hyperpolarized [1-$^{13}$C]Pyruvate Studies on a 3T Clinical Scanner

Authors: Shuyu Tang, Robert Bok, Hecong Qin, Galen Reed, Mark VanCriekinge, Romelyn Delos Santos, William Overall, Juan Santos, Jeremy Gordon, Zhen Jane Wang, Daniel B. Vigneron, Peder E. Z. Larson

Abstract: Purpose: The balanced steady-state free precession sequence has been previously explored to improve the efficient use of non-recoverable hyperpolarized $^{13}$C magnetization, but suffers from poor spectral selectivity and long acquisition time. The purpose of this study was to develop a novel metabolite-specific 3D bSSFP ("MS-3DSSFP") sequence with stack-of-spiral readouts for improved lactate im… ▽ More Purpose: The balanced steady-state free precession sequence has been previously explored to improve the efficient use of non-recoverable hyperpolarized $^{13}$C magnetization, but suffers from poor spectral selectivity and long acquisition time. The purpose of this study was to develop a novel metabolite-specific 3D bSSFP ("MS-3DSSFP") sequence with stack-of-spiral readouts for improved lactate imaging in hyperpolarized [1-$^{13}$C]pyruvate studies on a clinical 3T scanner. Methods: Simulations were performed to evaluate the spectral response of the MS-3DSSFP sequence. Thermal $^{13}$C phantom experiments were performed to validate the MS-3DSSFP sequence. In vivo hyperpolarized [1-$^{13}$C]pyruvate studies were performed to compare the MS-3DSSFP sequence with metabolite specific gradient echo ("MS-GRE") sequences for lactate imaging. Results: Simulations, phantom and in vivo studies demonstrate that the MS-3DSSFP sequence achieved spectrally selective excitation on lactate while minimally perturbing other metabolites. Compared with MS-GRE sequences, the MS-3DSSFP sequence showed approximately a 2.5-fold SNR improvement for lactate imaging in rat kidneys, prostate tumors in a mouse model and human kidneys. Conclusions: Improved lactate imaging using the MS-3DSSFP sequence in hyperpolarized [1-$^{13}$C]pyruvate studies was demonstrated in animals and humans. The MS-3DSSFP sequence could be applied for other clinical applications such as in the brain or adapted for imaging other metabolites such as pyruvate and bicarbonate. △ Less

Submitted 20 August, 2020; originally announced August 2020.

Journal ref: Magn Reson Med. 2020; 84: 1113-1125

arXiv:2008.08794 [pdf, other]

doi 10.1002/mrm.27391

A Regional Bolus Tracking and Real-time B$_1$ Calibration Method for Hyperpolarized $^{13}$C MRI

Authors: Shuyu Tang, Eugene Milshteyn, Galen Reed, Jeremy Gordon, Robert Bok, Xucheng Zhu, Zihan Zhu, Daniel B. Vigneron, Peder E. Z. Larson

Abstract: Purpose: Acquisition timing and B$_1$ calibration are two key factors that affect the quality and accuracy of hyperpolarized $^{13}$C MRI. The goal of this project was to develop a new approach using regional bolus tracking to trigger Bloch-Siegert B$_1$ mapping and real-time B$_1$ calibration based on regional B$_1$ measurements, followed by dynamic imaging of hyperpolarized $^{13}C$ metabolites… ▽ More Purpose: Acquisition timing and B$_1$ calibration are two key factors that affect the quality and accuracy of hyperpolarized $^{13}$C MRI. The goal of this project was to develop a new approach using regional bolus tracking to trigger Bloch-Siegert B$_1$ mapping and real-time B$_1$ calibration based on regional B$_1$ measurements, followed by dynamic imaging of hyperpolarized $^{13}C$ metabolites in vivo. Methods: The proposed approach was implemented on a system which allows real-time data processing and real-time control on the sequence. Real-time center frequency calibration upon the bolus arrival was also added. The feasibility of applying the proposed framework for in vivo hyperpolarized $^{13}$C imaging was tested on healthy rats, tumor-bearing mice and a healthy volunteer on a clinical 3T scanner following hyperpolarized [1-$^{13}$C]pyruvate injection. Multichannel receive coils were used in the human study. Results: Automatic acquisition timing based on either regional bolus peak or bolus arrival was achieved with the proposed framework. Reduced blurring artifacts in real-time reconstructed images were observed with real-time center frequency calibration. Real-time computed B$_1$ scaling factors agreed with real-time acquired B$_1$ maps. Flip angle correction using B$_1$ maps results in a more consistent quantification of metabolic activity (i.e, pyruvate-to-lactate conversion, k$_{PL}$). Experiment recordings are provided to demonstrate the real-time actions during the experiment. Conclusion: The proposed method was successfully demonstrated on animals and a human volunteer, and is anticipated to improve the efficient use of the hyperpolarized signal as well as the accuracy and robustness of hyperpolarized $^{13}$C imaging. △ Less

Submitted 20 August, 2020; originally announced August 2020.

Journal ref: Magn Reson Med. 2019 Feb; 81(2): 839-851

arXiv:2008.05690 [pdf, other]

doi 10.1103/PhysRevLett.126.071103

Unusual Near-horizon Cosmic-ray-like Events Observed by ANITA-IV

Authors: ANITA Collaboration, P. W. Gorham, A. Ludwig, C. Deaconu, P. Cao, P. Allison, O. Banerjee, L. Batten, D. Bhattacharya, J. J. Beatty, K. Belov, W. R. Binns, V. Bugaev, C. H. Chen, P. Chen, Y. Chen, J. M. Clem, L. Cremonesi, B. Dailey, P. F. Dowkontt, B. D. Fox, J. W. H. Gordon, C. Hast, B. Hill, S. Y. Hsu , et al. (35 additional authors not shown)

Abstract: ANITA's fourth long-duration balloon flight in late 2016 detected 29 cosmic-ray (CR)-like events on a background of $0.37^{+0.27}_{-0.17}$ anthropogenic events. CRs are mainly seen in reflection off the Antarctic ice sheets, creating a characteristic phase-inverted waveform polarity. However, four of the below-horizon CR-like events show anomalous non-inverted polarity, a $p = 5.3 \times 10^{-4}$… ▽ More ANITA's fourth long-duration balloon flight in late 2016 detected 29 cosmic-ray (CR)-like events on a background of $0.37^{+0.27}_{-0.17}$ anthropogenic events. CRs are mainly seen in reflection off the Antarctic ice sheets, creating a characteristic phase-inverted waveform polarity. However, four of the below-horizon CR-like events show anomalous non-inverted polarity, a $p = 5.3 \times 10^{-4}$ chance if due to background. All anomalous events are from locations near the horizon; ANITA-IV observed no steeply-upcoming anomalous events similar to the two such events seen in prior flights. △ Less

Submitted 19 November, 2020; v1 submitted 13 August, 2020; originally announced August 2020.

Comments: 6 pages, 4 figures, to appear in Physical Review Letters. Supplemental material (reference 17) available from corresponding author

Journal ref: Phys. Rev. Lett. 126, 071103 (2021)

arXiv:2007.07259 [pdf, other]

doi 10.1103/PhysRevX.11.011013

The Heart of Entanglement: Chiral, Nematic, and Incommensurate Phases in the Kitaev-Gamma Ladder in a Field

Authors: Erik S. Sørensen, Andrei Catuneanu, Jacob Gordon, Hae-Young Kee

Abstract: The bond-dependent Kitaev model on the honeycomb lattice with anyonic excitations has recently attracted considerable attention. However, in solid state materials other spin interactions are present, and among such additional interactions, the off-diagonal symmetric Gamma interaction, another type of bond-dependent term, has been particularly challenging to fully understand. A minimal Kitaev-Gamma… ▽ More The bond-dependent Kitaev model on the honeycomb lattice with anyonic excitations has recently attracted considerable attention. However, in solid state materials other spin interactions are present, and among such additional interactions, the off-diagonal symmetric Gamma interaction, another type of bond-dependent term, has been particularly challenging to fully understand. A minimal Kitaev-Gamma (KG) model has been investigated by various numerical techniques under a magnetic field, but definite conclusions about field-induced spin liquids remain elusive and one reason may lie in the limited sizes of the two-dimensional geometry it is possible to access numerically. We therefore focus on the KG model defined on a two-leg ladder which is much more amenable to a complete study, and determine the entire phase diagram in the presence of a magnetic field along [111]-direction. Due to the competition between the interactions and the field, an extremely rich phase diagram emerges with fifteen distinct phases. Focusing near the antiferromagnetic Kitaev region, we identify nine different phases solely within this region: several incommensurate magnetically ordered phases, spin nematic, and two chiral phases with enhanced entanglement. Of particular interest is a highly entangled phase with staggered chirality with zero-net flux occurring at intermediate field, which along with its companion phases outline a heart-shaped region of high entanglement, the heart of entanglement. We compare our results for the ladder with a C3 symmetric cluster of the two-dimensional honeycomb lattice, and offer insights into possible spin liquids in the two-dimensional limit. △ Less

Submitted 14 July, 2020; originally announced July 2020.

Comments: 23 pages 28 figures

Journal ref: Phys. Rev. X 11, 011013 (2021)

arXiv:2007.01332 [pdf, other]

Meta-Learning Stationary Stochastic Process Prediction with Convolutional Neural Processes

Authors: Andrew Y. K. Foong, Wessel P. Bruinsma, Jonathan Gordon, Yann Dubois, James Requeima, Richard E. Turner

Abstract: Stationary stochastic processes (SPs) are a key component of many probabilistic models, such as those for off-the-grid spatio-temporal data. They enable the statistical symmetry of underlying physical phenomena to be leveraged, thereby aiding generalization. Prediction in such models can be viewed as a translation equivariant map from observed data sets to predictive SPs, emphasizing the intimate… ▽ More Stationary stochastic processes (SPs) are a key component of many probabilistic models, such as those for off-the-grid spatio-temporal data. They enable the statistical symmetry of underlying physical phenomena to be leveraged, thereby aiding generalization. Prediction in such models can be viewed as a translation equivariant map from observed data sets to predictive SPs, emphasizing the intimate relationship between stationarity and equivariance. Building on this, we propose the Convolutional Neural Process (ConvNP), which endows Neural Processes (NPs) with translation equivariance and extends convolutional conditional NPs to allow for dependencies in the predictive distribution. The latter enables ConvNPs to be deployed in settings which require coherent samples, such as Thompson sampling or conditional image completion. Moreover, we propose a new maximum-likelihood objective to replace the standard ELBO objective in NPs, which conceptually simplifies the framework and empirically improves performance. We demonstrate the strong performance and generalization capabilities of ConvNPs on 1D regression, image completion, and various tasks with real-world spatio-temporal data. △ Less

Submitted 20 November, 2020; v1 submitted 2 July, 2020; originally announced July 2020.

Comments: NeurIPS 2020

arXiv:2007.00165 [pdf, other]

Multi-coil Magnetic Resonance Imaging with Compressed Sensing Using Physically Motivated Regularization

Authors: Nicholas Dwork, Ethan M. I. Johnson, Daniel O'Connor, Jeremy W. Gordon, Adam B. Kerr, Corey A. Baron, John M. Pauly, Peder E. Z. Larson

Abstract: With the advent of multi-coil imaging and compressed sensing, a number of model based reconstruction algorithms have been created. They incorporate a multitude of different regularization functions based on physics, observed phenomenology, and heuristics. Moreover, several iterative methods exist that attempt to simultaneously estimate the sensitivity maps and the image. In this manuscript, we pre… ▽ More With the advent of multi-coil imaging and compressed sensing, a number of model based reconstruction algorithms have been created. They incorporate a multitude of different regularization functions based on physics, observed phenomenology, and heuristics. Moreover, several iterative methods exist that attempt to simultaneously estimate the sensitivity maps and the image. In this manuscript, we present a generalization of several existing iterative model based algorithms. We devise a calibrationless instance of this generalization that only incorporates regularization terms based on physics and the accepted compressed sensing phenomenology of sparsity in the wavelet domain. We compare the results of the new amalgamated optimization problem with existing methods on both simulated and real datasets. We show that the images reconstructed using the new method, entitled Multi-coil Compressed Sensing (MCCS), are of higher quality than existing methods in all cases studied. △ Less

Submitted 2 February, 2023; v1 submitted 30 June, 2020; originally announced July 2020.

arXiv:2006.10801 [pdf, other]

Predictive Complexity Priors

Authors: Eric Nalisnick, Jonathan Gordon, José Miguel Hernández-Lobato

Abstract: Specifying a Bayesian prior is notoriously difficult for complex models such as neural networks. Reasoning about parameters is made challenging by the high-dimensionality and over-parameterization of the space. Priors that seem benign and uninformative can have unintuitive and detrimental effects on a model's predictions. For this reason, we propose predictive complexity priors: a functional prior… ▽ More Specifying a Bayesian prior is notoriously difficult for complex models such as neural networks. Reasoning about parameters is made challenging by the high-dimensionality and over-parameterization of the space. Priors that seem benign and uninformative can have unintuitive and detrimental effects on a model's predictions. For this reason, we propose predictive complexity priors: a functional prior that is defined by comparing the model's predictions to those of a reference model. Although originally defined on the model outputs, we transfer the prior to the model parameters via a change of variables. The traditional Bayesian workflow can then proceed as usual. We apply our predictive complexity prior to high-dimensional regression, reasoning over neural network depth, and sharing of statistical strength for few-shot learning. △ Less

Submitted 21 October, 2020; v1 submitted 18 June, 2020; originally announced June 2020.

Comments: 23 pages

arXiv:2004.13723 [pdf, other]

doi 10.1103/PhysRevResearch.3.013179

Testing Ising Topological Order in $αあるふぁ$-RuCl$_3$ Under In-Plane Magnetic Fields

Authors: Jacob S. Gordon, Hae-Young Kee

Abstract: Material realization of the non-Abelian Kitaev spin liquid phase - an example of Ising topological order (ITO) - has been the subject of intense research in recent years. The $4d$ honeycomb Mott insulator $αあるふぁ$-RuCl$_3$ has emerged as a leading candidate, as it enters a field-induced magnetically disordered state where a half-integer quantized thermal Hall conductivity $κかっぱ_{xy}$ was reported. Further… ▽ More Material realization of the non-Abelian Kitaev spin liquid phase - an example of Ising topological order (ITO) - has been the subject of intense research in recent years. The $4d$ honeycomb Mott insulator $αあるふぁ$-RuCl$_3$ has emerged as a leading candidate, as it enters a field-induced magnetically disordered state where a half-integer quantized thermal Hall conductivity $κかっぱ_{xy}$ was reported. Further, a recent report of a sign change in the quantized $κかっぱ_{xy}$ across a certain crystallographic direction is strong evidence for a topological phase transition between two ITOs with opposite Chern numbers. Although this is a fascinating result, independent verification remains elusive, and one may ask if there is a thermodynamic quantity sensitive to the phase transition. Here we propose that the magnetotropic coefficient $k$ under in-plane magnetic fields would serve such a purpose. We report a singular feature in $k$ that indicates a topological phase transition across the $\hat{b}$-axis where ITO is prohibited by a $C_2$ symmetry. If the transition in $αあるふぁ$-RuCl$_3$ is indeed a direct transition between ITOs, then this feature in $k$ should be observable. △ Less

Submitted 28 April, 2020; originally announced April 2020.

Comments: 7 pages, 2 figures, 1 table

Journal ref: Phys. Rev. Research 3, 013179 (2021)

arXiv:2003.05042 [pdf, other]

doi 10.1007/s10334-020-00903-y

Di-chromatic Interpolation of Magnetic Resonance Metabolic Imagery

Authors: Nicholas Dwork, Jeremy W. Gordon, Shuyu Tang, Daniel O'Connor, Esben Sovso Szocska Hansen, Christoffer Laustsen, Peder E. Z. Larson

Abstract: Magnetic resonance imaging with hyperpolarized contrast agents can provide unprecedented \textit{in-vivo} measurements of metabolism, but yields images that are lower resolution than that achieved with proton anatomical imaging. In order to spatially localize the metabolic activity, the metabolic image must be interpolated to the size of the proton image. The most common methods for choosing the u… ▽ More Magnetic resonance imaging with hyperpolarized contrast agents can provide unprecedented \textit{in-vivo} measurements of metabolism, but yields images that are lower resolution than that achieved with proton anatomical imaging. In order to spatially localize the metabolic activity, the metabolic image must be interpolated to the size of the proton image. The most common methods for choosing the unknown values rely exclusively on values of the original un-interpolated image. In this work, we present an alternative method that uses the higher-resolution proton image to provide additional spatial structure. The interpolated image is the result of a convex optimization algorithm which is solved with the Fast Iterative Shrinkage Threshold Algorithm (FISTA). Results are shown with images of hyperpolarized pyruvate, lactate, and bicarbonate using data of the heart and brain from healthy human volunteers, a healthy porcine heart, and a human with prostate cancer. △ Less

Submitted 12 February, 2021; v1 submitted 10 March, 2020; originally announced March 2020.

Comments: Magnetic Resonance Materials in Physics, Biology, and Medicine (2021)

Journal ref: Magnetic Resonance Materials in Physics, Biology and Medicine (2021): 1-16

arXiv:2003.03284 [pdf, other]

TaskNorm: Rethinking Batch Normalization for Meta-Learning

Authors: John Bronskill, Jonathan Gordon, James Requeima, Sebastian Nowozin, Richard E. Turner

Abstract: Modern meta-learning approaches for image classification rely on increasingly deep networks to achieve state-of-the-art performance, making batch normalization an essential component of meta-learning pipelines. However, the hierarchical nature of the meta-learning setting presents several challenges that can render conventional batch normalization ineffective, giving rise to the need to rethink no… ▽ More Modern meta-learning approaches for image classification rely on increasingly deep networks to achieve state-of-the-art performance, making batch normalization an essential component of meta-learning pipelines. However, the hierarchical nature of the meta-learning setting presents several challenges that can render conventional batch normalization ineffective, giving rise to the need to rethink normalization in this setting. We evaluate a range of approaches to batch normalization for meta-learning scenarios, and develop a novel approach that we call TaskNorm. Experiments on fourteen datasets demonstrate that the choice of batch normalization has a dramatic effect on both classification accuracy and training time for both gradient based and gradient-free meta-learning approaches. Importantly, TaskNorm is found to consistently improve performance. Finally, we provide a set of best practices for normalization that will allow fair comparison of meta-learning algorithms. △ Less

Submitted 28 June, 2020; v1 submitted 6 March, 2020; originally announced March 2020.

Journal ref: Proceedings of Machine Learning and Systems 2020, 4683-4694

Showing 1–50 of 188 results for author: Gordon, J