(Translated by https://www.hiragana.jp/)
Search | arXiv e-print repository
Skip to main content

Showing 1–47 of 47 results for author: Elsen, E

.
  1. Measurement of groomed event shape observables in deep-inelastic electron-proton scattering at HERA

    Authors: The H1 collaboration, V. Andreev, M. Arratia, A. Baghdasaryan, A. Baty, K. Begzsuren, A. Bolz, V. Boudry, G. Brandt, D. Britzger, A. Buniatyan, L. Bystritskaya, A. J. Campbell, K. B. Cantun Avila, K. Cerny, V. Chekelian, Z. Chen, J. G. Contreras, J. Cvach, J. B. Dainton, K. Daum, A. Deshpande, C. Diaconu, A. Drees, G. Eckerlin , et al. (123 additional authors not shown)

    Abstract: The H1 Collaboration at HERA reports the first measurement of groomed event shape observables in deep inelastic electron-proton scattering (DIS) at $\sqrt{s}=319$ GeV, using data recorded between the years 2003 and 2007 with an integrated luminosity of $351$ pb$^{-1}$. Event shapes provide incisive probes of perturbative and non-perturbative QCD. Grooming techniques have been used for jet measurem… ▽ More

    Submitted 1 August, 2024; v1 submitted 15 March, 2024; originally announced March 2024.

    Comments: 32 pages, 17 tables, 7 figures, version as accepted by EPJ C

    Report number: DESY-24-036

    Journal ref: EPJC 84 (2024), 718

  2. arXiv:2403.10109  [pdf, other

    hep-ex hep-ph nucl-ex

    Measurement of the 1-jettiness event shape observable in deep-inelastic electron-proton scattering at HERA

    Authors: The H1 collaboration, V. Andreev, M. Arratia, A. Baghdasaryan, A. Baty, K. Begzsuren, A. Bolz, V. Boudry, G. Brandt, D. Britzger, A. Buniatyan, L. Bystritskaya, A. J. Campbell, K. B. Cantun Avila, K. Cerny, V. Chekelian, Z. Chen, J. G. Contreras, J. Cvach, J. B. Dainton, K. Daum, A. Deshpande, C. Diaconu, A. Drees, G. Eckerlin , et al. (124 additional authors not shown)

    Abstract: The H1 Collaboration reports the first measurement of the 1-jettiness event shape observable $τたう_1^b$ in neutral-current deep-inelastic electron-proton scattering (DIS). The observable $τたう_1^b$ is equivalent to a thrust observable defined in the Breit frame. The data sample was collected at the HERA $ep$ collider in the years 2003-2007 with center-of-mass energy of $\sqrt{s}=319\,\text{GeV}$, corres… ▽ More

    Submitted 15 March, 2024; originally announced March 2024.

    Comments: 45 pages, 38 tables, 13 figures

    Report number: DESY-24-035

  3. Observation and differential cross section measurement of neutral current DIS events with an empty hemisphere in the Breit frame

    Authors: The H1 collaboration, V. Andreev, M. Arratia, A. Baghdasaryan, A. Baty, K. Begzsuren, A. Bolz, V. Boudry, G. Brandt, D. Britzger, A. Buniatyan, L. Bystritskaya, A. J. Campbell, K. B. Cantun Avila, K. Cerny, V. Chekelian, Z. Chen, J. G. Contreras, J. Cvach, J. B. Dainton, K. Daum, A. Deshpande, C. Diaconu, A. Drees, G. Eckerlin , et al. (124 additional authors not shown)

    Abstract: The Breit frame provides a natural frame to analyze lepton-proton scattering events. In this reference frame, the parton model hard interactions between a quark and an exchanged boson defines the coordinate system such that the struck quark is back-scattered along the virtual photon momentum direction. In Quantum Chromodynamics (QCD), higher order perturbative or non-perturbative effects can chang… ▽ More

    Submitted 1 August, 2024; v1 submitted 13 March, 2024; originally announced March 2024.

    Comments: 13 pages, 5 figures, 2 Tables. This version as accepted for publication

    Report number: DESY-24-034

    Journal ref: EPJC 84 (2024), 720

  4. Unbinned Deep Learning Jet Substructure Measurement in High $Q^2$ ep collisions at HERA

    Authors: The H1 collaboration, V. Andreev, M. Arratia, A. Baghdasaryan, A. Baty, K. Begzsuren, A. Bolz, V. Boudry, G. Brandt, D. Britzger, A. Buniatyan, L. Bystritskaya, A. J. Campbell, K. B. Cantun Avila, K. Cerny, V. Chekelian, Z. Chen, J. G. Contreras, J. Cvach, J. B. Dainton, K. Daum, A. Deshpande, C. Diaconu, A. Drees, G. Eckerlin , et al. (120 additional authors not shown)

    Abstract: The radiation pattern within high energy quark- and gluon-initiated jets (jet substructure) is used extensively as a precision probe of the strong force as well as an environment for optimizing event generators with numerous applications in high energy particle and nuclear physics. Looking at electron-proton collisions is of particular interest as many of the complications present at hadron collid… ▽ More

    Submitted 14 September, 2023; v1 submitted 23 March, 2023; originally announced March 2023.

    Comments: 25 pages, 10 figures, 8 tables, version accepted by Physics Letters B

    Report number: DESY-23-034

    Journal ref: PLB 844 (2023) 138101

  5. arXiv:2206.10369  [pdf, other

    cs.LG cs.AI

    The State of Sparse Training in Deep Reinforcement Learning

    Authors: Laura Graesser, Utku Evci, Erich Elsen, Pablo Samuel Castro

    Abstract: The use of sparse neural networks has seen rapid growth in recent years, particularly in computer vision. Their appeal stems largely from the reduced number of parameters required to train and store, as well as in an increase in learning efficiency. Somewhat surprisingly, there have been very few efforts exploring their use in Deep Reinforcement Learning (DRL). In this work we perform a systematic… ▽ More

    Submitted 17 June, 2022; originally announced June 2022.

    Comments: Proceedings of the 39th International Conference on Machine Learning (ICML'22)

  6. arXiv:2203.15556  [pdf, other

    cs.CL cs.LG

    Training Compute-Optimal Large Language Models

    Authors: Jordan Hoffmann, Sebastian Borgeaud, Arthur Mensch, Elena Buchatskaya, Trevor Cai, Eliza Rutherford, Diego de Las Casas, Lisa Anne Hendricks, Johannes Welbl, Aidan Clark, Tom Hennigan, Eric Noland, Katie Millican, George van den Driessche, Bogdan Damoc, Aurelia Guy, Simon Osindero, Karen Simonyan, Erich Elsen, Jack W. Rae, Oriol Vinyals, Laurent Sifre

    Abstract: We investigate the optimal model size and number of tokens for training a transformer language model under a given compute budget. We find that current large language models are significantly undertrained, a consequence of the recent focus on scaling language models whilst keeping the amount of training data constant. By training over 400 language models ranging from 70 million to over 16 billion… ▽ More

    Submitted 29 March, 2022; originally announced March 2022.

  7. arXiv:2203.07622  [pdf, other

    physics.acc-ph hep-ex hep-ph

    The International Linear Collider: Report to Snowmass 2021

    Authors: Alexander Aryshev, Ties Behnke, Mikael Berggren, James Brau, Nathaniel Craig, Ayres Freitas, Frank Gaede, Spencer Gessner, Stefania Gori, Christophe Grojean, Sven Heinemeyer, Daniel Jeans, Katja Kruger, Benno List, Jenny List, Zhen Liu, Shinichiro Michizono, David W. Miller, Ian Moult, Hitoshi Murayama, Tatsuya Nakada, Emilio Nanni, Mihoko Nojiri, Hasan Padamsee, Maxim Perelstein , et al. (487 additional authors not shown)

    Abstract: The International Linear Collider (ILC) is on the table now as a new global energy-frontier accelerator laboratory taking data in the 2030s. The ILC addresses key questions for our current understanding of particle physics. It is based on a proven accelerator technology. Its experiments will challenge the Standard Model of particle physics and will provide a new window to look beyond it. This docu… ▽ More

    Submitted 16 January, 2023; v1 submitted 14 March, 2022; originally announced March 2022.

    Comments: 356 pages, Large pdf file (40 MB) submitted to Snowmass 2021; v2 references to Snowmass contributions added, additional authors; v3 references added, some updates, additional authors

    Report number: DESY-22-045, IFT--UAM/CSIC--22-028, KEK Preprint 2021-61, PNNL-SA-160884, SLAC-PUB-17662

  8. arXiv:2202.01169  [pdf, other

    cs.CL cs.LG

    Unified Scaling Laws for Routed Language Models

    Authors: Aidan Clark, Diego de las Casas, Aurelia Guy, Arthur Mensch, Michela Paganini, Jordan Hoffmann, Bogdan Damoc, Blake Hechtman, Trevor Cai, Sebastian Borgeaud, George van den Driessche, Eliza Rutherford, Tom Hennigan, Matthew Johnson, Katie Millican, Albin Cassirer, Chris Jones, Elena Buchatskaya, David Budden, Laurent Sifre, Simon Osindero, Oriol Vinyals, Jack Rae, Erich Elsen, Koray Kavukcuoglu , et al. (1 additional authors not shown)

    Abstract: The performance of a language model has been shown to be effectively modeled as a power-law in its parameter count. Here we study the scaling behaviors of Routing Networks: architectures that conditionally use only a subset of their parameters while processing an input. For these models, parameter count and computational requirement form two independent axes along which an increase leads to better… ▽ More

    Submitted 9 February, 2022; v1 submitted 2 February, 2022; originally announced February 2022.

    Comments: Fixing typos and affiliation clarity

  9. arXiv:2112.11446  [pdf, other

    cs.CL cs.AI

    Scaling Language Models: Methods, Analysis & Insights from Training Gopher

    Authors: Jack W. Rae, Sebastian Borgeaud, Trevor Cai, Katie Millican, Jordan Hoffmann, Francis Song, John Aslanides, Sarah Henderson, Roman Ring, Susannah Young, Eliza Rutherford, Tom Hennigan, Jacob Menick, Albin Cassirer, Richard Powell, George van den Driessche, Lisa Anne Hendricks, Maribeth Rauh, Po-Sen Huang, Amelia Glaese, Johannes Welbl, Sumanth Dathathri, Saffron Huang, Jonathan Uesato, John Mellor , et al. (55 additional authors not shown)

    Abstract: Language modelling provides a step towards intelligent communication systems by harnessing large repositories of written human knowledge to better predict and understand the world. In this paper, we present an analysis of Transformer-based language model performance across a wide range of model scales -- from models with tens of millions of parameters up to a 280 billion parameter model called Gop… ▽ More

    Submitted 21 January, 2022; v1 submitted 8 December, 2021; originally announced December 2021.

    Comments: 120 pages

  10. arXiv:2112.06749  [pdf, other

    cs.CL cs.LG

    Step-unrolled Denoising Autoencoders for Text Generation

    Authors: Nikolay Savinov, Junyoung Chung, Mikolaj Binkowski, Erich Elsen, Aaron van den Oord

    Abstract: In this paper we propose a new generative model of text, Step-unrolled Denoising Autoencoder (SUNDAE), that does not rely on autoregressive models. Similarly to denoising diffusion techniques, SUNDAE is repeatedly applied on a sequence of tokens, starting from random inputs and improving them each time until convergence. We present a simple new improvement operator that converges in fewer iteratio… ▽ More

    Submitted 19 April, 2022; v1 submitted 13 December, 2021; originally announced December 2021.

    Comments: Accepted to ICLR 2022

  11. arXiv:2112.04426  [pdf, other

    cs.CL cs.LG

    Improving language models by retrieving from trillions of tokens

    Authors: Sebastian Borgeaud, Arthur Mensch, Jordan Hoffmann, Trevor Cai, Eliza Rutherford, Katie Millican, George van den Driessche, Jean-Baptiste Lespiau, Bogdan Damoc, Aidan Clark, Diego de Las Casas, Aurelia Guy, Jacob Menick, Roman Ring, Tom Hennigan, Saffron Huang, Loren Maggiore, Chris Jones, Albin Cassirer, Andy Brock, Michela Paganini, Geoffrey Irving, Oriol Vinyals, Simon Osindero, Karen Simonyan , et al. (3 additional authors not shown)

    Abstract: We enhance auto-regressive language models by conditioning on document chunks retrieved from a large corpus, based on local similarity with preceding tokens. With a $2$ trillion token database, our Retrieval-Enhanced Transformer (RETRO) obtains comparable performance to GPT-3 and Jurassic-1 on the Pile, despite using 25$\times$ fewer parameters. After fine-tuning, RETRO performance translates to d… ▽ More

    Submitted 7 February, 2022; v1 submitted 8 December, 2021; originally announced December 2021.

    Comments: Fix incorrect reported numbers in Table 14

  12. arXiv:2112.01120  [pdf, other

    hep-ex hep-ph

    Impact of jet-production data on the next-to-next-to-leading-order determination of HERAPDF2.0 parton distributions

    Authors: H1, ZEUS Collaborations, :, I. Abt, R. Aggarwal, V. Andreev, M. Arratia, V. Aushev, A. Baghdasaryan, A. Baty, K. Begzsuren, O. Behnke, A. Belousov, A. Bertolin, I. Bloch, V. Boudry, G. Brandt, I. Brock, N. H. Brook, R. Brugnera, A. Bruni, A. Buniatyan, P. J. Bussey, L. Bystritskaya, A. Caldwell , et al. (212 additional authors not shown)

    Abstract: The HERAPDF2.0 ensemble of parton distribution functions (PDFs) was introduced in 2015. The final stage is presented, a next-to-next-to-leading-order (NNLO) analysis of the HERA data on inclusive deep inelastic $ep$ scattering together with jet data as published by the H1 and ZEUS collaborations. A perturbative QCD fit, simultaneously of $αあるふぁ_s(M_Z^2)$ and and the PDFs, was performed with the result… ▽ More

    Submitted 2 December, 2021; originally announced December 2021.

    Comments: 43 pages, 24 figures, to be submitted to Eur. Phys. J. C

    Report number: DESY-21-206

  13. arXiv:2108.12376  [pdf, other

    hep-ex hep-ph

    Measurement of lepton-jet correlation in deep-inelastic scattering with the H1 detector using machine learning for unfolding

    Authors: H1 Collaboration, V. Andreev, M. Arratia, A. Baghdasaryan, A. Baty, K. Begzsuren, A. Belousov, A. Bolz, V. Boudry, G. Brandt, D. Britzger, A. Buniatyan, L. Bystritskaya, A. J. Campbell, K. B. Cantun Avila, K. Cerny, V. Chekelian, Z. Chen, J. G. Contreras, L. Cunqueiro Mendez, J. Cvach, J. B. Dainton, K. Daum, A. Deshpande, C. Diaconu , et al. (120 additional authors not shown)

    Abstract: The first measurement of lepton-jet momentum imbalance and azimuthal correlation in lepton-proton scattering at high momentum transfer is presented. These data, taken with the H1 detector at HERA, are corrected for detector effects using an unbinned machine learning algorithm OmniFold, which considers eight observables simultaneously in this first application. The unfolded cross sections are compa… ▽ More

    Submitted 1 April, 2022; v1 submitted 27 August, 2021; originally announced August 2021.

    Comments: 17 pages, 7 figures, 4 tables, version accepted by PRL

    Report number: DESY 21-130

  14. arXiv:2106.03517  [pdf, other

    cs.LG stat.ML

    Top-KAST: Top-K Always Sparse Training

    Authors: Siddhant M. Jayakumar, Razvan Pascanu, Jack W. Rae, Simon Osindero, Erich Elsen

    Abstract: Sparse neural networks are becoming increasingly important as the field seeks to improve the performance of existing models by scaling them up, while simultaneously trying to reduce power consumption and computational footprint. Unfortunately, most existing methods for inducing performant sparse models still entail the instantiation of dense parameters, or dense gradients in the backward-pass, dur… ▽ More

    Submitted 7 June, 2021; originally announced June 2021.

    Journal ref: Advances in Neural Information Processing Systems, 33, 20744-20754

  15. arXiv:2006.15081  [pdf, other

    cs.LG stat.ML

    On the Generalization Benefit of Noise in Stochastic Gradient Descent

    Authors: Samuel L. Smith, Erich Elsen, Soham De

    Abstract: It has long been argued that minibatch stochastic gradient descent can generalize better than large batch gradient descent in deep neural networks. However recent papers have questioned this claim, arguing that this effect is simply a consequence of suboptimal hyperparameter tuning or insufficient compute budgets when the batch size is large. In this paper, we perform carefully designed experiment… ▽ More

    Submitted 26 June, 2020; originally announced June 2020.

    Comments: Camera-ready version of ICML 2020

  16. arXiv:2006.10901  [pdf, other

    cs.LG cs.DC stat.ML

    Sparse GPU Kernels for Deep Learning

    Authors: Trevor Gale, Matei Zaharia, Cliff Young, Erich Elsen

    Abstract: Scientific workloads have traditionally exploited high levels of sparsity to accelerate computation and reduce memory requirements. While deep neural networks can be made sparse, achieving practical speedups on GPUs is difficult because these applications have relatively moderate levels of sparsity that are not sufficient for existing sparse kernels to outperform their dense counterparts. In this… ▽ More

    Submitted 31 August, 2020; v1 submitted 18 June, 2020; originally announced June 2020.

    Comments: Updated to match camera-ready for SC20

  17. arXiv:2006.07360  [pdf, other

    cs.LG stat.ML

    AlgebraNets

    Authors: Jordan Hoffmann, Simon Schmitt, Simon Osindero, Karen Simonyan, Erich Elsen

    Abstract: Neural networks have historically been built layerwise from the set of functions in ${f: \mathbb{R}^n \to \mathbb{R}^m }$, i.e. with activations and weights/parameters represented by real numbers, $\mathbb{R}$. Our work considers a richer set of objects for activations and weights, and undertakes a comprehensive study of alternative algebras as number representations by studying their performance… ▽ More

    Submitted 16 June, 2020; v1 submitted 12 June, 2020; originally announced June 2020.

  18. arXiv:2006.07232  [pdf, other

    cs.LG cs.NE stat.ML

    A Practical Sparse Approximation for Real Time Recurrent Learning

    Authors: Jacob Menick, Erich Elsen, Utku Evci, Simon Osindero, Karen Simonyan, Alex Graves

    Abstract: Current methods for training recurrent neural networks are based on backpropagation through time, which requires storing a complete history of network states, and prohibits updating the weights `online' (after every timestep). Real Time Recurrent Learning (RTRL) eliminates the need for history storage and allows for online weight updates, but does so at the expense of computational costs that are… ▽ More

    Submitted 12 June, 2020; originally announced June 2020.

  19. arXiv:2006.03575  [pdf, other

    cs.SD cs.LG eess.AS

    End-to-End Adversarial Text-to-Speech

    Authors: Jeff Donahue, Sander Dieleman, Mikołaj Bińkowski, Erich Elsen, Karen Simonyan

    Abstract: Modern text-to-speech synthesis pipelines typically involve multiple processing stages, each of which is designed or learnt independently from the rest. In this work, we take on the challenging task of learning to synthesise speech from normalised text or phonemes in an end-to-end manner, resulting in models which operate directly on character or phoneme input sequences and produce raw speech audi… ▽ More

    Submitted 17 March, 2021; v1 submitted 5 June, 2020; originally announced June 2020.

    Comments: 23 pages. In proceedings of ICLR 2021

  20. arXiv:1911.11134  [pdf, other

    cs.LG cs.CV stat.ML

    Rigging the Lottery: Making All Tickets Winners

    Authors: Utku Evci, Trevor Gale, Jacob Menick, Pablo Samuel Castro, Erich Elsen

    Abstract: Many applications require sparse neural networks due to space or inference time restrictions. There is a large body of work on training dense networks to yield sparse networks for inference, but this limits the size of the largest trainable sparse model to that of the largest trainable dense model. In this paper we introduce a method to train sparse neural networks with a fixed parameter count and… ▽ More

    Submitted 23 July, 2021; v1 submitted 25 November, 2019; originally announced November 2019.

    Comments: Published in Proceedings of the 37th International Conference on Machine Learning. Code can be found in github.com/google-research/rigl

    Journal ref: Proceedings of the 37th International Conference on Machine Learning (2020) 471-481

  21. arXiv:1911.09723  [pdf, other

    cs.CV

    Fast Sparse ConvNets

    Authors: Erich Elsen, Marat Dukhan, Trevor Gale, Karen Simonyan

    Abstract: Historically, the pursuit of efficient inference has been one of the driving forces behind research into new deep learning architectures and building blocks. Some recent examples include: the squeeze-and-excitation module, depthwise separable convolutions in Xception, and the inverted bottleneck in MobileNet v2. Notably, in all of these cases, the resulting building blocks enabled not only higher… ▽ More

    Submitted 21 November, 2019; originally announced November 2019.

  22. arXiv:1909.11646  [pdf, other

    cs.SD cs.LG eess.AS

    High Fidelity Speech Synthesis with Adversarial Networks

    Authors: Mikołaj Bińkowski, Jeff Donahue, Sander Dieleman, Aidan Clark, Erich Elsen, Norman Casagrande, Luis C. Cobo, Karen Simonyan

    Abstract: Generative adversarial networks have seen rapid development in recent years and have led to remarkable improvements in generative modelling of images. However, their application in the audio domain has received limited attention, and autoregressive models, such as WaveNet, remain the state of the art in generative modelling of audio signals such as human speech. To address this paucity, we introdu… ▽ More

    Submitted 26 September, 2019; v1 submitted 25 September, 2019; originally announced September 2019.

  23. arXiv:1906.10732  [pdf, other

    cs.LG cs.CV stat.ML

    The Difficulty of Training Sparse Neural Networks

    Authors: Utku Evci, Fabian Pedregosa, Aidan Gomez, Erich Elsen

    Abstract: We investigate the difficulties of training sparse neural networks and make new observations about optimization dynamics and the energy landscape within the sparse regime. Recent work of \citep{Gale2019, Liu2018} has shown that sparse ResNet-50 architectures trained on ImageNet-2012 dataset converge to solutions that are significantly worse than those found by pruning. We show that, despite the fa… ▽ More

    Submitted 7 October, 2020; v1 submitted 25 June, 2019; originally announced June 2019.

    Comments: sparse networks, pruning, energy landscape, sparsity

  24. arXiv:1906.03139  [pdf, other

    cs.NE cs.LG stat.ML

    Non-Differentiable Supervised Learning with Evolution Strategies and Hybrid Methods

    Authors: Karel Lenc, Erich Elsen, Tom Schaul, Karen Simonyan

    Abstract: In this work we show that Evolution Strategies (ES) are a viable method for learning non-differentiable parameters of large supervised models. ES are black-box optimization algorithms that estimate distributions of model parameters; however they have only been used for relatively small problems so far. We show that it is possible to scale ES to more complex tasks and models with millions of parame… ▽ More

    Submitted 7 June, 2019; originally announced June 2019.

  25. arXiv:1902.09574  [pdf, other

    cs.LG stat.ML

    The State of Sparsity in Deep Neural Networks

    Authors: Trevor Gale, Erich Elsen, Sara Hooker

    Abstract: We rigorously evaluate three state-of-the-art techniques for inducing sparsity in deep neural networks on two large-scale learning tasks: Transformer trained on WMT 2014 English-to-German, and ResNet-50 trained on ImageNet. Across thousands of experiments, we demonstrate that complex techniques (Molchanov et al., 2017; Louizos et al., 2017b) shown to yield high compression rates on smaller dataset… ▽ More

    Submitted 25 February, 2019; originally announced February 2019.

  26. arXiv:1810.12247  [pdf, other

    cs.SD cs.LG eess.AS stat.ML

    Enabling Factorized Piano Music Modeling and Generation with the MAESTRO Dataset

    Authors: Curtis Hawthorne, Andriy Stasyuk, Adam Roberts, Ian Simon, Cheng-Zhi Anna Huang, Sander Dieleman, Erich Elsen, Jesse Engel, Douglas Eck

    Abstract: Generating musical audio directly with neural networks is notoriously difficult because it requires coherently modeling structure at many different timescales. Fortunately, most music is also highly structured and can be represented as discrete note events played on musical instruments. Herein, we show that by using notes as an intermediate representation, we can train a suite of models capable of… ▽ More

    Submitted 17 January, 2019; v1 submitted 29 October, 2018; originally announced October 2018.

    Comments: Examples available at https://goo.gl/magenta/maestro-examples

  27. arXiv:1802.08435  [pdf, other

    cs.SD cs.LG eess.AS

    Efficient Neural Audio Synthesis

    Authors: Nal Kalchbrenner, Erich Elsen, Karen Simonyan, Seb Noury, Norman Casagrande, Edward Lockhart, Florian Stimberg, Aaron van den Oord, Sander Dieleman, Koray Kavukcuoglu

    Abstract: Sequential models achieve state-of-the-art results in audio, visual and textual domains with respect to both estimating the data distribution and generating high-quality samples. Efficient sampling for this class of models has however remained an elusive problem. With a focus on text-to-speech synthesis, we describe a set of general techniques for reducing sampling time while maintaining high outp… ▽ More

    Submitted 25 June, 2018; v1 submitted 23 February, 2018; originally announced February 2018.

    Comments: 10 pages

  28. arXiv:1711.10433  [pdf, other

    cs.LG

    Parallel WaveNet: Fast High-Fidelity Speech Synthesis

    Authors: Aaron van den Oord, Yazhe Li, Igor Babuschkin, Karen Simonyan, Oriol Vinyals, Koray Kavukcuoglu, George van den Driessche, Edward Lockhart, Luis C. Cobo, Florian Stimberg, Norman Casagrande, Dominik Grewe, Seb Noury, Sander Dieleman, Erich Elsen, Nal Kalchbrenner, Heiga Zen, Alex Graves, Helen King, Tom Walters, Dan Belov, Demis Hassabis

    Abstract: The recently-developed WaveNet architecture is the current state of the art in realistic speech synthesis, consistently rated as more natural sounding for many different languages than any previous system. However, because WaveNet relies on sequential generation of one audio sample at a time, it is poorly suited to today's massively parallel computers, and therefore hard to deploy in a real-time p… ▽ More

    Submitted 28 November, 2017; originally announced November 2017.

  29. arXiv:1710.11153  [pdf, other

    cs.SD cs.LG eess.AS stat.ML

    Onsets and Frames: Dual-Objective Piano Transcription

    Authors: Curtis Hawthorne, Erich Elsen, Jialin Song, Adam Roberts, Ian Simon, Colin Raffel, Jesse Engel, Sageev Oore, Douglas Eck

    Abstract: We advance the state of the art in polyphonic piano music transcription by using a deep convolutional and recurrent neural network which is trained to jointly predict onsets and frames. Our model predicts pitch onset events and then uses those predictions to condition framewise pitch predictions. During inference, we restrict the predictions from the framewise detector by not allowing a new note t… ▽ More

    Submitted 5 June, 2018; v1 submitted 30 October, 2017; originally announced October 2017.

    Comments: Examples available at https://goo.gl/magenta/onsets-frames-examples

  30. arXiv:1710.03740  [pdf, other

    cs.AI cs.LG stat.ML

    Mixed Precision Training

    Authors: Paulius Micikevicius, Sharan Narang, Jonah Alben, Gregory Diamos, Erich Elsen, David Garcia, Boris Ginsburg, Michael Houston, Oleksii Kuchaiev, Ganesh Venkatesh, Hao Wu

    Abstract: Deep neural networks have enabled progress in a wide variety of applications. Growing the size of the neural network typically results in improved accuracy. As model sizes grow, the memory and compute requirements for training these models also increases. We introduce a technique to train deep neural networks using half precision floating point numbers. In our technique, weights, activations and g… ▽ More

    Submitted 15 February, 2018; v1 submitted 10 October, 2017; originally announced October 2017.

    Comments: Published as a conference paper at ICLR 2018

  31. arXiv:1709.07251  [pdf, other

    hep-ex hep-ph

    Determination of the strong coupling constant $αあるふぁ_s(M_Z)$ in next-to-next-to-leading order QCD using H1 jet cross section measurements

    Authors: H1 collaboration, V. Andreev, A. Baghdasaryan, K. Begzsuren, A. Belousov, V. Bertone, A. Bolz, V. Boudry, G. Brandt, V. Brisson, D. Britzger, A. Buniatyan, A. Bylinkin, L. Bystritskaya, A. J. Campbell, K. B. Cantun Avila, K. Cerny, V. Chekelian, J. G. Contreras, J. Cvach, J. Currie, J. B. Dainton, K. Daum, C. Diaconu, M. Dobre , et al. (123 additional authors not shown)

    Abstract: The strong coupling constant $αあるふぁ_s(M_Z)$ is determined from inclusive jet and dijet cross sections in neutral-current deep-inelastic $ep$ scattering (DIS) measured at HERA by the H1 collaboration using next-to-next-to-leading order (NNLO) QCD predictions. The dependence of the NNLO predictions and of the resulting value of $αあるふぁ_s(M_Z)$ at the $Z$-boson mass $m_Z$ are studied as a function of the choi… ▽ More

    Submitted 16 June, 2021; v1 submitted 21 September, 2017; originally announced September 2017.

    Comments: 45 pages, 17 figures, with changes discussed in an erratum submitted to EPJ C

    Report number: DESY17-137

  32. Running of the Charm-Quark Mass from HERA Deep-Inelastic Scattering Data

    Authors: A. Gizhko, A. Geiser, S. Moch, I. Abt, O. Behnke, A. Bertolin, J. Blümlein, D. Britzger, R. Brugnera, A. Buniatyan, P. J. Bussey, R. Carlin, A. M. Cooper-Sarkar, K. Daum, S. Dusini, E. Elsen, L. Favart, J. Feltesse, B. Foster, A. Garfagnini, M. Garzelli, J. Gayler, D. Haidt, J. Hladky, A. W. Jung , et al. (25 additional authors not shown)

    Abstract: Combined HERA data on charm production in deep-inelastic scattering have previously been used to determine the charm-quark running mass $m_c(m_c)$ in the MSbar renormalisation scheme. Here, the same data are used as a function of the photon virtuality $Q^2$ to evaluate the charm-quark running mass at different scales to one-loop order, in the context of a next-to-leading order QCD analysis. The sc… ▽ More

    Submitted 24 May, 2017; originally announced May 2017.

    Comments: 12 pages, 4 figures

    Report number: DESY-17-048

  33. arXiv:1704.05119  [pdf, other

    cs.LG cs.CL

    Exploring Sparsity in Recurrent Neural Networks

    Authors: Sharan Narang, Erich Elsen, Gregory Diamos, Shubho Sengupta

    Abstract: Recurrent Neural Networks (RNN) are widely used to solve a variety of problems and as the quantity of data and the amount of available compute have increased, so have model sizes. The number of parameters in recent state-of-the-art networks makes them hard to deploy, especially on mobile phones and embedded devices. The challenge is due to both the size of the model and the time it takes to evalua… ▽ More

    Submitted 6 November, 2017; v1 submitted 17 April, 2017; originally announced April 2017.

    Comments: Published as a conference paper at ICLR 2017

  34. arXiv:1607.04381  [pdf, other

    cs.CV

    DSD: Dense-Sparse-Dense Training for Deep Neural Networks

    Authors: Song Han, Jeff Pool, Sharan Narang, Huizi Mao, Enhao Gong, Shijian Tang, Erich Elsen, Peter Vajda, Manohar Paluri, John Tran, Bryan Catanzaro, William J. Dally

    Abstract: Modern deep neural networks have a large number of parameters, making them very hard to train. We propose DSD, a dense-sparse-dense training flow, for regularizing deep neural networks and achieving better optimization performance. In the first D (Dense) step, we train a dense network to learn connection weights and importance. In the S (Sparse) step, we regularize the network by pruning the unimp… ▽ More

    Submitted 21 February, 2017; v1 submitted 15 July, 2016; originally announced July 2016.

    Comments: Published as a conference paper at ICLR 2017

  35. arXiv:1512.02595  [pdf, other

    cs.CL

    Deep Speech 2: End-to-End Speech Recognition in English and Mandarin

    Authors: Dario Amodei, Rishita Anubhai, Eric Battenberg, Carl Case, Jared Casper, Bryan Catanzaro, Jingdong Chen, Mike Chrzanowski, Adam Coates, Greg Diamos, Erich Elsen, Jesse Engel, Linxi Fan, Christopher Fougner, Tony Han, Awni Hannun, Billy Jun, Patrick LeGresley, Libby Lin, Sharan Narang, Andrew Ng, Sherjil Ozair, Ryan Prenger, Jonathan Raiman, Sanjeev Satheesh , et al. (9 additional authors not shown)

    Abstract: We show that an end-to-end deep learning approach can be used to recognize either English or Mandarin Chinese speech--two vastly different languages. Because it replaces entire pipelines of hand-engineered components with neural networks, end-to-end learning allows us to handle a diverse variety of speech including noisy environments, accents and different languages. Key to our approach is our app… ▽ More

    Submitted 8 December, 2015; originally announced December 2015.

  36. arXiv:1511.09032  [pdf, other

    physics.plasm-ph physics.acc-ph

    Path to AWAKE: Evolution of the concept

    Authors: A. Caldwell, E. Adli, L. Amorim, R. Apsimon, T. Argyropoulos, R. Assmann, A. -M. Bachmann, F. Batsch, J. Bauche, V. K. Berglyd Olsen, M. Bernardini, R. Bingham, B. Biskup, T. Bohl, C. Bracco, P. N. Burrows, G. Burt, B. Buttenschon, A. Butterworth, M. Cascella, S. Chattopadhyay, E. Chevallay, S. Cipiccia, H. Damerau, L. Deacon , et al. (96 additional authors not shown)

    Abstract: This report describes the conceptual steps in reaching the design of the AWAKE experiment currently under construction at CERN. We start with an introduction to plasma wakefield acceleration and the motivation for using proton drivers. We then describe the self-modulation instability --- a key to an early realization of the concept. This is then followed by the historical development of the experi… ▽ More

    Submitted 29 November, 2015; originally announced November 2015.

    Comments: 15 pages, 24 figures, 1 table, 111 references, 121 author from 36 organizations

  37. The FLASHForward Facility at DESY

    Authors: A. Aschikhin, C. Behrens, S. Bohlen, J. Dale, N. Delbos, L. di Lucchio, E. Elsen, J. -H. Erbe, M. Felber, B. Foster, L. Goldberg, J. Grebenyuk, J. -N. Gruse, B. Hidding, Zhanghu Hu, S. Karstensen, A. Knetsch, O. Kononenko, V. Libov, K. Ludwig, A. R. Maier, A. Martinez de la Ossa, T. Mehrling, C. A. J. Palmer, F. Pannek , et al. (13 additional authors not shown)

    Abstract: The FLASHForward project at DESY is a pioneering plasma-wakefield acceleration experiment that aims to produce, in a few centimetres of ionised hydrogen, beams with energy of order GeV that are of quality sufficient to be used in a free-electron laser. The plasma wave will be driven by high-current density electron beams from the FLASH linear accelerator and will explore both external and internal… ▽ More

    Submitted 18 August, 2015; v1 submitted 13 August, 2015; originally announced August 2015.

    Comments: 19 pages, 9 figures

    Report number: DESY 15-143

  38. arXiv:1412.5567  [pdf, other

    cs.CL cs.LG cs.NE

    Deep Speech: Scaling up end-to-end speech recognition

    Authors: Awni Hannun, Carl Case, Jared Casper, Bryan Catanzaro, Greg Diamos, Erich Elsen, Ryan Prenger, Sanjeev Satheesh, Shubho Sengupta, Adam Coates, Andrew Y. Ng

    Abstract: We present a state-of-the-art speech recognition system developed using end-to-end deep learning. Our architecture is significantly simpler than traditional speech systems, which rely on laboriously engineered processing pipelines; these traditional systems also tend to perform poorly when used in noisy environments. In contrast, our system does not need hand-designed components to model backgroun… ▽ More

    Submitted 19 December, 2014; v1 submitted 17 December, 2014; originally announced December 2014.

  39. arXiv:1306.6353  [pdf

    physics.acc-ph

    The International Linear Collider Technical Design Report - Volume 3.I: Accelerator R&D in the Technical Design Phase

    Authors: Chris Adolphsen, Maura Barone, Barry Barish, Karsten Buesser, Philip Burrows, John Carwardine, Jeffrey Clark, Hélène Mainaud Durand, Gerry Dugan, Eckhard Elsen, Atsushi Enomoto, Brian Foster, Shigeki Fukuda, Wei Gai, Martin Gastal, Rongli Geng, Camille Ginsburg, Susanna Guiducci, Mike Harrison, Hitoshi Hayano, Keith Kershaw, Kiyoshi Kubo, Victor Kuchler, Benno List, Wanming Liu , et al. (19 additional authors not shown)

    Abstract: The International Linear Collider Technical Design Report (TDR) describes in four volumes the physics case and the design of a 500 GeV centre-of-mass energy linear electron-positron collider based on superconducting radio-frequency technology using Niobium cavities as the accelerating structures. The accelerator can be extended to 1 TeV and also run as a Higgs factory at around 250 GeV and on the… ▽ More

    Submitted 26 June, 2013; originally announced June 2013.

    Comments: See also http://www.linearcollider.org/ILC/TDR . The full list of signatories is inside the Report

    Report number: ILC-REPORT-2013-040; ANL-HEP-TR-13-20; BNL-100603-2013-IR; IRFU-13-59; CERN-ATS-2013-037; Cockcroft-13-10; CLNS 13/2085; DESY 13-062; FERMILAB TM-2554; IHEP-AC-ILC-2013-001; INFN-13-04/LNF; JAI-2013-001; JINR E9-2013-35; JLAB-R-2013-01; KEK Report 2013-1; KNU/CHEP-ILC-2013-1; LLNL-TR-635539; SLAC-R-1004; ILC-HiGrade-Report-2013-003

  40. arXiv:1306.6328  [pdf

    physics.acc-ph

    The International Linear Collider Technical Design Report - Volume 3.II: Accelerator Baseline Design

    Authors: Chris Adolphsen, Maura Barone, Barry Barish, Karsten Buesser, Philip Burrows, John Carwardine, Jeffrey Clark, Hélène Mainaud Durand, Gerry Dugan, Eckhard Elsen, Atsushi Enomoto, Brian Foster, Shigeki Fukuda, Wei Gai, Martin Gastal, Rongli Geng, Camille Ginsburg, Susanna Guiducci, Mike Harrison, Hitoshi Hayano, Keith Kershaw, Kiyoshi Kubo, Victor Kuchler, Benno List, Wanming Liu , et al. (19 additional authors not shown)

    Abstract: The International Linear Collider Technical Design Report (TDR) describes in four volumes the physics case and the design of a 500 GeV centre-of-mass energy linear electron-positron collider based on superconducting radio-frequency technology using Niobium cavities as the accelerating structures. The accelerator can be extended to 1 TeV and also run as a Higgs factory at around 250 GeV and on the… ▽ More

    Submitted 26 June, 2013; originally announced June 2013.

    Comments: See also http://www.linearcollider.org/ILC/TDR . The full list of signatories is inside the Report

    Report number: ILC-REPORT-2013-040; ANL-HEP-TR-13-20; BNL-100603-2013-IR; IRFU-13-59; CERN-ATS-2013-037; Cockcroft-13-10; CLNS 13/2085; DESY 13-062; FERMILAB TM-2554; IHEP-AC-ILC-2013-001; INFN-13-04/LNF; JAI-2013-001; JINR E9-2013-35; JLAB-R-2013-01; KEK Report 2013-1; KNU/CHEP-ILC-2013-1; LLNL-TR-635539; SLAC-R-1004; ILC-HiGrade-Report-2013-003

  41. Present status and first results of the final focus beam line at the KEK Accelerator Test Facility

    Authors: P. Bambade, M. Alabau Pons, J. Amann, D. Angal-Kalinin, R. Apsimon, S. Araki, A. Aryshev, S. Bai, P. Bellomo, D. Bett, G. Blair, B. Bolzon, S. Boogert, G. Boorman, P. N. Burrows, G. Christian, P. Coe, B. Constance, Jean-Pierre Delahaye, L. Deacon, E. Elsen, A. Faus-Golfe, M. Fukuda, J. Gao, N. Geffroy , et al. (69 additional authors not shown)

    Abstract: ATF2 is a final-focus test beam line which aims to focus the low emittance beam from the ATF damping ring to a vertical size of about 37 nm and to demonstrate nanometer level beam stability. Several advanced beam diagnostics and feedback tools are used. In December 2008, construction and installation were completed and beam commissioning started, supported by an international team of Asian, Europe… ▽ More

    Submitted 5 July, 2012; originally announced July 2012.

    Comments: 10 pp

    Report number: FERMILAB-PUB-10-290-AD

    Journal ref: Phys.Rev.ST Accel.Beams 13 (2010) 042801

  42. Simulation study of fast ion instability in the ILC damping ring and PETRA III

    Authors: G. Xia, K. Ohmi, E. Elsen

    Abstract: The fast ion instability is simulated in different gas pressures and fill patterns for the damping ring of the International Linear Collider (ILC) and PETRA III respectively. Beam size variation due to beta function and dispersion function change is taken into account. Feedback is also applied in the simulation.

    Submitted 3 June, 2008; originally announced June 2008.

    Comments: 11 pages, 2 tables, 16 figures

    Journal ref: Nucl.Instrum.Meth.A593:183-187,2008

  43. arXiv:0709.2248  [pdf, ps, other

    physics.acc-ph physics.gen-ph

    Update on Ion Studies

    Authors: Guoxing Xia, Eckhard Elsen

    Abstract: The effect of ions has received one of the highest priorities in R&D for the damping rings of the International Linear Collider(ILC). It is detrimental to the performance of the electron damping ring. In this note, an update concerning the ion studies for the ILC damping ring is given. We investigate the gap role and irregular fill pattern in the ring.The ion density reduction in different fills… ▽ More

    Submitted 14 September, 2007; originally announced September 2007.

    Comments: There are 6 pages, 9 figures and 1 table in this paper. It is for the proceedings of LCWS07 Workshop

    Journal ref: ECONF C0705302:DR002,2007

  44. arXiv:0706.3060  [pdf, ps, other

    cs.CE cs.DC

    N-Body Simulations on GPUs

    Authors: Erich Elsen, V. Vishal, Mike Houston, Vijay Pande, Pat Hanrahan, Eric Darve

    Abstract: Commercial graphics processors (GPUs) have high compute capacity at very low cost, which makes them attractive for general purpose scientific computing. In this paper we show how graphics processors can be used for N-body simulations to obtain improvements in performance over current generation CPUs. We have developed a highly optimized algorithm for performing the O(N^2) force calculations that… ▽ More

    Submitted 20 June, 2007; originally announced June 2007.

  45. Introduction to Diffraction and low x Dynamics

    Authors: E. Elsen

    Abstract: An attempt is made to illustrate the relation between low x process es and diffraction. ep scattering provides a unique laboratory, a single hadronic target probed by a point like lepton, where one can try to understand diffraction in terms of a colourless exchange in QCD. Low x processes eventually involve aspects of QCD which cannot be described perturbatively. The HERA inclusive measurements… ▽ More

    Submitted 10 January, 2002; originally announced January 2002.

    Comments: 8 pages, 7 figures in eps. talk given at XXXI International Symposium on Multiparticle Dynamics, Sep. 1-7, 2001, Datong China See http://ismd31 .ccnu.edu.cn/

  46. A Fast High Resolution Track Trigger for the H1 Experiment

    Authors: A. Baird, E. Elsen, Y. H. Fleming, M. Kolander, S. Kolya, D. Meer, D. Mercer, J. Naumann, P. R. Newman, D. Sankey, A. Schoening, H. -C. Schultz-Coulon, Ch. Wissing

    Abstract: After 2001 the upgraded ep collider HERA will provide an about five times higher luminosity for the two experiments H1 and ZEUS. In order to cope with the expected higher event rates the H1 collaboration is building a track based trigger system, the Fast Track Trigger (FTT). It will be integrated in the first three levels (L1-L3) of the H1 trigger scheme to provide higher selectivity for events… ▽ More

    Submitted 6 April, 2001; originally announced April 2001.

    Comments: 6 pages, 7 figures, submitted to TNS

    Journal ref: IEEE Trans.Nucl.Sci.48:1276-1285,2001

  47. arXiv:hep-ph/9610251  [pdf, ps, other

    hep-ph hep-ex

    Electroweak Physics at HERA: Introduction and Summary

    Authors: R. J. Cashmore, E. Elsen, B. A. Kniehl, H. Spiesberger

    Abstract: A high luminosity upgrade of HERA will allow the measurement of standard model parameters and the neutral current couplings of quarks. These results will have to be consistent with other precision measurements or indicate traces of new physics. The analysis of $W$ production will complement future results of LEP 2 and the Tevatron. We summarize the main results and conclusions obtained by the wo… ▽ More

    Submitted 4 October, 1996; originally announced October 1996.

    Comments: 12 pages (Latex), 4 figures (Postscript), to appear in the Proceedings of the Workshop on Future Physics at HERA. The complete report by the working group on electroweak physics at HERA is available from http://www.desy.de/~heraws96/proceedings

    Report number: MPI/PhT/96-105