(Translated by https://www.hiragana.jp/)
Search | arXiv e-print repository
Skip to main content

Showing 1–50 of 75 results for author: Niehues, J

.
  1. arXiv:2406.16777  [pdf, other

    cs.CL cs.AI

    Blending LLMs into Cascaded Speech Translation: KIT's Offline Speech Translation System for IWSLT 2024

    Authors: Sai Koneru, Thai-Binh Nguyen, Ngoc-Quan Pham, Danni Liu, Zhaolin Li, Alexander Waibel, Jan Niehues

    Abstract: Large Language Models (LLMs) are currently under exploration for various tasks, including Automatic Speech Recognition (ASR), Machine Translation (MT), and even End-to-End Speech Translation (ST). In this paper, we present KIT's offline submission in the constrained + LLM track by incorporating recently proposed techniques that can be added to any cascaded speech translation. Specifically, we inte… ▽ More

    Submitted 24 June, 2024; originally announced June 2024.

  2. arXiv:2406.10421  [pdf, other

    cs.CL

    SciEx: Benchmarking Large Language Models on Scientific Exams with Human Expert Grading and Automatic Grading

    Authors: Tu Anh Dinh, Carlos Mullov, Leonard Bärmann, Zhaolin Li, Danni Liu, Simon Reiß, Jueun Lee, Nathan Lerzer, Fabian Ternava, Jianfeng Gao, Tobias Röddiger, Alexander Waibel, Tamim Asfour, Michael Beigl, Rainer Stiefelhagen, Carsten Dachsbacher, Klemens Böhm, Jan Niehues

    Abstract: With the rapid development of Large Language Models (LLMs), it is crucial to have benchmarks which can evaluate the ability of LLMs on different domains. One common use of LLMs is performing tasks on scientific topics, such as writing algorithms, querying databases or giving mathematical proofs. Inspired by the way university students are evaluated on such tasks, in this paper, we propose SciEx -… ▽ More

    Submitted 12 July, 2024; v1 submitted 14 June, 2024; originally announced June 2024.

    ACM Class: I.2.7

  3. arXiv:2406.03881  [pdf, other

    cs.CL

    Evaluating the IWSLT2023 Speech Translation Tasks: Human Annotations, Automatic Metrics, and Segmentation

    Authors: Matthias Sperber, Ondřej Bojar, Barry Haddow, Dávid Javorský, Xutai Ma, Matteo Negri, Jan Niehues, Peter Polák, Elizabeth Salesky, Katsuhito Sudoh, Marco Turchi

    Abstract: Human evaluation is a critical component in machine translation system development and has received much attention in text translation research. However, little prior work exists on the topic of human evaluation for speech translation, which adds additional challenges such as noisy data and segmentation mismatches. We take first steps to fill this gap by conducting a comprehensive human evaluation… ▽ More

    Submitted 6 June, 2024; originally announced June 2024.

    Comments: LREC-COLING2024 publication (with corrections for Table 3)

    Journal ref: Proceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024)

  4. arXiv:2404.18031  [pdf, other

    cs.CL

    Quality Estimation with $k$-nearest Neighbors and Automatic Evaluation for Model-specific Quality Estimation

    Authors: Tu Anh Dinh, Tobias Palzer, Jan Niehues

    Abstract: Providing quality scores along with Machine Translation (MT) output, so-called reference-free Quality Estimation (QE), is crucial to inform users about the reliability of the translation. We propose a model-specific, unsupervised QE approach, termed $k$NN-QE, that extracts information from the MT model's training data using $k$-nearest neighbors. Measuring the performance of model-specific QE is n… ▽ More

    Submitted 27 April, 2024; originally announced April 2024.

    Comments: Accepted to EAMT 2024

    ACM Class: I.2.7

  5. arXiv:2404.05720  [pdf, other

    cs.CL cs.AI

    Language-Independent Representations Improve Zero-Shot Summarization

    Authors: Vladimir Solovyev, Danni Liu, Jan Niehues

    Abstract: Finetuning pretrained models on downstream generation tasks often leads to catastrophic forgetting in zero-shot conditions. In this work, we focus on summarization and tackle the problem through the lens of language-independent representations. After training on monolingual summarization, we perform zero-shot transfer to new languages or language pairs. We first show naively finetuned models are h… ▽ More

    Submitted 8 April, 2024; originally announced April 2024.

    Comments: NAACL 2024

  6. arXiv:2401.06483  [pdf, other

    nlin.AO physics.app-ph

    Resonant Solitary States in Complex Networks

    Authors: Jakob Niehues, Serhiy Yanchuk, Rico Berner, Jürgen Kurths, Frank Hellmann, Mehrnaz Anvari

    Abstract: Partially synchronized solitary states occur frequently when a synchronized system of networked oscillators is perturbed locally. Several asymptotic states of different frequencies can coexist at the same node. Here, we reveal the mechanism behind this multistability: additional solitary frequencies arise from the coupling between network modes and the solitary oscillator's frequency, leading to s… ▽ More

    Submitted 12 July, 2024; v1 submitted 12 January, 2024; originally announced January 2024.

  7. arXiv:2310.14855  [pdf, other

    cs.CL cs.AI

    Contextual Refinement of Translations: Large Language Models for Sentence and Document-Level Post-Editing

    Authors: Sai Koneru, Miriam Exel, Matthias Huck, Jan Niehues

    Abstract: Large Language Models (LLM's) have demonstrated considerable success in various Natural Language Processing tasks, but they have yet to attain state-of-the-art performance in Neural Machine Translation (NMT). Nevertheless, their significant performance in tasks demanding a broad understanding and contextual processing shows their potential for translation. To exploit these abilities, we investigat… ▽ More

    Submitted 18 March, 2024; v1 submitted 23 October, 2023; originally announced October 2023.

    Comments: NAACL 2024

  8. arXiv:2309.12998  [pdf, other

    cs.CL cs.AI

    Audience-specific Explanations for Machine Translation

    Authors: Renhan Lou, Jan Niehues

    Abstract: In machine translation, a common problem is that the translation of certain words even if translated can cause incomprehension of the target language audience due to different cultural backgrounds. A solution to solve this problem is to add explanations for these words. In a first step, we therefore need to identify these words or phrases. In this work we explore techniques to extract example expl… ▽ More

    Submitted 22 September, 2023; originally announced September 2023.

  9. arXiv:2309.08565  [pdf, other

    cs.CL cs.AI

    How Transferable are Attribute Controllers on Pretrained Multilingual Translation Models?

    Authors: Danni Liu, Jan Niehues

    Abstract: Customizing machine translation models to comply with desired attributes (e.g., formality or grammatical gender) is a well-studied topic. However, most current approaches rely on (semi-)supervised data with attribute annotations. This data scarcity bottlenecks democratizing such customization possibilities to a wider range of languages, particularly lower-resource ones. This gap is out of sync wit… ▽ More

    Submitted 24 January, 2024; v1 submitted 15 September, 2023; originally announced September 2023.

    Comments: EACL 2024

  10. arXiv:2309.04316  [pdf, other

    cs.RO cs.AI

    Incremental Learning of Humanoid Robot Behavior from Natural Interaction and Large Language Models

    Authors: Leonard Bärmann, Rainer Kartmann, Fabian Peller-Konrad, Jan Niehues, Alex Waibel, Tamim Asfour

    Abstract: Natural-language dialog is key for intuitive human-robot interaction. It can be used not only to express humans' intents, but also to communicate instructions for improvement if a robot does not understand a command correctly. Of great importance is to endow robots with the ability to learn from such interaction experience in an incremental way to allow them to improve their behaviors or avoid mis… ▽ More

    Submitted 16 May, 2024; v1 submitted 8 September, 2023; originally announced September 2023.

    Comments: This version (v3) adds further quantitative evaluation and many improvements. v2 was presented at the Workshop on Language and Robot Learning (LangRob) at the Conference on Robot Learning (CoRL) 2023. Supplementary video available at https://youtu.be/y5O2mRGtsLM

  11. arXiv:2308.03415  [pdf, other

    cs.CL cs.AI

    End-to-End Evaluation for Low-Latency Simultaneous Speech Translation

    Authors: Christian Huber, Tu Anh Dinh, Carlos Mullov, Ngoc Quan Pham, Thai Binh Nguyen, Fabian Retkowski, Stefan Constantin, Enes Yavuz Ugan, Danni Liu, Zhaolin Li, Sai Koneru, Jan Niehues, Alexander Waibel

    Abstract: The challenge of low-latency speech translation has recently draw significant interest in the research community as shown by several publications and shared tasks. Therefore, it is essential to evaluate these different approaches in realistic scenarios. However, currently only specific aspects of the systems are evaluated and often it is not possible to compare different approaches. In this work… ▽ More

    Submitted 17 July, 2024; v1 submitted 7 August, 2023; originally announced August 2023.

    Comments: Demo paper at EMNLP 2023

  12. arXiv:2306.05320  [pdf, other

    cs.CL cs.SD

    KIT's Multilingual Speech Translation System for IWSLT 2023

    Authors: Danni Liu, Thai Binh Nguyen, Sai Koneru, Enes Yavuz Ugan, Ngoc-Quan Pham, Tuan-Nam Nguyen, Tu Anh Dinh, Carlos Mullov, Alexander Waibel, Jan Niehues

    Abstract: Many existing speech translation benchmarks focus on native-English speech in high-quality recording conditions, which often do not match the conditions in real-life use-cases. In this paper, we describe our speech translation system for the multilingual track of IWSLT 2023, which evaluates translation quality on scientific conference talks. The test condition features accented input speech and te… ▽ More

    Submitted 12 July, 2023; v1 submitted 8 June, 2023; originally announced June 2023.

    Comments: IWSLT 2023

  13. arXiv:2305.16935  [pdf, other

    cs.CL

    Gender Lost In Translation: How Bridging The Gap Between Languages Affects Gender Bias in Zero-Shot Multilingual Translation

    Authors: Lena Cabrera, Jan Niehues

    Abstract: Neural machine translation (NMT) models often suffer from gender biases that harm users and society at large. In this work, we explore how bridging the gap between languages for which parallel data is not available affects gender bias in multilingual NMT, specifically for zero-shot directions. We evaluate translation between grammatical gender languages which requires preserving the inherent gende… ▽ More

    Submitted 26 May, 2023; originally announced May 2023.

    Comments: Accepted at EAMT 2023 (Workshop on Gender-Inclusive Translation Technologies (GITT))

  14. arXiv:2305.07457  [pdf, other

    cs.CL

    Perturbation-based QE: An Explainable, Unsupervised Word-level Quality Estimation Method for Blackbox Machine Translation

    Authors: Tu Anh Dinh, Jan Niehues

    Abstract: Quality Estimation (QE) is the task of predicting the quality of Machine Translation (MT) system output, without using any gold-standard translation references. State-of-the-art QE models are supervised: they require human-labeled quality of some MT system output on some datasets for training, making them domain-dependent and MT-system-dependent. There has been research on unsupervised QE, which r… ▽ More

    Submitted 13 July, 2023; v1 submitted 12 May, 2023; originally announced May 2023.

    Comments: Accepted to MT Summit 2023

    ACM Class: I.2.7

  15. arXiv:2305.03873  [pdf, other

    cs.CL

    Train Global, Tailor Local: Minimalist Multilingual Translation into Endangered Languages

    Authors: Zhong Zhou, Jan Niehues, Alex Waibel

    Abstract: In many humanitarian scenarios, translation into severely low resource languages often does not require a universal translation engine, but a dedicated text-specific translation engine. For example, healthcare records, hygienic procedures, government communication, emergency procedures and religious texts are all limited texts. While generic translation engines for all languages do not exist, tran… ▽ More

    Submitted 5 May, 2023; originally announced May 2023.

    Comments: In Proceedings of the 6th Workshop on Technologies for Machine Translation of Low-Resource Languages (LoResMT) of the 17th Conference of the European Chapter of the Association for Computational Linguistic in 2023

  16. arXiv:2301.09617  [pdf, other

    cs.CV

    Fully transformer-based biomarker prediction from colorectal cancer histology: a large-scale multicentric study

    Authors: Sophia J. Wagner, Daniel Reisenbüchler, Nicholas P. West, Jan Moritz Niehues, Gregory Patrick Veldhuizen, Philip Quirke, Heike I. Grabsch, Piet A. van den Brandt, Gordon G. A. Hutchins, Susan D. Richman, Tanwei Yuan, Rupert Langer, Josien Christina Anna Jenniskens, Kelly Offermans, Wolfram Mueller, Richard Gray, Stephen B. Gruber, Joel K. Greenson, Gad Rennert, Joseph D. Bonner, Daniel Schmolze, Jacqueline A. James, Maurice B. Loughrey, Manuel Salto-Tellez, Hermann Brenner , et al. (6 additional authors not shown)

    Abstract: Background: Deep learning (DL) can extract predictive and prognostic biomarkers from routine pathology slides in colorectal cancer. For example, a DL test for the diagnosis of microsatellite instability (MSI) in CRC has been approved in 2022. Current approaches rely on convolutional neural networks (CNNs). Transformer networks are outperforming CNNs and are replacing them in many applications, but… ▽ More

    Submitted 1 March, 2023; v1 submitted 23 January, 2023; originally announced January 2023.

    Comments: Updated Figure 2 and Table A.5

  17. Diffusion Probabilistic Models beat GANs on Medical Images

    Authors: Gustav Müller-Franzes, Jan Moritz Niehues, Firas Khader, Soroosh Tayebi Arasteh, Christoph Haarburger, Christiane Kuhl, Tianci Wang, Tianyu Han, Sven Nebelung, Jakob Nikolas Kather, Daniel Truhn

    Abstract: The success of Deep Learning applications critically depends on the quality and scale of the underlying training data. Generative adversarial networks (GANs) can generate arbitrary large datasets, but diversity and fidelity are limited, which has recently been addressed by denoising diffusion probabilistic models (DDPMs) whose superiority has been demonstrated on natural images. In this study, we… ▽ More

    Submitted 14 December, 2022; originally announced December 2022.

    Journal ref: Sci Rep 13, 12098 (2023)

  18. arXiv:2211.11703  [pdf, other

    cs.CL cs.SD eess.AS

    Towards continually learning new languages

    Authors: Ngoc-Quan Pham, Jan Niehues, Alexander Waibel

    Abstract: Multilingual speech recognition with neural networks is often implemented with batch-learning, when all of the languages are available before training. An ability to add new languages after the prior training sessions can be economically beneficial, but the main challenge is catastrophic forgetting. In this work, we combine the qualities of weight factorization and elastic weight consolidation in… ▽ More

    Submitted 17 July, 2024; v1 submitted 21 November, 2022; originally announced November 2022.

    Comments: Work in progress

  19. arXiv:2211.04939  [pdf, other

    cs.CL cs.SD eess.AS

    Efficient Speech Translation with Pre-trained Models

    Authors: Zhaolin Li, Jan Niehues

    Abstract: When building state-of-the-art speech translation models, the need for large computational resources is a significant obstacle due to the large training data size and complex models. The availability of pre-trained models is a promising opportunity to build strong speech translation systems efficiently. In a first step, we investigate efficient strategies to build cascaded and end-to-end speech tr… ▽ More

    Submitted 9 November, 2022; originally announced November 2022.

  20. arXiv:2211.01292  [pdf, other

    cs.CL cs.AI

    Learning an Artificial Language for Knowledge-Sharing in Multilingual Translation

    Authors: Danni Liu, Jan Niehues

    Abstract: The cornerstone of multilingual neural translation is shared representations across languages. Given the theoretically infinite representation power of neural networks, semantically identical sentences are likely represented differently. While representing sentences in the continuous latent space ensures expressiveness, it introduces the risk of capturing of irrelevant features which hinders the l… ▽ More

    Submitted 18 November, 2022; v1 submitted 2 November, 2022; originally announced November 2022.

    Comments: WMT 2022

  21. arXiv:2205.12304  [pdf, ps, other

    cs.CL cs.SD eess.AS

    Adaptive multilingual speech recognition with pretrained models

    Authors: Ngoc-Quan Pham, Alex Waibel, Jan Niehues

    Abstract: Multilingual speech recognition with supervised learning has achieved great results as reflected in recent research. With the development of pretraining methods on audio and text data, it is imperative to transfer the knowledge from unsupervised multilingual models to facilitate recognition, especially in many languages with limited data. Our work investigated the effectiveness of using two pretra… ▽ More

    Submitted 24 May, 2022; originally announced May 2022.

    Comments: Submitted to INTERSPEECH 2022

  22. arXiv:2204.10593  [pdf, other

    cs.CL cs.SD eess.AS

    LibriS2S: A German-English Speech-to-Speech Translation Corpus

    Authors: Pedro Jeuris, Jan Niehues

    Abstract: Recently, we have seen an increasing interest in the area of speech-to-text translation. This has led to astonishing improvements in this area. In contrast, the activities in the area of speech-to-speech translation is still limited, although it is essential to overcome the language barrier. We believe that one of the limiting factors is the availability of appropriate training data. We address th… ▽ More

    Submitted 22 April, 2022; originally announced April 2022.

    Comments: Accepted to LREC 2022

  23. arXiv:2204.06028  [pdf, other

    cs.CL

    CUNI-KIT System for Simultaneous Speech Translation Task at IWSLT 2022

    Authors: Peter Polák, Ngoc-Quan Ngoc, Tuan-Nam Nguyen, Danni Liu, Carlos Mullov, Jan Niehues, Ondřej Bojar, Alexander Waibel

    Abstract: In this paper, we describe our submission to the Simultaneous Speech Translation at IWSLT 2022. We explore strategies to utilize an offline model in a simultaneous setting without the need to modify the original model. In our experiments, we show that our onlinization algorithm is almost on par with the offline setting while being $3\times$ faster than offline in terms of latency on the test set.… ▽ More

    Submitted 11 May, 2022; v1 submitted 12 April, 2022; originally announced April 2022.

    Comments: Accepted to IWSLT22

  24. arXiv:2203.14835  [pdf, ps, other

    cs.CL cs.SD eess.AS

    Multilingual Simultaneous Speech Translation

    Authors: Shashank Subramanya, Jan Niehues

    Abstract: Applications designed for simultaneous speech translation during events such as conferences or meetings need to balance quality and lag while displaying translated text to deliver a good user experience. One common approach to building online spoken language translation systems is by leveraging models built for offline speech translation. Based on a technique to adapt end-to-end monolingual models… ▽ More

    Submitted 29 March, 2022; v1 submitted 28 March, 2022; originally announced March 2022.

    Comments: Submitted to Interspeech 2022

  25. arXiv:2203.11110  [pdf, other

    hep-ph hep-ex

    Event Generators for High-Energy Physics Experiments

    Authors: J. M. Campbell, M. Diefenthaler, T. J. Hobbs, S. Höche, J. Isaacson, F. Kling, S. Mrenna, J. Reuter, S. Alioli, J. R. Andersen, C. Andreopoulos, A. M. Ankowski, E. C. Aschenauer, A. Ashkenazi, M. D. Baker, J. L. Barrow, M. van Beekveld, G. Bewick, S. Bhattacharya, C. Bierlich, E. Bothmann, P. Bredt, A. Broggio, A. Buckley, A. Butter , et al. (186 additional authors not shown)

    Abstract: We provide an overview of the status of Monte-Carlo event generators for high-energy particle physics. Guided by the experimental needs and requirements, we highlight areas of active development, and opportunities for future improvements. Particular emphasis is given to physics models and algorithms that are employed across a variety of experiments. These common themes in event generator developme… ▽ More

    Submitted 23 January, 2024; v1 submitted 21 March, 2022; originally announced March 2022.

    Comments: 164 pages, 10 figures, contribution to Snowmass 2021

    Report number: CP3-22-12, DESY-22-042, FERMILAB-PUB-22-116-SCD-T, IPPP/21/51, JLAB-PHY-22-3576, KA-TP-04-2022, LA-UR-22-22126, LU-TP-22-12, MCNET-22-04, OUTP-22-03P, P3H-22-024, PITT-PACC 2207, UCI-TR-2022-02

  26. Tackling data scarcity in speech translation using zero-shot multilingual machine translation techniques

    Authors: Tu Anh Dinh, Danni Liu, Jan Niehues

    Abstract: Recently, end-to-end speech translation (ST) has gained significant attention as it avoids error propagation. However, the approach suffers from data scarcity. It heavily depends on direct ST data and is less efficient in making use of speech transcription and text translation data, which is often more easily available. In the related field of multilingual text translation, several techniques have… ▽ More

    Submitted 26 January, 2022; originally announced January 2022.

    Comments: 6 pages, 5 figures, accepted to IEEE ICASSP 2022. arXiv admin note: text overlap with arXiv:2107.06010

    ACM Class: I.2.7

    Journal ref: ICASSP 2022 - 2022 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 2022, pp. 6222-6226

  27. arXiv:2201.05700  [pdf, other

    cs.CL cs.AI

    Cost-Effective Training in Low-Resource Neural Machine Translation

    Authors: Sai Koneru, Danni Liu, Jan Niehues

    Abstract: While Active Learning (AL) techniques are explored in Neural Machine Translation (NMT), only a few works focus on tackling low annotation budgets where a limited number of sentences can get translated. Such situations are especially challenging and can occur for endangered languages with few human annotators or having cost constraints to label large amounts of data. Although AL is shown to be help… ▽ More

    Submitted 14 January, 2022; originally announced January 2022.

  28. arXiv:2112.01120  [pdf, other

    hep-ex hep-ph

    Impact of jet-production data on the next-to-next-to-leading-order determination of HERAPDF2.0 parton distributions

    Authors: H1, ZEUS Collaborations, :, I. Abt, R. Aggarwal, V. Andreev, M. Arratia, V. Aushev, A. Baghdasaryan, A. Baty, K. Begzsuren, O. Behnke, A. Belousov, A. Bertolin, I. Bloch, V. Boudry, G. Brandt, I. Brock, N. H. Brook, R. Brugnera, A. Bruni, A. Buniatyan, P. J. Bussey, L. Bystritskaya, A. Caldwell , et al. (212 additional authors not shown)

    Abstract: The HERAPDF2.0 ensemble of parton distribution functions (PDFs) was introduced in 2015. The final stage is presented, a next-to-next-to-leading-order (NNLO) analysis of the HERA data on inclusive deep inelastic $ep$ scattering together with jet data as published by the H1 and ZEUS collaborations. A perturbative QCD fit, simultaneously of $αあるふぁ_s(M_Z^2)$ and and the PDFs, was performed with the result… ▽ More

    Submitted 2 December, 2021; originally announced December 2021.

    Comments: 43 pages, 24 figures, to be submitted to Eur. Phys. J. C

    Report number: DESY-21-206

  29. arXiv:2111.04470  [pdf, other

    cond-mat.stat-mech nlin.AO nlin.PS physics.comp-ph

    Self-organized quantization and oscillations on continuous fixed-energy sandpiles

    Authors: Jakob Niehues, Gorm Gruner Jensen, Jan O. Haerter

    Abstract: Atmospheric self-organization and activator-inhibitor dynamics in biology provide examples of checkerboard-like spatio-temporal organization. We study a simple model for local activation-inhibition processes. Our model, first introduced in the context of atmospheric moisture dynamics, is a continuous-energy and non-Abelian version of the fixed-energy sandpile model. Each lattice site is populated… ▽ More

    Submitted 4 November, 2021; originally announced November 2021.

    Comments: 13 pages, 7 figures, plus supplement, to be submitted to Physical Review E

  30. arXiv:2103.15877  [pdf, other

    cs.CL cs.AI

    Unsupervised Machine Translation On Dravidian Languages

    Authors: Sai Koneru, Danni Liu, Jan Niehues

    Abstract: Unsupervised neural machine translation (UNMT) is beneficial especially for low resource languages such as those from the Dravidian family. However, UNMT systems tend to fail in realistic scenarios involving actual low resource languages. Recent works propose to utilize auxiliary parallel data and have achieved state-of-the-art results. In this work, we focus on unsupervised translation between En… ▽ More

    Submitted 29 March, 2021; originally announced March 2021.

  31. arXiv:2102.06558  [pdf, other

    cs.CL

    Continuous Learning in Neural Machine Translation using Bilingual Dictionaries

    Authors: Jan Niehues

    Abstract: While recent advances in deep learning led to significant improvements in machine translation, neural machine translation is often still not able to continuously adapt to the environment. For humans, as well as for machine translation, bilingual dictionaries are a promising knowledge source to continuously integrate new knowledge. However, their exploitation poses several challenges: The system ne… ▽ More

    Submitted 12 February, 2021; originally announced February 2021.

    Comments: 9 pages, EACL 2021

  32. arXiv:2012.15127  [pdf, other

    cs.CL

    Improving Zero-Shot Translation by Disentangling Positional Information

    Authors: Danni Liu, Jan Niehues, James Cross, Francisco Guzmán, Xian Li

    Abstract: Multilingual neural machine translation has shown the capability of directly translating between language pairs unseen in training, i.e. zero-shot translation. Despite being conceptually attractive, it often suffers from low output quality. The difficulty of generalizing to new translation directions suggests the model representations are highly specific to those language pairs seen in training. W… ▽ More

    Submitted 30 June, 2021; v1 submitted 30 December, 2020; originally announced December 2020.

    Comments: ACL 2021

  33. arXiv:2007.14491  [pdf, other

    hep-ex hep-ph nucl-ex nucl-th

    The Large Hadron-Electron Collider at the HL-LHC

    Authors: P. Agostini, H. Aksakal, S. Alekhin, P. P. Allport, N. Andari, K. D. J. Andre, D. Angal-Kalinin, S. Antusch, L. Aperio Bella, L. Apolinario, R. Apsimon, A. Apyan, G. Arduini, V. Ari, A. Armbruster, N. Armesto, B. Auchmann, K. Aulenbacher, G. Azuelos, S. Backovic, I. Bailey, S. Bailey, F. Balli, S. Behera, O. Behnke , et al. (312 additional authors not shown)

    Abstract: The Large Hadron electron Collider (LHeC) is designed to move the field of deep inelastic scattering (DIS) to the energy and intensity frontier of particle physics. Exploiting energy recovery technology, it collides a novel, intense electron beam with a proton or ion beam from the High Luminosity--Large Hadron Collider (HL-LHC). The accelerator and interaction region are designed for concurrent el… ▽ More

    Submitted 12 April, 2021; v1 submitted 28 July, 2020; originally announced July 2020.

    Comments: 373 pages, many figures, to be published by J. Phys. G

    Report number: CERN-ACC-Note-2020-0002

    Journal ref: J.Phys.G 48 (2021) 11, 110501

  34. arXiv:2005.12143  [pdf, other

    cs.CL

    Adapting End-to-End Speech Recognition for Readable Subtitles

    Authors: Danni Liu, Jan Niehues, Gerasimos Spanakis

    Abstract: Automatic speech recognition (ASR) systems are primarily evaluated on transcription accuracy. However, in some use cases such as subtitling, verbatim transcription would reduce output readability given limited screen size and reading time. Therefore, this work focuses on ASR with output compression, a task challenging for supervised approaches due to the scarcity of training data. We first investi… ▽ More

    Submitted 25 May, 2020; originally announced May 2020.

    Comments: IWSLT 2020

  35. arXiv:2005.11185  [pdf, other

    cs.CL cs.SD eess.AS

    Low-Latency Sequence-to-Sequence Speech Recognition and Translation by Partial Hypothesis Selection

    Authors: Danni Liu, Gerasimos Spanakis, Jan Niehues

    Abstract: Encoder-decoder models provide a generic architecture for sequence-to-sequence tasks such as speech recognition and translation. While offline systems are often evaluated on quality metrics like word error rates (WER) and BLEU, latency is also a crucial factor in many practical use-cases. We propose three latency reduction techniques for chunk-based incremental inference and evaluate their efficie… ▽ More

    Submitted 13 October, 2020; v1 submitted 22 May, 2020; originally announced May 2020.

    Comments: Interspeech 2020

  36. arXiv:2005.09940  [pdf, other

    eess.AS cs.CL cs.SD

    Relative Positional Encoding for Speech Recognition and Direct Translation

    Authors: Ngoc-Quan Pham, Thanh-Le Ha, Tuan-Nam Nguyen, Thai-Son Nguyen, Elizabeth Salesky, Sebastian Stueker, Jan Niehues, Alexander Waibel

    Abstract: Transformer models are powerful sequence-to-sequence architectures that are capable of directly mapping speech inputs to transcriptions or translations. However, the mechanism for modeling positions in this model was tailored for text modeling, and thus is less ideal for acoustic inputs. In this work, we adapt the relative position encoding scheme to the Speech Transformer, where the key addition… ▽ More

    Submitted 20 May, 2020; originally announced May 2020.

    Comments: Submitted to Interspeech 2020

  37. arXiv:2004.03176  [pdf, other

    cs.CL

    Machine Translation with Unsupervised Length-Constraints

    Authors: Jan Niehues

    Abstract: We have seen significant improvements in machine translation due to the usage of deep learning. While the improvements in translation quality are impressive, the encoder-decoder architecture enables many more possibilities. In this paper, we explore one of these, the generation of constraint translation. We focus on length constraints, which are essential if the translation should be displayed in… ▽ More

    Submitted 7 April, 2020; originally announced April 2020.

    Comments: 8 pages

  38. arXiv:2003.09891  [pdf, other

    eess.AS cs.CL cs.SD

    Low Latency ASR for Simultaneous Speech Translation

    Authors: Thai Son Nguyen, Jan Niehues, Eunah Cho, Thanh-Le Ha, Kevin Kilgour, Markus Muller, Matthias Sperber, Sebastian Stueker, Alex Waibel

    Abstract: User studies have shown that reducing the latency of our simultaneous lecture translation system should be the most important goal. We therefore have worked on several techniques for reducing the latency for both components, the automatic speech recognition and the speech translation module. Since the commonly used commitment latency is not appropriate in our case of continuous stream decoding, we… ▽ More

    Submitted 22 March, 2020; originally announced March 2020.

  39. arXiv:1910.13296  [pdf, other

    eess.AS cs.CV cs.LG cs.SD

    Improving sequence-to-sequence speech recognition training with on-the-fly data augmentation

    Authors: Thai-Son Nguyen, Sebastian Stueker, Jan Niehues, Alex Waibel

    Abstract: Sequence-to-Sequence (S2S) models recently started to show state-of-the-art performance for automatic speech recognition (ASR). With these large and deep models overfitting remains the largest problem, outweighing performance improvements that can be obtained from better architectures. One solution to the overfitting problem is increasing the amount of available training data and the variety exhib… ▽ More

    Submitted 3 February, 2020; v1 submitted 29 October, 2019; originally announced October 2019.

    Comments: To appear in ICASSP 2020

  40. arXiv:1910.01859  [pdf, other

    cs.CL

    Modeling Confidence in Sequence-to-Sequence Models

    Authors: Jan Niehues, Ngoc-Quan Pham

    Abstract: Recently, significant improvements have been achieved in various natural language processing tasks using neural sequence-to-sequence models. While aiming for the best generation quality is important, ultimately it is also necessary to develop models that can assess the quality of their output. In this work, we propose to use the similarity between training and test conditions as a measure for mo… ▽ More

    Submitted 4 October, 2019; originally announced October 2019.

    Comments: 8 pages; INLG 2019

  41. arXiv:1909.13790  [pdf, other

    cs.CL cs.SD eess.AS

    Incremental processing of noisy user utterances in the spoken language understanding task

    Authors: Stefan Constantin, Jan Niehues, Alex Waibel

    Abstract: The state-of-the-art neural network architectures make it possible to create spoken language understanding systems with high quality and fast processing time. One major challenge for real-world applications is the high latency of these systems caused by triggered actions with high executions times. If an action can be separated into subactions, the reaction time of the systems can be improved thro… ▽ More

    Submitted 30 September, 2019; originally announced September 2019.

    Comments: 10 pages, 3 figures, 7 tables, forthcoming in W-NUT 2019

  42. arXiv:1909.02760  [pdf, other

    hep-ph

    Second-order QCD corrections to event shape distributions in deep inelastic scattering

    Authors: T. Gehrmann, A. Huss, J. Mo, J. Niehues

    Abstract: We compute the next-to-next-to-leading order (NNLO) QCD corrections to event shape distributions and their mean values in deep inelastic lepton-nucleon scattering. The magnitude and shape of the corrections varies considerably between different variables. The corrections reduce the renormalization and factorization scale uncertainty of the predictions. Using a dispersive model to describe non-pert… ▽ More

    Submitted 6 September, 2019; originally announced September 2019.

    Comments: 15 pages, 15 figures, data for figures included as auxiliary files

    Report number: CERN-TH-2019-140, IPPP/19/68, ZU-TH 39/19

  43. arXiv:1906.08584  [pdf, other

    cs.CL

    Improving Zero-shot Translation with Language-Independent Constraints

    Authors: Ngoc-Quan Pham, Jan Niehues, Thanh-Le Ha, Alex Waibel

    Abstract: An important concern in training multilingual neural machine translation (NMT) is to translate between language pairs unseen during training, i.e zero-shot translation. Improving this ability kills two birds with one stone by providing an alternative to pivot translation which also allows us to better understand how the model captures information between languages. In this work, we carried out a… ▽ More

    Submitted 20 June, 2019; originally announced June 2019.

    Comments: 10 pages version accepted in WMT 2019

  44. Calculations for deep inelastic scattering using fast interpolation grid techniques at NNLO in QCD and the extraction of $αあるふぁ_s$ from HERA data

    Authors: D. Britzger, J. Currie, A. Gehrmann-De Ridder, T. Gehrmann, E. W. N. Glover, C. Gwenlan, A. Huss, T. Morgan, J. Niehues, J. Pires, K. Rabbertz, M. R. Sutton

    Abstract: The extension of interpolation-grid frameworks for perturbative QCD calculations at next-to-next-to-leading order (NNLO) is presented for deep inelastic scattering (DIS) processes. A fast and flexible evaluation of higher-order predictions for any a posteriori choice of parton distribution functions (PDFs) or value of the strong coupling constant is essential in iterative fitting procedures to ext… ▽ More

    Submitted 27 August, 2021; v1 submitted 12 June, 2019; originally announced June 2019.

    Comments: 13 pages, 6 figures, 2 tables. v2: corrected scale bands in Fig. 4; version to appear in EPJC. v3: changes as discussed in an erratum submitted to EPJ C

    Report number: CERN-TH-2019-079, CFTP/19-020, IPPP/19/44, MPP-2019-114, ZU-TH 29/19

  45. arXiv:1904.13377  [pdf, other

    cs.CL cs.LG cs.SD eess.AS

    Very Deep Self-Attention Networks for End-to-End Speech Recognition

    Authors: Ngoc-Quan Pham, Thai-Son Nguyen, Jan Niehues, Markus Müller, Sebastian Stüker, Alexander Waibel

    Abstract: Recently, end-to-end sequence-to-sequence models for speech recognition have gained significant interest in the research community. While previous architecture choices revolve around time-delay neural networks (TDNN) and long short-term memory (LSTM) recurrent neural networks, we propose to use self-attention via the Transformer architecture as an alternative. Our analysis shows that deep Transfor… ▽ More

    Submitted 3 May, 2019; v1 submitted 30 April, 2019; originally announced April 2019.

    Comments: Submitted to INTERSPEECH 2019

  46. arXiv:1904.07209  [pdf, other

    cs.CL

    Attention-Passing Models for Robust and Data-Efficient End-to-End Speech Translation

    Authors: Matthias Sperber, Graham Neubig, Jan Niehues, Alex Waibel

    Abstract: Speech translation has traditionally been approached through cascaded models consisting of a speech recognizer trained on a corpus of transcribed speech, and a machine translation system trained on parallel texts. Several recent works have shown the feasibility of collapsing the cascade into a single, direct model that can be trained in an end-to-end fashion on a corpus of translated speech. Howev… ▽ More

    Submitted 15 April, 2019; originally announced April 2019.

    Comments: Authors' final version, accepted at TACL 2019

  47. arXiv:1812.06876  [pdf, other

    cs.CL

    Multi-task learning to improve natural language understanding

    Authors: Stefan Constantin, Jan Niehues, Alex Waibel

    Abstract: Recently advancements in sequence-to-sequence neural network architectures have led to an improved natural language understanding. When building a neural network-based Natural Language Understanding component, one main challenge is to collect enough training data. The generation of a synthetic dataset is an inexpensive and quick way to collect data. Since this data often has less variety than real… ▽ More

    Submitted 15 February, 2019; v1 submitted 17 December, 2018; originally announced December 2018.

    Comments: 11 pages, 4 figures, 2 tables, forthcoming in IWSDS 2019

  48. Jet production in charged-current deep-inelastic scattering to third order in QCD

    Authors: T. Gehrmann, J. Niehues, A. Huss, A. Vogt, D. M. Walker

    Abstract: The production of jets in charged-current deep-inelastic scattering (CC DIS) probes simultaneously the strong and the electroweak sectors of the Standard Model; its measurement provides important information on the quark flavour structure of the proton. We compute third-order (N3LO) perturbative QCD corrections to this process, fully differential in the jet and lepton kinematics. We observe a subs… ▽ More

    Submitted 19 March, 2019; v1 submitted 14 December, 2018; originally announced December 2018.

    Comments: 5 pages, 2 figures. v2: matches published version

    Report number: CERN-TH-2018-272, IPPP/18/108, ZU-TH 45/18, LTH 1188

  49. arXiv:1811.03189  [pdf, other

    cs.CL

    Towards Fluent Translations from Disfluent Speech

    Authors: Elizabeth Salesky, Susanne Burger, Jan Niehues, Alex Waibel

    Abstract: When translating from speech, special consideration for conversational speech phenomena such as disfluencies is necessary. Most machine translation training data consists of well-formed written texts, causing issues when translating spontaneous speech. Previous work has introduced an intermediate step between speech recognition (ASR) and machine translation (MT) to remove disfluencies, making the… ▽ More

    Submitted 7 November, 2018; originally announced November 2018.

    Comments: To appear at SLT 2018

  50. arXiv:1810.08641  [pdf, other

    cs.CL

    Optimizing Segmentation Granularity for Neural Machine Translation

    Authors: Elizabeth Salesky, Andrew Runge, Alex Coda, Jan Niehues, Graham Neubig

    Abstract: In neural machine translation (NMT), it is has become standard to translate using subword units to allow for an open vocabulary and improve accuracy on infrequent words. Byte-pair encoding (BPE) and its variants are the predominant approach to generating these subwords, as they are unsupervised, resource-free, and empirically effective. However, the granularity of these subword units is a hyperpar… ▽ More

    Submitted 19 October, 2018; originally announced October 2018.