(Translated by https://www.hiragana.jp/)
Search | arXiv e-print repository
Skip to main content

Showing 1–10 of 10 results for author: Mason, H

Searching in archive cs. Search in all archives.
.
  1. arXiv:2402.11131  [pdf, other

    cs.CL cs.AI cs.LG

    Speculative Streaming: Fast LLM Inference without Auxiliary Models

    Authors: Nikhil Bhendawade, Irina Belousova, Qichen Fu, Henry Mason, Mohammad Rastegari, Mahyar Najibi

    Abstract: Speculative decoding is a prominent technique to speed up the inference of a large target language model based on predictions of an auxiliary draft model. While effective, in application-specific settings, it often involves fine-tuning both draft and target models to achieve high acceptance rates. As the number of downstream tasks grows, these draft models add significant complexity to inference s… ▽ More

    Submitted 16 February, 2024; originally announced February 2024.

  2. arXiv:2402.10425  [pdf

    eess.IV cs.CV cs.LG

    DABS-LS: Deep Atlas-Based Segmentation Using Regional Level Set Self-Supervision

    Authors: Hannah G. Mason, Jack H. Noble

    Abstract: Cochlear implants (CIs) are neural prosthetics used to treat patients with severe-to-profound hearing loss. Patient-specific modeling of CI stimulation of the auditory nerve fiber (ANFs) can help audiologists improve the CI programming. These models require localization of the ANFs relative to surrounding anatomy and the CI. Localization is challenging because the ANFs are so small they are not di… ▽ More

    Submitted 15 February, 2024; originally announced February 2024.

  3. arXiv:2312.10359  [pdf, other

    cs.LG cs.PF

    Conformer-Based Speech Recognition On Extreme Edge-Computing Devices

    Authors: Mingbin Xu, Alex Jin, Sicheng Wang, Mu Su, Tim Ng, Henry Mason, Shiyi Han, Zhihong Lei, Yaqiao Deng, Zhen Huang, Mahesh Krishnamoorthy

    Abstract: With increasingly more powerful compute capabilities and resources in today's devices, traditionally compute-intensive automatic speech recognition (ASR) has been moving from the cloud to devices to better protect user privacy. However, it is still challenging to implement on-device ASR on resource-constrained devices, such as smartphones, smart wearables, and other smart home automation devices.… ▽ More

    Submitted 13 May, 2024; v1 submitted 16 December, 2023; originally announced December 2023.

  4. arXiv:2211.01438  [pdf, other

    eess.AS cs.CL cs.SD

    Variable Attention Masking for Configurable Transformer Transducer Speech Recognition

    Authors: Pawel Swietojanski, Stefan Braun, Dogan Can, Thiago Fraga da Silva, Arnab Ghoshal, Takaaki Hori, Roger Hsiao, Henry Mason, Erik McDermott, Honza Silovsky, Ruchir Travadi, Xiaodan Zhuang

    Abstract: This work studies the use of attention masking in transformer transducer based speech recognition for building a single configurable model for different deployment scenarios. We present a comprehensive set of experiments comparing fixed masking, where the same attention mask is applied at every frame, with chunked masking, where the attention mask for each frame is determined by chunk boundaries,… ▽ More

    Submitted 18 April, 2023; v1 submitted 2 November, 2022; originally announced November 2022.

    Comments: To appear in ICASSP 2023

    Journal ref: International Conference on Acoustics, Speech, and Signal Processing, 2023 International Conference on Acoustics, Speech, and Signal Processing International Conference on Acoustics, Speech, and Signal Processing

  5. arXiv:2211.00080  [pdf, other

    cs.LG eess.SP stat.AP

    Denoising neural networks for magnetic resonance spectroscopy

    Authors: Natalie Klein, Amber J. Day, Harris Mason, Michael W. Malone, Sinead A. Williamson

    Abstract: In many scientific applications, measured time series are corrupted by noise or distortions. Traditional denoising techniques often fail to recover the signal of interest, particularly when the signal-to-noise ratio is low or when certain assumptions on the signal and noise are violated. In this work, we demonstrate that deep learning-based denoising methods can outperform traditional techniques w… ▽ More

    Submitted 31 October, 2022; originally announced November 2022.

    Comments: 5 pages with appendix

  6. arXiv:2210.12214  [pdf, ps, other

    cs.SD cs.CL eess.AS

    Optimizing Bilingual Neural Transducer with Synthetic Code-switching Text Generation

    Authors: Thien Nguyen, Nathalie Tran, Liuhui Deng, Thiago Fraga da Silva, Matthew Radzihovsky, Roger Hsiao, Henry Mason, Stefan Braun, Erik McDermott, Dogan Can, Pawel Swietojanski, Lyan Verwimp, Sibel Oyman, Tresi Arvizo, Honza Silovsky, Arnab Ghoshal, Mathieu Martel, Bharat Ram Ambati, Mohamed Ali

    Abstract: Code-switching describes the practice of using more than one language in the same sentence. In this study, we investigate how to optimize a neural transducer based bilingual automatic speech recognition (ASR) model for code-switching speech. Focusing on the scenario where the ASR model is trained without supervised code-switching data, we found that semi-supervised training and synthetic code-swit… ▽ More

    Submitted 21 October, 2022; originally announced October 2022.

    Comments: 5 pages, 1 figure, submitted to ICASSP 2023, *: equal contributions

  7. arXiv:2108.03138  [pdf

    eess.IV cs.CV

    Lung Ultrasound Segmentation and Adaptation between COVID-19 and Community-Acquired Pneumonia

    Authors: Harry Mason, Lorenzo Cristoni, Andrew Walden, Roberto Lazzari, Thomas Pulimood, Louis Grandjean, Claudia AM Gandini Wheeler-Kingshott, Yipeng Hu, Zachary MC Baum

    Abstract: Lung ultrasound imaging has been shown effective in detecting typical patterns for interstitial pneumonia, as a point-of-care tool for both patients with COVID-19 and other community-acquired pneumonia (CAP). In this work, we focus on the hyperechoic B-line segmentation task. Using deep neural networks, we automatically outline the regions that are indicative of pathology-sensitive artifacts and t… ▽ More

    Submitted 6 August, 2021; originally announced August 2021.

    Comments: Accepted to MICCAI ASMUS Workshop

  8. arXiv:2102.08503  [pdf, other

    cs.LG

    Federated Evaluation and Tuning for On-Device Personalization: System Design & Applications

    Authors: Matthias Paulik, Matt Seigel, Henry Mason, Dominic Telaar, Joris Kluivers, Rogier van Dalen, Chi Wai Lau, Luke Carlson, Filip Granqvist, Chris Vandevelde, Sudeep Agarwal, Julien Freudiger, Andrew Byde, Abhishek Bhowmick, Gaurav Kapoor, Si Beaumont, Áine Cahill, Dominic Hughes, Omid Javidbakht, Fei Dong, Rehan Rishi, Stanley Hung

    Abstract: We describe the design of our federated task processing system. Originally, the system was created to support two specific federated tasks: evaluation and tuning of on-device ML systems, primarily for the purpose of personalizing these systems. In recent years, support for an additional federated task has been added: federated learning (FL) of deep neural networks. To our knowledge, only one other… ▽ More

    Submitted 16 February, 2021; originally announced February 2021.

    Comments: 11 pages, 1 figure

  9. arXiv:2003.00304  [pdf, ps, other

    cs.CL cs.SD eess.AS stat.ML

    Voice trigger detection from LVCSR hypothesis lattices using bidirectional lattice recurrent neural networks

    Authors: Woojay Jeon, Leo Liu, Henry Mason

    Abstract: We propose a method to reduce false voice triggers of a speech-enabled personal assistant by post-processing the hypothesis lattice of a server-side large-vocabulary continuous speech recognizer (LVCSR) via a neural network. We first discuss how an estimate of the posterior probability of the trigger phrase can be obtained from the hypothesis lattice using known techniques to perform detection, th… ▽ More

    Submitted 29 February, 2020; originally announced March 2020.

    Comments: Presented at IEEE ICASSP, May 2019

    Journal ref: ICASSP 2019 - 2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Brighton, United Kingdom, 2019, pp. 6356-6360

  10. arXiv:1910.01992  [pdf, other

    cs.LG cs.CL cs.SD eess.AS stat.ML

    SNDCNN: Self-normalizing deep CNNs with scaled exponential linear units for speech recognition

    Authors: Zhen Huang, Tim Ng, Leo Liu, Henry Mason, Xiaodan Zhuang, Daben Liu

    Abstract: Very deep CNNs achieve state-of-the-art results in both computer vision and speech recognition, but are difficult to train. The most popular way to train very deep CNNs is to use shortcut connections (SC) together with batch normalization (BN). Inspired by Self- Normalizing Neural Networks, we propose the self-normalizing deep CNN (SNDCNN) based acoustic model topology, by removing the SC/BN and r… ▽ More

    Submitted 23 March, 2020; v1 submitted 4 October, 2019; originally announced October 2019.