(Translated by https://www.hiragana.jp/)
Search | arXiv e-print repository
Skip to main content

Showing 1–7 of 7 results for author: Ro, J H

Searching in archive cs. Search in all archives.
.
  1. arXiv:2403.10444  [pdf, other

    cs.LG cs.CL cs.DS cs.IT

    Optimal Block-Level Draft Verification for Accelerating Speculative Decoding

    Authors: Ziteng Sun, Jae Hun Ro, Ahmad Beirami, Ananda Theertha Suresh

    Abstract: Speculative decoding has shown to be an effective method for lossless acceleration of large language models (LLMs) during inference. In each iteration, the algorithm first uses a smaller model to draft a block of tokens. The tokens are then verified by the large model in parallel and only a subset of tokens will be kept to guarantee that the final output follows the distribution of the large model… ▽ More

    Submitted 15 March, 2024; originally announced March 2024.

  2. arXiv:2403.08100  [pdf, other

    cs.LG cs.CR cs.DC

    Efficient Language Model Architectures for Differentially Private Federated Learning

    Authors: Jae Hun Ro, Srinadh Bhojanapalli, Zheng Xu, Yanxiang Zhang, Ananda Theertha Suresh

    Abstract: Cross-device federated learning (FL) is a technique that trains a model on data distributed across typically millions of edge devices without data leaving the devices. SGD is the standard client optimizer for on device training in cross-device FL, favored for its memory and computational efficiency. However, in centralized training of neural language models, adaptive optimizers are preferred as th… ▽ More

    Submitted 12 March, 2024; originally announced March 2024.

  3. arXiv:2310.15141  [pdf, other

    cs.LG cs.CL cs.DS cs.IT

    SpecTr: Fast Speculative Decoding via Optimal Transport

    Authors: Ziteng Sun, Ananda Theertha Suresh, Jae Hun Ro, Ahmad Beirami, Himanshu Jain, Felix Yu

    Abstract: Autoregressive sampling from large language models has led to state-of-the-art results in several natural language tasks. However, autoregressive sampling generates tokens one at a time making it slow, and even prohibitive in certain tasks. One way to speed up sampling is $\textit{speculative decoding}$: use a small model to sample a $\textit{draft}$ (block or sequence of tokens), and then score a… ▽ More

    Submitted 17 January, 2024; v1 submitted 23 October, 2023; originally announced October 2023.

    Comments: NeurIPS 2023

  4. arXiv:2204.09715  [pdf, other

    cs.CL cs.LG

    Scaling Language Model Size in Cross-Device Federated Learning

    Authors: Jae Hun Ro, Theresa Breiner, Lara McConnaughey, Mingqing Chen, Ananda Theertha Suresh, Shankar Kumar, Rajiv Mathews

    Abstract: Most studies in cross-device federated learning focus on small models, due to the server-client communication and on-device computation bottlenecks. In this work, we leverage various techniques for mitigating these bottlenecks to train larger language models in cross-device federated learning. With systematic applications of partial model training, quantization, efficient transfer learning, and co… ▽ More

    Submitted 24 June, 2022; v1 submitted 31 March, 2022; originally announced April 2022.

  5. arXiv:2203.04925  [pdf, other

    cs.LG cs.DS cs.IT

    Correlated quantization for distributed mean estimation and optimization

    Authors: Ananda Theertha Suresh, Ziteng Sun, Jae Hun Ro, Felix Yu

    Abstract: We study the problem of distributed mean estimation and optimization under communication constraints. We propose a correlated quantization protocol whose leading term in the error guarantee depends on the mean deviation of data points rather than only their absolute range. The design doesn't need any prior knowledge on the concentration property of the dataset, which is required to get such depend… ▽ More

    Submitted 8 July, 2022; v1 submitted 9 March, 2022; originally announced March 2022.

  6. arXiv:2202.00153  [pdf, other

    cs.LG

    Transformer-based Models of Text Normalization for Speech Applications

    Authors: Jae Hun Ro, Felix Stahlberg, Ke Wu, Shankar Kumar

    Abstract: Text normalization, or the process of transforming text into a consistent, canonical form, is crucial for speech applications such as text-to-speech synthesis (TTS). In TTS, the system must decide whether to verbalize "1995" as "nineteen ninety five" in "born in 1995" or as "one thousand nine hundred ninety five" in "page 1995". We present an experimental comparison of various Transformer-based se… ▽ More

    Submitted 31 January, 2022; originally announced February 2022.

  7. arXiv:2108.02117  [pdf, other

    cs.LG

    FedJAX: Federated learning simulation with JAX

    Authors: Jae Hun Ro, Ananda Theertha Suresh, Ke Wu

    Abstract: Federated learning is a machine learning technique that enables training across decentralized data. Recently, federated learning has become an active area of research due to an increased focus on privacy and security. In light of this, a variety of open source federated learning libraries have been developed and released. We introduce FedJAX, a JAX-based open source library for federated learning… ▽ More

    Submitted 5 November, 2021; v1 submitted 4 August, 2021; originally announced August 2021.