(Translated by https://www.hiragana.jp/)
Search | arXiv e-print repository
Skip to main content

Showing 1–1 of 1 results for author: Fineran, B

.
  1. arXiv:2203.07259  [pdf, other

    cs.CL cs.LG

    The Optimal BERT Surgeon: Scalable and Accurate Second-Order Pruning for Large Language Models

    Authors: Eldar Kurtic, Daniel Campos, Tuan Nguyen, Elias Frantar, Mark Kurtz, Benjamin Fineran, Michael Goin, Dan Alistarh

    Abstract: Transformer-based language models have become a key building block for natural language processing. While these models are extremely accurate, they can be too large and computationally intensive to run on standard deployments. A variety of compression methods, including distillation, quantization, structured and unstructured pruning are known to decrease model size and increase inference speed, wi… ▽ More

    Submitted 17 October, 2022; v1 submitted 14 March, 2022; originally announced March 2022.

    Comments: Accepted to EMNLP 2022