(Translated by https://www.hiragana.jp/)
Search | arXiv e-print repository
Skip to main content

Showing 1–5 of 5 results for author: Alonso, C A

Searching in archive cs. Search in all archives.
.
  1. arXiv:2405.15731  [pdf, other

    cs.LG cs.AI eess.SY

    Understanding the differences in Foundation Models: Attention, State Space Models, and Recurrent Neural Networks

    Authors: Jerome Sieber, Carmen Amo Alonso, Alexandre Didier, Melanie N. Zeilinger, Antonio Orvieto

    Abstract: Softmax attention is the principle backbone of foundation models for various artificial intelligence applications, yet its quadratic complexity in sequence length can limit its inference throughput in long-context settings. To address this challenge, alternative architectures such as linear attention, State Space Models (SSMs), and Recurrent Neural Networks (RNNs) have been considered as more effi… ▽ More

    Submitted 3 June, 2024; v1 submitted 24 May, 2024; originally announced May 2024.

  2. arXiv:2405.15454  [pdf, other

    cs.CL eess.SY

    Linearly Controlled Language Generation with Performative Guarantees

    Authors: Emily Cheng, Marco Baroni, Carmen Amo Alonso

    Abstract: The increasing prevalence of Large Language Models (LMs) in critical applications highlights the need for controlled language generation strategies that are not only computationally efficient but that also enjoy performance guarantees. To achieve this, we use a common model of concept semantics as linearly represented in an LM's latent space. In particular, we take the view that natural language g… ▽ More

    Submitted 24 May, 2024; originally announced May 2024.

  3. arXiv:2403.16899  [pdf, other

    eess.SY cs.CL cs.LG

    State Space Models as Foundation Models: A Control Theoretic Overview

    Authors: Carmen Amo Alonso, Jerome Sieber, Melanie N. Zeilinger

    Abstract: In recent years, there has been a growing interest in integrating linear state-space models (SSM) in deep neural network architectures of foundation models. This is exemplified by the recent success of Mamba, showing better performance than the state-of-the-art Transformer architectures in language tasks. Foundation models, like e.g. GPT-4, aim to encode sequential data into a latent space in orde… ▽ More

    Submitted 25 March, 2024; originally announced March 2024.

  4. arXiv:2403.10762  [pdf, other

    cs.RO

    NARRATE: Versatile Language Architecture for Optimal Control in Robotics

    Authors: Seif Ismail, Antonio Arbues, Ryan Cotterell, René Zurbrügg, Carmen Amo Alonso

    Abstract: The impressive capabilities of Large Language Models (LLMs) have led to various efforts to enable robots to be controlled through natural language instructions, opening exciting possibilities for human-robot interaction The goal is for the motor-control task to be performed accurately, efficiently and safely while also enjoying the flexibility imparted by LLMs to specify and adjust the task throug… ▽ More

    Submitted 15 March, 2024; originally announced March 2024.

  5. arXiv:2103.14990  [pdf, other

    cs.DC eess.SY

    Effective GPU Parallelization of Distributed and Localized Model Predictive Control

    Authors: Carmen Amo Alonso, Shih-Hao Tseng

    Abstract: To effectively control large-scale distributed systems online, model predictive control (MPC) has to swiftly solve the underlying high-dimensional optimization. There are multiple techniques applied to accelerate the solving process in the literature, mainly attributed to software-based algorithmic advancements and hardware-assisted computation enhancements. However, those methods focus on arithme… ▽ More

    Submitted 27 March, 2021; originally announced March 2021.

    Comments: Submitted to 2021 Control and Decision Conference