(Translated by https://www.hiragana.jp/)
Search | arXiv e-print repository
Skip to main content

Showing 1–23 of 23 results for author: Lomeli, M

.
  1. arXiv:2409.08239  [pdf, other

    cs.CL cs.AI

    Source2Synth: Synthetic Data Generation and Curation Grounded in Real Data Sources

    Authors: Alisia Lupidi, Carlos Gemmell, Nicola Cancedda, Jane Dwivedi-Yu, Jason Weston, Jakob Foerster, Roberta Raileanu, Maria Lomeli

    Abstract: Large Language Models still struggle in challenging scenarios that leverage structured data, complex reasoning, or tool usage. In this paper, we propose Source2Synth: a new method that can be used for teaching LLMs new skills without relying on costly human annotations. Source2Synth takes as input a custom data source and produces synthetic data points with intermediate reasoning steps grounded in… ▽ More

    Submitted 12 September, 2024; originally announced September 2024.

  2. arXiv:2402.14158  [pdf, other

    cs.CL

    TOOLVERIFIER: Generalization to New Tools via Self-Verification

    Authors: Dheeraj Mekala, Jason Weston, Jack Lanchantin, Roberta Raileanu, Maria Lomeli, Jingbo Shang, Jane Dwivedi-Yu

    Abstract: Teaching language models to use tools is an important milestone towards building general assistants, but remains an open problem. While there has been significant progress on learning to use specific tools via fine-tuning, language models still struggle with learning how to robustly use new tools from only a few demonstrations. In this work we introduce a self-verification method which distinguish… ▽ More

    Submitted 13 March, 2024; v1 submitted 21 February, 2024; originally announced February 2024.

  3. arXiv:2401.08281  [pdf, other

    cs.LG cs.CV cs.SE

    The Faiss library

    Authors: Matthijs Douze, Alexandr Guzhva, Chengqi Deng, Jeff Johnson, Gergely Szilvasy, Pierre-Emmanuel Mazaré, Maria Lomeli, Lucas Hosseini, Hervé Jégou

    Abstract: Vector databases typically manage large collections of embedding vectors. Currently, AI applications are growing rapidly, and so is the number of embeddings that need to be stored and indexed. The Faiss library is dedicated to vector similarity search, a core functionality of vector databases. Faiss is a toolkit of indexing methods and related primitives used to search, cluster, compress and trans… ▽ More

    Submitted 6 September, 2024; v1 submitted 16 January, 2024; originally announced January 2024.

  4. arXiv:2310.10638  [pdf, other

    cs.CL cs.AI cs.LG

    In-context Pretraining: Language Modeling Beyond Document Boundaries

    Authors: Weijia Shi, Sewon Min, Maria Lomeli, Chunting Zhou, Margaret Li, Gergely Szilvasy, Rich James, Xi Victoria Lin, Noah A. Smith, Luke Zettlemoyer, Scott Yih, Mike Lewis

    Abstract: Large language models (LMs) are currently trained to predict tokens given document prefixes, enabling them to directly perform long-form generation and prompting-style tasks which can be reduced to document completion. Existing pretraining pipelines train LMs by concatenating random sets of short documents to create input contexts but the prior documents provide no signal for predicting the next d… ▽ More

    Submitted 24 June, 2024; v1 submitted 16 October, 2023; originally announced October 2023.

  5. arXiv:2310.01352  [pdf, other

    cs.CL cs.AI

    RA-DIT: Retrieval-Augmented Dual Instruction Tuning

    Authors: Xi Victoria Lin, Xilun Chen, Mingda Chen, Weijia Shi, Maria Lomeli, Rich James, Pedro Rodriguez, Jacob Kahn, Gergely Szilvasy, Mike Lewis, Luke Zettlemoyer, Scott Yih

    Abstract: Retrieval-augmented language models (RALMs) improve performance by accessing long-tail and up-to-date knowledge from external data stores, but are challenging to build. Existing approaches require either expensive retrieval-specific modifications to LM pre-training or use post-hoc integration of the data store that leads to suboptimal performance. We introduce Retrieval-Augmented Dual Instruction… ▽ More

    Submitted 6 May, 2024; v1 submitted 2 October, 2023; originally announced October 2023.

    Comments: v4: ICLR 2024 camera-ready version

  6. arXiv:2302.07842  [pdf, ps, other

    cs.CL

    Augmented Language Models: a Survey

    Authors: Grégoire Mialon, Roberto Dessì, Maria Lomeli, Christoforos Nalmpantis, Ram Pasunuru, Roberta Raileanu, Baptiste Rozière, Timo Schick, Jane Dwivedi-Yu, Asli Celikyilmaz, Edouard Grave, Yann LeCun, Thomas Scialom

    Abstract: This survey reviews works in which language models (LMs) are augmented with reasoning skills and the ability to use tools. The former is defined as decomposing a potentially complex task into simpler subtasks while the latter consists in calling external modules such as a code interpreter. LMs can leverage these augmentations separately or in combination via heuristics, or learn to do so from demo… ▽ More

    Submitted 15 February, 2023; originally announced February 2023.

  7. arXiv:2302.04761  [pdf, other

    cs.CL

    Toolformer: Language Models Can Teach Themselves to Use Tools

    Authors: Timo Schick, Jane Dwivedi-Yu, Roberto Dessì, Roberta Raileanu, Maria Lomeli, Luke Zettlemoyer, Nicola Cancedda, Thomas Scialom

    Abstract: Language models (LMs) exhibit remarkable abilities to solve new tasks from just a few examples or textual instructions, especially at scale. They also, paradoxically, struggle with basic functionality, such as arithmetic or factual lookup, where much simpler and smaller models excel. In this paper, we show that LMs can teach themselves to use external tools via simple APIs and achieve the best of… ▽ More

    Submitted 9 February, 2023; originally announced February 2023.

  8. arXiv:2209.13331  [pdf, other

    cs.CL cs.LG

    EditEval: An Instruction-Based Benchmark for Text Improvements

    Authors: Jane Dwivedi-Yu, Timo Schick, Zhengbao Jiang, Maria Lomeli, Patrick Lewis, Gautier Izacard, Edouard Grave, Sebastian Riedel, Fabio Petroni

    Abstract: Evaluation of text generation to date has primarily focused on content created sequentially, rather than improvements on a piece of text. Writing, however, is naturally an iterative and incremental process that requires expertise in different modular skills such as fixing outdated information or making the style more consistent. Even so, comprehensive evaluation of a model's capacity to perform th… ▽ More

    Submitted 27 September, 2022; originally announced September 2022.

  9. arXiv:2208.03299  [pdf, other

    cs.CL

    Atlas: Few-shot Learning with Retrieval Augmented Language Models

    Authors: Gautier Izacard, Patrick Lewis, Maria Lomeli, Lucas Hosseini, Fabio Petroni, Timo Schick, Jane Dwivedi-Yu, Armand Joulin, Sebastian Riedel, Edouard Grave

    Abstract: Large language models have shown impressive few-shot results on a wide range of tasks. However, when knowledge is key for such results, as is the case for tasks such as question answering and fact checking, massive parameter counts to store knowledge seem to be needed. Retrieval augmented models are known to excel at knowledge intensive tasks without the need for as many parameters, but it is uncl… ▽ More

    Submitted 16 November, 2022; v1 submitted 5 August, 2022; originally announced August 2022.

  10. arXiv:2207.06220  [pdf, other

    cs.IR cs.AI

    Improving Wikipedia Verifiability with AI

    Authors: Fabio Petroni, Samuel Broscheit, Aleksandra Piktus, Patrick Lewis, Gautier Izacard, Lucas Hosseini, Jane Dwivedi-Yu, Maria Lomeli, Timo Schick, Pierre-Emmanuel Mazaré, Armand Joulin, Edouard Grave, Sebastian Riedel

    Abstract: Verifiability is a core content policy of Wikipedia: claims that are likely to be challenged need to be backed by citations. There are millions of articles available online and thousands of new articles are released each month. For this reason, finding relevant sources is a difficult task: many claims do not have any references that support them. Furthermore, even existing citations might not supp… ▽ More

    Submitted 8 July, 2022; originally announced July 2022.

  11. arXiv:2112.03186  [pdf, other

    stat.ME q-bio.PE

    Statistical implications of relaxing the homogeneous mixing assumption in time series Susceptible-Infectious-Removed models

    Authors: Luis D. J. Martinez Lomeli, Michelle N. Ngo, Jon Wakefield, Babak Shahbaba, Vladimir N. Minin

    Abstract: Infectious disease epidemiologists routinely fit stochastic epidemic models to time series data to elucidate infectious disease dynamics, evaluate interventions, and forecast epidemic trajectories. To improve computational tractability, many approximate stochastic models have been proposed. In this paper, we focus on one class of such approximations -- time series Susceptible-Infectious-Removed (T… ▽ More

    Submitted 6 December, 2021; originally announced December 2021.

    Comments: 16 pages of the main text, 6 figures and 3 tables in the main text

  12. arXiv:2004.09065  [pdf, other

    stat.ME q-bio.QM stat.AP

    Optimal Experimental Design for Mathematical Models of Hematopoiesis

    Authors: Luis Martinez Lomeli, Abdon Iniguez, Babak Shahbaba, John S Lowengrub, Vladimir Minin

    Abstract: The hematopoietic system has a highly regulated and complex structure in which cells are organized to successfully create and maintain new blood cells. Feedback regulation is crucial to tightly control this system, but the specific mechanisms by which control is exerted are not completely understood. In this work, we aim to uncover the underlying mechanisms in hematopoiesis by conducting perturbat… ▽ More

    Submitted 30 June, 2020; v1 submitted 20 April, 2020; originally announced April 2020.

  13. arXiv:2001.05895  [pdf, other

    cs.LG stat.ML

    Masking schemes for universal marginalisers

    Authors: Divya Gautam, Maria Lomeli, Kostis Gourgoulias, Daniel H. Thompson, Saurabh Johri

    Abstract: We consider the effect of structure-agnostic and structure-dependent masking schemes when training a universal marginaliser (arXiv:1711.00695) in order to learn conditional distributions of the form $P(x_i |\mathbf x_{\mathbf b})$, where $x_i$ is a given random variable and $\mathbf x_{\mathbf b}$ is some arbitrary subset of all random variables of the generative model of interest. In other words,… ▽ More

    Submitted 16 January, 2020; originally announced January 2020.

    Comments: To be published in Proceedings of the 2nd Symposium on Advances in Approximate Bayesian Inference, 2019

  14. arXiv:1910.07474  [pdf, other

    cs.LG cs.AI stat.ML

    Universal Marginaliser for Deep Amortised Inference for Probabilistic Programs

    Authors: Robert Walecki, Kostis Gourgoulias, Adam Baker, Chris Hart, Chris Lucas, Max Zwiessele, Albert Buchard, Maria Lomeli, Yura Perov, Saurabh Johri

    Abstract: Probabilistic programming languages (PPLs) are powerful modelling tools which allow to formalise our knowledge about the world and reason about its inherent uncertainty. Inference methods used in PPL can be computationally costly due to significant time burden and/or storage requirements; or they can lack theoretical guarantees of convergence and accuracy when applied to large scale graphical mode… ▽ More

    Submitted 16 October, 2019; originally announced October 2019.

  15. arXiv:1910.05692  [pdf, other

    stat.CO

    Deep Markov Chain Monte Carlo

    Authors: Babak Shahbaba, Luis Martinez Lomeli, Tian Chen, Shiwei Lan

    Abstract: We propose a new computationally efficient sampling scheme for Bayesian inference involving high dimensional probability distributions. Our method maps the original parameter space into a low-dimensional latent space, explores the latent space to generate samples, and maps these samples back to the original space for inference. While our method can be used in conjunction with any dimension reducti… ▽ More

    Submitted 13 October, 2019; originally announced October 2019.

  16. arXiv:1811.04727  [pdf, other

    cs.AI cs.LG stat.ML

    Universal Marginalizer for Amortised Inference and Embedding of Generative Models

    Authors: Robert Walecki, Albert Buchard, Kostis Gourgoulias, Chris Hart, Maria Lomeli, A. K. W. Navarro, Max Zwiessele, Yura Perov, Saurabh Johri

    Abstract: Probabilistic graphical models are powerful tools which allow us to formalise our knowledge about the world and reason about its inherent uncertainty. There exist a considerable number of methods for performing inference in probabilistic graphical models; however, they can be computationally costly due to significant time burden and/or storage requirements; or they lack theoretical guarantees of c… ▽ More

    Submitted 12 November, 2018; originally announced November 2018.

  17. arXiv:1808.00385  [pdf, ps, other

    math.CO

    A Note on the Maximum Rectilinear Crossing Number of Spiders

    Authors: Joshua Fallon, Kirsten Hogenson, Lauren Keough, Mario Lomelí, Marcus Schaefer, Pablo Soberón

    Abstract: The maximum rectilinear crossing number of a graph $G$ is the maximum number of crossings in a good straight-line drawing of $G$ in the plane. In a good drawing any two edges intersect in at most one point (counting endpoints), no three edges have an interior point in common, and edges do not contain vertices in their interior. A spider is a subdivision of $K_{1,k}$. We provide both upper and lowe… ▽ More

    Submitted 20 August, 2021; v1 submitted 1 August, 2018; originally announced August 2018.

    Comments: 8 pages

    MSC Class: 05C10; 05C05; 68R10

  18. arXiv:1807.00400  [pdf, other

    stat.ML cs.LG

    Antithetic and Monte Carlo kernel estimators for partial rankings

    Authors: Maria Lomeli, Mark Rowland, Arthur Gretton, Zoubin Ghahramani

    Abstract: In the modern age, rankings data is ubiquitous and it is useful for a variety of applications such as recommender systems, multi-object tracking and preference learning. However, most rankings data encountered in the real world is incomplete, which prevents the direct application of existing modelling tools for complete rankings. Our contribution is a novel way to extend kernel methods for complet… ▽ More

    Submitted 25 July, 2018; v1 submitted 1 July, 2018; originally announced July 2018.

  19. arXiv:1706.03779  [pdf, other

    stat.ML

    General Latent Feature Models for Heterogeneous Datasets

    Authors: Isabel Valera, Melanie F. Pradier, Maria Lomeli, Zoubin Ghahramani

    Abstract: Latent feature modeling allows capturing the latent structure responsible for generating the observed properties of a set of objects. It is often used to make predictions either for new values of interest or missing information in the original data, as well as to perform data exploratory analysis. However, although there is an extensive literature on latent feature models for homogeneous datasets,… ▽ More

    Submitted 8 March, 2018; v1 submitted 12 June, 2017; originally announced June 2017.

    Comments: Software library available at https://github.com/ivaleraM/GLFM

  20. arXiv:1702.08781  [pdf, other

    stat.CO

    General Bayesian inference schemes in infinite mixture models

    Authors: Maria Lomeli

    Abstract: Bayesian statistical models allow us to formalise our knowledge about the world and reason about our uncertainty, but there is a need for better procedures to accurately encode its complexity. One way to do so is through compositional models, which are formed by combining blocks consisting of simpler models. One can increase the complexity of the compositional model by either stacking more blocks… ▽ More

    Submitted 28 February, 2017; originally announced February 2017.

    Comments: Doctoral dissertation, University College London

  21. arXiv:1509.07376  [pdf, other

    stat.CO

    A hybrid sampler for Poisson-Kingman mixture models

    Authors: Maria Lomeli, Stefano Favaro, Yee Whye Teh

    Abstract: This paper concerns the introduction of a new Markov Chain Monte Carlo scheme for posterior sampling in Bayesian nonparametric mixture models with priors that belong to the general Poisson-Kingman class. We present a novel compact way of representing the infinite dimensional component of the model such that while explicitly representing this infinite component it has less memory and storage requir… ▽ More

    Submitted 24 September, 2015; originally announced September 2015.

    Journal ref: NIPS 2015

  22. arXiv:1411.1699   

    math.MG math.CO

    The $2nd-$convex hull of every optimal rectilinear drawing of $K_{n}$ is a triangle

    Authors: J. Leaños, M. Lomeli, M. Ramírez-Ibáñez, L. M. Rivera-Martínez

    Abstract: A rectilinear drawing of a graph $G$ is optimal if it has the smallest number of crossings among all rectilinear drawings of $G$. In this paper it is shown that for $n\geq 8$, the second convex hull of every optimal rectilinear drawing of the complete graph $K_n$ is a triangle.

    Submitted 14 June, 2016; v1 submitted 2 November, 2014; originally announced November 2014.

    Comments: This paper has been withdrawn by the authors due to a crucial error in the proof of the main result

    MSC Class: 05C10; 68R10

  23. A marginal sampler for $σしぐま$-Stable Poisson-Kingman mixture models

    Authors: María Lomelí, Stefano Favaro, Yee Whye Teh

    Abstract: We investigate the class of $σしぐま$-stable Poisson-Kingman random probability measures (RPMs) in the context of Bayesian nonparametric mixture modeling. This is a large class of discrete RPMs which encompasses most of the the popular discrete RPMs used in Bayesian nonparametrics, such as the Dirichlet process, Pitman-Yor process, the normalized inverse Gaussian process and the normalized generalized G… ▽ More

    Submitted 24 September, 2015; v1 submitted 16 July, 2014; originally announced July 2014.

    Comments: New algorithmic performance comparisons were added