Retro-li: Small-Scale Retrieval Augmented Generation Supporting Noisy Similarity Searches and Domain Shift Generalization

Rashiti, Gentiana; Karunaratne, Geethan; Sachan, Mrinmaya; Sebastian, Abu; Rahimi, Abbas

doi:10.3233/FAIA240837

Computer Science > Information Retrieval

arXiv:2410.00004 (cs)

[Submitted on 12 Sep 2024 (v1), last revised 26 Mar 2025 (this version, v2)]

Title:Retro-li: Small-Scale Retrieval Augmented Generation Supporting Noisy Similarity Searches and Domain Shift Generalization

Authors:Gentiana Rashiti, Geethan Karunaratne, Mrinmaya Sachan, Abu Sebastian, Abbas Rahimi

View PDF HTML (experimental)

Abstract:The retrieval augmented generation (RAG) system such as Retro has been shown to improve language modeling capabilities and reduce toxicity and hallucinations by retrieving from a database of non-parametric memory containing trillions of entries. We introduce Retro-li that shows retrieval can also help using a small-scale database, but it demands more accurate and better neighbors when searching in a smaller hence sparser non-parametric memory. This can be met by using a proper semantic similarity search. We further propose adding a regularization to the non-parametric memory for the first time: it significantly reduces perplexity when the neighbor search operations are noisy during inference, and it improves generalization when a domain shift occurs. We also show that Retro-li's non-parametric memory can potentially be implemented on analog in-memory computing hardware, exhibiting O(1) search time while causing noise in retrieving neighbors, with minimal (<1%) performance loss. Our code is available at: this https URL.

Subjects:	Information Retrieval (cs.IR); Artificial Intelligence (cs.AI); Computation and Language (cs.CL)
Cite as:	arXiv:2410.00004 [cs.IR]
	(or arXiv:2410.00004v2 [cs.IR] for this version)
	https://doi.org/10.48550/arXiv.2410.00004
Journal reference:	Published in: Proceedings of 27TH EUROPEAN CONFERENCE ON ARTIFICIAL INTELLIGENCE, IOS Press, 392, 2024, pp. 2974 - 2982
Related DOI:	https://doi.org/10.3233/FAIA240837

Submission history

From: Geethan Karunaratne [view email]
[v1] Thu, 12 Sep 2024 23:29:33 UTC (769 KB)
[v2] Wed, 26 Mar 2025 10:27:15 UTC (769 KB)

Computer Science > Information Retrieval

Title:Retro-li: Small-Scale Retrieval Augmented Generation Supporting Noisy Similarity Searches and Domain Shift Generalization

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Information Retrieval

Title:Retro-li: Small-Scale Retrieval Augmented Generation Supporting Noisy Similarity Searches and Domain Shift Generalization

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators