Search | arXiv e-print repository

StreamingQA: A Benchmark for Adaptation to New Knowledge over Time in Question Answering Models

Authors: Adam Liška, Tomáš Kočiský, Elena Gribovskaya, Tayfun Terzi, Eren Sezener, Devang Agrawal, Cyprien de Masson d'Autume, Tim Scholtes, Manzil Zaheer, Susannah Young, Ellen Gilsenan-McMahon, Sophia Austin, Phil Blunsom, Angeliki Lazaridou

Abstract: Knowledge and language understanding of models evaluated through question answering (QA) has been usually studied on static snapshots of knowledge, like Wikipedia. However, our world is dynamic, evolves over time, and our models' knowledge becomes outdated. To study how semi-parametric QA models and their underlying parametric language models (LMs) adapt to evolving knowledge, we construct a new l… ▽ More Knowledge and language understanding of models evaluated through question answering (QA) has been usually studied on static snapshots of knowledge, like Wikipedia. However, our world is dynamic, evolves over time, and our models' knowledge becomes outdated. To study how semi-parametric QA models and their underlying parametric language models (LMs) adapt to evolving knowledge, we construct a new large-scale dataset, StreamingQA, with human written and generated questions asked on a given date, to be answered from 14 years of time-stamped news articles. We evaluate our models quarterly as they read new articles not seen in pre-training. We show that parametric models can be updated without full retraining, while avoiding catastrophic forgetting. For semi-parametric models, adding new articles into the search space allows for rapid adaptation, however, models with an outdated underlying LM under-perform those with a retrained LM. For questions about higher-frequency named entities, parametric updates are particularly beneficial. In our dynamic world, the StreamingQA dataset enables a more realistic evaluation of QA models, and our experiments highlight several promising directions for future research. △ Less

Submitted 23 May, 2022; originally announced May 2022.

arXiv:2102.01951 [pdf, other]

Mind the Gap: Assessing Temporal Generalization in Neural Language Models

Authors: Angeliki Lazaridou, Adhiguna Kuncoro, Elena Gribovskaya, Devang Agrawal, Adam Liska, Tayfun Terzi, Mai Gimenez, Cyprien de Masson d'Autume, Tomas Kocisky, Sebastian Ruder, Dani Yogatama, Kris Cao, Susannah Young, Phil Blunsom

Abstract: Our world is open-ended, non-stationary, and constantly evolving; thus what we talk about and how we talk about it change over time. This inherent dynamic nature of language contrasts with the current static language modelling paradigm, which trains and evaluates models on utterances from overlapping time periods. Despite impressive recent progress, we demonstrate that Transformer-XL language mode… ▽ More Our world is open-ended, non-stationary, and constantly evolving; thus what we talk about and how we talk about it change over time. This inherent dynamic nature of language contrasts with the current static language modelling paradigm, which trains and evaluates models on utterances from overlapping time periods. Despite impressive recent progress, we demonstrate that Transformer-XL language models perform worse in the realistic setup of predicting future utterances from beyond their training period, and that model performance becomes increasingly worse with time. We find that, while increasing model size alone -- a key driver behind recent progress -- does not solve this problem, having models that continually update their knowledge with new information can indeed mitigate this performance degradation over time. Hence, given the compilation of ever-larger language modelling datasets, combined with the growing list of language-model-based NLP applications that require up-to-date factual knowledge about the world, we argue that now is the right time to rethink the static way in which we currently train and evaluate our language models, and develop adaptive language models that can remain up-to-date with respect to our ever-changing and non-stationary world. We publicly release our dynamic, streaming language modelling benchmarks for WMT and arXiv to facilitate language model evaluation that takes temporal dynamics into account. △ Less

Submitted 26 October, 2021; v1 submitted 3 February, 2021; originally announced February 2021.

Comments: To appear as a Spotlight at NeurIPS 2021

arXiv:1802.06467 [pdf, other]

Memorize or generalize? Searching for a compositional RNN in a haystack

Authors: Adam Liška, Germán Kruszewski, Marco Baroni

Abstract: Neural networks are very powerful learning systems, but they do not readily generalize from one task to the other. This is partly due to the fact that they do not learn in a compositional way, that is, by discovering skills that are shared by different tasks, and recombining them to solve new problems. In this paper, we explore the compositional generalization capabilities of recurrent neural netw… ▽ More Neural networks are very powerful learning systems, but they do not readily generalize from one task to the other. This is partly due to the fact that they do not learn in a compositional way, that is, by discovering skills that are shared by different tasks, and recombining them to solve new problems. In this paper, we explore the compositional generalization capabilities of recurrent neural networks (RNNs). We first propose the lookup table composition domain as a simple setup to test compositional behaviour and show that it is theoretically possible for a standard RNN to learn to behave compositionally in this domain when trained with standard gradient descent and provided with additional supervision. We then remove this additional supervision and perform a search over a large number of model initializations to investigate the proportion of RNNs that can still converge to a compositional solution. We discover that a small but non-negligible proportion of RNNs do reach partial compositional solutions even without special architectural constraints. This suggests that a combination of gradient descent and evolutionary strategies directly favouring the minority models that developed more compositional approaches might suffice to lead standard RNNs towards compositional solutions. △ Less

Submitted 25 July, 2018; v1 submitted 18 February, 2018; originally announced February 2018.

Comments: AEGAP Workshop (ICML 2018)

arXiv:1712.08041 [pdf, other]

Autism Classification Using Brain Functional Connectivity Dynamics and Machine Learning

Authors: Ravi Tejwani, Adam Liska, Hongyuan You, Jenna Reinen, Payel Das

Abstract: The goal of the present study is to identify autism using machine learning techniques and resting-state brain imaging data, leveraging the temporal variability of the functional connections (FC) as the only information. We estimated and compared the FC variability across brain regions between typical, healthy subjects and autistic population by analyzing brain imaging data from a world-wide multi-… ▽ More The goal of the present study is to identify autism using machine learning techniques and resting-state brain imaging data, leveraging the temporal variability of the functional connections (FC) as the only information. We estimated and compared the FC variability across brain regions between typical, healthy subjects and autistic population by analyzing brain imaging data from a world-wide multi-site database known as ABIDE (Autism Brain Imaging Data Exchange). Our analysis revealed that patients diagnosed with autism spectrum disorder (ASD) show increased FC variability in several brain regions that are associated with low FC variability in the typical brain. We then used the enhanced FC variability of brain regions as features for training machine learning models for ASD classification and achieved 65% accuracy in identification of ASD versus control subjects within the dataset. We also used node strength estimated from number of functional connections per node averaged over the whole scan as features for ASD classification.The results reveal that the dynamic FC measures outperform or are comparable with the static FC measures in predicting ASD. △ Less

Submitted 21 December, 2017; originally announced December 2017.

arXiv:1501.02714 [pdf, other]

From Visual Attributes to Adjectives through Decompositional Distributional Semantics

Authors: Angeliki Lazaridou, Georgiana Dinu, Adam Liska, Marco Baroni

Abstract: As automated image analysis progresses, there is increasing interest in richer linguistic annotation of pictures, with attributes of objects (e.g., furry, brown...) attracting most attention. By building on the recent "zero-shot learning" approach, and paying attention to the linguistic nature of attributes as noun modifiers, and specifically adjectives, we show that it is possible to tag images w… ▽ More As automated image analysis progresses, there is increasing interest in richer linguistic annotation of pictures, with attributes of objects (e.g., furry, brown...) attracting most attention. By building on the recent "zero-shot learning" approach, and paying attention to the linguistic nature of attributes as noun modifiers, and specifically adjectives, we show that it is possible to tag images with attribute-denoting adjectives even when no training data containing the relevant annotation are available. Our approach relies on two key observations. First, objects can be seen as bundles of attributes, typically expressed as adjectival modifiers (a dog is something furry, brown, etc.), and thus a function trained to map visual representations of objects to nominal labels can implicitly learn to map attributes to adjectives. Second, objects and attributes come together in pictures (the same thing is a dog and it is brown). We can thus achieve better attribute (and object) label retrieval by treating images as "visual phrases", and decomposing their linguistic representation into an attribute-denoting adjective and an object-denoting noun. Our approach performs comparably to a method exploiting manual attribute annotation, it outperforms various competitive alternatives in both attribute and object annotation, and it automatically constructs attribute-centric representations that significantly improve performance in supervised object recognition. △ Less

Submitted 24 March, 2015; v1 submitted 12 January, 2015; originally announced January 2015.

Comments: accepted at Transactions of the Association for Computational Linguistics (TACL), 3/2015

arXiv:1209.4108 [pdf, ps, other]

doi 10.1088/0004-637X/768/1/85

The Pulsar Search Collaboratory: Discovery and Timing of Five New Pulsars

Authors: R. Rosen, J. Swiggum, M. A. McLaughlin, D. R. Lorimer, M. Yun, S. Heatherly, J. Boyles, R. Lynch, V. I. Kondratiev, S. Scoles, S. M. Ransom, M. L. Moniot, A. Cottrill, M. Weaver, A. Snider, C. Thompson, M. Raycraft, J. Dudenhoefer, L. Allphin, J. Thorley, B. Meadows, G. Marchiny, A. Liska, A. M. O'Dwyer, B. Butler , et al. (46 additional authors not shown)

Abstract: We present the discovery and timing solutions of five new pulsars by students involved in the Pulsar Search Collaboratory (PSC), a NSF-funded joint program between the National Radio Astronomy Observatory and West Virginia University designed to excite and engage high-school students in Science, Technology, Engineering, and Mathematics (STEM) and related fields. We encourage students to pursue STE… ▽ More We present the discovery and timing solutions of five new pulsars by students involved in the Pulsar Search Collaboratory (PSC), a NSF-funded joint program between the National Radio Astronomy Observatory and West Virginia University designed to excite and engage high-school students in Science, Technology, Engineering, and Mathematics (STEM) and related fields. We encourage students to pursue STEM fields by apprenticing them within a professional scientific community doing cutting edge research, specifically by teaching them to search for pulsars. The students are analyzing 300 hours of drift-scan survey data taken with the Green Bank Telescope at 350 MHz. These data cover 2876 square degrees of the sky. Over the course of five years, more than 700 students have inspected diagnostic plots through a web-based graphical interface designed for this project. The five pulsars discovered in the data have spin periods ranging from 3.1 ms to 4.8 s. Among the new discoveries are - PSR J1926-1314, a long period, nulling pulsar; PSR J1821+0155, an isolated, partially recycled 33-ms pulsar; and PSR J1400-1438, a millisecond pulsar in a 9.5-day orbit whose companion is likely a white dwarf star. △ Less

Submitted 14 February, 2013; v1 submitted 18 September, 2012; originally announced September 2012.

Journal ref: 2013 ApJ, 768, 85

arXiv:1206.2895 [pdf, ps, other]

doi 10.1088/0004-637X/759/2/127

Discovery of Five New Pulsars in Archival Data

Authors: Mitchell B. Mickaliger, Duncan R. Lorimer, Jason Boyles, Maura A. McLaughlin, Adam Collins, Logan Hough, Nathan Tehrani, Craig Tenney, April Liska, Joseph Swiggum

Abstract: Reprocessing of the Parkes Multibeam Pulsar Survey has resulted in the discovery of five previously unknown pulsars and several as-yet-unconfirmed candidates. PSR J0922-52 has a period of 9.68 ms and a DM of 122.4 pc cm^-3. PSR J1147-66 has a period of 3.72 ms and a DM of 133.8 pc cm^-3. PSR J1227-6208 has a period of 34.53 ms, a DM of 362.6 pc cm^-3, is in a 6.7 day binary orbit, and was independ… ▽ More Reprocessing of the Parkes Multibeam Pulsar Survey has resulted in the discovery of five previously unknown pulsars and several as-yet-unconfirmed candidates. PSR J0922-52 has a period of 9.68 ms and a DM of 122.4 pc cm^-3. PSR J1147-66 has a period of 3.72 ms and a DM of 133.8 pc cm^-3. PSR J1227-6208 has a period of 34.53 ms, a DM of 362.6 pc cm^-3, is in a 6.7 day binary orbit, and was independently detected in an ongoing high-resolution Parkes survey by Thornton et al. and also in independent processing by Einstein@Home volunteers. PSR J1546-59 has a period of 7.80 ms and a DM of 168.3 pc cm^-3. PSR J1725-3853 is an isolated 4.79-ms pulsar with a DM of 158.2 pc cm^-3. These pulsars were likely missed in earlier processing efforts due to their high DMs and short periods and the large number of candidates that needed to be looked through. These discoveries suggest that further pulsars are awaiting discovery in the multibeam survey data. △ Less

Submitted 11 May, 2013; v1 submitted 13 June, 2012; originally announced June 2012.

Comments: 12 pages, 2 figures, 2 tables, accepted to ApJ

Journal ref: The Astrophysical Journal, 759:127, 2012

Showing 1–7 of 7 results for author: Liska, A