(Translated by https://www.hiragana.jp/)
Search | arXiv e-print repository
Skip to main content

Showing 1–50 of 94 results for author: Durrett, G

Searching in archive cs. Search in all archives.
.
  1. arXiv:2406.01563  [pdf, other

    cs.CL

    LoFiT: Localized Fine-tuning on LLM Representations

    Authors: Fangcong Yin, Xi Ye, Greg Durrett

    Abstract: Recent work in interpretability shows that large language models (LLMs) can be adapted for new tasks in a learning-free way: it is possible to intervene on LLM representations to elicit desired behaviors for alignment. For instance, adding certain bias vectors to the outputs of certain attention heads is reported to boost the truthfulness of models. In this work, we show that localized fine-tuning… ▽ More

    Submitted 3 June, 2024; originally announced June 2024.

  2. arXiv:2405.10040  [pdf, other

    cs.CL cs.AI cs.LG

    SynthesizRR: Generating Diverse Datasets with Retrieval Augmentation

    Authors: Abhishek Divekar, Greg Durrett

    Abstract: Large language models (LLMs) are versatile and can address many tasks, but for computational efficiency, it is often desirable to distill their capabilities into smaller student models. One way to do this for classification tasks is via dataset synthesis, which can be accomplished by generating examples of each label from the LLM. Prior approaches to synthesis use few-shot prompting, which relies… ▽ More

    Submitted 16 May, 2024; originally announced May 2024.

  3. arXiv:2405.01511  [pdf, other

    cs.CL

    D2PO: Discriminator-Guided DPO with Response Evaluation Models

    Authors: Prasann Singhal, Nathan Lambert, Scott Niekum, Tanya Goyal, Greg Durrett

    Abstract: Varied approaches for aligning language models have been proposed, including supervised fine-tuning, RLHF, and direct optimization methods such as DPO. Although DPO has rapidly gained popularity due to its straightforward training process and competitive results, there is an open question of whether there remain practical advantages of using a discriminator, like a reward model, to evaluate respon… ▽ More

    Submitted 2 May, 2024; originally announced May 2024.

    Comments: 20 pages, 12 figures

  4. arXiv:2404.10917  [pdf, other

    cs.CL

    Which questions should I answer? Salience Prediction of Inquisitive Questions

    Authors: Yating Wu, Ritika Mangla, Alexandros G. Dimakis, Greg Durrett, Junyi Jessy Li

    Abstract: Inquisitive questions -- open-ended, curiosity-driven questions people ask as they read -- are an integral part of discourse processing (Kehler and Rohde, 2017; Onea, 2016) and comprehension (Prince, 2004). Recent work in NLP has taken advantage of question generation capabilities of LLMs to enhance a wide range of applications. But the space of inquisitive questions is vast: many questions can be… ▽ More

    Submitted 16 April, 2024; originally announced April 2024.

  5. arXiv:2404.10774  [pdf, other

    cs.CL cs.AI

    MiniCheck: Efficient Fact-Checking of LLMs on Grounding Documents

    Authors: Liyan Tang, Philippe Laban, Greg Durrett

    Abstract: Recognizing if LLM output can be grounded in evidence is central to many tasks in NLP: retrieval-augmented generation, summarization, document-grounded dialogue, and more. Current approaches to this kind of "fact-checking" are based on verifying each piece of a model generation against potential evidence using an LLM. However, this process can be very computationally expensive, requiring many call… ▽ More

    Submitted 16 April, 2024; originally announced April 2024.

    Comments: LLM-AggreFact benchmark, MiniCheck models, data generation code at https://github.com/Liyan06/MiniCheck

  6. arXiv:2312.04510  [pdf, other

    cs.CL cs.LG

    A Block Metropolis-Hastings Sampler for Controllable Energy-based Text Generation

    Authors: Jarad Forristal, Niloofar Mireshghallah, Greg Durrett, Taylor Berg-Kirkpatrick

    Abstract: Recent work has shown that energy-based language modeling is an effective framework for controllable text generation because it enables flexible integration of arbitrary discriminators. However, because energy-based LMs are globally normalized, approximate techniques like Metropolis-Hastings (MH) are required for inference. Past work has largely explored simple proposal distributions that modify a… ▽ More

    Submitted 7 December, 2023; originally announced December 2023.

  7. arXiv:2310.16049  [pdf, other

    cs.CL

    MuSR: Testing the Limits of Chain-of-thought with Multistep Soft Reasoning

    Authors: Zayne Sprague, Xi Ye, Kaj Bostrom, Swarat Chaudhuri, Greg Durrett

    Abstract: While large language models (LLMs) equipped with techniques like chain-of-thought prompting have demonstrated impressive capabilities, they still fall short in their ability to reason robustly in complex settings. However, evaluating LLM reasoning is challenging because system capabilities continue to grow while benchmark datasets for tasks like logical deduction have remained static. We introduce… ▽ More

    Submitted 23 March, 2024; v1 submitted 24 October, 2023; originally announced October 2023.

    Journal ref: ICLR 2024 (Spotlight)

  8. arXiv:2310.14520  [pdf, other

    cs.CL

    QUDEVAL: The Evaluation of Questions Under Discussion Discourse Parsing

    Authors: Yating Wu, Ritika Mangla, Greg Durrett, Junyi Jessy Li

    Abstract: Questions Under Discussion (QUD) is a versatile linguistic framework in which discourse progresses as continuously asking questions and answering them. Automatic parsing of a discourse to produce a QUD structure thus entails a complex question generation task: given a document and an answer sentence, generate a question that satisfies linguistic constraints of QUD and can be grounded in an anchor… ▽ More

    Submitted 1 November, 2023; v1 submitted 22 October, 2023; originally announced October 2023.

    Comments: Camera Ready for EMNLP Main Conference

  9. arXiv:2310.03716  [pdf, other

    cs.CL cs.LG

    A Long Way to Go: Investigating Length Correlations in RLHF

    Authors: Prasann Singhal, Tanya Goyal, Jiacheng Xu, Greg Durrett

    Abstract: Great successes have been reported using Reinforcement Learning from Human Feedback (RLHF) to align large language models. Open-source preference datasets and reward models have enabled wider experimentation beyond generic chat settings, particularly to make systems more "helpful" for tasks like web question answering, summarization, and multi-turn dialogue. When optimizing for helpfulness, RLHF h… ▽ More

    Submitted 5 October, 2023; originally announced October 2023.

    Comments: 20 pages, 12 figures

  10. arXiv:2309.08873  [pdf, other

    cs.CL

    X-PARADE: Cross-Lingual Textual Entailment and Information Divergence across Paragraphs

    Authors: Juan Diego Rodriguez, Katrin Erk, Greg Durrett

    Abstract: Understanding when two pieces of text convey the same information is a goal touching many subproblems in NLP, including textual entailment and fact-checking. This problem becomes more complex when those two pieces of text are in different languages. Here, we introduce X-PARADE (Cross-lingual Paragraph-level Analysis of Divergences and Entailments), the first cross-lingual dataset of paragraph-leve… ▽ More

    Submitted 15 April, 2024; v1 submitted 16 September, 2023; originally announced September 2023.

    Comments: To be published in NAACL 2024

  11. arXiv:2307.02472  [pdf, other

    cs.CL cs.AI

    Deductive Additivity for Planning of Natural Language Proofs

    Authors: Zayne Sprague, Kaj Bostrom, Swarat Chaudhuri, Greg Durrett

    Abstract: Current natural language systems designed for multi-step claim validation typically operate in two phases: retrieve a set of relevant premise statements using heuristics (planning), then generate novel conclusions from those statements using a large language model (deduction). The planning step often requires expensive Transformer operations and does not scale to arbitrary numbers of premise state… ▽ More

    Submitted 5 July, 2023; v1 submitted 5 July, 2023; originally announced July 2023.

  12. arXiv:2306.09306  [pdf, other

    cs.CL

    Propagating Knowledge Updates to LMs Through Distillation

    Authors: Shankar Padmanabhan, Yasumasa Onoe, Michael J. Q. Zhang, Greg Durrett, Eunsol Choi

    Abstract: Modern language models have the capacity to store and use immense amounts of knowledge about real-world entities, but it remains unclear how to update such knowledge stored in model parameters. While prior methods for updating knowledge in LMs successfully inject atomic facts, updated LMs fail to make inferences based on injected facts. In this work, we demonstrate that a context distillation-base… ▽ More

    Submitted 30 October, 2023; v1 submitted 15 June, 2023; originally announced June 2023.

    Comments: NeurIPS 2023 Camera Ready

  13. arXiv:2306.00947  [pdf, other

    cs.CL cs.AI cs.LG

    EEL: Efficiently Encoding Lattices for Reranking

    Authors: Prasann Singhal, Jiacheng Xu, Xi Ye, Greg Durrett

    Abstract: Standard decoding approaches for conditional text generation tasks typically search for an output hypothesis with high model probability, but this may not yield the best hypothesis according to human judgments of quality. Reranking to optimize for "downstream" metrics can better optimize for quality, but many metrics of interest are computed with pre-trained language models, which are slow to appl… ▽ More

    Submitted 1 June, 2023; originally announced June 2023.

    Comments: ACL 2023 (16 pages), code available at https://github.com/PrasannS/eel-reranking

  14. arXiv:2305.19339  [pdf, other

    cs.CL cs.AI

    Less Likely Brainstorming: Using Language Models to Generate Alternative Hypotheses

    Authors: Liyan Tang, Yifan Peng, Yanshan Wang, Ying Ding, Greg Durrett, Justin F. Rousseau

    Abstract: A human decision-maker benefits the most from an AI assistant that corrects for their biases. For problems such as generating interpretation of a radiology report given findings, a system predicting only highly likely outcomes may be less useful, where such outcomes are already obvious to the user. To alleviate biases in human decision-making, it is worth considering a broad differential diagnosis… ▽ More

    Submitted 30 May, 2023; originally announced May 2023.

    Comments: Accepted to ACL (Findings) 2023

  15. arXiv:2305.18584  [pdf, other

    cs.SE cs.LG cs.PL

    Coeditor: Leveraging Contextual Changes for Multi-round Code Auto-editing

    Authors: Jiayi Wei, Greg Durrett, Isil Dillig

    Abstract: Developers often dedicate significant time to maintaining and refactoring existing code. However, most prior work on generative models for code focuses solely on creating new code, overlooking the distinctive needs of editing existing code. In this work, we explore a multi-round code auto-editing setting, aiming to predict edits to a code region based on recent changes within the same codebase. Ou… ▽ More

    Submitted 28 April, 2024; v1 submitted 29 May, 2023; originally announced May 2023.

    Comments: The Twelfth International Conference on Learning Representations (2024)

  16. arXiv:2305.14847  [pdf, other

    cs.CL

    Drafting Event Schemas using Language Models

    Authors: Anisha Gunjal, Greg Durrett

    Abstract: Past work has studied event prediction and event language modeling, sometimes mediated through structured representations of knowledge in the form of event schemas. Such schemas can lead to explainable predictions and forecasting of unseen events given incomplete information. In this work, we look at the process of creating such schemas to describe complex events. We use large language models (LLM… ▽ More

    Submitted 24 May, 2023; originally announced May 2023.

  17. arXiv:2305.14770  [pdf, other

    cs.CL

    Using Natural Language Explanations to Rescale Human Judgments

    Authors: Manya Wadhwa, Jifan Chen, Junyi Jessy Li, Greg Durrett

    Abstract: The rise of large language models (LLMs) has brought a critical need for high-quality human-labeled data, particularly for processes like human feedback and evaluation. A common practice is to label data via consensus annotation over crowdworker judgments. However, annotators' judgments for subjective tasks can differ in many ways: they may have different qualitative judgments about an example, an… ▽ More

    Submitted 14 November, 2023; v1 submitted 24 May, 2023; originally announced May 2023.

    Comments: Data available at https://github.com/ManyaWadhwa/explanation_based_rescaling

  18. arXiv:2305.11859  [pdf, other

    cs.CL

    Complex Claim Verification with Evidence Retrieved in the Wild

    Authors: Jifan Chen, Grace Kim, Aniruddh Sriram, Greg Durrett, Eunsol Choi

    Abstract: Evidence retrieval is a core part of automatic fact-checking. Prior work makes simplifying assumptions in retrieval that depart from real-world use cases: either no access to evidence, access to evidence curated by a human fact-checker, or access to evidence available long after the claim has been made. In this work, we present the first fully automated pipeline to check real-world claims by retri… ▽ More

    Submitted 19 May, 2023; originally announced May 2023.

  19. arXiv:2305.10401  [pdf, other

    cs.PL

    Data Extraction via Semantic Regular Expression Synthesis

    Authors: Qiaochu Chen, Arko Banerjee, Çağatay Demiralp, Greg Durrett, Isil Dillig

    Abstract: Many data extraction tasks of practical relevance require not only syntactic pattern matching but also semantic reasoning about the content of the underlying text. While regular expressions are very well suited for tasks that require only syntactic pattern matching, they fall short for data extraction tasks that involve both a syntactic and semantic component. To address this issue, we introduce s… ▽ More

    Submitted 24 August, 2023; v1 submitted 17 May, 2023; originally announced May 2023.

  20. arXiv:2305.09656  [pdf, other

    cs.CL cs.AI

    SatLM: Satisfiability-Aided Language Models Using Declarative Prompting

    Authors: Xi Ye, Qiaochu Chen, Isil Dillig, Greg Durrett

    Abstract: Prior work has combined chain-of-thought prompting in large language models (LLMs) with programmatic representations to perform effective and transparent reasoning. While such an approach works well for tasks that only require forward reasoning (e.g., straightforward arithmetic), it is less effective for constraint solving problems that require more sophisticated planning and search. In this paper… ▽ More

    Submitted 11 October, 2023; v1 submitted 16 May, 2023; originally announced May 2023.

    Comments: NeurIPS 2023

  21. arXiv:2305.01651  [pdf, other

    cs.CL

    Can LMs Learn New Entities from Descriptions? Challenges in Propagating Injected Knowledge

    Authors: Yasumasa Onoe, Michael J. Q. Zhang, Shankar Padmanabhan, Greg Durrett, Eunsol Choi

    Abstract: Pre-trained language models (LMs) are used for knowledge intensive tasks like question answering, but their knowledge gets continuously outdated as the world changes. Prior work has studied targeted updates to LMs, injecting individual facts and evaluating whether the model learns these facts while not changing predictions on other contexts. We take a step forward and study LMs' abilities to make… ▽ More

    Submitted 2 May, 2023; originally announced May 2023.

    Comments: ACL 2023

  22. arXiv:2303.09564  [pdf, other

    cs.SE cs.LG cs.PL

    TypeT5: Seq2seq Type Inference using Static Analysis

    Authors: Jiayi Wei, Greg Durrett, Isil Dillig

    Abstract: There has been growing interest in automatically predicting missing type annotations in programs written in Python and JavaScript. While prior methods have achieved impressive accuracy when predicting the most common types, they often perform poorly on rare or complex types. In this paper, we present a new type inference method that treats type prediction as a code infilling task by leveraging Cod… ▽ More

    Submitted 16 March, 2023; originally announced March 2023.

    Comments: Published as a conference paper at ICLR 2023

  23. arXiv:2303.01432  [pdf, other

    cs.CL

    WiCE: Real-World Entailment for Claims in Wikipedia

    Authors: Ryo Kamoi, Tanya Goyal, Juan Diego Rodriguez, Greg Durrett

    Abstract: Textual entailment models are increasingly applied in settings like fact-checking, presupposition verification in question answering, or summary evaluation. However, these represent a significant domain shift from existing entailment datasets, and models underperform as a result. We propose WiCE, a new fine-grained textual entailment dataset built on natural claim and evidence pairs extracted from… ▽ More

    Submitted 22 October, 2023; v1 submitted 2 March, 2023; originally announced March 2023.

    Comments: EMNLP 2023

  24. arXiv:2302.07139  [pdf, other

    cs.CL

    Modeling Complex Event Scenarios via Simple Entity-focused Questions

    Authors: Mahnaz Koupaee, Greg Durrett, Nathanael Chambers, Niranjan Balasubramanian

    Abstract: Event scenarios are often complex and involve multiple event sequences connected through different entity participants. Exploring such complex scenarios requires an ability to branch through different sequences, something that is difficult to achieve with standard event language modeling. To address this, we propose a question-guided generation framework that models events in complex scenarios as… ▽ More

    Submitted 14 February, 2023; originally announced February 2023.

    Comments: To be published in proceedings of EACL 2023

  25. arXiv:2302.04813  [pdf, other

    cs.CL

    Explanation Selection Using Unlabeled Data for Chain-of-Thought Prompting

    Authors: Xi Ye, Greg Durrett

    Abstract: Recent work has shown how to prompt large language models with explanations to obtain strong performance on textual reasoning tasks, i.e., the chain-of-thought paradigm. However, subtly different explanations can yield widely varying downstream task accuracy. Explanations that have not been "tuned" for a task, such as off-the-shelf explanations written by nonexperts, may lead to mediocre performan… ▽ More

    Submitted 18 October, 2023; v1 submitted 9 February, 2023; originally announced February 2023.

    Comments: EMNLP 2023

  26. arXiv:2211.15914  [pdf, other

    cs.CL

    Prompted Opinion Summarization with GPT-3.5

    Authors: Adithya Bhaskar, Alexander R. Fabbri, Greg Durrett

    Abstract: Large language models have shown impressive performance across a wide variety of tasks, including text summarization. In this paper, we show that this strong performance extends to opinion summarization. We explore several pipeline methods for applying GPT-3.5 to summarize a large collection of user reviews in a prompted fashion. To handle arbitrarily large numbers of user reviews, we explore recu… ▽ More

    Submitted 23 May, 2023; v1 submitted 28 November, 2022; originally announced November 2022.

    Comments: Accepted to ACL (Findings) 2023

  27. arXiv:2211.13892  [pdf, other

    cs.CL

    Complementary Explanations for Effective In-Context Learning

    Authors: Xi Ye, Srinivasan Iyer, Asli Celikyilmaz, Ves Stoyanov, Greg Durrett, Ramakanth Pasunuru

    Abstract: Large language models (LLMs) have exhibited remarkable capabilities in learning from explanations in prompts, but there has been limited understanding of exactly how these explanations function or why they are effective. This work aims to better understand the mechanisms by which explanations are used for in-context learning. We first study the impact of two different factors on the performance of… ▽ More

    Submitted 12 June, 2023; v1 submitted 24 November, 2022; originally announced November 2022.

    Comments: ACL Findings 2023 Camera-Ready

  28. arXiv:2211.00614  [pdf, other

    cs.CL

    Natural Language Deduction with Incomplete Information

    Authors: Zayne Sprague, Kaj Bostrom, Swarat Chaudhuri, Greg Durrett

    Abstract: A growing body of work studies how to answer a question or verify a claim by generating a natural language "proof": a chain of deductive inferences yielding the answer based on a set of premises. However, these methods can only make sound deductions when they follow from evidence that is given. We propose a new system that can handle the underspecified setting where not all premises are stated at… ▽ More

    Submitted 1 November, 2022; originally announced November 2022.

    Comments: Conference of EMNLP 2022

  29. arXiv:2210.06748  [pdf, other

    cs.CL

    Shortcomings of Question Answering Based Factuality Frameworks for Error Localization

    Authors: Ryo Kamoi, Tanya Goyal, Greg Durrett

    Abstract: Despite recent progress in abstractive summarization, models often generate summaries with factual errors. Numerous approaches to detect these errors have been proposed, the most popular of which are question answering (QA)-based factuality metrics. These have been shown to work well at predicting summary-level factuality and have potential to localize errors within summaries, but this latter capa… ▽ More

    Submitted 11 February, 2023; v1 submitted 13 October, 2022; originally announced October 2022.

    Comments: EACL 2023

  30. arXiv:2210.06725  [pdf, other

    cs.CL

    Assessing Out-of-Domain Language Model Performance from Few Examples

    Authors: Prasann Singhal, Jarad Forristal, Xi Ye, Greg Durrett

    Abstract: While pretrained language models have exhibited impressive generalization capabilities, they still behave unpredictably under certain domain shifts. In particular, a model may learn a reasoning process on in-domain training data that does not hold for out-of-domain test data. We address the task of predicting out-of-domain (OOD) performance in a few-shot fashion: given a few target-domain examples… ▽ More

    Submitted 13 October, 2022; originally announced October 2022.

  31. arXiv:2210.05905  [pdf, other

    cs.CL

    Discourse Analysis via Questions and Answers: Parsing Dependency Structures of Questions Under Discussion

    Authors: Wei-Jen Ko, Yating Wu, Cutter Dalton, Dananjay Srinivas, Greg Durrett, Junyi Jessy Li

    Abstract: Automatic discourse processing is bottlenecked by data: current discourse formalisms pose highly demanding annotation tasks involving large taxonomies of discourse relations, making them inaccessible to lay annotators. This work instead adopts the linguistic framework of Questions Under Discussion (QUD) for discourse analysis and seeks to derive QUD structures automatically. QUD views each sentenc… ▽ More

    Submitted 12 May, 2023; v1 submitted 11 October, 2022; originally announced October 2022.

    Comments: Findings of ACL 2023

  32. arXiv:2209.12356  [pdf, other

    cs.CL

    News Summarization and Evaluation in the Era of GPT-3

    Authors: Tanya Goyal, Junyi Jessy Li, Greg Durrett

    Abstract: The recent success of prompting large language models like GPT-3 has led to a paradigm shift in NLP research. In this paper, we study its impact on text summarization, focusing on the classic benchmark domain of news summarization. First, we investigate how GPT-3 compares against fine-tuned models trained on large summarization datasets. We show that not only do humans overwhelmingly prefer GPT-3… ▽ More

    Submitted 23 May, 2023; v1 submitted 25 September, 2022; originally announced September 2022.

    Comments: All data shared at: https://tagoyal.github.io/zeroshot-news-annotations.html

  33. arXiv:2209.01081  [pdf, other

    cs.PL

    Type-Directed Synthesis of Visualizations from Natural Language Queries

    Authors: Qiaochu Chen, Shankara Pailoor, Celeste Barnaby, Abby Criswell, Chenglong Wang, Greg Durrett, Isil Dillig

    Abstract: We propose a new technique based on program synthesis for automatically generating visualizations from natural language queries. Our method parses the natural language query into a refinement type specification using the intents-and-slots paradigm and leverages type-directed synthesis to generate a set of visualization programs that are most likely to meet the user's intent. Our refinement type sy… ▽ More

    Submitted 2 September, 2022; originally announced September 2022.

    Comments: 39 pages

  34. arXiv:2205.12854  [pdf, other

    cs.CL cs.AI

    Understanding Factual Errors in Summarization: Errors, Summarizers, Datasets, Error Detectors

    Authors: Liyan Tang, Tanya Goyal, Alexander R. Fabbri, Philippe Laban, Jiacheng Xu, Semih Yavuz, Wojciech Kryściński, Justin F. Rousseau, Greg Durrett

    Abstract: The propensity of abstractive summarization models to make factual errors has been studied extensively, including design of metrics to detect factual errors and annotation of errors in current systems' outputs. However, the ever-evolving nature of summarization systems, metrics, and annotated benchmarks makes factuality evaluation a moving target, and drawing clear comparisons among metrics has be… ▽ More

    Submitted 25 May, 2023; v1 submitted 25 May, 2022; originally announced May 2022.

    Comments: Accepted to ACL 2023

  35. arXiv:2205.09641  [pdf, other

    cs.CL

    SNaC: Coherence Error Detection for Narrative Summarization

    Authors: Tanya Goyal, Junyi Jessy Li, Greg Durrett

    Abstract: Progress in summarizing long texts is inhibited by the lack of appropriate evaluation frameworks. When a long summary must be produced to appropriately cover the facets of that text, that summary needs to present a coherent narrative to be understandable by a reader, but current automatic and human evaluation methods fail to identify gaps in coherence. In this work, we introduce SNaC, a narrative… ▽ More

    Submitted 28 October, 2022; v1 submitted 19 May, 2022; originally announced May 2022.

    Comments: EMNLP 2022

  36. arXiv:2205.06938  [pdf, other

    cs.CL

    Generating Literal and Implied Subquestions to Fact-check Complex Claims

    Authors: Jifan Chen, Aniruddh Sriram, Eunsol Choi, Greg Durrett

    Abstract: Verifying complex political claims is a challenging task, especially when politicians use various tactics to subtly misrepresent the facts. Automatic fact-checking systems fall short here, and their predictions like "half-true" are not very useful in isolation, since we have no idea which parts of the claim are true and which are not. In this work, we focus on decomposing a complex claim into a co… ▽ More

    Submitted 31 October, 2022; v1 submitted 13 May, 2022; originally announced May 2022.

    Journal ref: EMNLP 2022

  37. arXiv:2205.03401  [pdf, other

    cs.CL

    The Unreliability of Explanations in Few-shot Prompting for Textual Reasoning

    Authors: Xi Ye, Greg Durrett

    Abstract: Does prompting a large language model (LLM) like GPT-3 with explanations improve in-context learning? We study this question on two NLP tasks that involve reasoning over text, namely question answering and natural language inference. We test the performance of four LLMs on three textual reasoning datasets using prompts that include explanations in multiple different styles. For these tasks, we fin… ▽ More

    Submitted 12 October, 2022; v1 submitted 6 May, 2022; originally announced May 2022.

    Comments: NeurIPS 2022

  38. arXiv:2205.02832  [pdf, other

    cs.CL

    Entity Cloze By Date: What LMs Know About Unseen Entities

    Authors: Yasumasa Onoe, Michael J. Q. Zhang, Eunsol Choi, Greg Durrett

    Abstract: Language models (LMs) are typically trained once on a large-scale corpus and used for years without being updated. However, in a dynamic world, new entities constantly arise. We propose a framework to analyze what LMs can infer about new entities that did not exist when the LMs were pretrained. We derive a dataset of entities indexed by their origination date and paired with their English Wikipedi… ▽ More

    Submitted 5 May, 2022; originally announced May 2022.

    Comments: NAACL 2022 Findings

  39. arXiv:2203.09452  [pdf, other

    cs.PL

    Automated Transpilation of Imperative to Functional Code using Neural-Guided Program Synthesis (Extended Version)

    Authors: Benjamin Mariano, Yanju Chen, Yu Feng, Greg Durrett, Isil Dillig

    Abstract: While many mainstream languages such as Java, Python, and C# increasingly incorporate functional APIs to simplify programming and improve parallelization/performance, there are no effective techniques that can be used to automatically translate existing imperative code to functional variants using these APIs. Motivated by this problem, this paper presents a transpilation approach based on inductiv… ▽ More

    Submitted 18 March, 2022; v1 submitted 17 March, 2022; originally announced March 2022.

    Comments: Fixed some incorrectly rendered citations

  40. arXiv:2201.06028  [pdf, other

    cs.CL cs.AI

    Natural Language Deduction through Search over Statement Compositions

    Authors: Kaj Bostrom, Zayne Sprague, Swarat Chaudhuri, Greg Durrett

    Abstract: In settings from fact-checking to question answering, we frequently want to know whether a collection of evidence (premises) entails a hypothesis. Existing methods primarily focus on the end-to-end discriminative version of this task, but less work has treated the generative version in which a model searches over the space of statements entailed by the premises to constructively derive the hypothe… ▽ More

    Submitted 28 October, 2022; v1 submitted 16 January, 2022; originally announced January 2022.

    Comments: Findings of EMNLP 2022

    ACM Class: I.2.3; I.2.7

  41. arXiv:2112.07660  [pdf, other

    cs.CL

    Massive-scale Decoding for Text Generation using Lattices

    Authors: Jiacheng Xu, Siddhartha Reddy Jonnalagadda, Greg Durrett

    Abstract: Conditional neural text generation models generate high-quality outputs, but often concentrate around a mode when what we really want is a diverse set of options. We present a search algorithm to construct lattices encoding a massive number of generation options. First, we restructure decoding as a best-first search, which explores the space differently than beam search and improves efficiency by… ▽ More

    Submitted 3 May, 2022; v1 submitted 14 December, 2021; originally announced December 2021.

    Comments: NAACL 2022, see https://github.com/jiacheng-xu/lattice-generation for code

  42. arXiv:2111.00701  [pdf, other

    cs.CL

    Discourse Comprehension: A Question Answering Framework to Represent Sentence Connections

    Authors: Wei-Jen Ko, Cutter Dalton, Mark Simmons, Eliza Fisher, Greg Durrett, Junyi Jessy Li

    Abstract: While there has been substantial progress in text comprehension through simple factoid question answering, more holistic comprehension of a discourse still presents a major challenge (Dunietz et al., 2020). Someone critically reflecting on a text as they read it will pose curiosity-driven, often open-ended questions, which reflect deep understanding of the content and require complex reasoning to… ▽ More

    Submitted 17 October, 2022; v1 submitted 1 November, 2021; originally announced November 2021.

    Comments: EMNLP 2022 Camera Ready

  43. arXiv:2110.08370  [pdf, other

    cs.CL

    Training Dynamics for Text Summarization Models

    Authors: Tanya Goyal, Jiacheng Xu, Junyi Jessy Li, Greg Durrett

    Abstract: Pre-trained language models (e.g. BART) have shown impressive results when fine-tuned on large summarization datasets. However, little is understood about this fine-tuning process, including what knowledge is retained from pre-training time or how content selection and generation strategies are learnt across iterations. In this work, we analyze the training dynamics for generation models, focusing… ▽ More

    Submitted 15 March, 2022; v1 submitted 15 October, 2021; originally announced October 2021.

    Comments: ACL 2022 Findings

  44. arXiv:2110.08296  [pdf, other

    cs.CL

    ASPECTNEWS: Aspect-Oriented Summarization of News Documents

    Authors: Ojas Ahuja, Jiacheng Xu, Akshay Gupta, Kevin Horecka, Greg Durrett

    Abstract: Generic summaries try to cover an entire document and query-based summaries try to answer document-specific questions. But real users' needs often fall in between these extremes and correspond to aspects, high-level topics discussed among similar types of documents. In this paper, we collect a dataset of realistic aspect-oriented summaries, AspectNews, which covers different subtopics about articl… ▽ More

    Submitted 15 March, 2022; v1 submitted 15 October, 2021; originally announced October 2021.

    Comments: ACL 2022

  45. arXiv:2110.07837  [pdf, other

    cs.CL cs.LG

    Cross-Lingual Fine-Grained Entity Typing

    Authors: Nila Selvaraj, Yasumasa Onoe, Greg Durrett

    Abstract: The growth of cross-lingual pre-trained models has enabled NLP tools to rapidly generalize to new languages. While these models have been applied to tasks involving entities, their ability to explicitly predict typological features of these entities across languages has not been established. In this paper, we present a unified cross-lingual fine-grained entity typing model capable of handling over… ▽ More

    Submitted 14 October, 2021; originally announced October 2021.

  46. arXiv:2110.07686  [pdf, other

    cs.CL cs.AI

    Making Document-Level Information Extraction Right for the Right Reasons

    Authors: Liyan Tang, Dhruv Rajan, Suyash Mohan, Abhijeet Pradhan, R. Nick Bryan, Greg Durrett

    Abstract: Document-level models for information extraction tasks like slot-filling are flexible: they can be applied to settings where information is not necessarily localized in a single sentence. For example, key features of a diagnosis in a radiology report may not be explicitly stated in one place, but nevertheless can be inferred from parts of the report's text. However, these models can easily learn s… ▽ More

    Submitted 18 May, 2022; v1 submitted 14 October, 2021; originally announced October 2021.

    Comments: 9 pages (15 with references and appendix), 3 figures

  47. arXiv:2110.07586  [pdf, other

    cs.CL

    Can Explanations Be Useful for Calibrating Black Box Models?

    Authors: Xi Ye, Greg Durrett

    Abstract: NLP practitioners often want to take existing trained models and apply them to data from new domains. While fine-tuning or few-shot learning can be used to adapt a base model, there is no single recipe for making these techniques work; moreover, one may not have access to the original model weights if it is deployed as a black box. We study how to improve a black box model's performance on a new d… ▽ More

    Submitted 14 March, 2022; v1 submitted 14 October, 2021; originally announced October 2021.

    Comments: ACL 2022

  48. arXiv:2109.01653  [pdf, other

    cs.CL cs.AI

    CREAK: A Dataset for Commonsense Reasoning over Entity Knowledge

    Authors: Yasumasa Onoe, Michael J. Q. Zhang, Eunsol Choi, Greg Durrett

    Abstract: Most benchmark datasets targeting commonsense reasoning focus on everyday scenarios: physical knowledge like knowing that you could fill a cup under a waterfall [Talmor et al., 2019], social knowledge like bumping into someone is awkward [Sap et al., 2019], and other generic situations. However, there is a rich space of commonsense inferences anchored to knowledge about specific entities: for exam… ▽ More

    Submitted 3 September, 2021; originally announced September 2021.

  49. arXiv:2106.01518  [pdf, other

    cs.CL

    Dissecting Generation Modes for Abstractive Summarization Models via Ablation and Attribution

    Authors: Jiacheng Xu, Greg Durrett

    Abstract: Despite the prominence of neural abstractive summarization models, we know little about how they actually form summaries and how to understand where their decisions come from. We propose a two-step method to interpret summarization model decisions. We first analyze the model's behavior by ablating the full model to categorize each decoder decision into one of several generation modes: roughly, is… ▽ More

    Submitted 2 June, 2021; originally announced June 2021.

    Comments: ACL 2021; 16 pages

  50. arXiv:2104.08825  [pdf, other

    cs.CL cs.AI

    Flexible Generation of Natural Language Deductions

    Authors: Kaj Bostrom, Xinyu Zhao, Swarat Chaudhuri, Greg Durrett

    Abstract: An interpretable system for open-domain reasoning needs to express its reasoning process in a transparent form. Natural language is an attractive representation for this purpose -- it is both highly expressive and easy for humans to understand. However, manipulating natural language statements in logically consistent ways is hard: models must cope with variation in how meaning is expressed while r… ▽ More

    Submitted 9 September, 2021; v1 submitted 18 April, 2021; originally announced April 2021.

    Comments: Accepted to EMNLP 2021 (long paper). 9 pages (13 with references and appendix), 8 figures

    ACM Class: I.2.7; I.2.3