(Translated by https://www.hiragana.jp/)
Search | arXiv e-print repository
Skip to main content

Showing 1–50 of 126 results for author: Kirk, H

.
  1. arXiv:2406.06196  [pdf, other

    cs.CL

    LINGOLY: A Benchmark of Olympiad-Level Linguistic Reasoning Puzzles in Low-Resource and Extinct Languages

    Authors: Andrew M. Bean, Simi Hellsten, Harry Mayne, Jabez Magomere, Ethan A. Chi, Ryan Chi, Scott A. Hale, Hannah Rose Kirk

    Abstract: In this paper, we present the LingOly benchmark, a novel benchmark for advanced reasoning abilities in large language models. Using challenging Linguistic Olympiad puzzles, we evaluate (i) capabilities for in-context identification and generalisation of linguistic patterns in very low-resource or extinct languages, and (ii) abilities to follow complex task instructions. The LingOly benchmark cover… ▽ More

    Submitted 11 June, 2024; v1 submitted 10 June, 2024; originally announced June 2024.

    Comments: 9 pages, 5 figures, 16 pages supplemental materials

  2. arXiv:2405.13058  [pdf, other

    cs.SE cs.AI cs.CY cs.LG

    The AI Community Building the Future? A Quantitative Analysis of Development Activity on Hugging Face Hub

    Authors: Cailean Osborne, Jennifer Ding, Hannah Rose Kirk

    Abstract: Open model developers have emerged as key actors in the political economy of artificial intelligence (AI), but we still have a limited understanding of collaborative practices in the open AI ecosystem. This paper responds to this gap with a three-part quantitative analysis of development activity on the Hugging Face (HF) Hub, a popular platform for building, sharing, and demonstrating models. Firs… ▽ More

    Submitted 5 June, 2024; v1 submitted 20 May, 2024; originally announced May 2024.

    Comments: 27 pages, 5 figures, 9 tables

    ACM Class: K.4.1

  3. arXiv:2404.16019  [pdf, other

    cs.CL

    The PRISM Alignment Project: What Participatory, Representative and Individualised Human Feedback Reveals About the Subjective and Multicultural Alignment of Large Language Models

    Authors: Hannah Rose Kirk, Alexander Whitefield, Paul Röttger, Andrew Bean, Katerina Margatina, Juan Ciro, Rafael Mosquera, Max Bartolo, Adina Williams, He He, Bertie Vidgen, Scott A. Hale

    Abstract: Human feedback plays a central role in the alignment of Large Language Models (LLMs). However, open questions remain about the methods (how), domains (where), people (who) and objectives (to what end) of human feedback collection. To navigate these questions, we introduce PRISM, a new dataset which maps the sociodemographics and stated preferences of 1,500 diverse participants from 75 countries, t… ▽ More

    Submitted 24 April, 2024; originally announced April 2024.

  4. arXiv:2404.13520  [pdf, other

    astro-ph.GA

    An ALMA search for substructure and fragmentation in starless cores in Orion B North

    Authors: Samuel Fielder, Helen Kirk, Michael Dunham, Stella Offner

    Abstract: We present Atacama Large Millimeter/submillimeter Array (ALMA) Cycle 3 observations of 73 starless and protostellar cores in the Orion B North molecular cloud. We detect a total of 34 continuum sources at 106 GHz, and after comparisons with other data, 4 of these sources appear to be starless. Three of the four sources are located near groupings of protostellar sources, while one source is an isol… ▽ More

    Submitted 21 April, 2024; originally announced April 2024.

    Comments: 38 pages, 18 figures, accepted for publication in ApJ

  5. arXiv:2404.12241  [pdf, other

    cs.CL cs.AI

    Introducing v0.5 of the AI Safety Benchmark from MLCommons

    Authors: Bertie Vidgen, Adarsh Agrawal, Ahmed M. Ahmed, Victor Akinwande, Namir Al-Nuaimi, Najla Alfaraj, Elie Alhajjar, Lora Aroyo, Trupti Bavalatti, Max Bartolo, Borhane Blili-Hamelin, Kurt Bollacker, Rishi Bomassani, Marisa Ferrara Boston, Siméon Campos, Kal Chakra, Canyu Chen, Cody Coleman, Zacharie Delpierre Coudert, Leon Derczynski, Debojyoti Dutta, Ian Eisenberg, James Ezick, Heather Frase, Brian Fuller , et al. (75 additional authors not shown)

    Abstract: This paper introduces v0.5 of the AI Safety Benchmark, which has been created by the MLCommons AI Safety Working Group. The AI Safety Benchmark has been designed to assess the safety risks of AI systems that use chat-tuned language models. We introduce a principled approach to specifying and constructing the benchmark, which for v0.5 covers only a single use case (an adult chatting to a general-pu… ▽ More

    Submitted 13 May, 2024; v1 submitted 18 April, 2024; originally announced April 2024.

  6. arXiv:2403.12075  [pdf, other

    cs.CY cs.AI cs.CR cs.CV cs.LG

    Adversarial Nibbler: An Open Red-Teaming Method for Identifying Diverse Harms in Text-to-Image Generation

    Authors: Jessica Quaye, Alicia Parrish, Oana Inel, Charvi Rastogi, Hannah Rose Kirk, Minsuk Kahng, Erin van Liemt, Max Bartolo, Jess Tsang, Justin White, Nathan Clement, Rafael Mosquera, Juan Ciro, Vijay Janapa Reddi, Lora Aroyo

    Abstract: With the rise of text-to-image (T2I) generative AI models reaching wide audiences, it is critical to evaluate model robustness against non-obvious attacks to mitigate the generation of offensive images. By focusing on ``implicitly adversarial'' prompts (those that trigger T2I models to generate unsafe images for non-obvious reasons), we isolate a set of difficult safety issues that human creativit… ▽ More

    Submitted 13 May, 2024; v1 submitted 14 February, 2024; originally announced March 2024.

    Comments: 10 pages, 6 figures

  7. arXiv:2402.16786  [pdf, other

    cs.CL cs.AI

    Political Compass or Spinning Arrow? Towards More Meaningful Evaluations for Values and Opinions in Large Language Models

    Authors: Paul Röttger, Valentin Hofmann, Valentina Pyatkin, Musashi Hinck, Hannah Rose Kirk, Hinrich Schütze, Dirk Hovy

    Abstract: Much recent work seeks to evaluate values and opinions in large language models (LLMs) using multiple-choice surveys and questionnaires. Most of this work is motivated by concerns around real-world LLM applications. For example, politically-biased LLMs may subtly influence society when they are used by millions of people. Such real-world concerns, however, stand in stark contrast to the artificial… ▽ More

    Submitted 5 June, 2024; v1 submitted 26 February, 2024; originally announced February 2024.

    Comments: Accepted at ACL 2024 (Main Conference)

  8. arXiv:2401.12295  [pdf, other

    cs.CL

    Cheap Learning: Maximising Performance of Language Models for Social Data Science Using Minimal Data

    Authors: Leonardo Castro-Gonzalez, Yi-Ling Chung, Hannak Rose Kirk, John Francis, Angus R. Williams, Pica Johansson, Jonathan Bright

    Abstract: The field of machine learning has recently made significant progress in reducing the requirements for labelled training data when building new models. These `cheaper' learning techniques hold significant potential for the social sciences, where development of large labelled training datasets is often a significant practical impediment to the use of machine learning for analytical tasks. In this ar… ▽ More

    Submitted 22 January, 2024; originally announced January 2024.

    Comments: 39 pages, 10 figures, 6 tables

    ACM Class: I.2.7; J.4

  9. arXiv:2311.08370  [pdf, other

    cs.CL

    SimpleSafetyTests: a Test Suite for Identifying Critical Safety Risks in Large Language Models

    Authors: Bertie Vidgen, Nino Scherrer, Hannah Rose Kirk, Rebecca Qian, Anand Kannappan, Scott A. Hale, Paul Röttger

    Abstract: The past year has seen rapid acceleration in the development of large language models (LLMs). However, without proper steering and safeguards, LLMs will readily follow malicious instructions, provide unsafe advice, and generate toxic content. We introduce SimpleSafetyTests (SST) as a new test suite for rapidly and systematically identifying such critical safety risks. The test suite comprises 100… ▽ More

    Submitted 16 February, 2024; v1 submitted 14 November, 2023; originally announced November 2023.

  10. arXiv:2310.17006  [pdf, ps, other

    eess.SP

    Mode Selection and Target Classification in Cognitive Radar Networks

    Authors: William W. Howard, Samuel R. Shebert, Benjamin H. Kirk, R. Michael Buehrer

    Abstract: Cognitive Radar Networks were proposed by Simon Haykin in 2006 to address problems with large legacy radar implementations - primarily, single-point vulnerabilities and lack of adaptability. This work proposes to leverage the adaptability of cognitive radar networks to trade between active radar observation, which uses high power and risks interception, and passive signal parameter estimation, whi… ▽ More

    Submitted 25 October, 2023; originally announced October 2023.

    Comments: 6 pages, 5 figures

  11. arXiv:2310.07629  [pdf, other

    cs.CL cs.CY

    The Past, Present and Better Future of Feedback Learning in Large Language Models for Subjective Human Preferences and Values

    Authors: Hannah Rose Kirk, Andrew M. Bean, Bertie Vidgen, Paul Röttger, Scott A. Hale

    Abstract: Human feedback is increasingly used to steer the behaviours of Large Language Models (LLMs). However, it is unclear how to collect and incorporate feedback in a way that is efficient, effective and unbiased, especially for highly subjective human preferences and values. In this paper, we survey existing approaches for learning from human feedback, drawing on 95 papers primarily from the ACL and ar… ▽ More

    Submitted 11 October, 2023; originally announced October 2023.

    Comments: Accepted for the 2023 Conference on Empirical Methods in Natural Language Processing (EMNLP, Main)

  12. arXiv:2310.02457  [pdf, other

    cs.CL cs.CY

    The Empty Signifier Problem: Towards Clearer Paradigms for Operationalising "Alignment" in Large Language Models

    Authors: Hannah Rose Kirk, Bertie Vidgen, Paul Röttger, Scott A. Hale

    Abstract: In this paper, we address the concept of "alignment" in large language models (LLMs) through the lens of post-structuralist socio-political theory, specifically examining its parallels to empty signifiers. To establish a shared vocabulary around how abstract concepts of alignment are operationalised in empirical datasets, we propose a framework that demarcates: 1) which dimensions of model behavio… ▽ More

    Submitted 15 November, 2023; v1 submitted 3 October, 2023; originally announced October 2023.

    Comments: Socially Responsible Language Modelling Research (SoLaR) @ NeurIPs 2023

  13. arXiv:2309.08573  [pdf, other

    cs.CL cs.CY

    Casteist but Not Racist? Quantifying Disparities in Large Language Model Bias between India and the West

    Authors: Khyati Khandelwal, Manuel Tonneau, Andrew M. Bean, Hannah Rose Kirk, Scott A. Hale

    Abstract: Large Language Models (LLMs), now used daily by millions of users, can encode societal biases, exposing their users to representational harms. A large body of scholarship on LLM bias exists but it predominantly adopts a Western-centric frame and attends comparatively less to bias levels and potential harms in the Global South. In this paper, we quantify stereotypical bias in popular LLMs according… ▽ More

    Submitted 15 September, 2023; originally announced September 2023.

  14. arXiv:2308.01263  [pdf, other

    cs.CL cs.AI

    XSTest: A Test Suite for Identifying Exaggerated Safety Behaviours in Large Language Models

    Authors: Paul Röttger, Hannah Rose Kirk, Bertie Vidgen, Giuseppe Attanasio, Federico Bianchi, Dirk Hovy

    Abstract: Without proper safeguards, large language models will readily follow malicious instructions and generate toxic content. This risk motivates safety efforts such as red-teaming and large-scale feedback learning, which aim to make models both helpful and harmless. However, there is a tension between these two objectives, since harmlessness requires models to refuse to comply with unsafe prompts, and… ▽ More

    Submitted 1 April, 2024; v1 submitted 2 August, 2023; originally announced August 2023.

    Comments: Accepted at NAACL 2024 (Main Conference)

  15. arXiv:2307.16811  [pdf, other

    cs.CL cs.CY

    DoDo Learning: DOmain-DemOgraphic Transfer in Language Models for Detecting Abuse Targeted at Public Figures

    Authors: Angus R. Williams, Hannah Rose Kirk, Liam Burke, Yi-Ling Chung, Ivan Debono, Pica Johansson, Francesca Stevens, Jonathan Bright, Scott A. Hale

    Abstract: Public figures receive a disproportionate amount of abuse on social media, impacting their active participation in public life. Automated systems can identify abuse at scale but labelling training data is expensive, complex and potentially harmful. So, it is desirable that systems are efficient and generalisable, handling both shared and specific aspects of online abuse. We explore the dynamics of… ▽ More

    Submitted 25 April, 2024; v1 submitted 31 July, 2023; originally announced July 2023.

    Comments: 15 pages, 7 figures, 4 tables

  16. arXiv:2307.13022  [pdf, other

    astro-ph.GA astro-ph.SR

    Alignment of dense molecular core morphology and velocity gradients with ambient magnetic fields

    Authors: A. Pandhi, R. K. Friesen, L. Fissel, J. E. Pineda, P. Caselli, M. C-Y. Chen, J. Di Francesco, A. Ginsburg, H. Kirk, P. C. Myers, S. S. R. Offner, A. Punanova, F. Quan, E. Redaelli, E. Rosolowsky, S. Scibelli, Y. M. Seo, Y. Shirley

    Abstract: Studies of dense core morphologies and their orientations with respect to gas flows and the local magnetic field have been limited to only a small sample of cores with spectroscopic data. Leveraging the Green Bank Ammonia Survey alongside existing sub-millimeter continuum observations and Planck dust polarization, we produce a cross-matched catalogue of 399 dense cores with estimates of core morph… ▽ More

    Submitted 24 July, 2023; originally announced July 2023.

    Comments: 33 pages, 28 figures, accepted to MNRAS

  17. arXiv:2306.12424  [pdf, other

    cs.CV cs.CL

    VisoGender: A dataset for benchmarking gender bias in image-text pronoun resolution

    Authors: Siobhan Mackenzie Hall, Fernanda Gonçalves Abrantes, Hanwen Zhu, Grace Sodunke, Aleksandar Shtedritski, Hannah Rose Kirk

    Abstract: We introduce VisoGender, a novel dataset for benchmarking gender bias in vision-language models. We focus on occupation-related biases within a hegemonic system of binary gender, inspired by Winograd and Winogender schemas, where each image is associated with a caption containing a pronoun relationship of subjects and objects in the scene. VisoGender is balanced by gender representation in profess… ▽ More

    Submitted 12 December, 2023; v1 submitted 21 June, 2023; originally announced June 2023.

    Comments: NeurIPS Datasets and Benchmarks 2023. Data and code available at https://github.com/oxai/visogender

  18. arXiv:2305.15407  [pdf, other

    cs.CV

    Balancing the Picture: Debiasing Vision-Language Datasets with Synthetic Contrast Sets

    Authors: Brandon Smith, Miguel Farinha, Siobhan Mackenzie Hall, Hannah Rose Kirk, Aleksandar Shtedritski, Max Bain

    Abstract: Vision-language models are growing in popularity and public visibility to generate, edit, and caption images at scale; but their outputs can perpetuate and amplify societal biases learned during pre-training on uncurated image-text pairs from the internet. Although debiasing methods have been proposed, we argue that these measurements of model bias lack validity due to dataset bias. We demonstrate… ▽ More

    Submitted 24 May, 2023; originally announced May 2023.

    Comments: Github: https://github.com/oxai/debias-gensynth

  19. arXiv:2305.14384  [pdf, other

    cs.LG cs.AI cs.CR cs.CV

    Adversarial Nibbler: A Data-Centric Challenge for Improving the Safety of Text-to-Image Models

    Authors: Alicia Parrish, Hannah Rose Kirk, Jessica Quaye, Charvi Rastogi, Max Bartolo, Oana Inel, Juan Ciro, Rafael Mosquera, Addison Howard, Will Cukierski, D. Sculley, Vijay Janapa Reddi, Lora Aroyo

    Abstract: The generative AI revolution in recent years has been spurred by an expansion in compute power and data quantity, which together enable extensive pre-training of powerful text-to-image (T2I) models. With their greater capabilities to generate realistic and creative content, these T2I models like DALL-E, MidJourney, Imagen or Stable Diffusion are reaching ever wider audiences. Any unsafe behaviors… ▽ More

    Submitted 22 May, 2023; originally announced May 2023.

    MSC Class: 14J68 (Primary)

  20. arXiv:2303.18190  [pdf, other

    cs.CL

    Assessing Language Model Deployment with Risk Cards

    Authors: Leon Derczynski, Hannah Rose Kirk, Vidhisha Balachandran, Sachin Kumar, Yulia Tsvetkov, M. R. Leiser, Saif Mohammad

    Abstract: This paper introduces RiskCards, a framework for structured assessment and documentation of risks associated with an application of language models. As with all language, text generated by language models can be harmful, or used to bring about harm. Automating language generation adds both an element of scale and also more subtle or emergent undesirable tendencies to the generated text. Prior work… ▽ More

    Submitted 31 March, 2023; originally announced March 2023.

  21. arXiv:2303.05453  [pdf, ps, other

    cs.CL cs.CY

    Personalisation within bounds: A risk taxonomy and policy framework for the alignment of large language models with personalised feedback

    Authors: Hannah Rose Kirk, Bertie Vidgen, Paul Röttger, Scott A. Hale

    Abstract: Large language models (LLMs) are used to generate content for a wide range of tasks, and are set to reach a growing audience in coming years due to integration in product interfaces like ChatGPT or search engines like Bing. This intensifies the need to ensure that models are aligned with human preferences and do not produce unsafe, inaccurate or toxic outputs. While alignment techniques like reinf… ▽ More

    Submitted 9 March, 2023; originally announced March 2023.

    Comments: 19 pages, 1 table

  22. arXiv:2303.04222  [pdf, other

    cs.CL cs.CY

    SemEval-2023 Task 10: Explainable Detection of Online Sexism

    Authors: Hannah Rose Kirk, Wenjie Yin, Bertie Vidgen, Paul Röttger

    Abstract: Online sexism is a widespread and harmful phenomenon. Automated tools can assist the detection of sexism at scale. Binary detection, however, disregards the diversity of sexist content, and fails to provide clear explanations for why something is sexist. To address this issue, we introduce SemEval Task 10 on the Explainable Detection of Online Sexism (EDOS). We make three main contributions: i) a… ▽ More

    Submitted 8 May, 2023; v1 submitted 7 March, 2023; originally announced March 2023.

    Comments: SemEval-2023 Task 10 (ACL 2023)

  23. Auditing large language models: a three-layered approach

    Authors: Jakob Mökander, Jonas Schuett, Hannah Rose Kirk, Luciano Floridi

    Abstract: Large language models (LLMs) represent a major advance in artificial intelligence (AI) research. However, the widespread use of LLMs is also coupled with significant ethical and social challenges. Previous research has pointed towards auditing as a promising governance mechanism to help ensure that AI systems are designed and deployed in ways that are ethical, legal, and technically robust. Howeve… ▽ More

    Submitted 27 June, 2023; v1 submitted 16 February, 2023; originally announced February 2023.

    Comments: 22 pages, 2 figures. AI Ethics (2023)

    ACM Class: K.4; K.6

  24. arXiv:2302.03749  [pdf, other

    eess.SP

    Open Set Wireless Signal Classification: Augmenting Deep Learning with Expert Feature Classifiers

    Authors: Samuel R. Shebert, Benjamin H. Kirk, R. Michael Buehrer

    Abstract: In shared spectrum with multiple radio access technologies, wireless standard classification is vital for applications such as dynamic spectrum access (DSA) and wideband spectrum monitoring. However, interfering signals and the presence of unknown classes of signals can diminish classification accuracy. To reduce interference, signals can be isolated in time, frequency, and space, but the isolatio… ▽ More

    Submitted 7 February, 2023; originally announced February 2023.

  25. Velocity-Coherent Substructure in TMC-1: Inflow and Fragmentation

    Authors: Simon E. T. Smith, Rachel Friesen, Antoine Marchal, Jaime E. Pineda, Paola Caselli, Michael Chun-Yuan Chen, Spandan Choudhury, James Di Francesco, Adam Ginsburg, Helen Kirk, Chris Matzner, Anna Punanova, Samantha Scibelli, Yancy Shirley

    Abstract: Filamentary structures have been found nearly ubiquitously in molecular clouds and yet their formation and evolution is still poorly understood. We examine a segment of Taurus Molecular Cloud 1 (TMC-1) that appears as a single, narrow filament in continuum emission from dust. We use the Regularized Optimization for Hyper-Spectral Analysis (ROHSA), a Gaussian decomposition algorithm which enforces… ▽ More

    Submitted 6 February, 2023; v1 submitted 18 November, 2022; originally announced November 2022.

    Comments: 17 pages, 8 figures; Accepted for publication to MNRAS

  26. arXiv:2209.10193  [pdf, other

    cs.CL

    Is More Data Better? Re-thinking the Importance of Efficiency in Abusive Language Detection with Transformers-Based Active Learning

    Authors: Hannah Rose Kirk, Bertie Vidgen, Scott A. Hale

    Abstract: Annotating abusive language is expensive, logistically complex and creates a risk of psychological harm. However, most machine learning research has prioritized maximizing effectiveness (i.e., F1 or accuracy score) rather than data efficiency (i.e., minimizing the amount of data that is annotated). In this paper, we use simulated experiments over two datasets at varying percentages of abuse to dem… ▽ More

    Submitted 21 September, 2022; originally announced September 2022.

    Comments: Third Workshop on Threat, Aggression and Cyberbullying (COLING 2022)

  27. arXiv:2207.10062  [pdf, other

    cs.LG

    DataPerf: Benchmarks for Data-Centric AI Development

    Authors: Mark Mazumder, Colby Banbury, Xiaozhe Yao, Bojan Karlaš, William Gaviria Rojas, Sudnya Diamos, Greg Diamos, Lynn He, Alicia Parrish, Hannah Rose Kirk, Jessica Quaye, Charvi Rastogi, Douwe Kiela, David Jurado, David Kanter, Rafael Mosquera, Juan Ciro, Lora Aroyo, Bilge Acun, Lingjiao Chen, Mehul Smriti Raje, Max Bartolo, Sabri Eyuboglu, Amirata Ghorbani, Emmett Goodman , et al. (20 additional authors not shown)

    Abstract: Machine learning research has long focused on models rather than datasets, and prominent datasets are used for common ML tasks without regard to the breadth, difficulty, and faithfulness of the underlying problems. Neglecting the fundamental importance of data has given rise to inaccuracy, bias, and fragility in real-world applications, and research is hindered by saturation across existing datase… ▽ More

    Submitted 13 October, 2023; v1 submitted 20 July, 2022; originally announced July 2022.

    Comments: NeurIPS 2023 Datasets and Benchmarks Track

  28. arXiv:2205.11374  [pdf, other

    cs.CL cs.AI

    Looking for a Handsome Carpenter! Debiasing GPT-3 Job Advertisements

    Authors: Conrad Borchers, Dalia Sara Gala, Benjamin Gilburt, Eduard Oravkin, Wilfried Bounsi, Yuki M. Asano, Hannah Rose Kirk

    Abstract: The growing capability and availability of generative language models has enabled a wide range of new downstream tasks. Academic research has identified, quantified and mitigated biases present in language models but is rarely tailored to downstream tasks where wider impact on individuals and society can be felt. In this work, we leverage one popular generative language model, GPT-3, with the goal… ▽ More

    Submitted 23 May, 2022; originally announced May 2022.

    Comments: Accepted for the 4th Workshop on Gender Bias in Natural Language Processing at NAACL 2022

  29. arXiv:2204.14256  [pdf, other

    cs.CL

    Handling and Presenting Harmful Text in NLP Research

    Authors: Hannah Rose Kirk, Abeba Birhane, Bertie Vidgen, Leon Derczynski

    Abstract: Text data can pose a risk of harm. However, the risks are not fully understood, and how to handle, present, and discuss harmful text in a safe way remains an unresolved issue in the NLP community. We provide an analytical framework categorising harms on three axes: (1) the harm type (e.g., misinformation, hate speech or racial stereotypes); (2) whether a harm is \textit{sought} as a feature of the… ▽ More

    Submitted 24 February, 2023; v1 submitted 29 April, 2022; originally announced April 2022.

    Comments: in Findings of EMNLP 2022

  30. arXiv:2203.11933  [pdf, other

    cs.LG cs.CL cs.CV cs.CY

    A Prompt Array Keeps the Bias Away: Debiasing Vision-Language Models with Adversarial Learning

    Authors: Hugo Berg, Siobhan Mackenzie Hall, Yash Bhalgat, Wonsuk Yang, Hannah Rose Kirk, Aleksandar Shtedritski, Max Bain

    Abstract: Vision-language models can encode societal biases and stereotypes, but there are challenges to measuring and mitigating these multimodal harms due to lacking measurement robustness and feature degradation. To address these challenges, we investigate bias measures and apply ranking metrics for image-text representations. We then investigate debiasing methods and show that prepending learned embeddi… ▽ More

    Submitted 25 October, 2022; v1 submitted 22 March, 2022; originally announced March 2022.

    Comments: 17 pages, 4 figures, 7 tables. For code and trained token embeddings, see https://github.com/oxai/debias-vision-lang; Changed to use ACL layout, added joint training with comparison figure, corrected spelling and formatting errors; This paper is accepted for publication at AACL 2022, the official version of record is in the ACL Anthology

  31. arXiv:2108.05921  [pdf, other

    cs.CL cs.CY

    Hatemoji: A Test Suite and Adversarially-Generated Dataset for Benchmarking and Detecting Emoji-based Hate

    Authors: Hannah Rose Kirk, Bertram Vidgen, Paul Röttger, Tristan Thrush, Scott A. Hale

    Abstract: Detecting online hate is a complex task, and low-performing models have harmful consequences when used for sensitive applications such as content moderation. Emoji-based hate is an emerging challenge for automated detection. We present HatemojiCheck, a test suite of 3,930 short-form statements that allows us to evaluate performance on hateful language expressed with emoji. Using the test suite, we… ▽ More

    Submitted 6 May, 2022; v1 submitted 12 August, 2021; originally announced August 2021.

    Journal ref: 2022 Annual Conference of the North American Chapter of the Association for Computational Linguistics (NAACL 2022)

  32. arXiv:2108.05367  [pdf, other

    astro-ph.GA astro-ph.SR

    Are massive dense clumps truly sub-virial? A new analysis using Gould Belt ammonia data

    Authors: Ayushi Singh, Christopher D. Matzner, Rachel K. Friesen, Peter G. Martin, Jaime E. Pineda, Erik W. Rosolowsky, Felipe Alves, Ana Chacón-Tanarro, Hope How-Huan Chen, Michael Chun-Yuan Chen, Spandan Choudhury, James Di Francesco, Jared Keown, Helen Kirk, Anna Punanova, Youngmin Seo, Yancy Shirley, Adam Ginsburg, Stella S. R. Offner, Héctor G. Arce, Paola Caselli, Alyssa A. Goodman, Philip C. Myers, Elena Redaelli

    Abstract: Dynamical studies of dense structures within molecular clouds often conclude that the most massive clumps contain too little kinetic energy for virial equilibrium, unless they are magnetized to an unexpected degree. This raises questions about how such a state might arise, and how it might persist long enough to represent the population of massive clumps. In an effort to re-examine the origins of… ▽ More

    Submitted 11 August, 2021; originally announced August 2021.

    Comments: Submitted to ApJ

  33. arXiv:2107.10750  [pdf, other

    astro-ph.SR astro-ph.GA

    The JCMT Transient Survey: Four Year Summary of Monitoring the Submillimeter Variability of Protostars

    Authors: Yong-Hee Lee, Doug Johnstone, Jeong-Eun Lee, Gregory Herczeg, Steve Mairs, Carlos Contreras-Peña, Jennifer Hatchell, Tim Naylor, Graham S. Bell, Tyler L. Bourke, Colton Broughton, Logan Francis, Aashish Gupta, Daniel Harsono, Sheng-Yuan Liu, Geumsook Park, Spencer Plovie, Gerald H. Moriarty-Schieven, Aleks Scholz, Tanvi Sharma, Paula Stella Teixeira, Yao-Te Wang, Yuri Aikawa, Geoffrey C. Bower, Huei-Ru Vivien Chen , et al. (27 additional authors not shown)

    Abstract: We present the four-year survey results of monthly submillimeter monitoring of eight nearby ($< 500 $pc) star-forming regions by the JCMT Transient Survey. We apply the Lomb-Scargle Periodogram technique to search for and characterize variability on 295 submillimeter peaks brighter than 0.14 Jy beam$^{-1}$, including 22 disk sources (Class II), 83 protostars (Class 0/I), and 190 starless sources.… ▽ More

    Submitted 22 July, 2021; originally announced July 2021.

    Comments: Accepted for publication in the Astrophysical Journal

  34. arXiv:2107.04313  [pdf, other

    cs.CV

    Memes in the Wild: Assessing the Generalizability of the Hateful Memes Challenge Dataset

    Authors: Hannah Rose Kirk, Yennie Jun, Paulius Rauba, Gal Wachtel, Ruining Li, Xingjian Bai, Noah Broestl, Martin Doff-Sotta, Aleksandar Shtedritski, Yuki M. Asano

    Abstract: Hateful memes pose a unique challenge for current machine learning systems because their message is derived from both text- and visual-modalities. To this effect, Facebook released the Hateful Memes Challenge, a dataset of memes with pre-extracted text captions, but it is unclear whether these synthetic examples generalize to `memes in the wild'. In this paper, we collect hateful and non-hateful m… ▽ More

    Submitted 9 July, 2021; originally announced July 2021.

    Comments: Accepted paper at ACL WOAH 2021

  35. The JCMT Gould Belt Survey: radiative heating by OB stars

    Authors: Damian Rumble, Jennifer Hatchell, Helen Kirk, Kate Pattle

    Abstract: Radiative feedback can influence subsequent star formation. We quantify the heating from OB stars in the local star-forming regions in the JCMT Gould Belt survey. Dust temperatures are calculated from 450/850 micron flux ratios from SCUBA-2 observations at the JCMT assuming a fixed dust opacity spectral index $βべーた=1.8$. Mean dust temperatures are calculated for each submillimetre clump along with pr… ▽ More

    Submitted 7 May, 2021; originally announced May 2021.

    Comments: 9 pages, 4 figures. MNRAS accepted

  36. Transition from Coherent Cores to Surrounding Cloud in L1688

    Authors: Spandan Choudhury, Jaime E. Pineda, Paola Caselli, Stella S. R. Offner, Erik Rosolowsky, Rachel K. Friesen, Elena Redaelli, Ana Chacón-Tanarro, Yancy Shirley, Anna Punanova, Helen Kirk

    Abstract: Stars form in cold dense cores showing subsonic velocity dispersions. The parental molecular clouds display higher temperatures and supersonic velocity dispersions. The transition from core to cloud has been observed in velocity dispersion, but temperature and abundance variations are unknown. We aim to study the transition from cores to ambient cloud in temperature and velocity dispersion using a… ▽ More

    Submitted 12 February, 2021; originally announced February 2021.

    Comments: 37 pages, 33 figures, 1 table. Accepted for publication in A&A

    Journal ref: A&A 648, A114 (2021)

  37. arXiv:2102.04130  [pdf, other

    cs.CL cs.AI

    Bias Out-of-the-Box: An Empirical Analysis of Intersectional Occupational Biases in Popular Generative Language Models

    Authors: Hannah Kirk, Yennie Jun, Haider Iqbal, Elias Benussi, Filippo Volpin, Frederic A. Dreyer, Aleksandar Shtedritski, Yuki M. Asano

    Abstract: The capabilities of natural language models trained on large-scale data have increased immensely over the past few years. Open source libraries such as HuggingFace have made these models easily available and accessible. While prior research has identified biases in large language models, this paper considers biases contained in the most popular versions of these models when applied `out-of-the-box… ▽ More

    Submitted 27 October, 2021; v1 submitted 8 February, 2021; originally announced February 2021.

    Comments: Accepted to NeurIPS 2021. Code and data at https://github.com/oxai/intersectional_gpt2

  38. Ubiquitous $\rm NH_3$ supersonic component in L1688 coherent cores

    Authors: Spandan Choudhury, Jaime E. Pineda, Paola Caselli, Adam Ginsburg, Stella S. R. Offner, Erik Rosolowsky, Rachel K. Friesen, Felipe O. Alves, Ana Chacón-Tanarro, Anna Punanova, Elena Redaelli, Helen Kirk, Philip C. Myers, Peter G. Martin, Yancy Shirley, Michael Chun-Yuan Chen, Alyssa A. Goodman, James Di Francesco

    Abstract: Context : Star formation takes place in cold dense cores in molecular clouds. Earlier observations have found that dense cores exhibit subsonic non-thermal velocity dispersions. In contrast, CO observations show that the ambient large-scale cloud is warmer and has supersonic velocity dispersions. Aims : We aim to study the ammonia ($\rm NH_3$) molecular line profiles with exquisite sensitivity tow… ▽ More

    Submitted 20 July, 2020; v1 submitted 14 July, 2020; originally announced July 2020.

    Comments: Accepted for publication in Astronomy & Astrophysics on 06/07/2020. 15 pages, 16 figures, 1 table. Language edits from previous version

    Journal ref: A&A 640, L6 (2020)

  39. arXiv:2003.11033  [pdf, other

    astro-ph.GA astro-ph.SR

    Relative Alignment between Dense Molecular Cores and Ambient Magnetic Field: The Synergy of Numerical Models and Observations

    Authors: Che-Yu Chen, Erica A. Behrens, Jasmin E. Washington, Laura M. Fissel, Rachel K. Friesen, Zhi-Yun Li, Jaime E. Pineda, Adam Ginsburg, Helen Kirk, Samantha Scibelli, Felipe Alves, Elena Redaelli, Paola Caselli, Anna Punanova, James Di Francesco, Erik Rosolowsky, Stella S. R. Offner, Peter G. Martin, Ana Chacón-Tanarro, Hope H. -H. Chen, Michael C. -Y. Chen, Jared Keown, Youngmin Seo, Yancy Shirley, Hector G. Arce , et al. (4 additional authors not shown)

    Abstract: The role played by magnetic field during star formation is an important topic in astrophysics. We investigate the correlation between the orientation of star-forming cores (as defined by the core major axes) and ambient magnetic field directions in 1) a 3D MHD simulation, 2) synthetic observations generated from the simulation at different viewing angles, and 3) observations of nearby molecular cl… ▽ More

    Submitted 24 March, 2020; originally announced March 2020.

    Comments: 18 pages, 11 figures, accepted for publication in MNRAS

  40. arXiv:1911.10320  [pdf, other

    astro-ph.IM physics.soc-ph

    Opportunities and Outcomes for Postdocs in Canada

    Authors: Henry Ngo, Helen Kirk, Toby Brown, Tyrone E. Woods, Gwendolyn Eadie, Samantha Lawler, Locke Spencer

    Abstract: Currently, postdoctoral fellow (PDF) researchers in Canada face challenges due to the precarious nature of their employment and their overall low compensation and benefits coverage. This report presents three themes, written as statements of need, to support an inclusive and thriving PDF community. These themes are the need for better terms of employment and conditions, the need for access to gran… ▽ More

    Submitted 23 November, 2019; originally announced November 2019.

    Comments: State of the profession white paper submitted to the Canadian Long Range Plan 2020 decadal survey with appendices

    Report number: W064

  41. arXiv:1911.08928  [pdf

    cs.RO cs.CV cs.LG

    A Human Action Descriptor Based on Motion Coordination

    Authors: Pietro Falco, Matteo Saveriano, Eka Gibran Hasany, Nicholas H. Kirk, Dongheui Lee

    Abstract: In this paper, we present a descriptor for human whole-body actions based on motion coordination. We exploit the principle, well known in neuromechanics, that humans move their joints in a coordinated fashion. Our coordination-based descriptor (CODE) is computed by two main steps. The first step is to identify the most informative joints which characterize the motion. The second step enriches the… ▽ More

    Submitted 20 November, 2019; originally announced November 2019.

  42. arXiv:1911.01989  [pdf, other

    astro-ph.SR astro-ph.GA

    The Formation of Stars -- From Filaments to Cores to Protostars and Protoplanetry Disks

    Authors: James Di Francesco, Helen Kirk, Doug Johnstone, Ralph Pudritz, Shantanu Basu, Sarah Sadavoy, Laura Fissel, Lewis Knee, Mehrnoosh Tahani, Rachel Friesen, Simon Coudé, Erik Rosolowsky, Nienke van der Marel, Michel Fich, Christine Wilson, Chris Matzner, Ruobing Dong, Brenda Matthews, Gerald Schieven

    Abstract: Star formation involves the flow of gas and dust within molecular clouds into protostars and young stellar objects (YSOs) due to gravity. Along the way, these flows are shaped significantly by many other mechanisms, including pressure, turbulent motions, magnetic fields, stellar feedback, jets, and angular momentum. How all these mechanisms interact nonlinearly with each other on various length sc… ▽ More

    Submitted 9 November, 2019; v1 submitted 5 November, 2019; originally announced November 2019.

    Comments: 11 pages, a contributed white paper prepared for Canada's 2020 Long Range Plan decadal process

  43. The Next Generation Very Large Array

    Authors: James Di Francesco, Dean Chalmers, Nolan Denman, Laura Fissel, Rachel Friesen, Bryan Gaensler, Julie Hlavacek-Larrondo, Helen Kirk, Brenda Matthews, Christopher O'Dea, Tim Robishaw, Erik Rosolowsky, Michael Rupen, Sarah Sadavoy, Samar Safi-Harb, Greg Sivakoff, Mehrnoosh Tahani, Nienke van der Marel, Jacob White, Christine Wilson

    Abstract: The next generation Very Large Array (ngVLA) is a transformational radio observatory being designed by the U.S. National Radio Astronomy Observatory (NRAO). It will provide order of magnitude improvements in sensitivity, resolution, and uv coverage over the current Jansky Very Large Array (VLA) at ~1.2-50 GHz and extend the frequency range up to 70-115 GHz. This document is a white paper written b… ▽ More

    Submitted 4 November, 2019; originally announced November 2019.

    Comments: 11 pages; a contributed white paper for Canada's 2020 Long Range Plan decadal process

  44. Development Plans for the Atacama Large Millimeter/submillimeter Array (ALMA)

    Authors: Christine Wilson, Scott Chapman, Ruobing Dong, James di Francesco, Laura Fissel, Doug Johnstone, Helen Kirk, Brenda Matthews, Brian McNamara, Erik Rosolowsky, Michael Rupen, Sarah Sadavoy, Douglas Scott, Nienke van der Marel

    Abstract: (abridged) The Atacama Large Millimeter/submillimeter Array (ALMA) was the top-ranked priority for a new ground-based facility in the 2000 Canadian Long Range Plan. Ten years later, at the time of LRP2010, ALMA construction was well underway, with first science observations anticipated for 2011. In the past 8 years, ALMA has proved itself to be a high-impact, high-demand observatory, with record n… ▽ More

    Submitted 9 October, 2019; originally announced October 2019.

    Comments: White paper E004 submitted to the Canadian Long Range Plan 2020

  45. arXiv:1908.10514  [pdf, other

    astro-ph.GA astro-ph.SR

    KFPA Examinations of Young STellar Object Natal Environments (KEYSTONE): Hierarchical Ammonia Structures in Galactic Giant Molecular Clouds

    Authors: Jared Keown, James Di Francesco, Erik Rosolowsky, Ayushi Singh, Charles Figura, Helen Kirk, L. D. Anderson, Michael Chun-Yuan Chen, Davide Elia, Rachel Friesen, Adam Ginsburg, A. Marston, Stefano Pezzuto, Eugenio Schisano, Sylvain Bontemps, Paola Caselli, Hong-Li Liu, Steven Longmore, Frederique Motte, Philip C. Myers, Stella S. R. Offner, Patricio Sanhueza, Nicola Schneider, Ian Stephens, James Urquhart , et al. (1 additional authors not shown)

    Abstract: We present initial results from the K-band focal plane array Examinations of Young STellar Object Natal Environments (KEYSTONE) survey, a large project on the 100-m Green Bank Telescope mapping ammonia emission across eleven giant molecular clouds at distances of $0.9-3.0$ kpc (Cygnus X North, Cygnus X South, M16, M17, MonR1, MonR2, NGC2264, NGC7538, Rosette, W3, and W48). This data release includ… ▽ More

    Submitted 29 August, 2019; v1 submitted 27 August, 2019; originally announced August 2019.

    Comments: Accepted for publication in ApJ

  46. The Green Bank Ammonia Survey: A Virial Analysis of Gould Belt Clouds in Data Release 1

    Authors: Ronan Kerr, Helen Kirk, James Di Francesco, Jared Keown, Mike Chen, Erik Rosolowsky, Stella S. R. Offner, Rachel Friesen, Jaime E. Pineda, Yancy Shirley, Elena Redaelli, Paola Caselli, Anna Punanova, Youngmin Seo, Felipe Alves, Ana Chacón-Tanarro, Hope How-Huan Chen

    Abstract: We perform a virial analysis of starless dense cores in three nearby star-forming regions : L1688 in Ophiuchus, NGC 1333 in Perseus, and B18 in Taurus. Our analysis takes advantage of comprehensive kinematic information for the dense gas in all of these regions made publicly available through the Green Bank Ammonia Survey Data Release 1, which used to estimate internal support against collapse. We… ▽ More

    Submitted 8 March, 2019; originally announced March 2019.

    Comments: 35 pages, 8 tables, and 14 figures consisting of 16 .pdf files. Accepted for publication in the Astrophysical Journal

  47. arXiv:1812.00556  [pdf, ps, other

    astro-ph.GA astro-ph.SR

    Catalogue of High Protostellar Surface Density Regions in Nearby Embedded Clusters

    Authors: Juan Li, Philip C. Myers, Helen Kirk, Robert A. Gutermuth, Michael M. Dunham, Riwaj Pokhrel

    Abstract: We analyze high-quality stellar catalogs for 24 young and nearby (within 1 kpc) embedded clusters and present a catalogue of 32 groups which have a high concentration of protostars. The median effective radius of these groups is 0.17 pc. The median protostellar and pre-main sequence star surface densities are 46 M_{\odot} pc^{-2} and 11 M_{\odot} pc^{-2}, respectively. We estimate the age of these… ▽ More

    Submitted 3 December, 2018; originally announced December 2018.

    Comments: 29 pages, 24 figures, to be published in ApJ

  48. Droplets I: Pressure-Dominated Sub-0.1 pc Coherent Structures in L1688 and B18

    Authors: Hope How-Huan Chen, Jaime E. Pineda, Alyssa A. Goodman, Andreas Burkert, Stella S. R. Offner, Rachel K. Friesen, Philip C. Myers, Felipe Alves, Hector G. Arce, Paola Caselli, Ana Chacon-Tanarro, Michael Chun-Yuan Chen, James Di Francesco, Adam Ginsburg, Jared Keown, Helen Kirk, Peter G. Martin, Christopher Matzner, Anna Punanova, Elena Redaelli, Erik Rosolowsky, Samantha Scibelli, Young Min Seo, Yancy Shirley, Ayushi Singh

    Abstract: We present the observation and analysis of newly discovered coherent structures in the L1688 region of Ophiuchus and the B18 region of Taurus. Using data from the Green Bank Ammonia Survey (GAS), we identify regions of high density and near-constant, almost-thermal, velocity dispersion. Eighteen coherent structures are revealed, twelve in L1688 and six in B18, each of which shows a sharp "transiti… ▽ More

    Submitted 15 May, 2019; v1 submitted 26 September, 2018; originally announced September 2018.

    Comments: Accepted by ApJ in April, 2019

    Journal ref: 2019ApJ...877...93C

  49. arXiv:1808.07952  [pdf, ps, other

    astro-ph.GA astro-ph.IM

    The JCMT Gould Belt Survey: SCUBA-2 Data-Reduction Methods and Gaussian Source Recovery Analysis

    Authors: Helen Kirk, Jennifer Hatchell, Doug Johnstone, David Berry, Tim Jenness, Jane Buckle, Steve Mairs, Erik Rosolowsky, James Di Francesco, Sarah Sadavoy, Malcolm Currie, Hannah Broekhoven-Fiene, Joseph C. Mottram, Kate Pattle, Brenda Matthews, Lewis B. G. Knee, Gerald Moriarty-Schieven, Ana Duarte-Cabral, Sam Tisi, Derek Ward-Thompson

    Abstract: The JCMT Gould Belt Survey was one of the first Legacy Surveys with the James Clerk Maxwell Telescope in Hawaii, mapping 47 square degrees of nearby (< 500 pc) molecular clouds in both dust continuum emission at 850 $μみゅー$m and 450 $μみゅー$m, as well as a more-limited area in lines of various CO isotopologues. While molecular clouds and the material that forms stars have structures on many size scales, th… ▽ More

    Submitted 23 August, 2018; originally announced August 2018.

    Comments: Accepted for publication in ApJS

  50. arXiv:1806.01847  [pdf, other

    astro-ph.GA astro-ph.SR

    Dense gas kinematics and a narrow filament in the Orion A OMC1 region using NH3

    Authors: Kristina Monsch, Jaime E. Pineda, Hauyu Baobab Liu, Catherine Zucker, Hope How-Huan Chen, Kate Pattle, Stella S. R. Offner, James Di Francesco, Adam Ginsburg, Barbara Ercolano, Héctor G. Arce, Rachel Friesen, Helen Kirk, Paola Caselli, Alyssa A. Goodman

    Abstract: We present combined observations of the NH3 (J,K) = (1,1) and (2,2) inversion transitions towards OMC1 in Orion A obtained by the Karl G. Jansky Very Large Array (VLA) and the 100 m Robert C. Byrd Green Bank Telescope (GBT). With an angular resolution of 6" (0.01 pc), these observations reveal with unprecedented detail the complex filamentary structure extending north of the active Orion BN/KL reg… ▽ More

    Submitted 18 December, 2018; v1 submitted 5 June, 2018; originally announced June 2018.

    Comments: Accepted for publication in ApJ. The combined data cubes of the NH3 (1,1) and (2,2) transitions as well as the resulting parameter maps are provided as FITS-files on Harvard Dataverse (https://doi.org/10.7910/DVN/QLD7TC); updated references

    Journal ref: ApJ 861, 77 (2018)