(Translated by https://www.hiragana.jp/)
Trust and Terror: Hazards in Text Reveal Negatively Biased Credulity and Partisan Negativity Bias

Trust and Terror: Hazards in Text Reveal Negatively Biased Credulity and Partisan Negativity Bias

Keith Burghardt1, Daniel M.T. Fessler2,3,4, Chyna Tang2, Anne Pisor5, Kristina Lerman1
Abstract

Socio-linguistic indicators of text, such as emotion or sentiment, are often extracted using neural networks in order to better understand features of social media. One indicator that is often overlooked, however, is the presence of hazards within text. Recent psychological research suggests that statements about hazards are more believable than statements about benefits (a property known as negatively biased credulity), and that political liberals and conservatives differ in how often they share hazards. Here, we develop a new model to detect information concerning hazards, trained on a new collection of annotated X posts, as well as urban legends annotated in previous work. We show that not only does this model perform well (outperforming, e.g., zero-shot human annotator proxies, such as GPT-4) but that the hazard information it extracts is not strongly correlated with other indicators, namely moral outrage, sentiment, emotions, and threat words. (That said, consonant with expectations, hazard information does correlate positively with such emotions as fear, and negatively with emotions like joy.) We then apply this model to three datasets: X posts about COVID-19, X posts about the 2023 Hamas-Israel war, and a new expanded collection of urban legends. From these data, we uncover words associated with hazards unique to each dataset as well as differences in this language between groups of users, such as conservatives and liberals, which informs what these groups perceive as hazards. We further show that information about hazards peaks in frequency after major hazard events, and therefore acts as an automated indicator of such events. Finally, we find that information about hazards is especially prevalent in urban legends, which is consistent with previous work that finds that reports of hazards are more likely to be both believed and transmitted.

Refer to caption
Figure 1: Detecting hazards. Data used to train a hazard model are either X posts or a previously annotated set of urban legends (Fessler, Pisor, and Navarrete 2014). Qualtrics surveys are sent to workers on Cloud Research to each annotate ten X posts such that there are typically three (and at least 2) annotators per post. We take the majority decision to annotate the gold standard label for each text. Data are then fed through a multi-lingual text embedding and trained on a support vector machine (Hearst et al. 1998) (alternative and more sophisticated models show poorer performance).

Introduction

Humans evolved to pay attention to negative information that could signal potential dangers or hazards in their environment. Evolutionary psychologists suggest that this “negativity bias” evolved as an adaptive mechanism to enhance survival by promptly detecting and responding to hazards (Öhman and Mineka 2001). Researchers have demonstrated that negative information has a stronger influence on human attention, cognition, and emotion compared to positive information. Negative events, emotions, and feedback tend to have more potent and lasting effects than their positive counterparts (Baumeister et al. 2001; Rozin and Royzman 2001). Additionally, people tend to have greater credulity towards hazard information compared to benefit information, even when the information is equivalent in truth value (Fessler, Pisor, and Navarrete 2014), a phenomenon that has been dubbed “negatively biased credulity.” Recent work has identified an ideological asymmetry, with social conservatives more likely to pay attention to and believe information about hazards than their social liberal counterparts  (Fessler 2019). For example, language highlighting threats has a long history of being used to rally public support for political measures and causes (Choi et al. 2022).

The same psychological mechanisms also operate in the realm of online information sharing. Researchers found that negatively-valenced information, which conveys outrage or sadness, spreads quickly and widely online (Ferrara and Yang 2015; Brady et al. 2017), and is a factor in the spread of misinformation (Vosoughi, Roy, and Aral 2018; Ecker et al. 2022), suggesting that it could be a potent tool for information operations aimed at influencing public opinion online. However, to date, researchers have not used machine learning to analyze the presence of information concerning hazards, which can be potent indicator of negative information (Choi et al. 2022).

We address this gap by presenting a multi-lingual transformer-based model to recognize information concerning hazards expressed in text messages on X (formerly Twitter) and within hundreds of urban legends. Analysis of X posts allows us to understand how major events affect users’ propensity to use hazard language, and how different users employ such language. Urban legend analysis, meanwhile, allows us to better test for the hypothesized impact of negatively-biased credulity: that negative events are more likely to be believed (Fessler, Pisor, and Navarrete 2014; Donavan, Mowen, and Chakraborty 1999). Urban legends are broadly believed by the sharers (Donavan, Mowen, and Chakraborty 1999), so a high frequency of information concerning hazards would support prior findings. We validate the model on human-annotated data, and show substantially higher performance over baselines.

The model is highly scalable, enabling us to quantify the likelihood that information concerning a hazard (hereafter, for simplicity, “hazards”) is present in millions of posts. We use the model to quantify the amount of hazards present in diverse multi-lingual conversations on X about topics such as the COVID-19 pandemic and the Hamas-Israel war. We show that, while hazards are somewhat correlated with negative emotions like anger and fear, our identification of hazards goes beyond such indices, capturing additional salient information about harms. Further, binning COVID-19 posts by the political orientation of the user, we find, in agreement with expectations (Hibbing, Smith, and Alford 2014; Lilienfeld and Latzman 2014), that social conservatives discuss hazards at a greater rate than do social liberals, even when split across various concerns (Rao et al. 2023). Moreover, these data reveal differences in the language liberals and conservatives associate with hazards – differences that a dictionary approach to identifying hazards would not necessarily capture. When posts about the Hamas-Israel war are split by whether accounts are associated with information operations (coordinated accounts), we see notable differences in how these accounts describe threats (emphasizing harms in Gaza and not in Israel) while also using hazard posts to promote petitions that try to end the war. These results show that identifying hazards can reveal salient features, potentially unmasking information operation tactics. Finally, when we apply these data to urban legends, we find that hazards are very common, which scales up prior work (Fessler, Pisor, and Navarrete 2014), and provides evidence for (albeit not proof of) the postulated aggregate effects of negatively-biased credulity in a population.

To summarize, our contributions are as follows:

  • We curate a new dataset to classify hazards at scale.

  • We develop a new model to extract hazards from both social media posts and urban legends.

  • We apply this model to newly collected online data, and contrast hazards to alternative indicators of affect.

  • We show distinct ways hazards are shared within different datasets, even as their overall prevalence in urban legends confirms the psychological theory of negatively-biased credulity.

Code, data, and annotations are available in the following repository: https://github.com/KeithBurghardt/Hazards.

Overall, these results demonstrate the need for, and utility of, a new tool to detect information concerning hazards. When applied at scale, this tool can help replicate and validate prior findings in psychology, and may prove useful in revealing influence operations.

Related Work

Linguistic indicators in text

Detecting indicators from text has a long history. One of the earliest methods to do this was The General Inquirer (Stone, Dunphy, and Smith 1966), but more recently text indicators have been analyzed with LIWC (Pennebaker, Francis, and Booth 2001). These dictionary-based approaches have been superseded by rule-and-word approaches, such as VADER (Hutto and Gilbert 2014), which have themselves been superseded by text embedding approaches, including SeerNet (Duppada, Jain, and Hiray 2018), a moral outrage classifier (Brady et al. 2021), and DeepMoji (Felbo et al. 2017). The advantages of these approaches are that embeddings can learn how semantically similar content has similar indicator values (e.g., “I am sad” and “This day was awful” would be viewed as sad statements even when the two sentences do not have any words in common). These approaches, however, were based on GRUs (Dey and Salem 2017) or LSTMs (Hochreiter and Schmidhuber 1997), while more recently transformer-based models (Vaswani et al. 2017), such as BERT (Devlin et al. 2018) or Sentence-BERT (Reimers and Gurevych 2019), have become popular for uses such as detecting hate speech (Davani, Díaz, and Prabhakaran 2022), or emotions (Acheampong, Nunoo-Mensah, and Chen 2021). Unlike prior approaches, transformer models account for context. Finally, these methods have then been improved upon with SpanEmo (Alhuzali and Ananiadou 2021) or Demux (Chochlakis et al. 2023), that account for correlations between labels, while recent research has experimented with Large Language Models (LLMs) to similarly detect emotion (Peng et al. 2024).

In contrast to these approaches, we develop a transformer-based tool to extract hazards from social media and urban legends, a previously under-studied indicator. We apply a range of models to Sentence-BERT embeddings and compare these against LLMs (GPT-3.5 and GPT-4 (Achiam et al. 2023)) to detect hazards in text.

Negatively biased credulity

One key use of a hazard detection model is to assess negatively biased credulity and negativity bias. Negatively-biased credulity is the idea that, over the course of human history, the costs of erroneously believing false information about danger will, on average, have been less than the costs of erroneously rejecting true information about danger, and, as a consequence, information concerning hazards is viewed as more believable than many other types of information (Fessler, Pisor, and Navarrete 2014; Fessler, Pisor, and Holbrook 2017b; Fessler 2019; Samore et al. 2018; Forgas 2019). This builds on earlier investigations of the role of credulity in recipients’ susceptibility to manipulation (Little 2017; Kartik, Ottaviani, and Squintani 2007). Likewise, explorations of negatively-biased credulity partially overlap with previous work examining how negative content (including content unrelated to hazards) is more likely to be shared (Martel, Pennycook, and Rand 2020; Youngblood et al. 2023; Ferrara and Yang 2015). For example, moral-emotional language spreads most for partisan discussions (Brady et al. 2017), negative sentiment posts spread faster (Ferrara and Yang 2015), and moral outrage makes posts more viral (Brady et al. 2021). In contrast to this prior research, however, not all negative information confers similar survival advantage. Specifically, recognizing threats swiftly (Öhman and Mineka 2001) is more critical for survival than recognizing sadness or anger. While negative emotions may reduce belief in a false claim (Phillips et al. 2024), discussion of hazards seems to increase it (Fessler 2019). As a metric of content, hazards are therefore distinct from emotions or other features of language.

Credulity toward hazard information may differ by ideology as well. As we mention in the Introduction, belief in a dangerous world differs by political ideology (Clifton and Kerry ; Holbrook et al. 2022; Federico, Hunt, and Ergun 2009), consistent with other differences across ideologies (Rao, Morstatter, and Lerman 2022). Social conservatives also display greater negativity bias (Hibbing, Smith, and Alford 2014), and pay more attention to threats (Lilienfeld and Latzman 2014) than do social liberals. Previous work in psychology indicates that hazard content is treated with greater credulity by social conservatives than by social liberals (Samore et al. 2018; Fessler, Pisor, and Holbrook 2017a).

We expand on previous research about negativity bias and negatively biased credulity by analyzing hazards at scale in social media posts and urban legends. The large datasets provide enough statistical power for us to assess how hazards vary after major events, how they relate to other indicators, and how diverse groups discuss hazards. This allows us to replicate at scale research on ideological differences in negativity, and evidence of negatively biased credulity in urban legends. Moreover, we can develop new hypotheses about how hazards are harnessed when attempting to influence social media users.

Research Methods

All data collected and analyzed were considered exempt by the lead author’s institutional review board, where all modeling, data collection, and data analysis was conducted. All annotations were non-human subject research. In addition, all data were anonymized prior to analysis or annotation to minimize privacy risks.

Hazards Benchmark: Data and Model

We curate a ground truth dataset for training models to recognize hazards. This process involves collecting and annotating posts for the presence of hazards. This ground truth data is then used to train a language model to classify hazards posts. The methods used to collect data, annotate data, and train hazard detection models are shown in Fig. 1.

Ground truth data

To create the benchmark X post dataset, we first extract 1,338 X posts containing at least one word from the Threat Dictionary (Choi et al. 2022), shown in Fig. 1, left panel. Data is collected via X’s Academic API between March, 2006 (when X is founded) and late 2022 in order to have a representative sample of data. We then randomize the order of these posts and recruit Cloud Research annotators to label these hazards via a Qualtrics survey. For each post, workers answer, “Does the tweet describe a hazard (something that could impose harm or other costs on the author of the tweet or on others)?”. Workers are paid $2 for each assignment, in which they annotate 10 random posts (on average this took 11 minutes to complete, meaning workers were paid $12 an hour). To account for workers who do not meaningfully complete the assignment, we add an easy question mid-way through the survey and remove any workers who did not complete it. We also remove annotations that are unfinished or take less than 200 seconds to complete (this was an arbitrary cutoff to better ensure that the annotations were not completed carelessly). As all data are annotated before 2023, we believe the prevalence of workers annotating with LLMs (Veselovsky, Ribeiro, and West 2023) was minimal. All posts annotated by fewer than two crowd workers are also discarded, resulting in a dataset with 1131 posts for training, validation, and testing. In addition, we use Python’s demoji library (https://pypi.org/project/demoji/) to convert all emojis to words in order to reduce artefacts in the embeddings. These collected data all follow the FAIR principles: Findable (these data are directly available via the repository link at the end of the Introduction section, and contain a unique identifier with all metadata are described), Accessible (the link is accessible to anyone), Interoperable (metadata use a formal and broadly applicable language), and Reusable (all data are described in detail). In the repository link, we also include a datasheet for the dataset Gebru et al. (2021).

We also use data from (Fessler, Pisor, and Navarrete 2014) (from the website SNOPES) to train a model to detect hazards in urban legends. While these data contain 278 legends in total, there are only 221 unique legend types. For example, SNOPES lists three different examples of the story “Bill Gates Wants to Share His Fortune With You”. We keep the first example listed in “Urban Legend Data_ 2022 Snopes, Encyclopedia & 2014.xlsx” located within our data repository.

Model training

We use a 90% training-10% testing split for X posts. If validation is needed (such as for early stopping) we split these data into 81% training, 9% validation, and 10% testing. We split urban legends data into 86% (190 legends) for training and 14% (31 legends) for testing. When validation is needed, these data are split 76%, 10%, and 14% for training, validation, and testing respectively. We randomize these splits 1000 times to come up with a mean and variance in our performance metrics.

We use the stsb-xlm-r-multilingual sentence-transformers model from huggingface (https://huggingface.co/sentence-transformers/stsb-xlm-r-multilingual) to embed all text, shown in Fig. 1, right panel. This model is used to ensure multi-lingual data common in social media text can be embedded. We then apply several supervised models to these embeddings including XGBoost (Chen and Guestrin 2016) (via https://xgboost.readthedocs.io/), neural networks (Abadi and et al. 2015), random forest (Ho 1995), and support vector machines (SVMs) (Hearst et al. 1998), the latter two are trained via scikit-learn (Pedregosa et al. 2011). While XGBoost, random forest, and SVM are trained with default hyperparameters, the neural network contains 4 fully-connected layer, with 256 nodes, 85 nodes, 28 nodes, and 1 node, respectively, with ReLu activation and dropout layers. Alternative models performed similarly. We also experimented with augmenting the human-annotated posts with 5K GPT-3.5 annotating posts that contained words from the Threat Dictionary (Choi et al. 2022), and 5K posts collected at random (containing popular English keywords, namely any of the top 100 lemmas within a large corpus (https://www.wordfrequency.info/samples.asp) that X does not consider stop words. This augmentation did not significantly change the performance of the model (which we show in the Results section). All models are trained via NVIDIA Tesla K80 GPU with 12GB of VRAM; we predict text via GeForce RTX 2080 GPU with 8GB VRAM. As a comparison, we also use GPT-3.5 and GPT-4 (Achiam et al. 2023) with chain-of-thought prompting (Wei et al. 2022) via the prompt, “Does the tweet [story] describe a hazard (something that could impose harm or other costs on the author of the tweet or on others)? Please answer ‘yes’ or ‘no’ and explain your thought process.” All text with “yes” are labeled “hazard” and otherwise labeled “not hazard” (therefore rare ambiguous situations where the LLMs do not know if a hazard exits are labeled “not hazard”). We validate this technique on 100 posts at random and 100 legends at random, and we find this technique extracts every post GPT-N considers a hazard.

Comparison to alternative text indicators

We compare our hazard detection model to alternative text indicators of affect: moral outrage (Brady et al. 2021), sentiment (VADER) (Hutto and Gilbert 2014), emotion detection (Demux) (Chochlakis et al. 2023), and threat words (Choi et al. 2022). These are state-of-the-art methods to detect each indicator. These code are under a MIT license (VADER and Demux), and Creative Commons Attribution-NonCommercial-ShareAlike 2.0 license (moral outrage), respectively, and adapted as needed to run on a GeForce RTX 2080 GPU with 8GB VRAM. All model outputs we analyze are continuous (such as confidence values for emotions) with the exception of the Threat Dictionary (Choi et al. 2022), in which we either indicate if a threat word is (1) or is not (0), in a post or legend.

X Data

Hamas-Israel War

Our analysis uses a corpus of 3.6M posts about the 2023 Hamas attack on Israel and Israel’s subsequent invasion of Gaza from 1.3M accounts. The posts were collected by querying X with a set of keywords related to the war: e.g., “Israel”, “Hamas”, “Gaza”, etc. These data are multilingual, with approximately 93% of posts in English, 6.5% in Arabic, and a small proportion in other languages. We further split by a common metric of coordination (Burghardt et al. 2023; Luceri et al. 2023), in which accounts are coordinated if they post near duplicate sequences of hashtags (which is strongly associated with near-duplicate messages). This extracted 9.4K inauthentic accounts that posted 61K posts. We share the coordinated network in the repository, but, because X’s terms of service do not allow us to share post text, and because we have removed all text IDs and user IDs, we do not share additional data. Because these data are public and anonymized, we did not need to obtain consent from users to extract these data. We show the frequency of posts over time for coordinated and non-coordinated accounts in Supplementary Materials Fig 13.

COVID-19

We use a public corpus of posts about the COVID-19 pandemic (Chen, Lerman, and Ferrara 2020) posted between January 21, 2020 and December 31, 2020. These posts include one or more terms related to the pandemic, such as “coronavirus”, “pandemic”, and “outbreak”, among others. We use Carmen (Dredze et al. 2013), a geo-location tool for X data, to link posts to locations within the United States. Carmen relies on metadata in posts, such as “place” and “coordinates” objects that encode location information, as well as mentions of locations in a user’s X bio, to infer their location. This approach is applied to filter out users whose home location is not one of the 50 U.S. states. We take a random 1% subsample to reduce the dataset size. This process yields 12.7M posts from 5.7M users. We show the post frequency in Supplementary Materials Fig. 12. Data are then split by keywords related to various concerns created by (Rao et al. 2023): education, healthcare, lockdowns, masking, origins, therapeutics, and vaccines. To compare hazards against other text indicators, we took a complete set of posts shared early in the dataset, from January 22 at 12 a.m. GMT until January 26, 21:00:40 GMT. This result allows the topics in the post to be more coherent (and not change in time), allowing for a more consistent comparison of what each indicator captures (e.g., there might be a lower correlation if each indicator picked up something different about each topic shared in 2020). We do not believe this choice affects the results significantly given the similarity of correlations with other datasets, shown in the Results section.

Urban Legends

We collect three sets of urban legends: Snopes.com data from 2014 (Fessler, Pisor, and Navarrete 2014), a new Snopes.com dataset collection, and legends from the Encyclopedia of Urban Legends (Brunvand 2002). To collect new Snopes.com data, we had several undergraduate research assistants scrape a summary and example text of all stories under the “fact check” category of Snopes.com as of mid-2022. This category includes several types of legends, such as frauds and scams, history legends, sports legends, and “old wives’ tales”. In total, we find 996 legends. We similarly extract every urban legend described in the Encyclopedia of Urban Legends (255 in total). Because urban legends can be long (up to 1,600 words) we use GPT-3.5 from ca. April, 2023 to summarize each urban legend. This keeps the text below 118 words, well within the 128 token limit for the sentence embedding model that we employ. We use these summaries to train our models and then detect hazards. Potential artefacts with this approach include hallucinations, but we generally found that these summaries faithfully represented the longer text. Alike to the annotated data, these collected urban legends all follow the FAIR principles. In the repository link, we also include a datasheet for the dataset Gebru et al. (2021).

Refer to caption
Figure 2: Performance of competing models. (a) Performance of models on human-annotated X posts. We show the ROC-AUC of XG-Boost (XGB) (Chen and Guestrin 2016), a neutral network (NN) (Abadi and et al. 2015), random forest (RF) (Ho 1995), and a support vector machine (SVM) (Hearst et al. 1998) trained on human annotated posts. We also show the ROC-AUC of a neutral network pre-trained on 10K LLM-annotated posts and then trained on human-annotated posts (NN-LLM-pre); GPT-3.5 and GPT-4, meanwhile, are zero-shot predictions. (b) Model performance on urban legends. We show ROC-AUC for the same types of models, except “[Model]-X-only” represents models only trained on human-annotated X posts, while all other supervised models are trained exclusively on urban legends. GPT-3.5 and GPT-4 are zero-shot predictions, as before.

Results

Hazards Model and Validation

Performance of the hazards detection model on benchmark data is shown in Fig. 2. Despite the simplicity of the SVM model, its performance exceeds all others, with an area under the receiver operating characteristic curve (ROC-AUえーゆーC) of 0.74 (Fig. 2a). This dataset uses posts that contain words from the Threat Dictionary (Choi et al. 2022), making the dataset especially challenging as, despite the posts containing threat words, only a subset show hazards. Notably, this implies that simple lexical models, like those based on the Threat Dictionary, are a poorer tool, as the Threat Dictionary would have an ROC-AUC of exactly 0.5 (all posts would contain at least one threat word). LLMs, namely GPT-3.5 and GPT-4, also perform markedly worse than supervised models.

We separately trained this model on urban legends in Fig. 2b. If the model is only trained on X posts, we see the models struggle to generalize, thus motivating a separate set of models trained on urban legends. The best model is the SVM model again, with an ROC-AUC of 0.83. In contrast to supervised models, we see that GPT-4’s performance is comparable to the urban legend-trained models, and far exceeds the models trained on X posts. GPT-4 therefore generalizes remarkably well in the X post and urban legend domains, although it does not exceed the performance of models trained to each dataset. Other studies have also noted that LLMs’ versatility comes at the expense of performance (Chochlakis et al. 2024).

Refer to caption
Figure 3: Spearman correlation between hazards moral outrage (Brady et al. 2021), sentiment (Hutto and Gilbert 2014), and emotions (Chochlakis et al. 2023). One indicator, urban legends “love” emotion is not statistically significantly correlated with hazard (p-value >0.05absent0.05>0.05> 0.05).

Hazards in Real-world Data

Next, we show in Fig. 3 how hazard confidences compare to other linguistic indicators of affect in three datasets: X posts on COVID-19, X posts on the Hamas-Israel war, and urban legends. We then show the words most often seen in high-hazard posts or legends in each dataset, clarifying how hazards are typically described. Focusing on X posts, we show the ideological asymmetry in how hazards are discussed, as well as the asymmetry in how coordinated and authentic accounts share hazards. While the former tests the greater negativity bias of conservatives compared to liberals (Hibbing, Smith, and Alford 2014), the latter offers insight into information operation tactics meant to drive authentic user behavior. Finally, we show the prevalence of hazards in urban legends, which offers further evidence, although not proof, of the postulated aggregate effects of negatively biased credulity (Fessler, Pisor, and Navarrete 2014).

Linguistic Analysis of Hazards

Figure 3 shows post and story-level correlations between hazard confidences and other text indicators of affect expressed by the post author, including moral outrage (Brady et al. 2021), sentiment (Hutto and Gilbert 2014), emotion confidence (Chochlakis et al. 2023), and threat words (Choi et al. 2022). In all cases the absolute value of Spearman correlations are all at or below 0.4, suggesting that alternative indicators do not fully capture information about hazards in text. The positive or negative direction of the correlations, however, make intuitive sense. For example, moral outrage is weakly positively correlated with hazards, perhaps because people share their moral outrage at some hazards, such as those that harm innocent people. Similarly, negative sentiment, as well as most negative emotions and posts containing Threat Dictionary words, show positive correlations with hazards. This is consistent with hazard posts being given a negative framing (cf. example hazard annotations and predictions in Supplementary Materials Table 1).

Refer to caption
Figure 4: Word frequency from highest 10% hazard confidence posts and bottom 10% hazard confidence posts for (a) COVID-19 (Chen, Lerman, and Ferrara 2020), (b) the Hamas-Israel war, and urban legends.
Refer to caption
Figure 5: High and low-hazard words associated with liberals and conservatives in the COVID-19 dataset.
Refer to caption
Figure 6: Hazard confidence for different COVID-19 concerns, split by conservative and liberal accounts. Differences are all significant. Error bars are too small to see, and are therefore removed for clarity.
Refer to caption
Figure 7: Hazards over time among COVID-19 posts (Chen, Lerman, and Ferrara 2020), split by liberal and conservative, as well as by type of post. Also shown is the proportion of posts with threat words from the Threat Dictionary (Choi et al. 2022).

To understand what language is associated with hazards, we plot relative word frequency of high-hazard posts (in the top 10% of hazard confidences) and low-hazard posts (in the bottom 10%) in Fig. 4. In Fig. 4a we see that the high-hazard posts about COVID-19 are often associated with dangers (e.g., “sick”, “killing”, “danger”) compared to low-hazard (“excited” or “stayhome”). Similarly, in Fig. 4b, the high-hazard words relate to “weapons”, “crimes”, “abuse”, while low-hazard words include “neutral”. Finally urban legends discuss “dead” or “car” (urban legends often involve dangerous scenarios related to cars), versus “2020” or “virus”. In all cases, we see hazard words are similar to threat words seen in (Choi et al. 2022), yet some words could not easily be captured with the Threat Dictionary, such as “vaccines” or “cdc”. Within the Hamas-Israel war dataset (Fig. 4b), we also notice “civilians” being mentioned, due to posts on the risk of harm to civilians. We show in Fig. 4c, meanwhile, the prevalence of high- and low-confidence hazard words (top 20% and bottom 20% hazard confidence due to the smaller dataset) for the three urban legend datasets combined.

Partisan Asymmetry in Hazard Discussions

The reason we see “vaccines” or “cdc” as words associated with hazards is better illustrated in Fig. 5, where we plot words associated with high-confidence and low-confidence hazard posts disaggregated by ideology, as identified using the technique developed in (Rao et al. 2023). We notice that, consistent with commonsensical observation of public discourse at the time, conservatives and liberals each discuss vaccines in hazard posts, but, when reviewing the posts, conservatives often use it in the context of vaccines being harmful, while liberals discuss it in the context of protecting from a harmful virus. We also see protests associated with high-confidence hazard posts among conservative users, possibly because we find conservative users associate the protests that occurred in 2020 as a hazard (e.g., looting). We also find that the mean hazard confidences are slightly higher for conservatives even when data are split by individual concerns, as shown in Fig. 6. The only exception is for COVID-19 origins. To better understand ideological differences in hazards discussions, we plot mean hazard confidence over time among conservative and liberal users in Fig. 7. This plot shows that hazard prediction confidence is higher for conservatives almost every day in which posts are collected. Thus, overall we see conservatives discuss hazards at a greater rate, consistent with conservatives having greater negativity bias (Hibbing, Smith, and Alford 2014; Samore et al. 2018; Fessler, Pisor, and Holbrook 2017a).

That being said, we notice that both liberals and conservatives show similar shifts in hazard confidence over time. In all cases, hazards appear to be highest at the very beginning of the COVID-19 dataset, shown in Fig. 7, with a rapid decline starting in March. When we calculate mean daily hazard level versus CDC weekly case count (https://covid19.who.int/WHO-COVID-19-global-data.csv) the Spearman correlation is -0.65 (p-value <0.001absent0.001<0.001< 0.001), and the correlation with death count is -0.49 (p-value <0.001absent0.001<0.001< 0.001), indicating, counter-intuitively, an increasingly muted response to the COVID-19 hazard, which is also seen with negative emotions (Guo et al. 2022). The authors of (Guo et al. 2022) suggest that this may be due to a reduction in uncertainty. We also notice a strong spike in hazards just after the George Floyd killing, consistent with a greater discussion about police brutality (by liberals) and property destruction by protesters (by conservatives). Finally, we see a notable increase in hazard confidence before, and a sharp decline after, the U.S. presidential election. Due to the sharp partisan divide in the U.S., we presume that both parties were anticipating some harm associated with the election depending on which candidate won. Once again, the reduction in hazard language appears to be associated with reduction in uncertainty about the outcome. Complementing these results, we show the proportion of posts with threat words (Choi et al. 2022). We show that results are similar, although notably, threat posts increase in September, and do not rebound after the election (both in contrast to hazard posts).

Refer to caption
Figure 8: Words associated with high-confidence and low-confidence hazard posts among (a) coordinated and (b) non-coordinated accounts within the Hamas-Israel war dataset.
Refer to caption
Figure 9: Hazards and threats over time for the Hamas-Israel war dataset. The plots show mean hazard confidences each day as well as the overall mean proportion of posts with at least one word from the Threat Dictionary (Choi et al. 2022) for (a) non-coordinated authentic accounts and (b) inauthentic coordinated accounts. Vertical lines correspond to the Hamas attack on October 7th, 2023, and the Israel Defence Force (IDF) entering Gaza on October 27th.
Refer to caption
Figure 10: Top URLs shared in coordinated accounts for (a) low-confidence hazard posts and (b) high-confidence hazard posts.

Hazards in Information Operations

Next, we explore how hazards are used within coordinated influence campaigns, specifically how they differ within authentic and inauthentic posts in the Hamas-Israel war dataset. To identify potentially inauthentic (coordinated) accounts, we use a hashtag-based heuristic that was shown to be highly accurate in detecting coordinated influence campaigns on X (Burghardt et al. 2023; Luceri et al. 2023). We find in Fig. 8 that likely authentic accounts emphasize the association between hazards and children, civilians, and terror (reflecting hazards surrounding the October 7th Hamas attack on Israel). In contrast, coordinated accounts (Fig. 8b) do not appear to associate the terrorist attack with a hazard, but instead associate hazards with bombs and civilians (reflecting the Israel ground war in Gaza). We plot hazard confidence in posts over time in Fig. 9. Just after the October 7th attack, there is a sudden increase in mean hazard confidence among authentic accounts as they discuss the attack on Israeli civilians. There is no such increase in hazard discussions among coordinated accounts. There is, however, a spike in the number of posts by coordinated accounts, shown in Supplementary Materials Fig. 13, which is why hazard confidence for coordinated account posts becomes much less noisy after the attack. We also see only a minor (or no) change in the proportion of posts containing threat words (Choi et al. 2022) after the October 7th attack. This is reflected in the Spearman correlation between daily mean hazard confidence and daily proportion of posts containing threat words, which is only 0.250.250.250.25 (p-value=0.047absent0.047=0.047= 0.047) for coordinated accounts and 0.40 (p-value=0.001absent0.001=0.001= 0.001) for authentic accounts. Hazards can therefore be a distinct, and in some ways better, indicator of major hazardous events than the previous state-of-the-art metrics. In separate analysis, coordinated account posts seem to show elation (promotion of Hamas, Supplementary Materials Fig. 14, and positive emotions, Supplementary Materials Figs. 1516), thus the insignificant change hazards just after October 7th is despite the accounts mentioning the attacks, rather than because the accounts ignore the event.

We do see, however, a steady increase in hazard discussions among coordinated accounts as Israel prepared to enter Gaza on October 27th. Reviewing posts, this appears to be because coordinated accounts start to discuss harm to Palestinian civilians. This is further corroborated in Fig. 10, where we compare the URLs shared in high and low-confidence hazard posts from coordinated accounts. The second most popular domain shared within hazard posts is secure.avaaz.org. These posts promote a petition to end the invasion of Gaza, such as “My heart is heavy with the horror in #Palestine & #Israel. There’s a way to help protect the #Gaza kids facing bombardment & free the Israeli kids hostage and Palestinian kids in prison – but we need massive pressure. Join the call #ShieldTheChildren”. These accounts appear to use hazards as a way to promote petitions meant to pressure world leaders to end the war.

Urban Legends

Fig. 11 shows the prevalence of hazards among urban legends in the three separate urban legend datasets from SNOPES and the Urban Legends Encyclopedia (Brunvand 2002). In all cases, hazard confidence is surprisingly high compared to X posts shown in Figs. 9 & Fig. 7. In contrast, only 30% of these urban legends contain at least 1 word from the Threat Dictionary. Urban legends are often shared because they were believed by the speaker or audience (Donavan, Mowen, and Chakraborty 1999). For this reason, the greater tendency to discuss hazards in an urban legend (e.g., razor blades in an apple during Halloween) is consistent with hazards increasing the believably of stories (Holbrook et al. 2022; Fessler, Pisor, and Navarrete 2014; Fessler 2019).

Refer to caption
Figure 11: The distribution of Hazard confidences within each urban legends dataset (Fessler, Pisor, and Navarrete 2014; Brunvand 2002).

Discussion

In conclusion, we create a new model to detect hazards in social media posts and urban legends that outpaces simple word-based proxies, such as the Threat Dictionary (Choi et al. 2022), as well as sophisticated LLM-based approaches. When this model is applied to X posts, we discover what different groups consider hazards. This uncovers partisan differences (such as conservatives associating protests with hazards), as well as differences that reveal information operation tactics. Specifically, we find that hazard content shared by coordinated accounts is often related to petitions to stop an ongoing war. Accounts may specifically mention hazards to take advantage of authentic users negativity bias in which they will be more likely to focus on a negative post (Fiske 1980), while the moral outrage of harmed children mentioned in their posts can drive the virality of the post (Brady et al. 2021). All these act as ways to bring more users to the petition webpage in order to pressure international leaders. When we measure the prevalence of hazards over time in both X datasets, we see intuitive peaks in hazards consistent with harmful events, suggesting that hazards are indeed good automated indicators of such events. In contrast, posts annotated by the Threat Dictionary show distinctive patterns over time and correlate poorly with the hazard model predictions. Moreover, while we see a strong change in hazards during the Hamas attack, we see no such strong change for Threat Dictionary-annotated posts, suggesting that the Threat Dictionary may not fully capture all hazardous events.

Our work replicates at scale prior work in psychology by demonstrating a higher use of hazards among conservatives compared to liberals, consistent with greater negativity bias (Hibbing, Smith, and Alford 2014), and more attention towards threats (Lilienfeld and Latzman 2014), among conservatives. We also see evidence consonant with the notion that negatively-biased credulity affects the contents of culture over time, as we find a high prevalence of hazards within urban legends, which are often believed by the speaker (Donavan, Mowen, and Chakraborty 1999). This is consistent with (Fessler, Pisor, and Navarrete 2014), as urban legends that contain hazards may be more believable than urban legends that do not, hence the former should persist at higher rates than the latter. However, we cannot rule out alternative interpretations of these data, such as the possibility that urban legends are especially negative for an entirely different reason. In the future, we anticipate that this model can help test other predictions of negatively biased credulity, including whether conservatives display greater negatively biased credulity (Fessler 2019), a finding that we cannot directly test in the present data. While there are concerns about the reproducibility of science, especially in psychology (Collaboration 2015), our work points to ways that AI can more generally improve reproducibility by testing social science hypotheses within larger representative samples across a range of time periods.

Limitations

While hazards appear to be a useful feature in text, we should caution that the model’s performance could still be improved. The subjectivity of text annotations (Davani, Díaz, and Prabhakaran 2022) and a lack of annotations both reduce the accuracy of many text indicator models (Chochlakis et al. 2023; Brady et al. 2021), a problem likely present in hazard detection. More research is needed to annotate text, especially multi-lingual text, and to improve the model’s generalizability. Notably, we see a significant drop in ROC-AUC when applying the model from X posts to urban legends. LLMs show promise, however, as GPT-4’s zero-shot performance was more consistent across X posts and urban legends. Instruction-tuned, and in-context learning-based LLMs could therefore allow for a better overall model.

Finally, while our work is consistent with prior psychological research, it offers only indirect support for both the existence of negatively biased credulity and the relationship between political orientation and negativity bias. Whether these results replicate in other datasets remains to be seen. At the most fundamental level, more work is needed to directly measure whether hazard texts are more likely to be believed by the author or reader.

Ethical considerations

This work was approved by an IRB. To improve user privacy, we removed all personally identifiable information in data, such as post IDs and user IDs, prior to our analysis. While the hazard model performs well at scale, it is still possible for the model to make mistakes. Therefore, care must be taken when interpreting the model output, including applying it to detect whether individual users are sharing information concerning hazards. The confidence of the model does not guarantee that individual users or posts are sharing such content, especially given that sarcasm and in-group language is a challenge for current AI models. So long as users are aware of these limitations, however, the harm to society is minimal, much like a sentiment or emotion detection model, therefore we share this model widely via the link presented at the top of the paper. We believe that the hazard detection model’s utility to society far outweighs any harmful societal impact.

References

  • Abadi and et al. (2015) Abadi, M.; and et al. 2015. TensorFlow: Large-Scale Machine Learning on Heterogeneous Systems. Software available from tensorflow.org.
  • Acheampong, Nunoo-Mensah, and Chen (2021) Acheampong, F. A.; Nunoo-Mensah, H.; and Chen, W. 2021. Transformer models for text-based emotion detection: a review of BERT-based approaches. Artificial Intelligence Review, 54(8): 5789–5829.
  • Achiam et al. (2023) Achiam, J.; Adler, S.; Agarwal, S.; Ahmad, L.; Akkaya, I.; Aleman, F. L.; Almeida, D.; Altenschmidt, J.; Altman, S.; Anadkat, S.; et al. 2023. Gpt-4 technical report. arXiv preprint arXiv:2303.08774.
  • Alhuzali and Ananiadou (2021) Alhuzali, H.; and Ananiadou, S. 2021. SpanEmo: Casting multi-label emotion classification as span-prediction. arXiv preprint arXiv:2101.10038.
  • Baumeister et al. (2001) Baumeister, R. F.; Bratslavsky, E.; Finkenauer, C.; and Vohs, K. D. 2001. Bad is stronger than good. Review of general psychology, 5(4): 323–370.
  • Brady et al. (2021) Brady, W. J.; McLoughlin, K.; Doan, T. N.; and Crockett, M. J. 2021. How social learning amplifies moral outrage expression in online social networks. Science Advances, 7(33): eabe5641.
  • Brady et al. (2017) Brady, W. J.; Wills, J. A.; Jost, J. T.; Tucker, J. A.; and Van Bavel, J. J. 2017. Emotion shapes the diffusion of moralized content in social networks. Proceedings of the National Academy of Sciences, 114(28): 7313–7318.
  • Brunvand (2002) Brunvand, J. H. 2002. Encyclopedia of urban legends. WW Norton & Company.
  • Burghardt et al. (2023) Burghardt, K.; Rao, A.; Guo, S.; He, Z.; Chochlakis, G.; Sabyasachee, B.; Rojecki, A.; Narayanan, S.; and Lerman, K. 2023. Socio-Linguistic Characteristics of Coordinated Inauthentic Accounts. arXiv preprint arXiv:2305.11867.
  • Chen, Lerman, and Ferrara (2020) Chen, E.; Lerman, K.; and Ferrara, E. 2020. Tracking Social Media Discourse About the COVID-19 Pandemic: Development of a Public Coronavirus Twitter Data Set. JMIR Public Health Surveill, 6(2): e19273.
  • Chen and Guestrin (2016) Chen, T.; and Guestrin, C. 2016. Xgboost: A scalable tree boosting system. In Proceedings of the 22nd acm sigkdd international conference on knowledge discovery and data mining, 785–794.
  • Chochlakis et al. (2023) Chochlakis, G.; Mahajan, G.; Baruah, S.; Burghardt, K.; Lerman, K.; and Narayanan, S. 2023. Leveraging Label Correlations in a Multi-Label Setting: a Case Study in Emotion. In ICASSP 2023 - 2023 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 1–5.
  • Chochlakis et al. (2024) Chochlakis, G.; Potamianos, A.; Lerman, K.; and Narayanan, S. 2024. The Strong Pull of Prior Knowledge in Large Language Models and Its Impact on Emotion Recognition. arXiv preprint arXiv:2403.17125.
  • Choi et al. (2022) Choi, V. K.; Shrestha, S.; Pan, X.; and Gelfand, M. J. 2022. When danger strikes: A linguistic tool for tracking America’s collective response to threats. Proceedings of the National Academy of Sciences, 119(4): e2113891119.
  • Clifton and Kerry (0) Clifton, J. D. W.; and Kerry, N. 0. Belief in a Dangerous World Does Not Explain Substantial Variance in Political Attitudes, But Other World Beliefs Do. Social Psychological and Personality Science, 0(0): 19485506221119324.
  • Collaboration (2015) Collaboration, O. S. 2015. Estimating the reproducibility of psychological science. Science, 349(6251): aac4716.
  • Davani, Díaz, and Prabhakaran (2022) Davani, A. M.; Díaz, M.; and Prabhakaran, V. 2022. Dealing with disagreements: Looking beyond the majority vote in subjective annotations. Transactions of the Association for Computational Linguistics, 10: 92–110.
  • Devlin et al. (2018) Devlin, J.; Chang, M.-W.; Lee, K.; and Toutanova, K. 2018. Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805.
  • Dey and Salem (2017) Dey, R.; and Salem, F. M. 2017. Gate-variants of gated recurrent unit (GRU) neural networks. In 2017 IEEE 60th international midwest symposium on circuits and systems (MWSCAS), 1597–1600. IEEE.
  • Donavan, Mowen, and Chakraborty (1999) Donavan, D. T.; Mowen, J. C.; and Chakraborty, G. 1999. Urban legends: The word-of-mouth communication of morality through negative story content. Marketing Letters, 10: 23–35.
  • Dredze et al. (2013) Dredze, M.; Paul, M. J.; Bergsma, S.; and Tran, H. 2013. Carmen: A twitter geolocation system with applications to public health. In Workshops at the twenty-seventh AAAI conference on artificial intelligence.
  • Duppada, Jain, and Hiray (2018) Duppada, V.; Jain, R.; and Hiray, S. 2018. Seernet at semeval-2018 task 1: Domain adaptation for affect in tweets. arXiv preprint arXiv:1804.06137.
  • Ecker et al. (2022) Ecker, U. K.; Lewandowsky, S.; Cook, J.; Schmid, P.; Fazio, L. K.; Brashier, N.; Kendeou, P.; Vraga, E. K.; and Amazeen, M. A. 2022. The psychological drivers of misinformation belief and its resistance to correction. Nature Reviews Psychology, 1(1): 13–29.
  • Federico, Hunt, and Ergun (2009) Federico, C. M.; Hunt, C. V.; and Ergun, D. 2009. Political expertise, social worldviews, and ideology: Translating “competitive jungles” and “dangerous worlds” into ideological reality. Social Justice Research, 22: 259–279.
  • Felbo et al. (2017) Felbo, B.; Mislove, A.; Søgaard, A.; Rahwan, I.; and Lehmann, S. 2017. Using millions of emoji occurrences to learn any-domain representations for detecting sentiment, emotion and sarcasm. arXiv preprint arXiv:1708.00524.
  • Ferrara and Yang (2015) Ferrara, E.; and Yang, Z. 2015. Quantifying the effect of sentiment on information diffusion in social media. PeerJ Computer Science, 1: e26.
  • Fessler (2019) Fessler, D. 2019. Believing chicken little: Evolutionary perspectives on credulity and danger. DRUMS: Distortions, rumours, untruths, misinformation & smears, 17–36.
  • Fessler, Pisor, and Holbrook (2017a) Fessler, D. M.; Pisor, A. C.; and Holbrook, C. 2017a. Political orientation predicts credulity regarding putative hazards. Psychological Science, 28(5): 651–660.
  • Fessler, Pisor, and Holbrook (2017b) Fessler, D. M. T.; Pisor, A. C.; and Holbrook, C. 2017b. Political Orientation Predicts Credulity Regarding Putative Hazards. Psychological Science, 28(5): 651–660. PMID: 28362568.
  • Fessler, Pisor, and Navarrete (2014) Fessler, D. M. T.; Pisor, A. C.; and Navarrete, C. D. 2014. Negatively-Biased Credulity and the Cultural Evolution of Beliefs. PLOS ONE, 9(4): 1–8.
  • Fiske (1980) Fiske, S. T. 1980. Attention and weight in person perception: The impact of negative and extreme behavior. Journal of personality and Social Psychology, 38(6): 889.
  • Forgas (2019) Forgas, J. P. 2019. On the role of affect in gullibility: Can positive mood increase, and negative mood reduce credulity? The social psychology of gullibility, 179–197.
  • Gebru et al. (2021) Gebru, T.; Morgenstern, J.; Vecchione, B.; Vaughan, J. W.; Wallach, H.; III, H. D.; and Crawford, K. 2021. Datasheets for datasets. Commun. ACM, 64(12): 86–92.
  • Guo et al. (2022) Guo, S.; Burghardt, K.; Rao, A.; and Lerman, K. 2022. Emotion regulation and dynamics of moral concerns during the early covid-19 pandemic. arXiv preprint arXiv:2203.03608.
  • Hearst et al. (1998) Hearst, M. A.; Dumais, S. T.; Osuna, E.; Platt, J.; and Scholkopf, B. 1998. Support vector machines. IEEE Intelligent Systems and their applications, 13(4): 18–28.
  • Hibbing, Smith, and Alford (2014) Hibbing, J. R.; Smith, K. B.; and Alford, J. R. 2014. Differences in negativity bias underlie variations in political ideology. Behavioral and brain sciences, 37(3): 297–307.
  • Ho (1995) Ho, T. K. 1995. Random decision forests. In Proceedings of 3rd international conference on document analysis and recognition, volume 1, 278–282. IEEE.
  • Hochreiter and Schmidhuber (1997) Hochreiter, S.; and Schmidhuber, J. 1997. Long short-term memory. Neural computation, 9(8): 1735–1780.
  • Holbrook et al. (2022) Holbrook, C.; Yoon, L.; Fessler, D. M. T.; Moser, C.; Delgado, S. J.; and Kim, H. 2022. Moral parochialism and causal appraisal of transgressive harm in Seoul and Los Angeles. Scientific Reports, 12(1): 14227.
  • Hutto and Gilbert (2014) Hutto, C.; and Gilbert, E. 2014. Vader: A parsimonious rule-based model for sentiment analysis of social media text. In Proceedings of the international AAAI conference on web and social media, volume 8, 216–225.
  • Kartik, Ottaviani, and Squintani (2007) Kartik, N.; Ottaviani, M.; and Squintani, F. 2007. Credulity, lies, and costly talk. Journal of Economic Theory, 134(1): 93–116.
  • Lilienfeld and Latzman (2014) Lilienfeld, S. O.; and Latzman, R. D. 2014. Threat bias, not negativity bias, underpins differences in political ideology. Behavioral & Brain Sciences, 37(3).
  • Little (2017) Little, A. T. 2017. Propaganda and credulity. Games and Economic Behavior, 102: 224–232.
  • Luceri et al. (2023) Luceri, L.; Pantè, V.; Burghardt, K.; and Ferrara, E. 2023. Unmasking the web of deceit: Uncovering coordinated activity to expose information operations on twitter. arXiv preprint arXiv:2310.09884.
  • Martel, Pennycook, and Rand (2020) Martel, C.; Pennycook, G.; and Rand, D. G. 2020. Reliance on emotion promotes belief in fake news. Cognitive Research: Principles and Implications, 5(1): 47.
  • Öhman and Mineka (2001) Öhman, A.; and Mineka, S. 2001. Fears, phobias, and preparedness: toward an evolved module of fear and fear learning. Psychological review, 108(3): 483.
  • Pedregosa et al. (2011) Pedregosa, F.; Varoquaux, G.; Gramfort, A.; Michel, V.; Thirion, B.; Grisel, O.; Blondel, M.; Prettenhofer, P.; Weiss, R.; Dubourg, V.; Vanderplas, J.; Passos, A.; Cournapeau, D.; Brucher, M.; Perrot, M.; and Duchesnay, E. 2011. Scikit-learn: Machine Learning in Python. Journal of Machine Learning Research, 12: 2825–2830.
  • Peng et al. (2024) Peng, L.; Zhang, Z.; Pang, T.; Han, J.; Zhao, H.; Chen, H.; and Schuller, B. W. 2024. Customising General Large Language Models for Specialised Emotion Recognition Tasks. In ICASSP 2024-2024 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 11326–11330. IEEE.
  • Pennebaker, Francis, and Booth (2001) Pennebaker, J. W.; Francis, M. E.; and Booth, R. J. 2001. Linguistic inquiry and word count: LIWC 2001. Mahway: Lawrence Erlbaum Associates, 71(2001): 2001.
  • Phillips et al. (2024) Phillips, S.; Wang, S. Y. N.; Carley, K. M.; Rand, D.; and Pennycook, G. 2024. Emotional language reduces belief in false claims.
  • Rao et al. (2023) Rao, A.; Guo, S.; Wang, S. Y. N.; Morstatter, F.; and Lerman, K. 2023. Pandemic Culture Wars: Partisan Differences in the Moral Language of COVID-19 Discussions. In 2023 IEEE International Conference on Big Data (BigData), 413–422.
  • Rao, Morstatter, and Lerman (2022) Rao, A.; Morstatter, F.; and Lerman, K. 2022. Partisan asymmetries in exposure to misinformation. Scientific Reports, 12(1): 15671.
  • Reimers and Gurevych (2019) Reimers, N.; and Gurevych, I. 2019. Sentence-bert: Sentence embeddings using siamese bert-networks. arXiv preprint arXiv:1908.10084.
  • Rozin and Royzman (2001) Rozin, P.; and Royzman, E. B. 2001. Negativity bias, negativity dominance, and contagion. Personality and social psychology review, 5(4): 296–320.
  • Samore et al. (2018) Samore, T.; Fessler, D. M. T.; Holbrook, C.; and Sparks, A. M. 2018. Electoral fortunes reverse, mindsets do not. PLOS ONE, 13(12): 1–15.
  • Stone, Dunphy, and Smith (1966) Stone, P. J.; Dunphy, D. C.; and Smith, M. S. 1966. The general inquirer: A computer approach to content analysis.
  • Vaswani et al. (2017) Vaswani, A.; Shazeer, N.; Parmar, N.; Uszkoreit, J.; Jones, L.; Gomez, A. N.; Kaiser, Ł.; and Polosukhin, I. 2017. Attention is all you need. Advances in neural information processing systems, 30.
  • Veselovsky, Ribeiro, and West (2023) Veselovsky, V.; Ribeiro, M. H.; and West, R. 2023. Artificial artificial artificial intelligence: Crowd workers widely use large language models for text production tasks. arXiv preprint arXiv:2306.07899.
  • Vosoughi, Roy, and Aral (2018) Vosoughi, S.; Roy, D.; and Aral, S. 2018. The spread of true and false news online. science, 359(6380): 1146–1151.
  • Wei et al. (2022) Wei, J.; Wang, X.; Schuurmans, D.; Bosma, M.; Xia, F.; Chi, E.; Le, Q. V.; Zhou, D.; et al. 2022. Chain-of-thought prompting elicits reasoning in large language models. Advances in neural information processing systems, 35: 24824–24837.
  • Youngblood et al. (2023) Youngblood, M.; Stubbersfield, J. M.; Morin, O.; Glassman, R.; and Acerbi, A. 2023. Negativity bias in the spread of voter fraud conspiracy theory tweets during the 2020 US election. Humanities and Social Sciences Communications, 10(1): 573.

Supplementary Materials

Refer to caption
Figure 12: Post frequency over time for coordinated and non-coordinated accounts.
Refer to caption
Figure 13: Post frequency over time for (a) authentic non-coordinated, and (b) inauthentic coordinated accounts. Vertical lines correspond to the Hamas attack on October 7th, 2023, and the Israel Defence Force (IDF) entering Gaza on October 27th.
Refer to caption
Figure 14: Percent of posts mentioning Hamas over time for (a) authentic non-coordinated, and (b) inauthentic coordinated accounts. Vertical lines correspond to the Hamas attack on October 7th, 2023, and the Israel Defence Force (IDF) entering Gaza on October 27th.
Refer to caption
Figure 15: Mean joy emotion confidence over time for (a) authentic non-coordinated, and (b) inauthentic coordinated accounts using Demux (Chochlakis et al. 2023). Vertical lines correspond to the Hamas attack on October 7th, 2023, and the Israel Defence Force (IDF) entering Gaza on October 27th.
Refer to caption
Figure 16: Mean optimism emotion confidence over time for (a) authentic non-coordinated, and (b) inauthentic coordinated accounts using Demux (Chochlakis et al. 2023). Vertical lines correspond to the Hamas attack on October 7th, 2023, and the Israel Defence Force (IDF) entering Gaza on October 27th.
Dataset Text Ground truth label
Random posts Aircraft crashes into residential building in Russian city near Crimea, killing at least three https://xxxx Hazard
Random posts @XXXX @YYYY But he is a liar and was thrown out because he is a liar, we know you love him but really you have to be aware that he is a Liar x No hazard
2014 Urban Legends Two friends had been practicing shooting beer cans off each other’s heads in Johannesburg, South Africa, when one of them accidentally shot the other in the face, causing serious injury. Hazard
2014 Urban Legends A job candidate was described as monosyllabic to the boss. The boss questioned where Monosyllabia was, but the person who made the comment played along and said it was south of Elbonia. The boss then incorrectly guessed that it was near Croatia. No hazard
Dataset Text Hazard model confidence
COVID-19 Ever just have a cry for no particular reason? These COVID times eh? Related: I’m listening to Nick Cave’s The Ship Song. 0.04
COVID-19 Mass shootings held in check during the pandemic (tho we saw a huge rise in domestic abuse). Now indications are, it’s going to get worse if Congress doesn’t act. Irrational ängry white malesḧave become the new face of terrorism. We must protect the country from that! 0.95
2022 Urban Legends A man gives a lift to a young girl dressed in an evening gown who claims her car broke down and asks to be taken home. As he pulls up to the address she provided, he realizes she has vanished from the back seat. Upon ringing the doorbell, he discovers that the girl he gave a lift to has been dead for two years and her father has been experiencing similar encounters. 0.0
2022 Urban Legends 19-year-old construction worker from north China died of a heart attack after smoking 100 cigarettes in one sitting in order to win a bet with his friend. The teenager collapsed and died as a crowd of passers-by watched. The attending doctor attributed the cause of death to the excessive intake of cigarette smoke and acute nicotine poisoning. The two friends had dreamed up the bet out of boredom, with the loser agreeing to pay for all the tobacco. The deceased continued smoking, emboldened by a growing crowd of onlookers, until he hit the 100-cigarette mark, while his friend gave up after 40 cigarettes. 1.0
2023 Hamas-Israel War [Text was in Arabic] RT @XXXX, Head of the Prisoners Affairs Commission to ’Al-Ghad’: The West has revealed its true face, and we have no choice but to stand firm against Israel. #AlGhadChannel #Palestine #GazaNow #AlAqsaFlood https://t.co/xxxx 0.05
2023 Hamas-Israel War @XXXX Palestinian govt gave American weapons to leaders of Hamas. Look at the countless pictures. Hamas used those weapons armed with hate and rhetoric propaganda against Israel. Women and children were killed by animals whose punishment should be swift and without prejudice. 0.94
Table 1: Hazard ground truth labels and model prediction examples.