-
Reddit-Impacts: A Named Entity Recognition Dataset for Analyzing Clinical and Social Effects of Substance Use Derived from Social Media
Authors:
Yao Ge,
Sudeshna Das,
Karen O'Connor,
Mohammed Ali Al-Garadi,
Graciela Gonzalez-Hernandez,
Abeed Sarker
Abstract:
Substance use disorders (SUDs) are a growing concern globally, necessitating enhanced understanding of the problem and its trends through data-driven research. Social media are unique and important sources of information about SUDs, particularly since the data in such sources are often generated by people with lived experiences. In this paper, we introduce Reddit-Impacts, a challenging Named Entit…
▽ More
Substance use disorders (SUDs) are a growing concern globally, necessitating enhanced understanding of the problem and its trends through data-driven research. Social media are unique and important sources of information about SUDs, particularly since the data in such sources are often generated by people with lived experiences. In this paper, we introduce Reddit-Impacts, a challenging Named Entity Recognition (NER) dataset curated from subreddits dedicated to discussions on prescription and illicit opioids, as well as medications for opioid use disorder. The dataset specifically concentrates on the lesser-studied, yet critically important, aspects of substance use--its clinical and social impacts. We collected data from chosen subreddits using the publicly available Application Programming Interface for Reddit. We manually annotated text spans representing clinical and social impacts reported by people who also reported personal nonmedical use of substances including but not limited to opioids, stimulants and benzodiazepines. Our objective is to create a resource that can enable the development of systems that can automatically detect clinical and social impacts of substance use from text-based social media data. The successful development of such systems may enable us to better understand how nonmedical use of substances affects individual health and societal dynamics, aiding the development of effective public health strategies. In addition to creating the annotated data set, we applied several machine learning models to establish baseline performances. Specifically, we experimented with transformer models like BERT, and RoBERTa, one few-shot learning model DANN by leveraging the full training dataset, and GPT-3.5 by using one-shot learning, for automatic NER of clinical and social impacts. The dataset has been made available through the 2024 SMM4H shared tasks.
△ Less
Submitted 9 May, 2024;
originally announced May 2024.
-
Sampling low-fidelity outputs for estimation of high-fidelity density and its tails
Authors:
Minji Kim,
Vladas Pipiras,
Kevin O'Connor,
Themistoklis Sapsis
Abstract:
In a multifidelity setting, data are available under the same conditions from two (or more) sources, e.g. computer codes, one being lower-fidelity but computationally cheaper, and the other higher-fidelity and more expensive. This work studies for which low-fidelity outputs, one should obtain high-fidelity outputs, if the goal is to estimate the probability density function of the latter, especial…
▽ More
In a multifidelity setting, data are available under the same conditions from two (or more) sources, e.g. computer codes, one being lower-fidelity but computationally cheaper, and the other higher-fidelity and more expensive. This work studies for which low-fidelity outputs, one should obtain high-fidelity outputs, if the goal is to estimate the probability density function of the latter, especially when it comes to the distribution tails and extremes. It is suggested to approach this problem from the perspective of the importance sampling of low-fidelity outputs according to some proposal distribution, combined with special considerations for the distribution tails based on extreme value theory. The notion of an optimal proposal distribution is introduced and investigated, in both theory and simulations. The approach is motivated and illustrated with an application to estimate the probability density function of record extremes of ship motions, obtained through two computer codes of different fidelities.
△ Less
Submitted 27 February, 2024;
originally announced February 2024.
-
Almost perfect nonlinear power functions with exponents expressed as fractions
Authors:
Daniel J. Katz,
Kathleen R. O'Connor,
Kyle Pacheco,
Yakov Sapozhnikov
Abstract:
Let $F$ be a finite field, let $f$ be a function from $F$ to $F$, and let $a$ be a nonzero element of $F$. The discrete derivative of $f$ in direction $a$ is $Δ_a f \colon F \to F$ with $(Δ_a f)(x)=f(x+a)-f(x)$. The differential spectrum of $f$ is the multiset of cardinalities of all the fibers of all the derivatives $Δ_a f$ as $a$ runs through $F^*$. The function $f$ is almost perfect nonlinear (…
▽ More
Let $F$ be a finite field, let $f$ be a function from $F$ to $F$, and let $a$ be a nonzero element of $F$. The discrete derivative of $f$ in direction $a$ is $Δ_a f \colon F \to F$ with $(Δ_a f)(x)=f(x+a)-f(x)$. The differential spectrum of $f$ is the multiset of cardinalities of all the fibers of all the derivatives $Δ_a f$ as $a$ runs through $F^*$. The function $f$ is almost perfect nonlinear (APN) if the largest cardinality in the differential spectrum is $2$. Almost perfect nonlinear functions are of interest as cryptographic primitives. If $d$ is a positive integer, the power function over $F$ with exponent $d$ is the function $f \colon F \to F$ with $f(x)=x^d$ for every $x \in F$. There is a small number of known infinite families of APN power functions. In this paper, we re-express the exponents for one such family in a more convenient form. This enables us to give the differential spectrum and, even more, to determine the sizes of individual fibers of derivatives.
△ Less
Submitted 28 July, 2023;
originally announced July 2023.
-
The 4-player gambler's ruin problem
Authors:
Kathryn O'Connor,
Laurent Saloff-Coste
Abstract:
This work explains how to utilize earlier results by P. Diaconis, K. Houston-Edwards and the second author to estimate probabilities related to the 4-player gambler ruin problem. For instance, we show that the probability that a very dominant player (i.e., a player starting with all but 3 chips distributed among the remaining players) is first to loose is of order $N^{-α}$ where $α$ is approximate…
▽ More
This work explains how to utilize earlier results by P. Diaconis, K. Houston-Edwards and the second author to estimate probabilities related to the 4-player gambler ruin problem. For instance, we show that the probability that a very dominant player (i.e., a player starting with all but 3 chips distributed among the remaining players) is first to loose is of order $N^{-α}$ where $α$ is approximately $5.68$. In the $3$-player game, this probability is or order $N^{-3}$. We note it is futile to attempt to give heuristic/intuitive explanations for the value of $α$. This value is obtained via an explicit formula relating $α$ to the Dirichlet eigenvalue $λ$ (zero boundary condition) of the spherical Laplacian in the equilateral spherical triangle on the unit sphere $\mathbb S^2$ that corresponds to a unit simplex with one vertex placed at the origin in Euclidean $3$-space. The value of $λ$ is estimated using a finite-difference-type algorithm developed by Grady Wright.
△ Less
Submitted 12 September, 2022;
originally announced September 2022.
-
JWST Imaging of Earendel, the Extremely Magnified Star at Redshift $z=6.2$
Authors:
Brian Welch,
Dan Coe,
Erik Zackrisson,
S. E. de Mink,
Swara Ravindranath,
Jay Anderson,
Gabriel Brammer,
Larry Bradley,
Jinmi Yoon,
Patrick Kelly,
Jose M. Diego,
Rogier Windhorst,
Adi Zitrin,
Paola Dimauro,
Yolanda Jimenez-Teja,
Abdurro'uf,
Mario Nonino,
Ana Acebron,
Felipe Andrade-Santos,
Roberto J. Avila,
Matthew B. Bayliss,
Alex Benitez,
Tom Broadhurst,
Rachana Bhatawdekar,
Marusa Bradac
, et al. (38 additional authors not shown)
Abstract:
The gravitationally lensed star WHL0137-LS, nicknamed Earendel, was identified with a photometric redshift $z_{phot} = 6.2 \pm 0.1$ based on images taken with the Hubble Space Telescope. Here we present James Webb Space Telescope (JWST) Near Infrared Camera (NIRCam) images of Earendel in 8 filters spanning 0.8--5.0$μ$m. In these higher resolution images, Earendel remains a single unresolved point…
▽ More
The gravitationally lensed star WHL0137-LS, nicknamed Earendel, was identified with a photometric redshift $z_{phot} = 6.2 \pm 0.1$ based on images taken with the Hubble Space Telescope. Here we present James Webb Space Telescope (JWST) Near Infrared Camera (NIRCam) images of Earendel in 8 filters spanning 0.8--5.0$μ$m. In these higher resolution images, Earendel remains a single unresolved point source on the lensing critical curve, increasing the lower limit on the lensing magnification to $μ> 4000$ and restricting the source plane radius further to $r < 0.02$ pc, or $\sim 4000$ AU. These new observations strengthen the conclusion that Earendel is best explained by an individual star or multiple star system, and support the previous photometric redshift estimate. Fitting grids of stellar spectra to our photometry yields a stellar temperature of $T_{\mathrm{eff}} \simeq 13000$--16000 K assuming the light is dominated by a single star. The delensed bolometric luminosity in this case ranges from $\log(L) = 5.8$--6.6 $L_{\odot}$, which is in the range where one expects luminous blue variable stars. Follow-up observations, including JWST NIRSpec scheduled for late 2022, are needed to further unravel the nature of this object, which presents a unique opportunity to study massive stars in the first billion years of the universe.
△ Less
Submitted 9 November, 2022; v1 submitted 18 August, 2022;
originally announced August 2022.
-
A targeted search for strongly lensed supernovae and expectations for targeted searches in the Rubin era
Authors:
Peter Craig,
Kyle O'Connor,
Sukanya Chakrabarti,
Steven A. Rodney,
Justin R. Pierel,
Curtis McCully,
Ismael Perez-Fournon
Abstract:
Gravitationally lensed supernovae (glSNe) are of interest for time delay cosmology and SN physics. However, glSNe detections are rare, owing to the intrinsic rarity of SN explosions, the necessity of alignment with a foreground lens, and the relatively short window of detectability. We present the Las Cumbres Observatory Lensed Supernova Search, LCOLSS, a targeted survey designed for detecting glS…
▽ More
Gravitationally lensed supernovae (glSNe) are of interest for time delay cosmology and SN physics. However, glSNe detections are rare, owing to the intrinsic rarity of SN explosions, the necessity of alignment with a foreground lens, and the relatively short window of detectability. We present the Las Cumbres Observatory Lensed Supernova Search, LCOLSS, a targeted survey designed for detecting glSNe in known strong-lensing systems. Using cadenced $r^\prime$-band imaging, LCOLSS targeted 112 galaxy-galaxy lensing systems with high expected SN rates, based on estimated star formation rates. No plausible glSN was detected by LCOLSS over two years of observing. The analysis performed here measures a detection efficiency for these observations and runs a Monte Carlo simulation using the predicted supernova rates to determine the expected number of glSN detections. The results of the simulation suggest an expected number of detections and $68\%$ Poisson confidence intervals, $N_{SN} = 0.20, [0,2.1] $, $N_{Ia} = 0.08, [0,2.0]$, $N_{CC} = 0.12, [0,2.0]$, for all SN, Type Ia, and core-collapse (CC) SNe respectively. These results are broadly consistent with the absence of a detection in our survey. Analysis of the survey strategy can provide insights for future efforts to develop targeted glSN discovery programs. We thereby forecast expected detection rates for the Rubin observatory for such a targeted survey, finding that a single visit depth of 24.7 mag with the Rubin observatory will detect $0.63 \pm 0.38$ SNe per year, with $0.47 \pm 0.28$ core collapse SNe per year and $0.16 \pm 0.10$ Type Ia SNe per year.
△ Less
Submitted 2 November, 2021;
originally announced November 2021.
-
Estimation of Stationary Optimal Transport Plans
Authors:
Kevin O'Connor,
Kevin McGoff,
Andrew B Nobel
Abstract:
We study optimal transport for stationary stochastic processes taking values in finite spaces. In order to reflect the stationarity of the underlying processes, we restrict attention to stationary couplings, also known as joinings. The resulting optimal joining problem captures differences in the long run average behavior of the processes of interest. We introduce estimators of both optimal joinin…
▽ More
We study optimal transport for stationary stochastic processes taking values in finite spaces. In order to reflect the stationarity of the underlying processes, we restrict attention to stationary couplings, also known as joinings. The resulting optimal joining problem captures differences in the long run average behavior of the processes of interest. We introduce estimators of both optimal joinings and the optimal joining cost, and we establish consistency of the estimators under mild conditions. Furthermore, under stronger mixing assumptions we establish finite-sample error rates for the estimated optimal joining cost that extend the best known results in the iid case. Finally, we extend the consistency and rate analysis to an entropy-penalized version of the optimal joining problem.
△ Less
Submitted 10 December, 2021; v1 submitted 25 July, 2021;
originally announced July 2021.
-
A Gravitationally Lensed Supernova with an Observable Two-Decade Time Delay
Authors:
Steven A. Rodney,
Gabriel B. Brammer,
Justin D. R. Pierel,
Johan Richard,
Sune Toft,
Kyle F. O'Connor,
Mohammad Akhshik,
Katherine Whitaker
Abstract:
When the light from a distant object passes very near to a foreground galaxy or cluster, gravitational lensing can cause it to appear as multiple images on the sky. If the source is variable, it can be used to constrain the cosmic expansion rate and dark energy models. Achieving these cosmological goals requires many lensed transients with precise time delay measurements. Lensed supernovae (SN) ar…
▽ More
When the light from a distant object passes very near to a foreground galaxy or cluster, gravitational lensing can cause it to appear as multiple images on the sky. If the source is variable, it can be used to constrain the cosmic expansion rate and dark energy models. Achieving these cosmological goals requires many lensed transients with precise time delay measurements. Lensed supernovae (SN) are attractive for this purpose because they have relatively simple photometric behavior, with well-understood light curve shapes and colours $-$ in contrast to the stochastic variation of quasars. Here we report the discovery of a multiply-imaged supernova, AT2016jka ("SN Requiem"). It appeared in an evolved galaxy at $z=1.95$, gravitationally lensed by a foreground galaxy cluster. It is likely a Type Ia supernova $-$ the explosion of a low-mass stellar remnant, whose light curve can be used to measure cosmic distances. In archival Hubble Space Telescope imaging, three lensed images of the supernova are detected with relative time delays of $<$200 days. We predict a fourth image will appear close to the cluster core in the year 2037$\pm$2. Observation of the fourth image could provide a time delay precision of $\approx$7 days, $<1\%$ of the extraordinary 20 year baseline. The SN classification and the predicted reappearance time could be improved with further lens modelling and a comprehensive analysis of systematic uncertainties.
△ Less
Submitted 12 July, 2021; v1 submitted 16 June, 2021;
originally announced June 2021.
-
Alignment and Comparison of Directed Networks via Transition Couplings of Random Walks
Authors:
Bongsoo Yi,
Kevin O'Connor,
Kevin McGoff,
Andrew B. Nobel
Abstract:
We describe and study a transport based procedure called NetOTC (network optimal transition coupling) for the comparison and alignment of two networks. The networks of interest may be directed or undirected, weighted or unweighted, and may have distinct vertex sets of different sizes. Given two networks and a cost function relating their vertices, NetOTC finds a transition coupling of their associ…
▽ More
We describe and study a transport based procedure called NetOTC (network optimal transition coupling) for the comparison and alignment of two networks. The networks of interest may be directed or undirected, weighted or unweighted, and may have distinct vertex sets of different sizes. Given two networks and a cost function relating their vertices, NetOTC finds a transition coupling of their associated random walks having minimum expected cost. The minimizing cost quantifies the difference between the networks, while the optimal transport plan itself provides alignments of both the vertices and the edges of the two networks. Coupling of the full random walks, rather than their marginal distributions, ensures that NetOTC captures local and global information about the networks, and preserves edges. NetOTC has no free parameters, and does not rely on randomization. We investigate a number of theoretical properties of NetOTC and present experiments establishing its empirical performance.
△ Less
Submitted 5 February, 2024; v1 submitted 13 June, 2021;
originally announced June 2021.
-
Neo-humanism and COVID-19: Opportunities for a socially and environmentally sustainable world
Authors:
Francesco Sarracino,
Kelsey J. O'Connor
Abstract:
A series of crises, culminating with COVID-19, shows that going Beyond GDP is urgently necessary. Social and environmental degradation are consequences of emphasizing GDP as a measure of progress. This degradation created the conditions for the COVID-19 pandemic and limited the efficacy of counter-measures. Additionally, rich countries did not fare the pandemic much better than poor ones. COVID-19…
▽ More
A series of crises, culminating with COVID-19, shows that going Beyond GDP is urgently necessary. Social and environmental degradation are consequences of emphasizing GDP as a measure of progress. This degradation created the conditions for the COVID-19 pandemic and limited the efficacy of counter-measures. Additionally, rich countries did not fare the pandemic much better than poor ones. COVID-19 thrived on inequalities and a lack of cooperation. In this article we leverage on defensive growth models to explain the complex relationships between these factors, and we put forward the idea of neo-humanism, a cultural movement grounded on evidence from quality-of-life studies. The movement proposes a new culture leading towards a socially and environmentally sustainable future. Specifically, neo-humanism suggests that prioritizing well-being by, for instance, promoting social relations, would benefit the environment, enable collective action to address public issues, which in turn positively affects productivity and health, among other behavioral outcomes, and thereby instills a virtuous cycle. Arguably, such a society would have been better endowed to cope with COVID-19, and possibly even prevented the pandemic. Neo-humanism proposes a world in which the well-being of people comes before the well-being of markets, in which promoting cooperation and social relations represents the starting point for better lives, and a peaceful and respectful coexistence with other species on Earth.
△ Less
Submitted 2 May, 2021;
originally announced May 2021.
-
Optimal Transport for Stationary Markov Chains via Policy Iteration
Authors:
Kevin O'Connor,
Kevin McGoff,
Andrew B. Nobel
Abstract:
We study the optimal transport problem for pairs of stationary finite-state Markov chains, with an emphasis on the computation of optimal transition couplings. Transition couplings are a constrained family of transport plans that capture the dynamics of Markov chains. Solutions of the optimal transition coupling (OTC) problem correspond to alignments of the two chains that minimize long-term avera…
▽ More
We study the optimal transport problem for pairs of stationary finite-state Markov chains, with an emphasis on the computation of optimal transition couplings. Transition couplings are a constrained family of transport plans that capture the dynamics of Markov chains. Solutions of the optimal transition coupling (OTC) problem correspond to alignments of the two chains that minimize long-term average cost. We establish a connection between the OTC problem and Markov decision processes, and show that solutions of the OTC problem can be obtained via an adaptation of policy iteration. For settings with large state spaces, we develop a fast approximate algorithm based on an entropy-regularized version of the OTC problem, and provide bounds on its per-iteration complexity. We establish a stability result for both the regularized and unregularized algorithms, from which a statistical consistency result follows as a corollary. We validate our theoretical results empirically through a simulation study, demonstrating that the approximate algorithm exhibits faster overall runtime with low error. Finally, we extend the setting and application of our methods to hidden Markov models, and illustrate the potential use of the proposed algorithms in practice with an application to computer-generated music.
△ Less
Submitted 16 September, 2021; v1 submitted 14 June, 2020;
originally announced June 2020.
-
The BUFFALO HST Survey
Authors:
Charles L. Steinhardt,
Mathilde Jauzac,
Ana Acebron,
Hakim Atek,
Peter Capak,
Iary Davidzon,
Dominique Eckert,
David Harvey,
Anton M. Koekemoer,
Claudia D. P. Lagos,
Guillaume Mahler,
Mireia Montes,
Anna Niemiec,
Mario Nonino,
P. A. Oesch,
Johan Richard,
Steven A. Rodney,
Matthieu Schaller,
Keren Sharon,
Louis-Gregory Strolger,
Joseph Allingham,
Adam Amara,
Yannick Bah'e,
Celine Boehm,
Sownak Bose
, et al. (70 additional authors not shown)
Abstract:
The Beyond Ultra-deep Frontier Fields and Legacy Observations (BUFFALO) is a 101 orbit + 101 parallel Cycle 25 Hubble Space Telescope Treasury program taking data from 2018-2020. BUFFALO will expand existing coverage of the Hubble Frontier Fields (HFF) in WFC3/IR F105W, F125W, and F160W and ACS/WFC F606W and F814W around each of the six HFF clusters and flanking fields. This additional area has no…
▽ More
The Beyond Ultra-deep Frontier Fields and Legacy Observations (BUFFALO) is a 101 orbit + 101 parallel Cycle 25 Hubble Space Telescope Treasury program taking data from 2018-2020. BUFFALO will expand existing coverage of the Hubble Frontier Fields (HFF) in WFC3/IR F105W, F125W, and F160W and ACS/WFC F606W and F814W around each of the six HFF clusters and flanking fields. This additional area has not been observed by HST but is already covered by deep multi-wavelength datasets, including Spitzer and Chandra. As with the original HFF program, BUFFALO is designed to take advantage of gravitational lensing from massive clusters to simultaneously find high-redshift galaxies which would otherwise lie below HST detection limits and model foreground clusters to study properties of dark matter and galaxy assembly. The expanded area will provide a first opportunity to study both cosmic variance at high redshift and galaxy assembly in the outskirts of the large HFF clusters. Five additional orbits are reserved for transient followup. BUFFALO data including mosaics, value-added catalogs and cluster mass distribution models will be released via MAST on a regular basis, as the observations and analysis are completed for the six individual clusters.
△ Less
Submitted 13 February, 2020; v1 submitted 27 January, 2020;
originally announced January 2020.
-
Turaev Hyperbolicity of Classical and Virtual Knots
Authors:
Colin Adams,
Or Eisenberg,
Jonah Greenberg,
Kabir Kapoor,
Zhen Liang,
Kate O'Connor,
Natalia Pacheco-Tallaj,
Yi Wang
Abstract:
By work of W. Thurston, knots and links in the 3-sphere are known to either be torus links, or to contain an essential torus in their complement, or to be hyperbolic, in which case a unique hyperbolic volume can be calculated for their complement. We employ a construction of Turaev to associate a family of hyperbolic 3-manifolds of finite volume to any classical or virtual link, even if non-hyperb…
▽ More
By work of W. Thurston, knots and links in the 3-sphere are known to either be torus links, or to contain an essential torus in their complement, or to be hyperbolic, in which case a unique hyperbolic volume can be calculated for their complement. We employ a construction of Turaev to associate a family of hyperbolic 3-manifolds of finite volume to any classical or virtual link, even if non-hyperbolic. These are in turn used to define the Turaev volume of a link, which is the minimal volume among all the hyperbolic 3-manifolds associated via this Turaev construction. In the case of a classical link, we can also define the classical Turaev volume, which is the minimal volume among all the hyperbolic 3-manifolds associated via this Turaev construction for the classical projections only. We then investigate these new invariants.
△ Less
Submitted 19 December, 2019;
originally announced December 2019.
-
TG-Hyperbolicity of Virtual Links
Authors:
Colin Adams,
Or Eisenberg,
Jonah Greenberg,
Kabir Kapoor,
Zhen Liang,
Kate O'Connor,
Natalia Pacheco-Tallaj,
Yi Wang
Abstract:
We extend the theory of hyperbolicity of links in the 3-sphere to tg-hyperbolicity of virtual links, using the fact that the theory of virtual links can be translated into the theory of links living in closed orientable thickened surfaces. When the boundary surfaces are taken to be totally geodesic, we obtain a tg-hyperbolic structure with a unique associated volume. We prove that all virtual alte…
▽ More
We extend the theory of hyperbolicity of links in the 3-sphere to tg-hyperbolicity of virtual links, using the fact that the theory of virtual links can be translated into the theory of links living in closed orientable thickened surfaces. When the boundary surfaces are taken to be totally geodesic, we obtain a tg-hyperbolic structure with a unique associated volume. We prove that all virtual alternating links are tg-hyperbolic. We further extend tg-hyperbolicity to several classes of non-alternating virtual links. We then consider bounds on volumes of virtual links and include a table for volumes of the 116 nontrivial virtual knots of four or fewer crossings, all of which, with the exception of the trefoil knot, turn out to be tg-hyperbolic.
△ Less
Submitted 12 April, 2019;
originally announced April 2019.
-
Deep Neural Networks Ensemble for Detecting Medication Mentions in Tweets
Authors:
Davy Weissenbacher,
Abeed Sarker,
Ari Klein,
Karen O'Connor,
Arjun Magge Ranganatha,
Graciela Gonzalez-Hernandez
Abstract:
Objective: After years of research, Twitter posts are now recognized as an important source of patient-generated data, providing unique insights into population health. A fundamental step to incorporating Twitter data in pharmacoepidemiological research is to automatically recognize medication mentions in tweets. Given that lexical searches for medication names may fail due to misspellings or ambi…
▽ More
Objective: After years of research, Twitter posts are now recognized as an important source of patient-generated data, providing unique insights into population health. A fundamental step to incorporating Twitter data in pharmacoepidemiological research is to automatically recognize medication mentions in tweets. Given that lexical searches for medication names may fail due to misspellings or ambiguity with common words, we propose a more advanced method to recognize them. Methods: We present Kusuri, an Ensemble Learning classifier, able to identify tweets mentioning drug products and dietary supplements. Kusuri ("medication" in Japanese) is composed of two modules. First, four different classifiers (lexicon-based, spelling-variant-based, pattern-based and one based on a weakly-trained neural network) are applied in parallel to discover tweets potentially containing medication names. Second, an ensemble of deep neural networks encoding morphological, semantical and long-range dependencies of important words in the tweets discovered is used to make the final decision. Results: On a balanced (50-50) corpus of 15,005 tweets, Kusuri demonstrated performances close to human annotators with 93.7% F1-score, the best score achieved thus far on this corpus. On a corpus made of all tweets posted by 113 Twitter users (98,959 tweets, with only 0.26% mentioning medications), Kusuri obtained 76.3% F1-score. There is not a prior drug extraction system that compares running on such an extremely unbalanced dataset. Conclusion: The system identifies tweets mentioning drug names with performance high enough to ensure its usefulness and ready to be integrated in larger natural language processing systems.
△ Less
Submitted 30 September, 2019; v1 submitted 10 April, 2019;
originally announced April 2019.
-
Nanoscale Thermal Imaging of VO$_2$ via Poole-Frenkel Conduction
Authors:
Alyson Spitzig,
Adam Pivonka,
Alex Frenzel,
Jeehoon Kim,
Changhyun Ko,
You Zhou,
Kevin O'Connor,
Eric Hudson,
Shriram Ramanathan,
Jennifer E. Hoffman,
Jason Hoffman
Abstract:
We present a method for nanoscale thermal imaging of insulating thin films using atomic force microscopy (AFM), and we demonstrate its utility on VO$_2$. We sweep the applied voltage $V$ to a conducting AFM tip in contact mode and measure the local current $I$ through the film. By fitting the $IV$ curves to a Poole-Frenkel conduction model at low $V$, we calculate the local temperature with spatia…
▽ More
We present a method for nanoscale thermal imaging of insulating thin films using atomic force microscopy (AFM), and we demonstrate its utility on VO$_2$. We sweep the applied voltage $V$ to a conducting AFM tip in contact mode and measure the local current $I$ through the film. By fitting the $IV$ curves to a Poole-Frenkel conduction model at low $V$, we calculate the local temperature with spatial resolution better than 50 nm using only fundamental constants and known film properties. Our thermometry technique enables local temperature measurement of \textit{any} insulating film dominated by the Poole-Frenkel conduction mechanism, and can be extended to insulators that display other conduction mechanisms.
△ Less
Submitted 13 April, 2022; v1 submitted 7 March, 2019;
originally announced March 2019.
-
Exchangeable Generative Models with Flow Scans
Authors:
Christopher Bender,
Kevin O'Connor,
Yang Li,
Juan Jose Garcia,
Manzil Zaheer,
Junier Oliva
Abstract:
In this work, we develop a new approach to generative density estimation for exchangeable, non-i.i.d. data. The proposed framework, FlowScan, combines invertible flow transformations with a sorted scan to flexibly model the data while preserving exchangeability. Unlike most existing methods, FlowScan exploits the intradependencies within sets to learn both global and local structure. FlowScan repr…
▽ More
In this work, we develop a new approach to generative density estimation for exchangeable, non-i.i.d. data. The proposed framework, FlowScan, combines invertible flow transformations with a sorted scan to flexibly model the data while preserving exchangeability. Unlike most existing methods, FlowScan exploits the intradependencies within sets to learn both global and local structure. FlowScan represents the first approach that is able to apply sequential methods to exchangeable density estimation without resorting to averaging over all possible permutations. We achieve new state-of-the-art performance on point cloud and image set modeling.
△ Less
Submitted 18 September, 2019; v1 submitted 5 February, 2019;
originally announced February 2019.
-
Numbers Represented by a Finite Set of Binary Quadratic Forms
Authors:
Christopher Donnay,
Havi Ellers,
Kate O'Connor,
Katherine Thompson,
Erin Wood
Abstract:
Every quadratic form represents 0; therefore, if we take any number of quadratic forms and ask which integers are simultaneously represented by all members of the collection, we are guaranteed a nonempty set. But when is that set more than just the "trivial" 0? We address this question in the case of integral, positive- definite, reduced, binary quadratic forms. For forms of the same discriminant,…
▽ More
Every quadratic form represents 0; therefore, if we take any number of quadratic forms and ask which integers are simultaneously represented by all members of the collection, we are guaranteed a nonempty set. But when is that set more than just the "trivial" 0? We address this question in the case of integral, positive- definite, reduced, binary quadratic forms. For forms of the same discriminant, we can use the structure of the underlying class group. If, however, the forms have different discriminants, we must apply class field theory.
△ Less
Submitted 16 August, 2017;
originally announced August 2017.
-
Entangled Rings
Authors:
Kevin M. O'Connor,
William K. Wootters
Abstract:
Consider a ring of N qubits in a translationally invariant quantum state. We ask to what extent each pair of nearest neighbors can be entangled. Under certain assumptions about the form of the state, we find a formula for the maximum possible nearest-neighbor entanglement. We then compare this maximum with the entanglement achieved by the ground state of an antiferromagnetic ring consisting of a…
▽ More
Consider a ring of N qubits in a translationally invariant quantum state. We ask to what extent each pair of nearest neighbors can be entangled. Under certain assumptions about the form of the state, we find a formula for the maximum possible nearest-neighbor entanglement. We then compare this maximum with the entanglement achieved by the ground state of an antiferromagnetic ring consisting of an even number of spin-1/2 particles. We find that, though the antiferromagnetic ground state does not maximize the nearest-neighbor entanglement relative to all other states, it does so relative to other states having zero z-component of spin.
△ Less
Submitted 14 September, 2000; v1 submitted 11 September, 2000;
originally announced September 2000.