-
3d human motion generation from the text via gesture action classification and the autoregressive model
Authors:
Gwantae Kim,
Youngsuk Ryu,
Junyeop Lee,
David K. Han,
Jeongmin Bae,
Hanseok Ko
Abstract:
In this paper, a deep learning-based model for 3D human motion generation from the text is proposed via gesture action classification and an autoregressive model. The model focuses on generating special gestures that express human thinking, such as waving and nodding. To achieve the goal, the proposed method predicts expression from the sentences using a text classification model based on a pretra…
▽ More
In this paper, a deep learning-based model for 3D human motion generation from the text is proposed via gesture action classification and an autoregressive model. The model focuses on generating special gestures that express human thinking, such as waving and nodding. To achieve the goal, the proposed method predicts expression from the sentences using a text classification model based on a pretrained language model and generates gestures using the gate recurrent unit-based autoregressive model. Especially, we proposed the loss for the embedding space for restoring raw motions and generating intermediate motions well. Moreover, the novel data augmentation method and stop token are proposed to generate variable length motions. To evaluate the text classification model and 3D human motion generation model, a gesture action classification dataset and action-based gesture dataset are collected. With several experiments, the proposed method successfully generates perceptually natural and realistic 3D human motion from the text. Moreover, we verified the effectiveness of the proposed method using a public-available action recognition dataset to evaluate cross-dataset generalization performance.
△ Less
Submitted 17 November, 2022;
originally announced November 2022.
-
Machine learning building-block-flow wall model for large-eddy simulation
Authors:
Adrián Lozano-Durán,
H. Jane Bae
Abstract:
A wall model for large-eddy simulation (LES) is proposed by devising the flow as a combination of building blocks. The core assumption of the model is that a finite set of simple canonical flows contains the essential physics to predict the wall-shear stress in more complex scenarios. The model is constructed to predict zero/favourable/adverse mean pressure gradient wall turbulence, separation, st…
▽ More
A wall model for large-eddy simulation (LES) is proposed by devising the flow as a combination of building blocks. The core assumption of the model is that a finite set of simple canonical flows contains the essential physics to predict the wall-shear stress in more complex scenarios. The model is constructed to predict zero/favourable/adverse mean pressure gradient wall turbulence, separation, statistically unsteady turbulence with mean flow three-dimensionality, and laminar flow. The approach is implemented using two types of artificial neural networks: a classifier, which identifies the contribution of each building block in the flow, and a predictor, which estimates the wall-shear stress via combination of the building-block flows. The training data are directly obtained from wall-modelled LES (WMLES) optimised to reproduce the correct mean quantities. This approach guarantees the consistency of the training data with the numerical discretisation and the gridding strategy of the flow solver. The output of the model is accompanied by a confidence score in the prediction that aids the detection of regions where the model underperforms. The model is validated in canonical flows (e.g. laminar/turbulent boundary layers, turbulent channels, turbulent Poiseuille-Couette flow, turbulent pipe) and two realistic aircraft configurations: the NASA Common Research Model High-lift and NASA Juncture Flow experiment. It is shown that the building-block-flow wall model outperforms (or matches) the predictions by an equilibrium wall model. It is also concluded that further improvements in WMLES should incorporate advances in subgrid-scale modelling to minimise error propagation to the wall model.
△ Less
Submitted 30 April, 2023; v1 submitted 14 November, 2022;
originally announced November 2022.
-
MoNET: Tackle State Momentum via Noise-Enhanced Training for Dialogue State Tracking
Authors:
Haoning Zhang,
Junwei Bao,
Haipeng Sun,
Youzheng Wu,
Wenye Li,
Shuguang Cui,
Xiaodong He
Abstract:
Dialogue state tracking (DST) aims to convert the dialogue history into dialogue states which consist of slot-value pairs. As condensed structural information memorizing all history information, the dialogue state in the last turn is typically adopted as the input for predicting the current state by DST models. However, these models tend to keep the predicted slot values unchanged, which is define…
▽ More
Dialogue state tracking (DST) aims to convert the dialogue history into dialogue states which consist of slot-value pairs. As condensed structural information memorizing all history information, the dialogue state in the last turn is typically adopted as the input for predicting the current state by DST models. However, these models tend to keep the predicted slot values unchanged, which is defined as state momentum in this paper. Specifically, the models struggle to update slot values that need to be changed and correct wrongly predicted slot values in the last turn. To this end, we propose MoNET to tackle state momentum via noise-enhanced training. First, the previous state of each turn in the training data is noised via replacing some of its slot values. Then, the noised previous state is used as the input to learn to predict the current state, improving the model's ability to update and correct slot values. Furthermore, a contrastive context matching framework is designed to narrow the representation distance between a state and its corresponding noised variant, which reduces the impact of noised state and makes the model better understand the dialogue history. Experimental results on MultiWOZ datasets show that MoNET outperforms previous DST methods. Ablations and analysis verify the effectiveness of MoNET in alleviating state momentum and improving anti-noise ability.
△ Less
Submitted 18 June, 2023; v1 submitted 10 November, 2022;
originally announced November 2022.
-
An Empirical Study on L2 Accents of Cross-lingual Text-to-Speech Systems via Vowel Space
Authors:
Jihwan Lee,
Jae-Sung Bae,
Seongkyu Mun,
Heejin Choi,
Joun Yeop Lee,
Hoon-Young Cho,
Chanwoo Kim
Abstract:
With the recent developments in cross-lingual Text-to-Speech (TTS) systems, L2 (second-language, or foreign) accent problems arise. Moreover, running a subjective evaluation for such cross-lingual TTS systems is troublesome. The vowel space analysis, which is often utilized to explore various aspects of language including L2 accents, is a great alternative analysis tool. In this study, we apply th…
▽ More
With the recent developments in cross-lingual Text-to-Speech (TTS) systems, L2 (second-language, or foreign) accent problems arise. Moreover, running a subjective evaluation for such cross-lingual TTS systems is troublesome. The vowel space analysis, which is often utilized to explore various aspects of language including L2 accents, is a great alternative analysis tool. In this study, we apply the vowel space analysis method to explore L2 accents of cross-lingual TTS systems. Through the vowel space analysis, we observe the three followings: a) a parallel architecture (Glow-TTS) is less L2-accented than an auto-regressive one (Tacotron); b) L2 accents are more dominant in non-shared vowels in a language pair; and c) L2 accents of cross-lingual TTS systems share some phenomena with those of human L2 learners. Our findings imply that it is necessary for TTS systems to handle each language pair differently, depending on their linguistic characteristics such as non-shared vowels. They also hint that we can further incorporate linguistics knowledge in developing cross-lingual TTS systems.
△ Less
Submitted 6 November, 2022;
originally announced November 2022.
-
Large Language Models Are Human-Level Prompt Engineers
Authors:
Yongchao Zhou,
Andrei Ioan Muresanu,
Ziwen Han,
Keiran Paster,
Silviu Pitis,
Harris Chan,
Jimmy Ba
Abstract:
By conditioning on natural language instructions, large language models (LLMs) have displayed impressive capabilities as general-purpose computers. However, task performance depends significantly on the quality of the prompt used to steer the model, and most effective prompts have been handcrafted by humans. Inspired by classical program synthesis and the human approach to prompt engineering, we p…
▽ More
By conditioning on natural language instructions, large language models (LLMs) have displayed impressive capabilities as general-purpose computers. However, task performance depends significantly on the quality of the prompt used to steer the model, and most effective prompts have been handcrafted by humans. Inspired by classical program synthesis and the human approach to prompt engineering, we propose Automatic Prompt Engineer (APE) for automatic instruction generation and selection. In our method, we treat the instruction as the "program," optimized by searching over a pool of instruction candidates proposed by an LLM in order to maximize a chosen score function. To evaluate the quality of the selected instruction, we evaluate the zero-shot performance of another LLM following the selected instruction. Experiments on 24 NLP tasks show that our automatically generated instructions outperform the prior LLM baseline by a large margin and achieve better or comparable performance to the instructions generated by human annotators on 19/24 tasks. We conduct extensive qualitative and quantitative analyses to explore the performance of APE. We show that APE-engineered prompts can be applied to steer models toward truthfulness and/or informativeness, as well as to improve few-shot learning performance by simply prepending them to standard in-context learning prompts. Please check out our webpage at https://sites.google.com/view/automatic-prompt-engineer.
△ Less
Submitted 10 March, 2023; v1 submitted 3 November, 2022;
originally announced November 2022.
-
Reduce Catastrophic Forgetting of Dense Retrieval Training with Teleportation Negatives
Authors:
Si Sun,
Chenyan Xiong,
Yue Yu,
Arnold Overwijk,
Zhiyuan Liu,
Jie Bao
Abstract:
In this paper, we investigate the instability in the standard dense retrieval training, which iterates between model training and hard negative selection using the being-trained model. We show the catastrophic forgetting phenomena behind the training instability, where models learn and forget different negative groups during training iterations. We then propose ANCE-Tele, which accumulates momentu…
▽ More
In this paper, we investigate the instability in the standard dense retrieval training, which iterates between model training and hard negative selection using the being-trained model. We show the catastrophic forgetting phenomena behind the training instability, where models learn and forget different negative groups during training iterations. We then propose ANCE-Tele, which accumulates momentum negatives from past iterations and approximates future iterations using lookahead negatives, as "teleportations" along the time axis to smooth the learning process. On web search and OpenQA, ANCE-Tele outperforms previous state-of-the-art systems of similar size, eliminates the dependency on sparse retrieval negatives, and is competitive among systems using significantly more (50x) parameters. Our analysis demonstrates that teleportation negatives reduce catastrophic forgetting and improve convergence speed for dense retrieval training. Our code is available at https://github.com/OpenMatch/ANCE-Tele.
△ Less
Submitted 31 October, 2022;
originally announced October 2022.
-
Coupling between colossal charge density wave ordering and magnetism in Ho2Ir3Si5
Authors:
Sitaram Ramakrishnan,
Jin-Ke Bao,
Claudio Eisele,
Bikash Patra,
Minoru Nohara,
Biplab Bag,
Leila Noohinejad,
Martin Tolkiehn,
Carsten Paulmann,
Achim M. Schaller,
Toms Rekis,
Surya Rohith Kotla,
Andreas Schönleber,
Arumugam Thamizhavel,
Bahadur Singh,
Srinivasan Ramakrishnan,
Sander van Smaalen
Abstract:
Ho2Ir3Si5 belongs to the family of three-dimensional (3D) R2Ir3Si5 (R = Lu, Er and Ho) compounds that exhibit a colossal first-order charge density wave (CDW) transition where there is a strong orthorhombic-to-triclinic distortion of the lattice accompanied by superlattice reflections. The analysis by single-crystal X-ray diffraction (SXRD) has revealed that the Ir-Ir zigzag chains along c are res…
▽ More
Ho2Ir3Si5 belongs to the family of three-dimensional (3D) R2Ir3Si5 (R = Lu, Er and Ho) compounds that exhibit a colossal first-order charge density wave (CDW) transition where there is a strong orthorhombic-to-triclinic distortion of the lattice accompanied by superlattice reflections. The analysis by single-crystal X-ray diffraction (SXRD) has revealed that the Ir-Ir zigzag chains along c are responsible for the CDW in all three compounds. The replacement of the rare earth element from non-magnetic Lu to magnetic Er or Ho lowers TCDW, where TCDWLu = 200 K, TCDWEr = 150 K and TCDWHo = 90 K. Out of the three compounds, Ho2Ir3Si5 is the only system where second-order superlattice reflections could be observed, indicative of an anharmonic shape of the modulation wave. The CDW transition is observed as anomalies in the temperature dependencies of the specific heat, electrical conductivity and magnetic susceptibility, which includes a large hysteresis of 90 to 130 K for all measured properties, thus corroborating the SXRD measurements. Similar to previously reported Er2Ir3Si5, there appears to be a coupling between CDW and magnetism such that the Ho3+ magnetic moments are influenced by the CDW transition, even in the paramagnetic state. Moreover, earlier investigations on polycrystalline material revealed antiferromagnetic (AFM) ordering at TN = 5.1 K, whereas AFM order is suppressed and only the CDW is present in our highly ordered single-crystal. First-principles calculations predict Ho2Ir3Si5 to be a metal with coexisting electron and hole pockets at the Fermi level. The Ho and Ir atoms have spherically symmetric metallic-type charge density distributions that are prone to CDW distortion. Phonon calculations affirm that the Ir atoms are primarily responsible for the CDW distortion, which is in agreement with the experiment.
△ Less
Submitted 28 October, 2022;
originally announced October 2022.
-
Structured Distributions of Gas and Solids in Protoplanetary Disks
Authors:
Jaehan Bae,
Andrea Isella,
Zhaohuan Zhu,
Rebecca Martin,
Satoshi Okuzumi,
Scott Suriano
Abstract:
Recent spatially-resolved observations of protoplanetary disks revealed a plethora of substructures, including concentric rings and gaps, inner cavities, misalignments, spiral arms, and azimuthal asymmetries. This is the major breakthrough in studies of protoplanetary disks since Protostars and Planets VI and is reshaping the field of planet formation. However, while the capability of imaging subs…
▽ More
Recent spatially-resolved observations of protoplanetary disks revealed a plethora of substructures, including concentric rings and gaps, inner cavities, misalignments, spiral arms, and azimuthal asymmetries. This is the major breakthrough in studies of protoplanetary disks since Protostars and Planets VI and is reshaping the field of planet formation. However, while the capability of imaging substructures in protoplanetary disks has been steadily improving, the origin of many substructures are still largely debated. The structured distributions of gas and solids in protoplanetary disks likely reflect the outcome of physical processes at work, including the formation of planets. Yet, the diverse properties among the observed protoplanetary disk population, for example, the number and radial location of rings and gaps in the dust distribution, suggest that the controlling process may differ between disks and/or the outcome may be sensitive to stellar or disk properties. In this review, we (1) summarize the existing observations of protoplanetary disk substructures collected from the literature; (2) provide a comprehensive theoretical review of various processes proposed to explain observed protoplanetary disk substructures; (3) compare current theoretical predictions with existing observations and highlight future research directions to distinguish between different origins; and (4) discuss implications of state-of-the-art protoplanetary disk observations to protoplanetary disk and planet formation theory.
△ Less
Submitted 16 January, 2023; v1 submitted 24 October, 2022;
originally announced October 2022.
-
Orbital hybridization-driven charge density wave transition in CsV3Sb5 kagome superconductor
Authors:
Shulun Han,
Chi Sin Tang,
Linyang Li,
Yi Liu,
Huimin Liu,
Jian Gou,
Jing Wu,
Difan Zhou,
Ping Yang,
Caozheng Diao,
Jiacheng Ji,
Jinke Bao,
Lingfeng Zhang,
Mingwen Zhao,
M. V. Milošević,
Yanqun Guo,
Lijun Tian,
Mark B. H. Breese,
Guanghan Cao,
Chuanbing Cai,
Andrew T. S. Wee,
Xinmao Yin
Abstract:
Owing to its inherent non-trivial geometry, the unique structural motif of the recently discovered Kagome topological superconductor AV3Sb5 is an ideal host of diverse topologically non-trivial phenomena, including giant anomalous Hall conductivity, topological charge order, charge density wave, and unconventional superconductivity. Despite possessing a normal-state CDW order in the form of topolo…
▽ More
Owing to its inherent non-trivial geometry, the unique structural motif of the recently discovered Kagome topological superconductor AV3Sb5 is an ideal host of diverse topologically non-trivial phenomena, including giant anomalous Hall conductivity, topological charge order, charge density wave, and unconventional superconductivity. Despite possessing a normal-state CDW order in the form of topological chiral charge order and diverse superconducting gaps structures, it remains unclear how fundamental atomic-level properties and many-body effects including Fermi surface nesting, electron-phonon coupling, and orbital hybridization contribute to these symmetry-breaking phenomena. Here, we report the direct participation of the V3d-Sb5p orbital hybridization in mediating the CDW phase transition in CsV3Sb5. The combination of temperature-dependent X-ray absorption and first principles studies clearly indicate the Inverse Star of David structure as the preferred reconstruction in the low-temperature CDW phase. Our results highlight the critical role that Sb orbitals plays and establish orbital hybridization as the direct mediator of the CDW states and structural transition dynamics in Kagome unconventional superconductors. This is a significant step towards the fundamental understanding and control of the emerging correlated phases from the Kagome lattice through the orbital interactions and provide promising approaches to novel regimes in unconventional orders and topology.
△ Less
Submitted 8 December, 2022; v1 submitted 23 October, 2022;
originally announced October 2022.
-
P$^3$LM: Probabilistically Permuted Prophet Language Modeling for Generative Pre-Training
Authors:
Junwei Bao,
Yifan Wang,
Jiangyong Ying,
Yeyun Gong,
Jing Zhao,
Youzheng Wu,
Xiaodong He
Abstract:
Conventional autoregressive left-to-right (L2R) sequence generation faces two issues during decoding: limited to unidirectional target sequence modeling, and constrained on strong local dependencies. To address the aforementioned problem, we propose P$^3$LM, a probabilistically permuted prophet language model, which strengthens the modeling of bidirectional information and long token dependencies…
▽ More
Conventional autoregressive left-to-right (L2R) sequence generation faces two issues during decoding: limited to unidirectional target sequence modeling, and constrained on strong local dependencies. To address the aforementioned problem, we propose P$^3$LM, a probabilistically permuted prophet language model, which strengthens the modeling of bidirectional information and long token dependencies for sequence generation. Specifically, P$^3$LM learns to generate tokens in permuted order upon an order-aware transformer decoder, as well as to generate the corresponding future $N$ tokens with a multi-stream attention mechanism. Extensive experiments are conducted on the GLGE benchmark, which includes four datasets for summarization, two for question generation, one for conversational question answering, and one for dialog response generation, where P$^3$LM achieves state-of-the-art results compared with strong publicly available generative pre-training methods.
△ Less
Submitted 21 October, 2022;
originally announced October 2022.
-
MuGER$^2$: Multi-Granularity Evidence Retrieval and Reasoning for Hybrid Question Answering
Authors:
Yingyao Wang,
Junwei Bao,
Chaoqun Duan,
Youzheng Wu,
Xiaodong He,
Tiejun Zhao
Abstract:
Hybrid question answering (HQA) aims to answer questions over heterogeneous data, including tables and passages linked to table cells. The heterogeneous data can provide different granularity evidence to HQA models, e.t., column, row, cell, and link. Conventional HQA models usually retrieve coarse- or fine-grained evidence to reason the answer. Through comparison, we find that coarse-grained evide…
▽ More
Hybrid question answering (HQA) aims to answer questions over heterogeneous data, including tables and passages linked to table cells. The heterogeneous data can provide different granularity evidence to HQA models, e.t., column, row, cell, and link. Conventional HQA models usually retrieve coarse- or fine-grained evidence to reason the answer. Through comparison, we find that coarse-grained evidence is easier to retrieve but contributes less to the reasoner, while fine-grained evidence is the opposite. To preserve the advantage and eliminate the disadvantage of different granularity evidence, we propose MuGER$^2$, a Multi-Granularity Evidence Retrieval and Reasoning approach. In evidence retrieval, a unified retriever is designed to learn the multi-granularity evidence from the heterogeneous data. In answer reasoning, an evidence selector is proposed to navigate the fine-grained evidence for the answer reader based on the learned multi-granularity evidence. Experiment results on the HybridQA dataset show that MuGER$^2$ significantly boosts the HQA performance. Further ablation analysis verifies the effectiveness of both the retrieval and reasoning designs.
△ Less
Submitted 19 October, 2022;
originally announced October 2022.
-
Mars: Modeling Context & State Representations with Contrastive Learning for End-to-End Task-Oriented Dialog
Authors:
Haipeng Sun,
Junwei Bao,
Youzheng Wu,
Xiaodong He
Abstract:
Traditional end-to-end task-oriented dialog systems first convert dialog context into belief state and action state before generating the system response. The system response performance is significantly affected by the quality of the belief state and action state. We first explore what dialog context representation is beneficial to improving the quality of the belief state and action state, which…
▽ More
Traditional end-to-end task-oriented dialog systems first convert dialog context into belief state and action state before generating the system response. The system response performance is significantly affected by the quality of the belief state and action state. We first explore what dialog context representation is beneficial to improving the quality of the belief state and action state, which further enhances the generated response quality. To tackle our exploration, we propose Mars, an end-to-end task-oriented dialog system with two contrastive learning strategies to model the relationship between dialog context and belief/action state representations. Empirical results show dialog context representations, which are more different from semantic state representations, are more conducive to multi-turn task-oriented dialog. Moreover, our proposed Mars achieves state-of-the-art performance on the MultiWOZ 2.0, CamRest676, and CrossWOZ.
△ Less
Submitted 10 July, 2023; v1 submitted 17 October, 2022;
originally announced October 2022.
-
UniRPG: Unified Discrete Reasoning over Table and Text as Program Generation
Authors:
Yongwei Zhou,
Junwei Bao,
Chaoqun Duan,
Youzheng Wu,
Xiaodong He,
Tiejun Zhao
Abstract:
Question answering requiring discrete reasoning, e.g., arithmetic computing, comparison, and counting, over knowledge is a challenging task. In this paper, we propose UniRPG, a semantic-parsing-based approach advanced in interpretability and scalability, to perform unified discrete reasoning over heterogeneous knowledge resources, i.e., table and text, as program generation. Concretely, UniRPG con…
▽ More
Question answering requiring discrete reasoning, e.g., arithmetic computing, comparison, and counting, over knowledge is a challenging task. In this paper, we propose UniRPG, a semantic-parsing-based approach advanced in interpretability and scalability, to perform unified discrete reasoning over heterogeneous knowledge resources, i.e., table and text, as program generation. Concretely, UniRPG consists of a neural programmer and a symbolic program executor, where a program is the composition of a set of pre-defined general atomic and higher-order operations and arguments extracted from table and text. First, the programmer parses a question into a program by generating operations and copying arguments, and then the executor derives answers from table and text based on the program. To alleviate the costly program annotation issue, we design a distant supervision approach for programmer learning, where pseudo programs are automatically constructed without annotated derivations. Extensive experiments on the TAT-QA dataset show that UniRPG achieves tremendous improvements and enhances interpretability and scalability compared with state-of-the-art methods, even without derivation annotation. Moreover, it achieves promising performance on the textual dataset DROP without derivations.
△ Less
Submitted 15 October, 2022;
originally announced October 2022.
-
CSS: Combining Self-training and Self-supervised Learning for Few-shot Dialogue State Tracking
Authors:
Haoning Zhang,
Junwei Bao,
Haipeng Sun,
Huaishao Luo,
Wenye Li,
Shuguang Cui
Abstract:
Few-shot dialogue state tracking (DST) is a realistic problem that trains the DST model with limited labeled data. Existing few-shot methods mainly transfer knowledge learned from external labeled dialogue data (e.g., from question answering, dialogue summarization, machine reading comprehension tasks, etc.) into DST, whereas collecting a large amount of external labeled data is laborious, and the…
▽ More
Few-shot dialogue state tracking (DST) is a realistic problem that trains the DST model with limited labeled data. Existing few-shot methods mainly transfer knowledge learned from external labeled dialogue data (e.g., from question answering, dialogue summarization, machine reading comprehension tasks, etc.) into DST, whereas collecting a large amount of external labeled data is laborious, and the external data may not effectively contribute to the DST-specific task. In this paper, we propose a few-shot DST framework called CSS, which Combines Self-training and Self-supervised learning methods. The unlabeled data of the DST task is incorporated into the self-training iterations, where the pseudo labels are predicted by a DST model trained on limited labeled data in advance. Besides, a contrastive self-supervised method is used to learn better representations, where the data is augmented by the dropout operation to train the model. Experimental results on the MultiWOZ dataset show that our proposed CSS achieves competitive performance in several few-shot scenarios.
△ Less
Submitted 11 October, 2022;
originally announced October 2022.
-
AlphaTuning: Quantization-Aware Parameter-Efficient Adaptation of Large-Scale Pre-Trained Language Models
Authors:
Se Jung Kwon,
Jeonghoon Kim,
Jeongin Bae,
Kang Min Yoo,
Jin-Hwa Kim,
Baeseong Park,
Byeongwook Kim,
Jung-Woo Ha,
Nako Sung,
Dongsoo Lee
Abstract:
There are growing interests in adapting large-scale language models using parameter-efficient fine-tuning methods. However, accelerating the model itself and achieving better inference efficiency through model compression has not been thoroughly explored yet. Model compression could provide the benefits of reducing memory footprints, enabling low-precision computations, and ultimately achieving co…
▽ More
There are growing interests in adapting large-scale language models using parameter-efficient fine-tuning methods. However, accelerating the model itself and achieving better inference efficiency through model compression has not been thoroughly explored yet. Model compression could provide the benefits of reducing memory footprints, enabling low-precision computations, and ultimately achieving cost-effective inference. To combine parameter-efficient adaptation and model compression, we propose AlphaTuning consisting of post-training quantization of the pre-trained language model and fine-tuning only some parts of quantized parameters for a target task. Specifically, AlphaTuning works by employing binary-coding quantization, which factorizes the full-precision parameters into binary parameters and a separate set of scaling factors. During the adaptation phase, the binary values are frozen for all tasks, while the scaling factors are fine-tuned for the downstream task. We demonstrate that AlphaTuning, when applied to GPT-2 and OPT, performs competitively with full fine-tuning on a variety of downstream tasks while achieving >10x compression ratio under 4-bit quantization and >1,000x reduction in the number of trainable parameters.
△ Less
Submitted 7 October, 2022;
originally announced October 2022.
-
Channel Modeling for UAV-to-Ground Communications with Posture Variation and Fuselage Scattering Effect
Authors:
Boyu Hua,
Haoran Ni,
Qiuming Zhu,
Cheng-Xiang Wang,
Tongtong Zhou,
Kai Mao,
Junwei Bao,
Xiaofei Zhang
Abstract:
Unmanned aerial vehicle (UAV)-to-ground (U2G) channel models play a pivotal role for reliable communications between UAV and ground terminal. This paper proposes a three-dimensional (3D) non-stationary hybrid model including both large-scale and small-scale fading for U2G multiple-input-multiple-output (MIMO) channels. Distinctive channel characteristics under U2G scenarios, i.e., 3D trajectory an…
▽ More
Unmanned aerial vehicle (UAV)-to-ground (U2G) channel models play a pivotal role for reliable communications between UAV and ground terminal. This paper proposes a three-dimensional (3D) non-stationary hybrid model including both large-scale and small-scale fading for U2G multiple-input-multiple-output (MIMO) channels. Distinctive channel characteristics under U2G scenarios, i.e., 3D trajectory and posture of UAV, fuselage scattering effect (FSE), and posture variation fading (PVF), are incorporated into the proposed model. The channel parameters, i.e., path loss (PL), shadow fading (SF), path delay, and path angle, are generated incorporating machine learning (ML) and ray tracing (RT) techniques to capture the structure-related characteristics. In order to guarantee the physical continuity of channel parameters such as Doppler phase and path power, the time evolution methods of inter- and intra- stationary intervals are proposed. Key statistical properties , i.e., temporal autocorrection function (ACF), power delay profile (PDP), level crossing rate (LCR), average fading duration (AFD), and stationary interval (SI) are given, and the impact of the change of fuselage and posture variation is analyzed. It is demonstrated that both posture variation and fuselage scattering have crucial effects on channel characteristics. The validity and practicability of the proposed model are verified by comparing the simulation results with the measured ones.
△ Less
Submitted 13 October, 2022; v1 submitted 5 October, 2022;
originally announced October 2022.
-
Physical interpretation of nonlocal quantum correlation through local description of subsystems
Authors:
Tanumoy Pramanik,
Xiaojiong Chen,
Yu Xiang,
Xudong Li,
Jun Mao,
Jueming Bao,
Yaohao Deng,
Tianxiang Dai,
Bo Tang,
Yan Yang,
Zhihua Li,
Qihuang Gong,
Qiongyi He,
Jianwei Wang
Abstract:
Characterization and categorization of quantum correlations are both fundamentally and practically important in quantum information science. Although quantum correlations such as non-separability, steerability, and non-locality can be characterized by different theoretical models in different scenarios with either known (trusted) or unknown (untrusted) knowledge of the associated systems, such cha…
▽ More
Characterization and categorization of quantum correlations are both fundamentally and practically important in quantum information science. Although quantum correlations such as non-separability, steerability, and non-locality can be characterized by different theoretical models in different scenarios with either known (trusted) or unknown (untrusted) knowledge of the associated systems, such characterization sometimes lacks unambiguous to experimentalist. In this work, we propose the physical interpretation of nonlocal quantum correlation between two systems. In the absence of {\it complete local description} of one of the subsystems quantified by the {\it local uncertainty relation}, the correlation between subsystems becomes nonlocal. Remarkably, different nonlocal quantum correlations can be discriminated from a single uncertainty relation derived under local hidden state (LHS)-LHS model only. We experimentally characterize the two-qubit Werner state in different scenarios.
△ Less
Submitted 1 October, 2022;
originally announced October 2022.
-
Tunable Exciton-Hybridized Magnon Interactions in a Layered Semiconductor
Authors:
Geoffrey M. Diederich,
John Cenker,
Yafei Ren,
Jordan Fonseca,
Daniel G. Chica,
Youn Jue Bae,
Xiaoyang Zhu,
Xavier Roy,
Ting Cao,
Di Xiao,
Xiaodong Xu
Abstract:
The interaction between distinct excitations in solids is of both fundamental interest and technological importance. One example of such interactions is coupling between an exciton, a Coulomb bound electron-hole pair, and a magnon, a collective spin excitation. The recent emergence of van der Waals magnetic semiconductors provides a powerful platform for exploring these exciton-magnon interactions…
▽ More
The interaction between distinct excitations in solids is of both fundamental interest and technological importance. One example of such interactions is coupling between an exciton, a Coulomb bound electron-hole pair, and a magnon, a collective spin excitation. The recent emergence of van der Waals magnetic semiconductors provides a powerful platform for exploring these exciton-magnon interactions and their fundamental properties, such as strong correlation, as well as their photo-spintronic and quantum transduction applications. Here we demonstrate precise control of coherent exciton-magnon interactions in the layered magnetic semiconductor CrSBr. We show that by controlling the direction of applied magnetic fields relative to the crystal axes, and thus the rotational symmetry of the magnetic system, we can tune not only the exciton coupling to the bright magnon, but also to an optically dark mode via magnon hybridization. The exciton-magnon coupling and associated magnon dispersion curves can be further modulated by applying a uniaxial strain. At the critical strain, a dispersionless dark magnon band emerges. Our results demonstrate unprecedented control of the opto-mechanical-magnonic coupling, and a step towards the predictable and controllable implementation of hybrid quantum magnonics.
△ Less
Submitted 27 September, 2022;
originally announced September 2022.
-
Exploring Low Rank Training of Deep Neural Networks
Authors:
Siddhartha Rao Kamalakara,
Acyr Locatelli,
Bharat Venkitesh,
Jimmy Ba,
Yarin Gal,
Aidan N. Gomez
Abstract:
Training deep neural networks in low rank, i.e. with factorised layers, is of particular interest to the community: it offers efficiency over unfactorised training in terms of both memory consumption and training time. Prior work has focused on low rank approximations of pre-trained networks and training in low rank space with additional objectives, offering various ad hoc explanations for chosen…
▽ More
Training deep neural networks in low rank, i.e. with factorised layers, is of particular interest to the community: it offers efficiency over unfactorised training in terms of both memory consumption and training time. Prior work has focused on low rank approximations of pre-trained networks and training in low rank space with additional objectives, offering various ad hoc explanations for chosen practice. We analyse techniques that work well in practice, and through extensive ablations on models such as GPT2 we provide evidence falsifying common beliefs in the field, hinting in the process at exciting research opportunities that still need answering.
△ Less
Submitted 27 September, 2022;
originally announced September 2022.
-
Theory and Experiments of Pressure-Tunable Broadband Light Emission from Self-Trapped Excitons in Metal Halide Crystals
Authors:
Shenyu Dai,
Xinxin Xing,
Viktor G. Hadjiev,
Zhaojun Qin,
Tian Tong,
Guang Yang,
Chong Wang,
Lijuan Hou,
Liangzi Deng,
Zhiming Wang,
Guoying Feng,
Jiming Bao
Abstract:
Hydrostatic pressure has been commonly applied to tune broadband light emissions from self-trapped excitons (STE) in perovskites for producing white light and study of basic electron-phonon interactions. However, a general theory is still lacking to understand pressure-driven evolution of STE emissions. In this work we first identify a theoretical model that predicts the effect of hydrostatic pres…
▽ More
Hydrostatic pressure has been commonly applied to tune broadband light emissions from self-trapped excitons (STE) in perovskites for producing white light and study of basic electron-phonon interactions. However, a general theory is still lacking to understand pressure-driven evolution of STE emissions. In this work we first identify a theoretical model that predicts the effect of hydrostatic pressure on STE emission spectrum, we then report the observation of extremely broadband photoluminescence emission and its wide pressure spectral tuning in 2D indirect bandgap CsPb2Br5 crystals. An excellent agreement is found between the theory and experiment on the peculiar experimental observation of STE emission with a nearly constant spectral bandwidth but linearly increasing energy with pressure below 2 GPa. Further analysis by the theory and experiment under higher pressure reveals that two types of STE are involved and respond differently to external pressure. We subsequently survey published STE emissions and discovered that most of them show a spectral blue-shift under pressure, as predicted by the theory. The identification of an appropriate theoretical model and its application to STE emission through the coordinate configuration diagram paves the way for engineering the STE emission and basic understanding of electron-phonon interaction.
△ Less
Submitted 23 September, 2022;
originally announced September 2022.
-
Interior estimates of derivatives and a Liouville type theorem for Parabolic $k$-Hessian equations
Authors:
Jiguang Bao,
Jiechen Qiang,
Zhongwei Tang,
Cong Wang
Abstract:
In this paper, we establish the gradient and Pogorelov estimates for $k$-convex-monotone solutions to parabolic $k$-Hessian equations of the form $-u_tσ_k(λ(D^2u))=ψ(x,t,u)$. We also apply such estimates to obtain a Liouville type result, which states that any $k$-convex-monotone and $C^{4,2}$ solution $u$ to $-u_tσ_k(λ(D^2u))=1$ in $\mathbb{R}^n\times(-\infty,0]$ must be a linear function of $t$…
▽ More
In this paper, we establish the gradient and Pogorelov estimates for $k$-convex-monotone solutions to parabolic $k$-Hessian equations of the form $-u_tσ_k(λ(D^2u))=ψ(x,t,u)$. We also apply such estimates to obtain a Liouville type result, which states that any $k$-convex-monotone and $C^{4,2}$ solution $u$ to $-u_tσ_k(λ(D^2u))=1$ in $\mathbb{R}^n\times(-\infty,0]$ must be a linear function of $t$ plus a quadratic polynomial of $x$, under some growth assumptions on $u$.
△ Less
Submitted 13 January, 2023; v1 submitted 22 September, 2022;
originally announced September 2022.
-
A Realistic 3D Non-Stationary Channel Model for UAV-to-Vehicle Communications Incorporating Fuselage Posture
Authors:
Boyu Hua,
Tongtong Zhou,
Qiuming Zhu,
Kai Mao,
Junwei Bao,
Weizhi Zhong,
Naeem Ahmed
Abstract:
Considering the unmanned aerial vehicle (UAV) three-dimensional (3D) posture, a novel 3D non-stationary geometry-based stochastic model (GBSM) is proposed for multiple-input multiple-output (MIMO) UAV-to-vehicle (U2V) channels. It consists of a line-of-sight (LoS) and non-line-of-sight (NLoS) components. The factor of fuselage posture is considered by introducing a time-variant 3D posture matrix.…
▽ More
Considering the unmanned aerial vehicle (UAV) three-dimensional (3D) posture, a novel 3D non-stationary geometry-based stochastic model (GBSM) is proposed for multiple-input multiple-output (MIMO) UAV-to-vehicle (U2V) channels. It consists of a line-of-sight (LoS) and non-line-of-sight (NLoS) components. The factor of fuselage posture is considered by introducing a time-variant 3D posture matrix. Some important statistical properties, i.e. the temporal autocorrelation function (ACF) and spatial cross correlation function (CCF), are derived and investigated. Simulation results show that the fuselage posture has significant impact on the U2V channel characteristic and aggravate the non-stationarity. The agreements between analytical, simulated, and measured results verify the correctness of proposed model and derivations. Moreover, it is demonstrated that the proposed model is also compatible to the existing GBSM without considering fuselage posture.
△ Less
Submitted 19 September, 2022;
originally announced September 2022.
-
ALMA Detection of Dust Trapping around Lagrangian Points in the LkCa 15 Disk
Authors:
Feng Long,
Sean M. Andrews,
Shangjia Zhang,
Chunhua Qi,
Myriam Benisty,
Stefano Facchini,
Andrea Isella,
David J. Wilner,
Jaehan Bae,
Jane Huang,
Ryan A. Loomis,
Karin I. Öberg,
Zhaohuan Zhu
Abstract:
We present deep high-resolution ($\sim$50 mas, 8 au) ALMA 0.88 and 1.3 mm continuum observations of the LkCa 15 disk. The emission morphology shows an inner cavity and three dust rings at both wavelengths, but with slightly narrower rings at the longer wavelength. Along a faint ring at 42 au, we identify two excess emission features at $\sim$10$σ$ significance at both wavelengths: one as an unreso…
▽ More
We present deep high-resolution ($\sim$50 mas, 8 au) ALMA 0.88 and 1.3 mm continuum observations of the LkCa 15 disk. The emission morphology shows an inner cavity and three dust rings at both wavelengths, but with slightly narrower rings at the longer wavelength. Along a faint ring at 42 au, we identify two excess emission features at $\sim$10$σ$ significance at both wavelengths: one as an unresolved clump and the other as an extended arc, separated by roughly 120 degrees in azimuth. The clump is unlikely to be a circumplanetary disk (CPD) as the emission peak shifts between the two wavelengths even after accounting for orbital motion. Instead, the morphology of the 42 au ring strongly resembles the characteristic horseshoe orbit produced in planet--disk interaction models, where the clump and the arc trace dust accumulation around Lagrangian points $L_{4}$ and $L_{5}$, respectively. The shape of the 42 au ring, dust trapping in the outer adjacent ring, and the coincidence of the horseshoe ring location with a gap in near-IR scattered light, are all consistent with the scenario of planet sculpting, with the planet likely having a mass between those of Neptune and Saturn. We do not detect point-like emission associated with a CPD around the putative planet location ($0.''27$ in projected separation from the central star at a position angle of $\sim$60\degr), with upper limits of 70 and 33 $μ$Jy at 0.88 and 1.3 mm, respectively, corresponding to dust mass upper limits of 0.02--0.03 $M_{\oplus}$.
△ Less
Submitted 12 September, 2022;
originally announced September 2022.
-
Reversibly controlled ternary polar states and ferroelectric bias promoted by boosting square-tensile-strain
Authors:
Jun Han Lee,
Nguyen Xuan Duong,
Min-Hyoung Jung,
Hyun-Jae Lee,
Ahyoung Kim,
Youngki Yeo,
Junhyung Kim,
Gye-Hyeon Kim,
Byeong-Gwan Cho,
Jaegyu Kim,
Furqan Ul Hassan Naqvi,
Jong-Seong Bae,
Jeehoon Kim,
Chang Won Ahn,
Young-Min Kim,
Tae Kwon Song,
Jae-Hyeon Ko,
Tae-Yeong Koo,
Changhee Sohn,
Kibog Park,
Chan-Ho Yang,
Sang Mo Yang,
Jun Hee Lee,
Hu Young Jeong,
Tae Heon Kim
, et al. (1 additional authors not shown)
Abstract:
Interaction between dipoles often emerges intriguing physical phenomena, such as exchange bias in the magnetic heterostructures and magnetoelectric effect in multiferroics, which lead to advances in multifunctional heterostructures. However, the defect-dipole tends to be considered the undesired to deteriorate the electronic functionality. Here, we report deterministic switching between the ferroe…
▽ More
Interaction between dipoles often emerges intriguing physical phenomena, such as exchange bias in the magnetic heterostructures and magnetoelectric effect in multiferroics, which lead to advances in multifunctional heterostructures. However, the defect-dipole tends to be considered the undesired to deteriorate the electronic functionality. Here, we report deterministic switching between the ferroelectric and the pinched states by exploiting a new substrate of cubic perovskite, BaZrO$_{3}$, which boosts square-tensile-strain to BaTiO$_{3}$ and promotes four-variants in-plane spontaneous polarization with oxygen vacancy creation. First-principles calculations propose a complex of an oxygen vacancy and two Ti$^{3+}$ ions coins a charge-neutral defect-dipole. Cooperative control of the defect-dipole and the spontaneous polarization reveals ternary in-plane polar states characterized by biased/pinched hysteresis loops. Furthermore, we experimentally demonstrate that three electrically controlled polar-ordering states lead to switchable and non-volatile dielectric states for application of non-destructive electro-dielectric memory. This discovery opens a new route to develop functional materials via manipulating defect-dipoles and offers a novel platform to advance heteroepitaxy beyond the prevalent perovskite substrates.
△ Less
Submitted 12 September, 2022;
originally announced September 2022.
-
Effects of Radiative Diffusion on the Dynamical Corotation Torque in Three-Dimensional Protoplanetary Disks
Authors:
Han-Gyeol Yun,
Woong-Tae Kim,
Jaehan Bae,
Cheongho Han
Abstract:
The dynamical corotation torque arising from the deformation of the horseshoe orbits, along with the vortensity gradient in the background disk, is important for determining orbital migration rate and direction of low-mass planets. Previous two-dimensional studies predicted that the dynamical corotation torque is positive, decelerating the inward planet migration. In contrast, recent three-dimensi…
▽ More
The dynamical corotation torque arising from the deformation of the horseshoe orbits, along with the vortensity gradient in the background disk, is important for determining orbital migration rate and direction of low-mass planets. Previous two-dimensional studies predicted that the dynamical corotation torque is positive, decelerating the inward planet migration. In contrast, recent three-dimensional studies have shown that buoyancy resonance makes the dynamical corotation torque negative, accelerating the inward migration. In this paper, we study the dependence of the dynamical corotation torque on the thermal transport using three-dimensional simulations. We first show that our results are consistent with previous three-dimensional studies when the disk is fully adiabatic. In more realistic radiative disks, however, radiative diffusion suppresses the buoyancy resonance significantly, especially at high-altitude regions, and yields a positive dynamical corotation torque. This alleviates the issue of a rapid migration caused by the negative dynamical corotation torque in the adiabatic disks. Our results suggest that radiative diffusion together with stellar irradiation and accretion heating is needed to accurately describe the migration of low-mass planets.
△ Less
Submitted 12 September, 2022;
originally announced September 2022.
-
If Influence Functions are the Answer, Then What is the Question?
Authors:
Juhan Bae,
Nathan Ng,
Alston Lo,
Marzyeh Ghassemi,
Roger Grosse
Abstract:
Influence functions efficiently estimate the effect of removing a single training data point on a model's learned parameters. While influence estimates align well with leave-one-out retraining for linear models, recent works have shown this alignment is often poor in neural networks. In this work, we investigate the specific factors that cause this discrepancy by decomposing it into five separate…
▽ More
Influence functions efficiently estimate the effect of removing a single training data point on a model's learned parameters. While influence estimates align well with leave-one-out retraining for linear models, recent works have shown this alignment is often poor in neural networks. In this work, we investigate the specific factors that cause this discrepancy by decomposing it into five separate terms. We study the contributions of each term on a variety of architectures and datasets and how they vary with factors such as network width and training time. While practical influence function estimates may be a poor match to leave-one-out retraining for nonlinear networks, we show they are often a good approximation to a different object we term the proximal Bregman response function (PBRF). Since the PBRF can still be used to answer many of the questions motivating influence functions, such as identifying influential or mislabeled examples, our results suggest that current algorithms for influence function estimation give more informative results than previous error analyses would suggest.
△ Less
Submitted 12 September, 2022;
originally announced September 2022.
-
Design of the ECCE Detector for the Electron Ion Collider
Authors:
J. K. Adkins,
Y. Akiba,
A. Albataineh,
M. Amaryan,
I. C. Arsene,
C. Ayerbe Gayoso,
J. Bae,
X. Bai,
M. D. Baker,
M. Bashkanov,
R. Bellwied,
F. Benmokhtar,
V. Berdnikov,
J. C. Bernauer,
F. Bock,
W. Boeglin,
M. Borysova,
E. Brash,
P. Brindza,
W. J. Briscoe,
M. Brooks,
S. Bueltmann,
M. H. S. Bukhari,
A. Bylinkin,
R. Capobianco
, et al. (259 additional authors not shown)
Abstract:
The EIC Comprehensive Chromodynamics Experiment (ECCE) detector has been designed to address the full scope of the proposed Electron Ion Collider (EIC) physics program as presented by the National Academy of Science and provide a deeper understanding of the quark-gluon structure of matter. To accomplish this, the ECCE detector offers nearly acceptance and energy coverage along with excellent track…
▽ More
The EIC Comprehensive Chromodynamics Experiment (ECCE) detector has been designed to address the full scope of the proposed Electron Ion Collider (EIC) physics program as presented by the National Academy of Science and provide a deeper understanding of the quark-gluon structure of matter. To accomplish this, the ECCE detector offers nearly acceptance and energy coverage along with excellent tracking and particle identification. The ECCE detector was designed to be built within the budget envelope set out by the EIC project while simultaneously managing cost and schedule risks. This detector concept has been selected to be the basis for the EIC project detector.
△ Less
Submitted 11 May, 2023; v1 submitted 6 September, 2022;
originally announced September 2022.
-
Utilizing Post-Hurricane Satellite Imagery to Identify Flooding Damage with Convolutional Neural Networks
Authors:
Jimmy Bao
Abstract:
Post-hurricane damage assessment is crucial towards managing resource allocations and executing an effective response. Traditionally, this evaluation is performed through field reconnaissance, which is slow, hazardous, and arduous. Instead, in this paper we furthered the idea of implementing deep learning through convolutional neural networks in order to classify post-hurricane satellite imagery o…
▽ More
Post-hurricane damage assessment is crucial towards managing resource allocations and executing an effective response. Traditionally, this evaluation is performed through field reconnaissance, which is slow, hazardous, and arduous. Instead, in this paper we furthered the idea of implementing deep learning through convolutional neural networks in order to classify post-hurricane satellite imagery of buildings as Flooded/Damaged or Undamaged. The experimentation was conducted employing a dataset containing post-hurricane satellite imagery from the Greater Houston area after Hurricane Harvey in 2017. This paper implemented three convolutional neural network model architectures paired with additional model considerations in order to achieve high accuracies (over 99%), reinforcing the effective use of machine learning in post-hurricane disaster assessment.
△ Less
Submitted 5 September, 2022;
originally announced September 2022.
-
Disk Evolution Study Through Imaging of Nearby Young Stars (DESTINYS): Scattered light detection of a possible disk wind in RY Tau
Authors:
P. -G. Valegård,
C. Ginski,
C. Dominik,
J. Bae,
M. Benisty,
T. Birnstiel,
S. Facchini,
A. Garufi,
M. Hogerheijde,
R. G. van Holstein,
M. Langlois,
C. F. Manara,
P. Pinilla,
Ch. Rab,
Á. Ribas,
L. B. F. M. Waters,
J. Williams
Abstract:
Disk winds are an important mechanism for accretion and disk evolution around young stars. The accreting intermediate-mass T-Tauri star RY Tau has an active jet and a previously known disk wind. Archival optical and new near-infrared observations of the RY Tau system show two horn-like components stretching out as a cone from RY Tau. Scattered light from the disk around RY Tau is visible in near-i…
▽ More
Disk winds are an important mechanism for accretion and disk evolution around young stars. The accreting intermediate-mass T-Tauri star RY Tau has an active jet and a previously known disk wind. Archival optical and new near-infrared observations of the RY Tau system show two horn-like components stretching out as a cone from RY Tau. Scattered light from the disk around RY Tau is visible in near-infrared but not seen at optical wavelengths. In the near-infrared, dark wedges that separates the horns from the disk, indicating we may see the scattered light from a disk wind. We use archived ALMA and SPHERE/ZIMPOL I-band observations combined with newly acquired SPEHRE/IRDIS H-band observations and available literature to build a simple geometric model of the RY Tau disk and disk wind. We use Monte Carlo radiative transfer modelling \textit{MCMax3D} to create comparable synthetic observations that test the effect of a dusty wind on the optical effect in the observations. We constrain the grain size and dust mass needed in the disk wind to reproduce the effect from the observations. A model geometrically reminiscent of a dusty disk wind with small micron to sub-micron size grains elevated above the disk can reproduce the optical effect seen in the observations. The mass in the obscuring component of the wind has been constrained to $1\times10^{-9} M_{\odot} \leq M \leq 5\times10^{-8} M_{\odot}$ which corresponds to a lower limit mass loss rate in the wind of about $\sim 1\times10^{-8}M_{\odot}\mathrm{yr}^{-1}$. While an illuminate dust cavity cannot be ruled out without measurements of the gas velocity, we argue that a magnetically launched disk wind is the most likely scenario.
△ Less
Submitted 5 October, 2022; v1 submitted 5 September, 2022;
originally announced September 2022.
-
Detector Requirements and Simulation Results for the EIC Exclusive, Diffractive and Tagging Physics Program using the ECCE Detector Concept
Authors:
A. Bylinkin,
C. T. Dean,
S. Fegan,
D. Gangadharan,
K. Gates,
S. J. D. Kay,
I. Korover,
W. B. Li,
X. Li,
R. Montgomery,
D. Nguyen,
G. Penman,
J. R. Pybus,
N. Santiesteban,
R. Trotta,
A. Usman,
M. D. Baker,
J. Frantz,
D. I. Glazier,
D. W. Higinbotham,
T. Horn,
J. Huang,
G. Huber,
R. Reed,
J. Roche
, et al. (258 additional authors not shown)
Abstract:
This article presents a collection of simulation studies using the ECCE detector concept in the context of the EIC's exclusive, diffractive, and tagging physics program, which aims to further explore the rich quark-gluon structure of nucleons and nuclei. To successfully execute the program, ECCE proposed to utilize the detecter system close to the beamline to ensure exclusivity and tag ion beam/fr…
▽ More
This article presents a collection of simulation studies using the ECCE detector concept in the context of the EIC's exclusive, diffractive, and tagging physics program, which aims to further explore the rich quark-gluon structure of nucleons and nuclei. To successfully execute the program, ECCE proposed to utilize the detecter system close to the beamline to ensure exclusivity and tag ion beam/fragments for a particular reaction of interest. Preliminary studies confirmed the proposed technology and design satisfy the requirements. The projected physics impact results are based on the projected detector performance from the simulation at 10 or 100 fb^-1 of integrated luminosity. Additionally, a few insights on the potential 2nd Interaction Region can (IR) were also documented which could serve as a guidepost for the future development of a second EIC detector.
△ Less
Submitted 6 March, 2023; v1 submitted 30 August, 2022;
originally announced August 2022.
-
Quiver Yangians and $\mathcal{W}$-Algebras for Generalized Conifolds
Authors:
Jiakang Bao
Abstract:
We focus on quiver Yangians for most generalized conifolds. We construct a coproduct of the quiver Yangian following the similar approach by Guay-Nakajima-Wendlandt. We also prove that the quiver Yangians related by Seiberg duality are indeed isomorphic. Then we discuss their connections to $\mathcal{W}$-algebras analogous to the study by Ueda. In particular, the universal enveloping algebras of t…
▽ More
We focus on quiver Yangians for most generalized conifolds. We construct a coproduct of the quiver Yangian following the similar approach by Guay-Nakajima-Wendlandt. We also prove that the quiver Yangians related by Seiberg duality are indeed isomorphic. Then we discuss their connections to $\mathcal{W}$-algebras analogous to the study by Ueda. In particular, the universal enveloping algebras of the $\mathcal{W}$-algebras are truncations of the quiver Yangians, and therefore they naturally have truncated crystals as their representations.
△ Less
Submitted 11 November, 2022; v1 submitted 29 August, 2022;
originally announced August 2022.
-
AutoQGS: Auto-Prompt for Low-Resource Knowledge-based Question Generation from SPARQL
Authors:
Guanming Xiong,
Junwei Bao,
Wen Zhao,
Youzheng Wu,
Xiaodong He
Abstract:
This study investigates the task of knowledge-based question generation (KBQG). Conventional KBQG works generated questions from fact triples in the knowledge graph, which could not express complex operations like aggregation and comparison in SPARQL. Moreover, due to the costly annotation of large-scale SPARQL-question pairs, KBQG from SPARQL under low-resource scenarios urgently needs to be expl…
▽ More
This study investigates the task of knowledge-based question generation (KBQG). Conventional KBQG works generated questions from fact triples in the knowledge graph, which could not express complex operations like aggregation and comparison in SPARQL. Moreover, due to the costly annotation of large-scale SPARQL-question pairs, KBQG from SPARQL under low-resource scenarios urgently needs to be explored. Recently, since the generative pre-trained language models (PLMs) typically trained in natural language (NL)-to-NL paradigm have been proven effective for low-resource generation, e.g., T5 and BART, how to effectively utilize them to generate NL-question from non-NL SPARQL is challenging. To address these challenges, AutoQGS, an auto-prompt approach for low-resource KBQG from SPARQL, is proposed. Firstly, we put forward to generate questions directly from SPARQL for the KBQG task to handle complex operations. Secondly, we propose an auto-prompter trained on large-scale unsupervised data to rephrase SPARQL into NL description, smoothing the low-resource transformation from non-NL SPARQL to NL question with PLMs. Experimental results on the WebQuestionsSP, ComlexWebQuestions 1.1, and PathQuestions show that our model achieves state-of-the-art performance, especially in low-resource settings. Furthermore, a corpus of 330k factoid complex question-SPARQL pairs is generated for further KBQG research.
△ Less
Submitted 26 August, 2022;
originally announced August 2022.
-
MaskCLIP: Masked Self-Distillation Advances Contrastive Language-Image Pretraining
Authors:
Xiaoyi Dong,
Jianmin Bao,
Yinglin Zheng,
Ting Zhang,
Dongdong Chen,
Hao Yang,
Ming Zeng,
Weiming Zhang,
Lu Yuan,
Dong Chen,
Fang Wen,
Nenghai Yu
Abstract:
This paper presents a simple yet effective framework MaskCLIP, which incorporates a newly proposed masked self-distillation into contrastive language-image pretraining. The core idea of masked self-distillation is to distill representation from a full image to the representation predicted from a masked image. Such incorporation enjoys two vital benefits. First, masked self-distillation targets loc…
▽ More
This paper presents a simple yet effective framework MaskCLIP, which incorporates a newly proposed masked self-distillation into contrastive language-image pretraining. The core idea of masked self-distillation is to distill representation from a full image to the representation predicted from a masked image. Such incorporation enjoys two vital benefits. First, masked self-distillation targets local patch representation learning, which is complementary to vision-language contrastive focusing on text-related representation. Second, masked self-distillation is also consistent with vision-language contrastive from the perspective of training objective as both utilize the visual encoder for feature aligning, and thus is able to learn local semantics getting indirect supervision from the language. We provide specially designed experiments with a comprehensive analysis to validate the two benefits. Symmetrically, we also introduce the local semantic supervision into the text branch, which further improves the pretraining performance. With extensive experiments, we show that MaskCLIP, when applied to various challenging downstream tasks, achieves superior results in linear probing, finetuning, and zero-shot performance with the guidance of the language encoder. Code will be release at \url{https://github.com/LightDXY/MaskCLIP}.
△ Less
Submitted 9 April, 2023; v1 submitted 25 August, 2022;
originally announced August 2022.
-
FurryGAN: High Quality Foreground-aware Image Synthesis
Authors:
Jeongmin Bae,
Mingi Kwon,
Youngjung Uh
Abstract:
Foreground-aware image synthesis aims to generate images as well as their foreground masks. A common approach is to formulate an image as an masked blending of a foreground image and a background image. It is a challenging problem because it is prone to reach the trivial solution where either image overwhelms the other, i.e., the masks become completely full or empty, and the foreground and backgr…
▽ More
Foreground-aware image synthesis aims to generate images as well as their foreground masks. A common approach is to formulate an image as an masked blending of a foreground image and a background image. It is a challenging problem because it is prone to reach the trivial solution where either image overwhelms the other, i.e., the masks become completely full or empty, and the foreground and background are not meaningfully separated. We present FurryGAN with three key components: 1) imposing both the foreground image and the composite image to be realistic, 2) designing a mask as a combination of coarse and fine masks, and 3) guiding the generator by an auxiliary mask predictor in the discriminator. Our method produces realistic images with remarkably detailed alpha masks which cover hair, fur, and whiskers in a fully unsupervised manner.
△ Less
Submitted 22 August, 2022;
originally announced August 2022.
-
L3: Accelerator-Friendly Lossless Image Format for High-Resolution, High-Throughput DNN Training
Authors:
Jonghyun Bae,
Woohyeon Baek,
Tae Jun Ham,
Jae W. Lee
Abstract:
The training process of deep neural networks (DNNs) is usually pipelined with stages for data preparation on CPUs followed by gradient computation on accelerators like GPUs. In an ideal pipeline, the end-to-end training throughput is eventually limited by the throughput of the accelerator, not by that of data preparation. In the past, the DNN training pipeline achieved a near-optimal throughput by…
▽ More
The training process of deep neural networks (DNNs) is usually pipelined with stages for data preparation on CPUs followed by gradient computation on accelerators like GPUs. In an ideal pipeline, the end-to-end training throughput is eventually limited by the throughput of the accelerator, not by that of data preparation. In the past, the DNN training pipeline achieved a near-optimal throughput by utilizing datasets encoded with a lightweight, lossy image format like JPEG. However, as high-resolution, losslessly-encoded datasets become more popular for applications requiring high accuracy, a performance problem arises in the data preparation stage due to low-throughput image decoding on the CPU. Thus, we propose L3, a custom lightweight, lossless image format for high-resolution, high-throughput DNN training. The decoding process of L3 is effectively parallelized on the accelerator, thus minimizing CPU intervention for data preparation during DNN training. L3 achieves a 9.29x higher data preparation throughput than PNG, the most popular lossless image format, for the Cityscapes dataset on NVIDIA A100 GPU, which leads to 1.71x higher end-to-end training throughput. Compared to JPEG and WebP, two popular lossy image formats, L3 provides up to 1.77x and 2.87x higher end-to-end training throughput for ImageNet, respectively, at equivalent metric performance.
△ Less
Submitted 18 August, 2022;
originally announced August 2022.
-
Detection of Intracluster Globular Clusters in the First JWST Images of the Gravitational Lens Cluster SMACS J0723.3-7327 at z = 0.39
Authors:
Myung Gyoon Lee,
Jang Ho Bae,
In Sung Jang
Abstract:
We present a survey of globular clusters (GCs) in the massive gravitational lens cluster SMACS J0723.3-7327 at $z=0.39$ based on the early released JWST/NIRCam images. In the color-magnitude diagrams of the point sources we find clearly a rich population of intracluster GCs that spread in a wide area of the cluster. Their ages, considering the cluster redshift, are younger than 9.5 Gyr. The F200W…
▽ More
We present a survey of globular clusters (GCs) in the massive gravitational lens cluster SMACS J0723.3-7327 at $z=0.39$ based on the early released JWST/NIRCam images. In the color-magnitude diagrams of the point sources we find clearly a rich population of intracluster GCs that spread in a wide area of the cluster. Their ages, considering the cluster redshift, are younger than 9.5 Gyr. The F200W (AB) magnitudes of these GCs, $26.5<{F200W_0} <29.5$ mag, correspond to $-15.2<{M_{F200W}} <-12.2$ mag, showing that they belong to the brightest GCs (including ultracompact dwarfs). The spatial distributions of these GCs show a megaparsec-scale structure elongated along the major axis of the brightest cluster galaxy. In addition, they show a large number of substructures, some of which are consistent with the substructures seen in the map of diffuse intracluster light. The GC number density map is, in general, consistent with the dark matter mass density map based on the strong lensing analysis in the literature. The radial number density profile of the GCs in the outer region is steeper than the dark matter mass profile obtained from lensing models. These results are consistent with those for the GCs found in the deep HST images of Abell 2744, another massive cluster at $z=0.308$, and in simulated galaxy clusters. This shows that the intracluster GCs are an excellent independent tool to probe the dark matter distribution in galaxy clusters as well as to reveal the cluster assembly history in the JWST era.
△ Less
Submitted 12 October, 2022; v1 submitted 9 August, 2022;
originally announced August 2022.
-
Mapping the Complex Kinematic Substructure in the TW Hya Disk
Authors:
Richard Teague,
Jaehan Bae,
Sean M. Andrews,
Myriam Benisty,
Edwin A. Bergin,
Stefano Facchini,
Jane Huang,
Cristiano Longarini,
David Wilner
Abstract:
We present ALMA observations of CO $J = 2-1$ and CS $J = 5-4$ emission from the disk around TW~Hydrae. Both molecules trace a predominantly Keplerian velocity structure, although a slowing of the rotation velocity is detected at the outer edge of the disk beyond ${\approx}~140$~au in CO emission. This was attributed to the enhanced pressure support from the gas density taper near the outer edge of…
▽ More
We present ALMA observations of CO $J = 2-1$ and CS $J = 5-4$ emission from the disk around TW~Hydrae. Both molecules trace a predominantly Keplerian velocity structure, although a slowing of the rotation velocity is detected at the outer edge of the disk beyond ${\approx}~140$~au in CO emission. This was attributed to the enhanced pressure support from the gas density taper near the outer edge of the disk. Subtraction of an azimuthally symmetric background velocity structure reveals localized deviations in the gas kinematics traced by each of the molecules. Both CO and CS exhibit a `Doppler flip' feature, centered nearly along the minor axis of the disk (${\rm PA} \sim 60\degr$) at a radius of $1\farcs35$, coinciding with the large gap observed in scattered light and mm~continuum. In addition, the CO emission, both through changes in intensity and its kinematics, traces a tightly wound spiral, previously seen with higher frequency CO $J = 3-2$ observations (Teague et al., 2019). Through comparison with linear models of the spiral wakes generated by embedded planets, we interpret these features in the context of interactions with a Saturn-mass planet within the gap at a position angle of ${\rm PA} = 60\degr$, consistent with the theoretical predictions of (Mentiplay et al. 2019). The lack of a corresponding spiral in the CS emission is attributed to the strong vertical dependence on the buoyancy spirals which are believed to only grow in the atmospheric of the disk, rather than those traced by CS emission.
△ Less
Submitted 9 August, 2022;
originally announced August 2022.
-
Spin and charge density waves in the quasi-one-dimensional KMn6Bi5
Authors:
Jin-Ke Bao,
Huibo Cao,
Matthew J. Krogstad,
Keith M. Taddei,
Chenfei Shi,
Shixun Cao,
Saul H. Lapidus,
Sander van Smaalen,
Duck Young Chung,
Mercouri G. Kanatzidis,
Stephan Rosenkranz,
Omar Chmaissem
Abstract:
AMn6Bi5 materials (A = Na, K, Rb and Cs) consisting of unique Mn-cluster chains emerge as a new family of superconductors with the suppression of their antiferromagnetic (AFM) order under high pressures. Here, we report transverse incommensurate spin density waves (SDWs) for the Mn atoms with a propagating direction along the chain axes as a ground state for KMn6Bi5 by single crystal neutron diffr…
▽ More
AMn6Bi5 materials (A = Na, K, Rb and Cs) consisting of unique Mn-cluster chains emerge as a new family of superconductors with the suppression of their antiferromagnetic (AFM) order under high pressures. Here, we report transverse incommensurate spin density waves (SDWs) for the Mn atoms with a propagating direction along the chain axes as a ground state for KMn6Bi5 by single crystal neutron diffraction. The SDWs have a refined amplitude of ~2.46 Bohr magnetons for the Mn atoms in the pentagons and ~0.29 Bohr magnetons with a large standard deviation for Mn atoms in the center between the pentagons. AFM dominate both the nearest-neighbor Mn-Mn interactions within the pentagon and next-nearest-neighbor Mn-Mn interactions out of the pentagon (along the propagating wave). The SDWs exhibit both local and itinerant characteristics probably formed by a cooperative interaction between local magnetic exchange and conduction electrons. A significant magnetoelastic effect during the AFM transition, especially along the chain direction, has been demonstrated by temperature-dependent x-ray powder diffraction. Single crystal x-ray diffraction below the AFM transition revealed satellite peaks originating from charge density waves along the chain direction with a q-vector twice as large as the SDW one, pointing to a strong real space coupling between them. Our work not only manifests a fascinating interplay among spin, charge, lattice and one dimensionality to trigger intertwined orders in KMn6Bi5 but also provides important piece of information for the magnetic structure of the parent compound to understand the mechanism of superconductivity in this new family.
△ Less
Submitted 4 August, 2022;
originally announced August 2022.
-
Numerical and modeling error assessment of large-eddy simulation using direct-numerical-simulation-aided large-eddy simulation
Authors:
H. Jane Bae,
Adrian Lozano-Duran
Abstract:
We study the numerical errors of large-eddy simulation (LES) in isotropic and wall-bounded turbulence. A direct-numerical-simulation (DNS)-aided LES formulation, where the subgrid-scale (SGS) term of the LES is computed by using filtered DNS data is introduced. We first verify that this formulation has zero error in the absence of commutation error between the filter and the differentiation operat…
▽ More
We study the numerical errors of large-eddy simulation (LES) in isotropic and wall-bounded turbulence. A direct-numerical-simulation (DNS)-aided LES formulation, where the subgrid-scale (SGS) term of the LES is computed by using filtered DNS data is introduced. We first verify that this formulation has zero error in the absence of commutation error between the filter and the differentiation operator of the numerical algorithm. This method allows the evaluation of the time evolution of numerical errors for various numerical schemes at grid resolutions relevant to LES. The analysis shows that the numerical errors are of the same order of magnitude as the modeling errors and often cancel each other. This supports the idea that supervised machine learning algorithms trained on filtered DNS data might not be suitable for robust SGS model development, as this approach disregards the existence of numerical errors in the system that accumulates over time. The assessment of errors in turbulent channel flow also identifies that numerical errors close to the wall dominate, which has implications for the development of wall models.
△ Less
Submitted 3 August, 2022;
originally announced August 2022.
-
Composable Text Controls in Latent Space with ODEs
Authors:
Guangyi Liu,
Zeyu Feng,
Yuan Gao,
Zichao Yang,
Xiaodan Liang,
Junwei Bao,
Xiaodong He,
Shuguang Cui,
Zhen Li,
Zhiting Hu
Abstract:
Real-world text applications often involve composing a wide range of text control operations, such as editing the text w.r.t. an attribute, manipulating keywords and structure, and generating new text of desired properties. Prior work typically learns/finetunes a language model (LM) to perform individual or specific subsets of operations. Recent research has studied combining operations in a plug-…
▽ More
Real-world text applications often involve composing a wide range of text control operations, such as editing the text w.r.t. an attribute, manipulating keywords and structure, and generating new text of desired properties. Prior work typically learns/finetunes a language model (LM) to perform individual or specific subsets of operations. Recent research has studied combining operations in a plug-and-play manner, often with costly search or optimization in the complex sequence space. This paper proposes a new efficient approach for composable text operations in the compact latent space of text. The low-dimensionality and differentiability of the text latent vector allow us to develop an efficient sampler based on ordinary differential equations (ODEs) given arbitrary plug-in operators (e.g., attribute classifiers). By connecting pretrained LMs (e.g., GPT2) to the latent space through efficient adaption, we then decode the sampled vectors into desired text sequences. The flexible approach permits diverse control operators (sentiment, tense, formality, keywords, etc.) acquired using any relevant data from different domains. Experiments show that composing those operators within our approach manages to generate or edit high-quality text, substantially improving over previous methods in terms of generation quality and efficiency.
△ Less
Submitted 6 November, 2023; v1 submitted 1 August, 2022;
originally announced August 2022.
-
The Trivial Bound of Entropic Uncertainty Relations
Authors:
Minu J. Bae
Abstract:
Entropic uncertainty relations are underpinning to compute the quantitative security bound in quantum cryptographic applications, such as quantum random number generation (QRNG) and quantum key distribution (QKD). All security proofs derive a relation between the information accessible to the legitimate group and the maximum knowledge that an adversary may have gained, Eve, which exploits entropic…
▽ More
Entropic uncertainty relations are underpinning to compute the quantitative security bound in quantum cryptographic applications, such as quantum random number generation (QRNG) and quantum key distribution (QKD). All security proofs derive a relation between the information accessible to the legitimate group and the maximum knowledge that an adversary may have gained, Eve, which exploits entropic uncertainty relations to lower bound Eve's uncertainty about the raw key generated by one party, Alice. The standard entropic uncertainty relations is to utilize the smooth min- and max-entropies to show these cryptographic applications' security by computing the overlap of two incompatible measurements or positive-operator valued measures (POVMs). This paper draws one case of the POVM-versioned standard entropic uncertainty relation yielding the trivial bound since the maximum overlap in POVMs always produces the trivial value, "one." So, it fails to tie the smooth min-entropy to show the security of the quantum cryptographic application.
△ Less
Submitted 19 January, 2023; v1 submitted 30 July, 2022;
originally announced August 2022.
-
ECCE unpolarized TMD measurements
Authors:
R. Seidl,
A. Vladimirov,
J. K. Adkins,
Y. Akiba,
A. Albataineh,
M. Amaryan,
I. C. Arsene,
C. Ayerbe Gayoso,
J. Bae,
X. Bai,
M. D. Baker,
M. Bashkanov,
R. Bellwied,
F. Benmokhtar,
V. Berdnikov,
J. C. Bernauer,
F. Bock,
W. Boeglin,
M. Borysova,
E. Brash,
P. Brindza,
W. J. Briscoe,
M. Brooks,
S. Bueltmann,
M. H. S. Bukhari
, et al. (258 additional authors not shown)
Abstract:
We performed feasibility studies for various measurements that are related to unpolarized TMD distribution and fragmentation functions. The processes studied include semi-inclusive Deep inelastic scattering (SIDIS) where single hadrons (pions and kaons) were detected in addition to the scattered DIS lepton. The single hadron cross sections and multiplicities were extracted as a function of the DIS…
▽ More
We performed feasibility studies for various measurements that are related to unpolarized TMD distribution and fragmentation functions. The processes studied include semi-inclusive Deep inelastic scattering (SIDIS) where single hadrons (pions and kaons) were detected in addition to the scattered DIS lepton. The single hadron cross sections and multiplicities were extracted as a function of the DIS variables $x$ and $Q^2$, as well as the semi-inclusive variables $z$, which corresponds to the momentum fraction the detected hadron carries relative to the struck parton and $P_T$, which corresponds to the transverse momentum of the detected hadron relative to the virtual photon. The expected statistical precision of such measurements is extrapolated to accumulated luminosities of 10 fb$^{-1}$ and potential systematic uncertainties are approximated given the deviations between true and reconstructed yields.
△ Less
Submitted 22 July, 2022;
originally announced July 2022.
-
ECCE Sensitivity Studies for Single Hadron Transverse Single Spin Asymmetry Measurements
Authors:
R. Seidl,
A. Vladimirov,
D. Pitonyak,
A. Prokudin,
J. K. Adkins,
Y. Akiba,
A. Albataineh,
M. Amaryan,
I. C. Arsene,
C. Ayerbe Gayoso,
J. Bae,
X. Bai,
M. D. Baker,
M. Bashkanov,
R. Bellwied,
F. Benmokhtar,
V. Berdnikov,
J. C. Bernauer,
F. Bock,
W. Boeglin,
M. Borysova,
E. Brash,
P. Brindza,
W. J. Briscoe,
M. Brooks
, et al. (260 additional authors not shown)
Abstract:
We performed feasibility studies for various single transverse spin measurements that are related to the Sivers effect, transversity and the tensor charge, and the Collins fragmentation function. The processes studied include semi-inclusive deep inelastic scattering (SIDIS) where single hadrons (pions and kaons) were detected in addition to the scattered DIS lepton. The data were obtained in {\sc…
▽ More
We performed feasibility studies for various single transverse spin measurements that are related to the Sivers effect, transversity and the tensor charge, and the Collins fragmentation function. The processes studied include semi-inclusive deep inelastic scattering (SIDIS) where single hadrons (pions and kaons) were detected in addition to the scattered DIS lepton. The data were obtained in {\sc pythia}6 and {\sc geant}4 simulated e+p collisions at 18 GeV on 275 GeV, 18 on 100, 10 on 100, and 5 on 41 that use the ECCE detector configuration. Typical DIS kinematics were selected, most notably $Q^2 > 1 $ GeV$^2$, and cover the $x$ range from $10^{-4}$ to $1$. The single spin asymmetries were extracted as a function of $x$ and $Q^2$, as well as the semi-inclusive variables $z$, and $P_T$. They are obtained in azimuthal moments in combinations of the azimuthal angles of the hadron transverse momentum and transverse spin of the nucleon relative to the lepton scattering plane. The initially unpolarized MonteCarlo was re-weighted in the true kinematic variables, hadron types and parton flavors based on global fits of fixed target SIDIS experiments and $e^+e^-$ annihilation data. The expected statistical precision of such measurements is extrapolated to 10 fb$^{-1}$ and potential systematic uncertainties are approximated given the deviations between true and reconstructed yields. The impact on the knowledge of the Sivers functions, transversity and tensor charges, and the Collins function has then been evaluated in the same phenomenological extractions as in the Yellow Report. The impact is found to be comparable to that obtained with the parameterized Yellow Report detector and shows that the ECCE detector configuration can fulfill the physics goals on these quantities.
△ Less
Submitted 22 July, 2022;
originally announced July 2022.
-
Open Heavy Flavor Studies for the ECCE Detector at the Electron Ion Collider
Authors:
X. Li,
J. K. Adkins,
Y. Akiba,
A. Albataineh,
M. Amaryan,
I. C. Arsene,
C. Ayerbe Gayoso,
J. Bae,
X. Bai,
M. D. Baker,
M. Bashkanov,
R. Bellwied,
F. Benmokhtar,
V. Berdnikov,
J. C. Bernauer,
F. Bock,
W. Boeglin,
M. Borysova,
E. Brash,
P. Brindza,
W. J. Briscoe,
M. Brooks,
S. Bueltmann,
M. H. S. Bukhari,
A. Bylinkin
, et al. (262 additional authors not shown)
Abstract:
The ECCE detector has been recommended as the selected reference detector for the future Electron-Ion Collider (EIC). A series of simulation studies have been carried out to validate the physics feasibility of the ECCE detector. In this paper, detailed studies of heavy flavor hadron and jet reconstruction and physics projections with the ECCE detector performance and different magnet options will…
▽ More
The ECCE detector has been recommended as the selected reference detector for the future Electron-Ion Collider (EIC). A series of simulation studies have been carried out to validate the physics feasibility of the ECCE detector. In this paper, detailed studies of heavy flavor hadron and jet reconstruction and physics projections with the ECCE detector performance and different magnet options will be presented. The ECCE detector has enabled precise EIC heavy flavor hadron and jet measurements with a broad kinematic coverage. These proposed heavy flavor measurements will help systematically study the hadronization process in vacuum and nuclear medium especially in the underexplored kinematic region.
△ Less
Submitted 23 July, 2022; v1 submitted 21 July, 2022;
originally announced July 2022.
-
Normalized solutions to lower critical Choquard equation with a local perturbation
Authors:
Xinfu Li,
Jianguang Bao,
Wenguang Tang
Abstract:
In this paper, we study the existence and non-existence of normalized solutions to the lower critical Choquard equation with a local perturbation \begin{equation*} \begin{cases} -Δu+λu=γ(I_α\ast|u|^{\frac{N+α}{N}})|u|^{\frac{N+α}{N}-2}u+μ|u|^{q-2}u,\quad \text{in}\ \mathbb{R}^N, \\ \int_{\mathbb{R}^N}|u|^2dx=c^2, \end{cases} \end{equation*} where $γ, μ, c>0$, $2<q\leq 2+\frac{4}{N}$, and…
▽ More
In this paper, we study the existence and non-existence of normalized solutions to the lower critical Choquard equation with a local perturbation \begin{equation*} \begin{cases} -Δu+λu=γ(I_α\ast|u|^{\frac{N+α}{N}})|u|^{\frac{N+α}{N}-2}u+μ|u|^{q-2}u,\quad \text{in}\ \mathbb{R}^N, \\ \int_{\mathbb{R}^N}|u|^2dx=c^2, \end{cases} \end{equation*} where $γ, μ, c>0$, $2<q\leq 2+\frac{4}{N}$, and $λ\in\mathbb{R}$ is an unknown parameter that appears as a Lagrange multiplier. The results of this paper about this equation answer some questions proposed by Yao, Chen, Rǎdulescu and Sun [Siam J. Math. Anal., 54(3) (2022), 3696-3723]. Moreover, based on the results obtained, we study the multiplicity of normalized solutions to the non-autonomous Choquard equation \begin{equation*} \begin{cases} -Δu+λu=(I_α\ast [h(εx)|u|^{\frac{N+α}{N}}])h(εx)|u|^{\frac{N+α}{N}-2}u+μ|u|^{q-2}u,\ x\in \mathbb{R}^N, \\ \int_{\mathbb{R}^N}|u|^2dx=c^2, \end{cases} \end{equation*} where $ε>0$, $2<q<2+\frac{4}{N}$, and $h$ is a positive and continuous function. It is proved that the numbers of normalized solutions are at least the numbers of global maximum points of $h$ when $ε$ is small enough.
△ Less
Submitted 18 August, 2022; v1 submitted 21 July, 2022;
originally announced July 2022.
-
Exclusive J/$ψ$ Detection and Physics with ECCE
Authors:
X. Li,
J. K. Adkins,
Y. Akiba,
A. Albataineh,
M. Amaryan,
I. C. Arsene,
C. Ayerbe Gayoso,
J. Bae,
X. Bai,
M. D. Baker,
M. Bashkanov,
R. Bellwied,
F. Benmokhtar,
V. Berdnikov,
J. C. Bernauer,
F. Bock,
W. Boeglin,
M. Borysova,
E. Brash,
P. Brindza,
W. J. Briscoe,
M. Brooks,
S. Bueltmann,
M. H. S. Bukhari,
A. Bylinkin
, et al. (262 additional authors not shown)
Abstract:
Exclusive heavy quarkonium photoproduction is one of the most popular processes in EIC, which has a large cross section and a simple final state. Due to the gluonic nature of the exchange Pomeron, this process can be related to the gluon distributions in the nucleus. The momentum transfer dependence of this process is sensitive to the interaction sites, which provides a powerful tool to probe the…
▽ More
Exclusive heavy quarkonium photoproduction is one of the most popular processes in EIC, which has a large cross section and a simple final state. Due to the gluonic nature of the exchange Pomeron, this process can be related to the gluon distributions in the nucleus. The momentum transfer dependence of this process is sensitive to the interaction sites, which provides a powerful tool to probe the spatial distribution of gluons in the nucleus. Recently the problem of the origin of hadron mass has received lots of attention in determining the anomaly contribution $M_{a}$. The trace anomaly is sensitive to the gluon condensate, and exclusive production of quarkonia such as J/$ψ$ and $Υ$ can serve as a sensitive probe to constrain it. In this paper, we present the performance of the ECCE detector for exclusive J/$ψ$ detection and the capability of this process to investigate the above physics opportunities with ECCE.
△ Less
Submitted 21 July, 2022;
originally announced July 2022.
-
Search for $e\toτ$ Charged Lepton Flavor Violation at the EIC with the ECCE Detector
Authors:
J. -L. Zhang,
S. Mantry,
J. K. Adkins,
Y. Akiba,
A. Albataineh,
M. Amaryan,
I. C. Arsene,
C. Ayerbe Gayoso,
J. Bae,
X. Bai,
M. D. Baker,
M. Bashkanov,
R. Bellwied,
F. Benmokhtar,
V. Berdnikov,
J. C. Bernauer,
F. Bock,
W. Boeglin,
M. Borysova,
E. Brash,
P. Brindza,
W. J. Briscoe,
M. Brooks,
S. Bueltmann,
M. H. S. Bukhari
, et al. (262 additional authors not shown)
Abstract:
The recently approved Electron-Ion Collider (EIC) will provide a unique new opportunity for searches of charged lepton flavor violation (CLFV) and other new physics scenarios. In contrast to the $e \leftrightarrow μ$ CLFV transition for which very stringent limits exist, there is still a relatively large discovery space for the $e \to τ$ CLFV transition, potentially to be explored by the EIC. With…
▽ More
The recently approved Electron-Ion Collider (EIC) will provide a unique new opportunity for searches of charged lepton flavor violation (CLFV) and other new physics scenarios. In contrast to the $e \leftrightarrow μ$ CLFV transition for which very stringent limits exist, there is still a relatively large discovery space for the $e \to τ$ CLFV transition, potentially to be explored by the EIC. With the latest detector design of ECCE (EIC Comprehensive Chromodynamics Experiment) and projected integral luminosity of the EIC, we find the $τ$-leptons created in the DIS process $ep\to τX$ are expected to be identified with high efficiency. A first ECCE simulation study, restricted to the 3-prong $τ$-decay mode and with limited statistics for the Standard Model backgrounds, estimates that the EIC will be able to improve the current exclusion limit on $e\to τ$ CLFV by an order of magnitude.
△ Less
Submitted 20 July, 2022;
originally announced July 2022.
-
Design and Simulated Performance of Calorimetry Systems for the ECCE Detector at the Electron Ion Collider
Authors:
F. Bock,
N. Schmidt,
P. K. Wang,
N. Santiesteban,
T. Horn,
J. Huang,
J. Lajoie,
C. Munoz Camacho,
J. K. Adkins,
Y. Akiba,
A. Albataineh,
M. Amaryan,
I. C. Arsene,
C. Ayerbe Gayoso,
J. Bae,
X. Bai,
M. D. Baker,
M. Bashkanov,
R. Bellwied,
F. Benmokhtar,
V. Berdnikov,
J. C. Bernauer,
W. Boeglin,
M. Borysova,
E. Brash
, et al. (263 additional authors not shown)
Abstract:
We describe the design and performance the calorimeter systems used in the ECCE detector design to achieve the overall performance specifications cost-effectively with careful consideration of appropriate technical and schedule risks. The calorimeter systems consist of three electromagnetic calorimeters, covering the combined pseudorapdity range from -3.7 to 3.8 and two hadronic calorimeters. Key…
▽ More
We describe the design and performance the calorimeter systems used in the ECCE detector design to achieve the overall performance specifications cost-effectively with careful consideration of appropriate technical and schedule risks. The calorimeter systems consist of three electromagnetic calorimeters, covering the combined pseudorapdity range from -3.7 to 3.8 and two hadronic calorimeters. Key calorimeter performances which include energy and position resolutions, reconstruction efficiency, and particle identification will be presented.
△ Less
Submitted 19 July, 2022;
originally announced July 2022.
-
Quantum Walk Random Number Generation: Memory-based Models
Authors:
Minu J. Bae
Abstract:
The semi-source independent quantum walk random number generator (SI-QW-QRNG) is a cryptographic protocol that extracts a string of true random bits from a quantum random walk with an adversary controls a randomness source, but the dimension of the system is known. This paper analyzes SI-QW-QRNG protocols with a memory-based quantum walk state. The new protocol utilizes a generalized coin operator…
▽ More
The semi-source independent quantum walk random number generator (SI-QW-QRNG) is a cryptographic protocol that extracts a string of true random bits from a quantum random walk with an adversary controls a randomness source, but the dimension of the system is known. This paper analyzes SI-QW-QRNG protocols with a memory-based quantum walk state. The new protocol utilizes a generalized coin operator with various parameters to optimize the randomness of the quantum walk state. We focus on evaluations of the protocols in multiple scenarios and walk configurations. Moreover, we show some interesting behavior of the system depending on the size of the memory space and the number of quantum coins.
△ Less
Submitted 11 October, 2022; v1 submitted 18 July, 2022;
originally announced July 2022.
-
Bootstrapped Masked Autoencoders for Vision BERT Pretraining
Authors:
Xiaoyi Dong,
Jianmin Bao,
Ting Zhang,
Dongdong Chen,
Weiming Zhang,
Lu Yuan,
Dong Chen,
Fang Wen,
Nenghai Yu
Abstract:
We propose bootstrapped masked autoencoders (BootMAE), a new approach for vision BERT pretraining. BootMAE improves the original masked autoencoders (MAE) with two core designs: 1) momentum encoder that provides online feature as extra BERT prediction targets; 2) target-aware decoder that tries to reduce the pressure on the encoder to memorize target-specific information in BERT pretraining. The f…
▽ More
We propose bootstrapped masked autoencoders (BootMAE), a new approach for vision BERT pretraining. BootMAE improves the original masked autoencoders (MAE) with two core designs: 1) momentum encoder that provides online feature as extra BERT prediction targets; 2) target-aware decoder that tries to reduce the pressure on the encoder to memorize target-specific information in BERT pretraining. The first design is motivated by the observation that using a pretrained MAE to extract the features as the BERT prediction target for masked tokens can achieve better pretraining performance. Therefore, we add a momentum encoder in parallel with the original MAE encoder, which bootstraps the pretraining performance by using its own representation as the BERT prediction target. In the second design, we introduce target-specific information (e.g., pixel values of unmasked patches) from the encoder directly to the decoder to reduce the pressure on the encoder of memorizing the target-specific information. Thus, the encoder focuses on semantic modeling, which is the goal of BERT pretraining, and does not need to waste its capacity in memorizing the information of unmasked tokens related to the prediction target. Through extensive experiments, our BootMAE achieves $84.2\%$ Top-1 accuracy on ImageNet-1K with ViT-B backbone, outperforming MAE by $+0.8\%$ under the same pre-training epochs. BootMAE also gets $+1.0$ mIoU improvements on semantic segmentation on ADE20K and $+1.3$ box AP, $+1.4$ mask AP improvement on object detection and segmentation on COCO dataset. Code is released at https://github.com/LightDXY/BootMAE.
△ Less
Submitted 14 July, 2022;
originally announced July 2022.