-
Is Flash Attention Stable?
Authors:
Alicia Golden,
Samuel Hsia,
Fei Sun,
Bilge Acun,
Basil Hosmer,
Yejin Lee,
Zachary DeVito,
Jeff Johnson,
Gu-Yeon Wei,
David Brooks,
Carole-Jean Wu
Abstract:
Training large-scale machine learning models poses distinct system challenges, given both the size and complexity of today's workloads. Recently, many organizations training state-of-the-art Generative AI models have reported cases of instability during training, often taking the form of loss spikes. Numeric deviation has emerged as a potential cause of this training instability, although quantify…
▽ More
Training large-scale machine learning models poses distinct system challenges, given both the size and complexity of today's workloads. Recently, many organizations training state-of-the-art Generative AI models have reported cases of instability during training, often taking the form of loss spikes. Numeric deviation has emerged as a potential cause of this training instability, although quantifying this is especially challenging given the costly nature of training runs. In this work, we develop a principled approach to understanding the effects of numeric deviation, and construct proxies to put observations into context when downstream effects are difficult to quantify. As a case study, we apply this framework to analyze the widely-adopted Flash Attention optimization. We find that Flash Attention sees roughly an order of magnitude more numeric deviation as compared to Baseline Attention at BF16 when measured during an isolated forward pass. We then use a data-driven analysis based on the Wasserstein Distance to provide upper bounds on how this numeric deviation impacts model weights during training, finding that the numerical deviation present in Flash Attention is 2-5 times less significant than low-precision training.
△ Less
Submitted 4 May, 2024;
originally announced May 2024.
-
Generative AI Beyond LLMs: System Implications of Multi-Modal Generation
Authors:
Alicia Golden,
Samuel Hsia,
Fei Sun,
Bilge Acun,
Basil Hosmer,
Yejin Lee,
Zachary DeVito,
Jeff Johnson,
Gu-Yeon Wei,
David Brooks,
Carole-Jean Wu
Abstract:
As the development of large-scale Generative AI models evolve beyond text (1D) generation to include image (2D) and video (3D) generation, processing spatial and temporal information presents unique challenges to quality, performance, and efficiency. We present the first work towards understanding this new system design space for multi-modal text-to-image (TTI) and text-to-video (TTV) generation m…
▽ More
As the development of large-scale Generative AI models evolve beyond text (1D) generation to include image (2D) and video (3D) generation, processing spatial and temporal information presents unique challenges to quality, performance, and efficiency. We present the first work towards understanding this new system design space for multi-modal text-to-image (TTI) and text-to-video (TTV) generation models. Current model architecture designs are bifurcated into 2 categories: Diffusion- and Transformer-based models. Our systematic performance characterization on a suite of eight representative TTI/TTV models shows that after state-of-the-art optimization techniques such as Flash Attention are applied, Convolution accounts for up to 44% of execution time for Diffusion-based TTI models, while Linear layers consume up to 49% of execution time for Transformer-based models. We additionally observe that Diffusion-based TTI models resemble the Prefill stage of LLM inference, and benefit from 1.1-2.5x greater speedup from Flash Attention than Transformer-based TTI models that resemble the Decode phase. Since optimizations designed for LLMs do not map directly onto TTI/TTV models, we must conduct a thorough characterization of these workloads to gain insights for new optimization opportunities. In doing so, we define sequence length in the context of TTI/TTV models and observe sequence length can vary up to 4x in Diffusion model inference. We additionally observe temporal aspects of TTV workloads pose unique system bottlenecks, with Temporal Attention accounting for over 60% of total Attention time. Overall, our in-depth system performance characterization is a critical first step towards designing efficient and deployable systems for emerging TTI/TTV workloads.
△ Less
Submitted 5 May, 2024; v1 submitted 21 December, 2023;
originally announced December 2023.
-
Gemini: A Family of Highly Capable Multimodal Models
Authors:
Gemini Team,
Rohan Anil,
Sebastian Borgeaud,
Jean-Baptiste Alayrac,
Jiahui Yu,
Radu Soricut,
Johan Schalkwyk,
Andrew M. Dai,
Anja Hauth,
Katie Millican,
David Silver,
Melvin Johnson,
Ioannis Antonoglou,
Julian Schrittwieser,
Amelia Glaese,
Jilin Chen,
Emily Pitler,
Timothy Lillicrap,
Angeliki Lazaridou,
Orhan Firat,
James Molloy,
Michael Isard,
Paul R. Barham,
Tom Hennigan,
Benjamin Lee
, et al. (1325 additional authors not shown)
Abstract:
This report introduces a new family of multimodal models, Gemini, that exhibit remarkable capabilities across image, audio, video, and text understanding. The Gemini family consists of Ultra, Pro, and Nano sizes, suitable for applications ranging from complex reasoning tasks to on-device memory-constrained use-cases. Evaluation on a broad range of benchmarks shows that our most-capable Gemini Ultr…
▽ More
This report introduces a new family of multimodal models, Gemini, that exhibit remarkable capabilities across image, audio, video, and text understanding. The Gemini family consists of Ultra, Pro, and Nano sizes, suitable for applications ranging from complex reasoning tasks to on-device memory-constrained use-cases. Evaluation on a broad range of benchmarks shows that our most-capable Gemini Ultra model advances the state of the art in 30 of 32 of these benchmarks - notably being the first model to achieve human-expert performance on the well-studied exam benchmark MMLU, and improving the state of the art in every one of the 20 multimodal benchmarks we examined. We believe that the new capabilities of the Gemini family in cross-modal reasoning and language understanding will enable a wide variety of use cases. We discuss our approach toward post-training and deploying Gemini models responsibly to users through services including Gemini, Gemini Advanced, Google AI Studio, and Cloud Vertex AI.
△ Less
Submitted 17 June, 2024; v1 submitted 18 December, 2023;
originally announced December 2023.
-
Chat Vector: A Simple Approach to Equip LLMs with Instruction Following and Model Alignment in New Languages
Authors:
Shih-Cheng Huang,
Pin-Zu Li,
Yu-Chi Hsu,
Kuang-Ming Chen,
Yu Tung Lin,
Shih-Kai Hsiao,
Richard Tzong-Han Tsai,
Hung-yi Lee
Abstract:
Recently, the development of open-source large language models (LLMs) has advanced rapidly. Nevertheless, due to data constraints, the capabilities of most open-source LLMs are primarily focused on English. To address this issue, we introduce the concept of $\textit{chat vector}$ to equip pre-trained language models with instruction following and human value alignment via simple model arithmetic.…
▽ More
Recently, the development of open-source large language models (LLMs) has advanced rapidly. Nevertheless, due to data constraints, the capabilities of most open-source LLMs are primarily focused on English. To address this issue, we introduce the concept of $\textit{chat vector}$ to equip pre-trained language models with instruction following and human value alignment via simple model arithmetic. The chat vector is derived by subtracting the weights of a pre-trained base model (e.g. LLaMA2) from those of its corresponding chat model (e.g. LLaMA2-chat). By simply adding the chat vector to a continual pre-trained model's weights, we can endow the model with chat capabilities in new languages without the need for further training. Our empirical studies demonstrate the superior efficacy of the chat vector from three different aspects: instruction following, toxicity mitigation, and multi-turn dialogue. Moreover, to showcase the adaptability of our approach, we extend our experiments to encompass various languages, base models, and chat vectors. The results underscore the chat vector's simplicity, effectiveness, and wide applicability, making it a compelling solution for efficiently enabling conversational capabilities in pre-trained language models. Our code is available at https://github.com/aqweteddy/ChatVector.
△ Less
Submitted 7 June, 2024; v1 submitted 7 October, 2023;
originally announced October 2023.
-
MAD Max Beyond Single-Node: Enabling Large Machine Learning Model Acceleration on Distributed Systems
Authors:
Samuel Hsia,
Alicia Golden,
Bilge Acun,
Newsha Ardalani,
Zachary DeVito,
Gu-Yeon Wei,
David Brooks,
Carole-Jean Wu
Abstract:
Training and deploying large-scale machine learning models is time-consuming, requires significant distributed computing infrastructures, and incurs high operational costs. Our analysis, grounded in real-world large model training on datacenter-scale infrastructures, reveals that 14~32% of all GPU hours are spent on communication with no overlapping computation. To minimize this outstanding commun…
▽ More
Training and deploying large-scale machine learning models is time-consuming, requires significant distributed computing infrastructures, and incurs high operational costs. Our analysis, grounded in real-world large model training on datacenter-scale infrastructures, reveals that 14~32% of all GPU hours are spent on communication with no overlapping computation. To minimize this outstanding communication latency and other inherent at-scale inefficiencies, we introduce an agile performance modeling framework, MAD-Max. This framework is designed to optimize parallelization strategies and facilitate hardware-software co-design opportunities. Through the application of MAD-Max to a suite of real-world large-scale ML models on state-of-the-art GPU clusters, we showcase potential throughput enhancements of up to 2.24x for pre-training and up to 5.2x for inference scenarios, respectively.
△ Less
Submitted 10 June, 2024; v1 submitted 4 October, 2023;
originally announced October 2023.
-
LoRA-like Calibration for Multimodal Deception Detection using ATSFace Data
Authors:
Shun-Wen Hsiao,
Cheng-Yuan Sun
Abstract:
Recently, deception detection on human videos is an eye-catching techniques and can serve lots applications. AI model in this domain demonstrates the high accuracy, but AI tends to be a non-interpretable black box. We introduce an attention-aware neural network addressing challenges inherent in video data and deception dynamics. This model, through its continuous assessment of visual, audio, and t…
▽ More
Recently, deception detection on human videos is an eye-catching techniques and can serve lots applications. AI model in this domain demonstrates the high accuracy, but AI tends to be a non-interpretable black box. We introduce an attention-aware neural network addressing challenges inherent in video data and deception dynamics. This model, through its continuous assessment of visual, audio, and text features, pinpoints deceptive cues. We employ a multimodal fusion strategy that enhances accuracy; our approach yields a 92\% accuracy rate on a real-life trial dataset. Most important of all, the model indicates the attention focus in the videos, providing valuable insights on deception cues. Hence, our method adeptly detects deceit and elucidates the underlying process. We further enriched our study with an experiment involving students answering questions either truthfully or deceitfully, resulting in a new dataset of 309 video clips, named ATSFace. Using this, we also introduced a calibration method, which is inspired by Low-Rank Adaptation (LoRA), to refine individual-based deception detection accuracy.
△ Less
Submitted 4 September, 2023;
originally announced September 2023.
-
MP-Rec: Hardware-Software Co-Design to Enable Multi-Path Recommendation
Authors:
Samuel Hsia,
Udit Gupta,
Bilge Acun,
Newsha Ardalani,
Pan Zhong,
Gu-Yeon Wei,
David Brooks,
Carole-Jean Wu
Abstract:
Deep learning recommendation systems serve personalized content under diverse tail-latency targets and input-query loads. In order to do so, state-of-the-art recommendation models rely on terabyte-scale embedding tables to learn user preferences over large bodies of contents. The reliance on a fixed embedding representation of embedding tables not only imposes significant memory capacity and bandw…
▽ More
Deep learning recommendation systems serve personalized content under diverse tail-latency targets and input-query loads. In order to do so, state-of-the-art recommendation models rely on terabyte-scale embedding tables to learn user preferences over large bodies of contents. The reliance on a fixed embedding representation of embedding tables not only imposes significant memory capacity and bandwidth requirements but also limits the scope of compatible system solutions. This paper challenges the assumption of fixed embedding representations by showing how synergies between embedding representations and hardware platforms can lead to improvements in both algorithmic- and system performance. Based on our characterization of various embedding representations, we propose a hybrid embedding representation that achieves higher quality embeddings at the cost of increased memory and compute requirements. To address the system performance challenges of the hybrid representation, we propose MP-Rec -- a co-design technique that exploits heterogeneity and dynamic selection of embedding representations and underlying hardware platforms.
On real system hardware, we demonstrate how matching custom accelerators, i.e., GPUs, TPUs, and IPUs, with compatible embedding representations can lead to 16.65x performance speedup. Additionally, in query-serving scenarios, MP-Rec achieves 2.49x and 3.76x higher correct prediction throughput and 0.19% and 0.22% better model quality on a CPU-GPU system for the Kaggle and Terabyte datasets, respectively.
△ Less
Submitted 21 February, 2023;
originally announced February 2023.
-
Attack Tactic Identification by Transfer Learning of Language Model
Authors:
Ling-Hsuan Lin,
Shun-Wen Hsiao
Abstract:
Cybersecurity has become a primary global concern with the rapid increase in security attacks and data breaches. Artificial intelligence is promising to help humans analyzing and identifying attacks. However, labeling millions of packets for supervised learning is never easy. This study aims to leverage transfer learning technique that stores the knowledge gained from well-defined attack lifecycle…
▽ More
Cybersecurity has become a primary global concern with the rapid increase in security attacks and data breaches. Artificial intelligence is promising to help humans analyzing and identifying attacks. However, labeling millions of packets for supervised learning is never easy. This study aims to leverage transfer learning technique that stores the knowledge gained from well-defined attack lifecycle documents and applies it to hundred thousands of unlabeled attacks (packets) for identifying their attack tactics. We anticipate the knowledge of an attack is well-described in the documents, and the cutting edge transformer-based language model can embed the knowledge into a high-dimensional latent space. Then, reusing the information from the language model for the learning of attack tactic carried by packets to improve the learning efficiency. We propose a system, PELAT, that fine-tunes BERT model with 1,417 articles from MITRE ATT&CK lifecycle framework to enhance its attack knowledge (including syntax used and semantic meanings embedded). PELAT then transfers its knowledge to perform semi-supervised learning for unlabeled packets to generate their tactic labels. Further, when a new attack packet arrives, the packet payload will be processed by the PELAT language model with a downstream classifier to predict its tactics. In this way, we can effectively reduce the burden of manually labeling big datasets. In a one-week honeypot attack dataset (227 thousand packets per day), PELAT performs 99% of precision, recall, and F1 on testing dataset. PELAT can infer over 99% of tactics on two other testing datasets (while nearly 90% of tactics are identified).
△ Less
Submitted 1 September, 2022;
originally announced September 2022.
-
Exploring Nanofibrous Networks with X-ray Photon Correlation Spectroscopy
Authors:
Tomas Rosén,
HongRui He,
Ruifu Wang,
Korneliya Gordeyeva,
Ahmad Reza Motezakker,
Andrei Fluerasu,
L. Daniel Söderberg,
Benjamin S. Hsiao
Abstract:
Nanofibrous networks are the foundation and natural building strategy for all life forms on our planet. Apart from providing structural integrity to cells and tissues, they also provide a porous scaffold allowing transport of substances, where the resulting properties rely on the nanoscale network structure. Recently, there has been a great deal of interest in extracting and reassembling biobased…
▽ More
Nanofibrous networks are the foundation and natural building strategy for all life forms on our planet. Apart from providing structural integrity to cells and tissues, they also provide a porous scaffold allowing transport of substances, where the resulting properties rely on the nanoscale network structure. Recently, there has been a great deal of interest in extracting and reassembling biobased nanofibers to create sustainable, advanced materials with applications ranging from high-performance textiles to artificial tissues. However, achieving structural control of the extracted nanofibers is challenging as it is strongly dependent on the extraction methods and source materials. Furthermore, the small nanofiber cross-sections and fast Brownian dynamics make them notoriously difficult to characterize in dispersions. In this work, we study the diffusive motion of spherical gold nanoparticles in semi-dilute networks of cellulose nanofibers (CNFs) using X-ray Photon Correlation Spectroscopy (XPCS). We find that the motion becomes increasingly subdiffusive with higher CNF concentration, where the dynamics can be decomposed into several superdiffusive relaxation modes in reciprocal space. Using simulations of confined Brownian dynamics in combination with simulated XPCS-experiments, we observe that the dynamic modes can be connected to pore sizes and inter-pore transport properties in the network. The demonstrated analytical strategy by combining experiments using tracer particles with a digital twin may be the key to understand nanoscale properties of nanofibrous networks.
△ Less
Submitted 18 August, 2022;
originally announced August 2022.
-
Sequence Feature Extraction for Malware Family Analysis via Graph Neural Network
Authors:
S. W. Hsiao,
P. Y. Chu
Abstract:
Malicious software (malware) causes much harm to our devices and life. We are eager to understand the malware behavior and the threat it made. Most of the record files of malware are variable length and text-based files with time stamps, such as event log data and dynamic analysis profiles. Using the time stamps, we can sort such data into sequence-based data for the following analysis. However, d…
▽ More
Malicious software (malware) causes much harm to our devices and life. We are eager to understand the malware behavior and the threat it made. Most of the record files of malware are variable length and text-based files with time stamps, such as event log data and dynamic analysis profiles. Using the time stamps, we can sort such data into sequence-based data for the following analysis. However, dealing with the text-based sequences with variable lengths is difficult. In addition, unlike natural language text data, most sequential data in information security have specific properties and structure, such as loop, repeated call, noise, etc. To deeply analyze the API call sequences with their structure, we use graphs to represent the sequences, which can further investigate the information and structure, such as the Markov model. Therefore, we design and implement an Attention Aware Graph Neural Network (AWGCN) to analyze the API call sequences. Through AWGCN, we can obtain the sequence embeddings to analyze the behavior of the malware. Moreover, the classification experiment result shows that AWGCN outperforms other classifiers in the call-like datasets, and the embedding can further improve the classic model's performance.
△ Less
Submitted 10 August, 2022;
originally announced August 2022.
-
Temporally-ultralong biphotons with a linewidth of 50 kHz
Authors:
Yu-Sheng Wang,
Kai-Bo Li,
Chao-Feng Chang,
Tan-Wen Lin,
Jian-Qing Li,
Shih-Si Hsiao,
Jia-Mou Chen,
Yi-Hua Lai,
Ying-Cheng Chen,
Yong-Fan Chen,
Chih-Sung Chuu,
Ite A. Yu
Abstract:
We report the generation of biphotons, with a temporal full width at the half maximum (FWHM) of 13.4$\pm$0.3 $μ$s and a spectral FWHM of 50$\pm$1 kHz, via the process of spontaneous four-wave mixing. The temporal width is the longest, and the spectral linewidth is the narrowest up to date. This is also the first biphoton result that obtains a linewidth below 100 kHz, reaching a new milestone. The…
▽ More
We report the generation of biphotons, with a temporal full width at the half maximum (FWHM) of 13.4$\pm$0.3 $μ$s and a spectral FWHM of 50$\pm$1 kHz, via the process of spontaneous four-wave mixing. The temporal width is the longest, and the spectral linewidth is the narrowest up to date. This is also the first biphoton result that obtains a linewidth below 100 kHz, reaching a new milestone. The very long biphoton wave packet has a signal-to-background ratio of 3.4, which violates the Cauchy-Schwarz inequality for classical light by 4.8 folds. Furthermore, we demonstrated a highly-tunable-linewidth biphoton source and showed that while the biphoton source's temporal and spectral width were controllably varied by about 24 folds, its generation rate only changed by less than 15\%. A spectral brightness or generation rate per pump power per linewidth of 1.2$\times$10$^6$ pairs/(s$\cdot$mW$\cdot$MHz) was achieved at the temporal width of 13.4 $μ$s. The above results were made possible by the low decoherence rate and high optical depth of the experimental system, as well as the nearly phase-mismatch-free scheme employed in the experiment. This work has demonstrated a high-efficiency ultranarrow-linewidth biphoton source, and has made a substantial advancement in the quantum technology utilizing heralded single photons.
△ Less
Submitted 27 May, 2022;
originally announced May 2022.
-
Temporal profile of biphotons generated from a hot atomic vapor and spectrum of electromagnetically induced transparency
Authors:
Shih-Si Hsiao,
Wei-Kai Huang,
Yi-Min Lin,
Jia-Mou Chen,
Chia-Yu Hsu,
Ite A. Yu
Abstract:
We systematically studied the temporal profile of biphotons, i.e., pairs of time-correlated single photons, generated from a hot atomic vapor via the spontaneous four-wave mixing process. The measured temporal width of biphoton wave packet or two-photon correlation function against the coupling power was varied from about 70 to 580 ns. We derived an analytical expression of the biphoton's spectral…
▽ More
We systematically studied the temporal profile of biphotons, i.e., pairs of time-correlated single photons, generated from a hot atomic vapor via the spontaneous four-wave mixing process. The measured temporal width of biphoton wave packet or two-photon correlation function against the coupling power was varied from about 70 to 580 ns. We derived an analytical expression of the biphoton's spectral profile in the Doppler-broadened medium. The analytical expression reveals that the spectral profile is mainly determined by the effect of electromagnetically induced transparency (EIT), and behaves like a Lorentzian function with a linewidth approximately equal to the EIT linewidth. Consequently, the biphoton's temporal profile influenced by the Doppler broadening is an exponential-decay function, which was consistent with the experimental data. Employing a weak input probe field of classical light, we further measured the EIT spectra under the same experimental conditions as those in the biphoton measurements. The theoretical predictions of the biphoton wave packets calculated with the parameters determined by the classical-light EIT spectra are consistent with the experimental data. The consistency demonstrates that in the Doppler-broadened medium, the classical-light EIT spectrum is a good indicator for the biphoton's temporal profile. Besides, the measured biphoton's temporal widths well approximated to the predictions of the analytical formula based on the biphoton's EIT effect. This study provides an analytical way to quantitatively understand the biphoton's spectral and temporal profiles in the Doppler-broadened medium.
△ Less
Submitted 5 April, 2022; v1 submitted 29 March, 2022;
originally announced March 2022.
-
Increasing decoherence rate of Rydberg polaritons due to accumulating dark Rydberg atoms
Authors:
Ko-Tang Chen,
Bongjune Kim,
Chia-Chen Su,
Shih-Si Hsiao,
Shou-Jou Huang,
Wen-Te Liao,
Ite A. Yu
Abstract:
We experimentally observed an accumulative type of nonlinear attenuation and distortion of slow light, i.e., Rydberg polaritons, with the Rydberg state $|32D_{5/2}\rangle$ in the weak-interaction regime. The present effect of attenuation and distortion cannot be explained by considering only the dipole-dipole interaction (DDI) between Rydberg atoms in $|32D_{5/2}\rangle$. Our observation can be at…
▽ More
We experimentally observed an accumulative type of nonlinear attenuation and distortion of slow light, i.e., Rydberg polaritons, with the Rydberg state $|32D_{5/2}\rangle$ in the weak-interaction regime. The present effect of attenuation and distortion cannot be explained by considering only the dipole-dipole interaction (DDI) between Rydberg atoms in $|32D_{5/2}\rangle$. Our observation can be attributed to the atoms in the dark Rydberg states other than those in the bright Rydberg state, i.e., $|32D_{5/2}\rangle$, driven by the coupling field. The dark Rydberg states are all the possible states, in which the population decaying from $|32D_{5/2}\rangle$ accumulated over time, and they were not driven by the coupling field. Consequently, the DDI between the dark and bright Rydberg atoms increased the decoherence rate of the Rydberg polaritons. We performed three different experiments to verify the above hypothesis, to confirm the existence of the dark Rydberg states, and to measure the decay rate from the bright to dark Rydberg states. In the theoretical model, we included the decay process from the bright to dark Rydberg states and the DDI effect induced by both the bright and dark Rydberg atoms. All the experimental data of slow light taken at various probe Rabi frequencies were in good agreement with the theoretical predictions based on the model. This study pointed out an additional decoherence rate in the Rydberg-EIT effect, and provides a better understanding of the Rydberg-polariton system.
△ Less
Submitted 9 April, 2022; v1 submitted 13 October, 2021;
originally announced October 2021.
-
Room-temperature biphoton source with a spectral brightness near the ultimate limit
Authors:
Jia-Mou Chen,
Chia-Yu Hsu,
Wei-Kai Huang,
Shih-Si Hsiao,
Fu-Chen Huang,
Yi-Hsin Chen,
Chih-Sung Chuu,
Ying-Cheng Chen,
Yong-Fan Chen,
Ite A. Yu
Abstract:
The biphotons, generated from a hot atomic vapor via the process of spontaneous four-wave mixing (SFWM), have the following merits: stable and tunable frequencies as well as linewidth. Such merits are very useful in the applications of long-distance quantum communication. However, the hot-atom SFWM biphoton sources previously had far lower values of generation rate per linewidth, i.e., spectral br…
▽ More
The biphotons, generated from a hot atomic vapor via the process of spontaneous four-wave mixing (SFWM), have the following merits: stable and tunable frequencies as well as linewidth. Such merits are very useful in the applications of long-distance quantum communication. However, the hot-atom SFWM biphoton sources previously had far lower values of generation rate per linewidth, i.e., spectral brightness, as compared with the sources of biphotons generated by the spontaneous parametric down conversion (SPDC) process. Here, we report a hot-atom SFWM source of biphotons with a linewidth of 960 kHz and a generation rate of 3.7$\times$ $10^5$ pairs/s. The high generation rate, together with the narrow linewidth, results in a spectral brightness of 3.8$\times$ $10^5$ pairs/s/MHz, which is 17 times of the previous best result with atomic vapors and also better than all known results with all kinds of media. The all-copropagating scheme together with a large optical depth (OD) of the atomic vapor is the key improvement, enabling the achieved spectral brightness to be about one quarter of the ultimate limit. Furthermore, this biphoton source had a signal-to-background ratio (SBR) of 2.7, which violated the Cauchy-Schwartz inequality for classical light by about 3.6 folds. Although an increasing spectral brightness usually leads to a decreasing SBR, our systematic study indicates that both of the present spectral brightness and SBR can be enhanced by further increasing the OD. This work demonstrates a significant advancement and provides useful knowledge in the quantum technology using photons.
△ Less
Submitted 8 May, 2022; v1 submitted 19 September, 2021;
originally announced September 2021.
-
RecPipe: Co-designing Models and Hardware to Jointly Optimize Recommendation Quality and Performance
Authors:
Udit Gupta,
Samuel Hsia,
Jeff Zhang,
Mark Wilkening,
Javin Pombra,
Hsien-Hsin S. Lee,
Gu-Yeon Wei,
Carole-Jean Wu,
David Brooks
Abstract:
Deep learning recommendation systems must provide high quality, personalized content under strict tail-latency targets and high system loads. This paper presents RecPipe, a system to jointly optimize recommendation quality and inference performance. Central to RecPipe is decomposing recommendation models into multi-stage pipelines to maintain quality while reducing compute complexity and exposing…
▽ More
Deep learning recommendation systems must provide high quality, personalized content under strict tail-latency targets and high system loads. This paper presents RecPipe, a system to jointly optimize recommendation quality and inference performance. Central to RecPipe is decomposing recommendation models into multi-stage pipelines to maintain quality while reducing compute complexity and exposing distinct parallelism opportunities. RecPipe implements an inference scheduler to map multi-stage recommendation engines onto commodity, heterogeneous platforms (e.g., CPUs, GPUs).While the hardware-aware scheduling improves ranking efficiency, the commodity platforms suffer from many limitations requiring specialized hardware. Thus, we design RecPipeAccel (RPAccel), a custom accelerator that jointly optimizes quality, tail-latency, and system throughput. RPAc-cel is designed specifically to exploit the distinct design space opened via RecPipe. In particular, RPAccel processes queries in sub-batches to pipeline recommendation stages, implements dual static and dynamic embedding caches, a set of top-k filtering units, and a reconfigurable systolic array. Com-pared to prior-art and at iso-quality, we demonstrate that RPAccel improves latency and throughput by 3x and 6x.
△ Less
Submitted 22 May, 2021; v1 submitted 18 May, 2021;
originally announced May 2021.
-
RecSSD: Near Data Processing for Solid State Drive Based Recommendation Inference
Authors:
Mark Wilkening,
Udit Gupta,
Samuel Hsia,
Caroline Trippel,
Carole-Jean Wu,
David Brooks,
Gu-Yeon Wei
Abstract:
Neural personalized recommendation models are used across a wide variety of datacenter applications including search, social media, and entertainment. State-of-the-art models comprise large embedding tables that have billions of parameters requiring large memory capacities. Unfortunately, large and fast DRAM-based memories levy high infrastructure costs. Conventional SSD-based storage solutions of…
▽ More
Neural personalized recommendation models are used across a wide variety of datacenter applications including search, social media, and entertainment. State-of-the-art models comprise large embedding tables that have billions of parameters requiring large memory capacities. Unfortunately, large and fast DRAM-based memories levy high infrastructure costs. Conventional SSD-based storage solutions offer an order of magnitude larger capacity, but have worse read latency and bandwidth, degrading inference performance. RecSSD is a near data processing based SSD memory system customized for neural recommendation inference that reduces end-to-end model inference latency by 2X compared to using COTS SSDs across eight industry-representative models.
△ Less
Submitted 29 January, 2021;
originally announced February 2021.
-
Generation of sub-MHz and spectrally-bright biphotons from hot atomic vapors with a phase mismatch-free scheme
Authors:
Chia-Yu Hsu,
Yu-Sheng Wang,
Jia-Mou Chen,
Fu-Chen Huang,
Yi-Ting Ke,
Emily Kay Huang,
Weilun Hung,
Kai-Lin Chao,
Shih-Si Hsiao,
Yi-Hsin Chen,
Chih-Sung Chuu,
Ying-Cheng Chen,
Yong-Fan Chen,
Ite A. Yu
Abstract:
We utilized the all-copropagating scheme, which maintains the phase-match condition, in the spontaneous four-wave mixing (SFWM) process to generate biphotons from a hot atomic vapor. The scheme enables our biphotons not only to surpass those in the previous works of hot-atom SFWM, but also to compete with the biphotons that are generated by either the cold-atom SFWM or the cavity-assisted spontane…
▽ More
We utilized the all-copropagating scheme, which maintains the phase-match condition, in the spontaneous four-wave mixing (SFWM) process to generate biphotons from a hot atomic vapor. The scheme enables our biphotons not only to surpass those in the previous works of hot-atom SFWM, but also to compete with the biphotons that are generated by either the cold-atom SFWM or the cavity-assisted spontaneous parametric down conversion. The biphoton linewidth in this work is tunable for an order of magnitude. As we tuned the linewidth to 610 kHz, the maximum two-photon correlation function, $g_{s,as}^{(2)}$, of the biphotons is 42. This $g_{s,as}^{(2)}$ violates the Cauchy-Schwartz inequality for classical light by 440 folds, and demonstrates that the biphotons have a high purity. The generation rate per linewidth of the 610-kHz biphoton source is 1,500 pairs/(s$\cdot$MHz), which is the best result of all the sub-MHz biphoton sources in the literature. By increasing the pump power by 16 folds, we further enhanced the generation rate per linewidth to 2.3$\times$10$^4$ pairs/(s$\cdot$MHz), while the maximum $g_{s,as}^{(2)}$ became 6.7. In addition, we are able to tune the linewidth down to 290$\pm$20 kHz. This is the narrowest linewidth to date, among all the various kinds of single-mode biphotons.
△ Less
Submitted 2 February, 2021; v1 submitted 9 December, 2020;
originally announced December 2020.
-
Cross-Stack Workload Characterization of Deep Recommendation Systems
Authors:
Samuel Hsia,
Udit Gupta,
Mark Wilkening,
Carole-Jean Wu,
Gu-Yeon Wei,
David Brooks
Abstract:
Deep learning based recommendation systems form the backbone of most personalized cloud services. Though the computer architecture community has recently started to take notice of deep recommendation inference, the resulting solutions have taken wildly different approaches - ranging from near memory processing to at-scale optimizations. To better design future hardware systems for deep recommendat…
▽ More
Deep learning based recommendation systems form the backbone of most personalized cloud services. Though the computer architecture community has recently started to take notice of deep recommendation inference, the resulting solutions have taken wildly different approaches - ranging from near memory processing to at-scale optimizations. To better design future hardware systems for deep recommendation inference, we must first systematically examine and characterize the underlying systems-level impact of design decisions across the different levels of the execution stack. In this paper, we characterize eight industry-representative deep recommendation models at three different levels of the execution stack: algorithms and software, systems platforms, and hardware microarchitectures. Through this cross-stack characterization, we first show that system deployment choices (i.e., CPUs or GPUs, batch size granularity) can give us up to 15x speedup. To better understand the bottlenecks for further optimization, we look at both software operator usage breakdown and CPU frontend and backend microarchitectural inefficiencies. Finally, we model the correlation between key algorithmic model architecture features and hardware bottlenecks, revealing the absence of a single dominant algorithmic component behind each hardware bottleneck.
△ Less
Submitted 10 October, 2020;
originally announced October 2020.
-
A Weakly-Interacting Many-Body System of Rydberg Polaritons Based on Electromagnetically Induced Transparency
Authors:
Bongjune Kim,
Ko-Tang Chen,
Shih-Si Hsiao,
Sheng-Yang Wang,
Kai-Bo Li,
Julius Ruseckas,
Gediminas Juzeliunas,
Teodora Kirova,
Marcis Auzinsh,
Ying-Cheng Chen,
Yong-Fan Chen,
Ite A. Yu
Abstract:
We proposed utilizing a medium with a high optical depth (OD) and a Rydberg state of low principal quantum number, $n$, to create a weakly-interacting many-body system of Rydberg polaritons, based on the effect of electromagnetically induced transparency (EIT). We experimentally verified the mean field approach to weakly-interacting Rydberg polaritons, and observed the phase shift and attenuation…
▽ More
We proposed utilizing a medium with a high optical depth (OD) and a Rydberg state of low principal quantum number, $n$, to create a weakly-interacting many-body system of Rydberg polaritons, based on the effect of electromagnetically induced transparency (EIT). We experimentally verified the mean field approach to weakly-interacting Rydberg polaritons, and observed the phase shift and attenuation induced by the dipole-dipole interaction (DDI). The DDI-induced phase shift or attenuation can be viewed as a consequence of the elastic or inelastic collisions among the Rydberg polaritons. Using a weakly-interacting system, we further observed that a larger DDI strength caused a width of the momentum distribution of Rydberg polaritons at the exit of the system to become notably smaller as compared with that at the entrance. In this study, we took $n =32$ and the atomic (or polariton) density of 5$\times10^{10}$ (or 2$\times10^{9}$) cm$^{-3}$. The observations demonstrate that the elastic collisions are sufficient to drive the thermalization process in this weakly-interacting many-body system. The combination of the $μ$s-long interaction time due to the high-OD EIT medium and the $μ$m$^2$-size collision cross section due to the DDI suggests a new and feasible platform for the Bose-Einstein condensation of the Rydberg polaritons.
△ Less
Submitted 24 May, 2021; v1 submitted 24 June, 2020;
originally announced June 2020.
-
Mean field theory of weakly-interacting Rydberg polaritons in the EIT system based on the nearest-neighbor distribution
Authors:
Shih-Si Hsiao,
Ko-Tang Chen,
Ite A. Yu
Abstract:
The combination of high optical nonlinearity in the electromagnetically induced transparency (EIT) effect and strong electric dipole-dipole interaction (DDI) among the Rydberg-state atoms can lead to important applications in quantum information processing and many-body physics. One can utilize the Rydberg-EIT system in the strongly-interacting regime to mediate photon-photon interaction or qubit-…
▽ More
The combination of high optical nonlinearity in the electromagnetically induced transparency (EIT) effect and strong electric dipole-dipole interaction (DDI) among the Rydberg-state atoms can lead to important applications in quantum information processing and many-body physics. One can utilize the Rydberg-EIT system in the strongly-interacting regime to mediate photon-photon interaction or qubit-qubit operation. One can also employ the Rydberg-EIT system in the weaklyinteracting regime to study the Bose-Einstein condensation of Rydberg polaritons. Most of the present theoretical models dealt with the strongly-interacting cases. Here, we consider the weaklyinteracting regime and develop a mean field model based on the nearest-neighbor distribution. Using the mean field model, we further derive the analytical formulas for the attenuation coefficient and phase shift of the output probe field. The predictions from the formulas are consistent with the experimental data in the weakly-interacting regime, verifying the validity of our model. As the DDI-induced phase shift and attenuation can be seen as the consequences of elastic and inelastic collisions among particles, this work provides a very useful tool for conceiving ideas relevant to the EIT system of weakly-interacting Rydberg polaritons, and for evaluating experimental feasibility.
△ Less
Submitted 11 September, 2020; v1 submitted 16 June, 2020;
originally announced June 2020.
-
DeepRecSys: A System for Optimizing End-To-End At-scale Neural Recommendation Inference
Authors:
Udit Gupta,
Samuel Hsia,
Vikram Saraph,
Xiaodong Wang,
Brandon Reagen,
Gu-Yeon Wei,
Hsien-Hsin S. Lee,
David Brooks,
Carole-Jean Wu
Abstract:
Neural personalized recommendation is the corner-stone of a wide collection of cloud services and products, constituting significant compute demand of the cloud infrastructure. Thus, improving the execution efficiency of neural recommendation directly translates into infrastructure capacity saving. In this paper, we devise a novel end-to-end modeling infrastructure, DeepRecInfra, that adopts an al…
▽ More
Neural personalized recommendation is the corner-stone of a wide collection of cloud services and products, constituting significant compute demand of the cloud infrastructure. Thus, improving the execution efficiency of neural recommendation directly translates into infrastructure capacity saving. In this paper, we devise a novel end-to-end modeling infrastructure, DeepRecInfra, that adopts an algorithm and system co-design methodology to custom-design systems for recommendation use cases. Leveraging the insights from the recommendation characterization, a new dynamic scheduler, DeepRecSched, is proposed to maximize latency-bounded throughput by taking into account characteristics of inference query size and arrival patterns, recommendation model architectures, and underlying hardware systems. By doing so, system throughput is doubled across the eight industry-representative recommendation models. Finally, design, deployment, and evaluation in at-scale production datacenter shows over 30% latency reduction across a wide variety of recommendation models running on hundreds of machines.
△ Less
Submitted 8 January, 2020;
originally announced January 2020.
-
Effect of laser frequency fluctuation on the decay rate of Rydberg coherence
Authors:
Bongjune Kim,
Ko-Tang Chen,
Chia-Yu Hsu,
Shih-Si Hsiao,
Yu-Chih Tseng,
Chin-Yuan Lee,
Shih-Lun Liang,
Yi-Hua Lai,
Julius Ruseckas,
Gediminas Juzeliunas,
Ite A. Yu
Abstract:
The effect of electromagnetically induced transparency (EIT) combined with Rydberg-state atoms provides high optical nonlinearity to efficiently mediate the photon-photon interaction. However, the decay rate of Rydberg coherence, i.e., the decoherence rate, plays an important role in optical nonlinear efficiency, and can be largely influenced by laser frequency fluctuation. In this work, we carrie…
▽ More
The effect of electromagnetically induced transparency (EIT) combined with Rydberg-state atoms provides high optical nonlinearity to efficiently mediate the photon-photon interaction. However, the decay rate of Rydberg coherence, i.e., the decoherence rate, plays an important role in optical nonlinear efficiency, and can be largely influenced by laser frequency fluctuation. In this work, we carried out a systematic study of the effect of laser frequency fluctuation on the decoherence rate. We derived an analytical formula that quantitatively describes the relationship between the decoherence rate and laser frequency fluctuation. The formula was experimentally verified by using the $Λ$-type EIT system of laser-cooled $^{87}$Rb atoms, in which one can either completely eliminate or controllably introduce the effect of laser frequency fluctuation. We also included the effect of Doppler shift caused by the atomic thermal motion in the formula, which can be negligible in the $Λ$-type EIT experiment but significant in the Rydberg-EIT experiment. Utilizing the atoms of 350 $μ$K, we studied the decoherence rate in the Rydberg-EIT system involving with the state of $|32D_{5/2}\rangle$. The experimental data are consistent with the predictions from the formula. We were able to achieve a rather low decoherence rate of $2π\times$48 kHz at a moderate coupling Rabi frequency of $2π\times$4.3 MHz.
△ Less
Submitted 13 June, 2019; v1 submitted 26 February, 2019;
originally announced February 2019.
-
Virtual Machine Introspection Based Malware Behavior Profiling and Family Grouping
Authors:
Shun-Wen Hsiao,
Yeali S. Sun,
Meng Chang Chen
Abstract:
The proliferation of malwares have been attributed to the alternations of a handful of original malware source codes. The malwares alternated from the same origin share some intrinsic behaviors and form a malware family. Expediently, identifying its malware family when a malware is first seen on the Internet can provide useful clues to mitigate the threat. In this paper, a malware profiler (VMP) i…
▽ More
The proliferation of malwares have been attributed to the alternations of a handful of original malware source codes. The malwares alternated from the same origin share some intrinsic behaviors and form a malware family. Expediently, identifying its malware family when a malware is first seen on the Internet can provide useful clues to mitigate the threat. In this paper, a malware profiler (VMP) is proposed to profile the execution behaviors of a malware by leveraging virtual machine introspection (VMI) technique. The VMP inserts plug-ins inside the virtual machine monitor (VMM) to record the invoked API calls with their input parameters and return values as the profile of malware. In this paper, a popular similarity measurement Jaccard distance and a phylogenetic tree construction method are adopted to discover malware families. The studies of malware profiles show the malwares from a malware family are very similar to each others and distinct from other malware families as well as benign software. This paper also examines VMP against existing anti-malware detection engines and some well-known malware grouping methods to compare the goodness in their malware family constructions. A peer voting approach is proposed and the results show VMP is better than almost all of the compared anti-malware engines, and compatible with the fine tuned text-mining approach and high order N-gram approaches. We also establish a malware profiling website based on VMP for malware research.
△ Less
Submitted 4 May, 2017;
originally announced May 2017.
-
Supercharacters, symmetric functions in noncommuting variables, and related Hopf algebras
Authors:
Marcelo Aguiar,
Carlos Andre,
Carolina Benedetti,
Nantel Bergeron,
Zhi Chen,
Persi Diaconis,
Anders Hendrickson,
Samuel Hsiao,
I. Martin Isaacs,
Andrea Jedwab,
Kenneth Johnson,
Gizem Karaali,
Aaron Lauve,
Tung Le,
Stephen Lewis,
Huilan Li,
Kay Magaard,
Eric Marberg,
Jean-Christophe Novelli,
Amy Pang,
Franco Saliola,
Lenny Tevlin,
Jean-Yves Thibon,
Nathaniel Thiem,
Vidya Venkateswaran
, et al. (3 additional authors not shown)
Abstract:
We identify two seemingly disparate structures: supercharacters, a useful way of doing Fourier analysis on the group of unipotent uppertriangular matrices with coefficients in a finite field, and the ring of symmetric functions in noncommuting variables. Each is a Hopf algebra and the two are isomorphic as such. This allows developments in each to be transferred. The identification suggests a rich…
▽ More
We identify two seemingly disparate structures: supercharacters, a useful way of doing Fourier analysis on the group of unipotent uppertriangular matrices with coefficients in a finite field, and the ring of symmetric functions in noncommuting variables. Each is a Hopf algebra and the two are isomorphic as such. This allows developments in each to be transferred. The identification suggests a rich class of examples for the emerging field of combinatorial Hopf algebras.
△ Less
Submitted 22 December, 2011; v1 submitted 21 September, 2010;
originally announced September 2010.
-
Multigraded combinatorial Hopf algebras and refinements of odd and even subalgebras
Authors:
Samuel K. Hsiao,
Gizem Karaali
Abstract:
We develop a theory of multigraded (i.e., $N^l$-graded) combinatorial Hopf algebras modeled on the theory of graded combinatorial Hopf algebras developed by Aguiar, Bergeron, and Sottile [Compos. Math. 142 (2006), 1--30]. In particular we introduce the notion of canonical $k$-odd and $k$-even subalgebras associated with any multigraded combinatorial Hopf algebra, extending simultaneously the work…
▽ More
We develop a theory of multigraded (i.e., $N^l$-graded) combinatorial Hopf algebras modeled on the theory of graded combinatorial Hopf algebras developed by Aguiar, Bergeron, and Sottile [Compos. Math. 142 (2006), 1--30]. In particular we introduce the notion of canonical $k$-odd and $k$-even subalgebras associated with any multigraded combinatorial Hopf algebra, extending simultaneously the work of Aguiar et al. and Ehrenborg. Among our results are specific categorical results for higher level quasisymmetric functions, several basis change formulas, and a generalization of the descents-to-peaks map.
△ Less
Submitted 10 February, 2011; v1 submitted 30 October, 2009;
originally announced October 2009.
-
A semigroup approach to wreath-product extensions of Solomon's descent algebras
Authors:
Samuel K. Hsiao
Abstract:
There is a well-known combinatorial definition, based on ordered set partitions, of the semigroup of faces of the braid arrangement. We generalize this definition to obtain a semigroup Sigma_n^G associated with G wr S_n, the wreath product of the symmetric group S_n with an arbitrary group G. Techniques of Bidigare and Brown are adapted to construct an anti-homomorphism from the S_n-invariant su…
▽ More
There is a well-known combinatorial definition, based on ordered set partitions, of the semigroup of faces of the braid arrangement. We generalize this definition to obtain a semigroup Sigma_n^G associated with G wr S_n, the wreath product of the symmetric group S_n with an arbitrary group G. Techniques of Bidigare and Brown are adapted to construct an anti-homomorphism from the S_n-invariant subalgebra of the semigroup algebra of Sigma_n^G into the group algebra of G wr S_n. The generalized descent algebras of Mantaci and Reutenauer are obtained as homomorphic images when G is abelian.
△ Less
Submitted 15 October, 2007; v1 submitted 10 October, 2007;
originally announced October 2007.
-
Random walks on quasisymmetric functions
Authors:
Patricia Hersh,
Samuel K. Hsiao
Abstract:
Conditions are provided under which an endomorphism on quasisymmetric functions gives rise to a left random walk on the descent algebra which is also a lumping of a left random walk on permutations. Spectral results are also obtained. Several well-studied random walks are now realized this way: Stanley's QS-distribution results from endomorphisms given by evaluation maps, a-shuffles result from…
▽ More
Conditions are provided under which an endomorphism on quasisymmetric functions gives rise to a left random walk on the descent algebra which is also a lumping of a left random walk on permutations. Spectral results are also obtained. Several well-studied random walks are now realized this way: Stanley's QS-distribution results from endomorphisms given by evaluation maps, a-shuffles result from the a-th convolution power of the universal character, and the Tchebyshev operator of the second kind introduced recently by Ehrenborg and Readdy yields traditional riffle shuffles. A conjecture of Ehrenborg regarding the spectra for a family of random walks on ab-words is proven. A theorem of Stembridge from the theory of enriched P-partitions is also recovered as a special case.
△ Less
Submitted 10 September, 2007;
originally announced September 2007.
-
Peak Quasisymmetric Functions and Eulerian Enumeration
Authors:
Louis J. Billera,
Samuel K. Hsiao,
Stephanie van Willigenburg
Abstract:
Via duality of Hopf algebras, there is a direct association between peak quasisymmetric functions and enumeration of chains in Eulerian posets. We study this association explicitly, showing that the notion of $\cd$-index, long studied in the context of convex polytopes and Eulerian posets, arises as the dual basis to a natural basis of peak quasisymmetric functions introduced by Stembridge. Thus…
▽ More
Via duality of Hopf algebras, there is a direct association between peak quasisymmetric functions and enumeration of chains in Eulerian posets. We study this association explicitly, showing that the notion of $\cd$-index, long studied in the context of convex polytopes and Eulerian posets, arises as the dual basis to a natural basis of peak quasisymmetric functions introduced by Stembridge. Thus Eulerian posets having a nonnegative $\cd$-index (for example, face lattices of convex polytopes) correspond to peak quasisymmetric functions having a nonnegative representation in terms of this basis. We diagonalize the operator that associates the basis of descent sets for all quasisymmetric functions to that of peak sets for the algebra of peak functions, and study the $g$-polynomial for Eulerian posets as an algebra homomorphism.
△ Less
Submitted 23 June, 2007;
originally announced June 2007.
-
Colored posets and colored quasisymmetric functions
Authors:
Samuel K. Hsiao,
T. Kyle Petersen
Abstract:
The colored quasisymmetric functions, like the classic quasisymmetric functions, are known to form a Hopf algebra with a natural peak subalgebra. We show how these algebras arise as the image of the algebra of colored posets. To effect this approach we introduce colored analogs of $P$-partitions and enriched $P$-partitions. We also frame our results in terms of Aguiar, Bergeron, and Sottile's th…
▽ More
The colored quasisymmetric functions, like the classic quasisymmetric functions, are known to form a Hopf algebra with a natural peak subalgebra. We show how these algebras arise as the image of the algebra of colored posets. To effect this approach we introduce colored analogs of $P$-partitions and enriched $P$-partitions. We also frame our results in terms of Aguiar, Bergeron, and Sottile's theory of combinatorial Hopf algebras and its colored analog.
△ Less
Submitted 31 October, 2006;
originally announced October 2006.
-
The Hopf algebras of type B quasisymmetric functions and peak functions
Authors:
Samuel K. Hsiao,
T. Kyle Petersen
Abstract:
We show that with the appropriate choice of coproduct, the type B quasisymmetric functions form a Hopf algebra, and the recently introduced type B peak functions form a Hopf subalgebra.
We show that with the appropriate choice of coproduct, the type B quasisymmetric functions form a Hopf algebra, and the recently introduced type B peak functions form a Hopf subalgebra.
△ Less
Submitted 31 October, 2006;
originally announced October 2006.
-
Enumeration in convex geometries and associated polytopal subdivisions of spheres
Authors:
Louis J. Billera,
Samuel K. Hsiao,
J. Scott Provan
Abstract:
We construct CW spheres from the lattices that arise as the closed sets of a convex closure, the meet-distributive lattices. These spheres are nearly polytopal, in the sense that their barycentric subdivisions are simplicial polytopes. The complete information on the numbers of faces and chains of faces in these spheres can be obtained from the defining lattices in a manner analogous to the rela…
▽ More
We construct CW spheres from the lattices that arise as the closed sets of a convex closure, the meet-distributive lattices. These spheres are nearly polytopal, in the sense that their barycentric subdivisions are simplicial polytopes. The complete information on the numbers of faces and chains of faces in these spheres can be obtained from the defining lattices in a manner analogous to the relation between arrangements of hyperplanes and their underlying geometric intersection lattices.
△ Less
Submitted 19 April, 2006; v1 submitted 26 May, 2005;
originally announced May 2005.
-
Canonical characters on quasi-symmetric functions and bivariate Catalan numbers
Authors:
Marcelo Aguiar,
Samuel K. Hsiao
Abstract:
Every character on a graded connected Hopf algebra decomposes uniquely as a product of an even character and an odd character (Aguiar, Bergeron, and Sottile, math.CO/0310016).
We obtain explicit formulas for the even and odd parts of the universal character on the Hopf algebra of quasi-symmetric functions. They can be described in terms of Legendre's beta function evaluated at half-integers, or…
▽ More
Every character on a graded connected Hopf algebra decomposes uniquely as a product of an even character and an odd character (Aguiar, Bergeron, and Sottile, math.CO/0310016).
We obtain explicit formulas for the even and odd parts of the universal character on the Hopf algebra of quasi-symmetric functions. They can be described in terms of Legendre's beta function evaluated at half-integers, or in terms of bivariate Catalan numbers:
$$ C(m,n)=\frac{(2m)!(2n)!}{m!(m+n)!n!}. $$
Properties of characters and of quasi-symmetric functions are then used to derive several interesting identities among bivariate Catalan numbers and in particular among Catalan numbers and central binomial coefficients.
△ Less
Submitted 4 August, 2004;
originally announced August 2004.
-
Pseudovector versus pseudoscalar coupling in kaon photoproduction - revisited
Authors:
S. S. Hsiao,
D. H. Lu,
Shin Nan Yang
Abstract:
The question of pseudovector versus pseudoscalar coupling schemes for the kaon-hyperon-nucleon interaction is re-examined for the reaction $γp\to K^+ Λ$ in several isobaric models. These models typically include Born terms, $K^*$- and $K_1$-exchange in the t-channel, and a few different combinations of spin-1/2 baryon resonances in the $s$- and $u$-channels. The coupling constants are obtained b…
▽ More
The question of pseudovector versus pseudoscalar coupling schemes for the kaon-hyperon-nucleon interaction is re-examined for the reaction $γp\to K^+ Λ$ in several isobaric models. These models typically include Born terms, $K^*$- and $K_1$-exchange in the t-channel, and a few different combinations of spin-1/2 baryon resonances in the $s$- and $u$-channels. The coupling constants are obtained by fitting to a large data set. We find that both pseudoscalar and pseudovector couplings can allow for a satisfactory description of the present database. The resulting coupling constants, $g_{KΛN}$ and $g_{KΣN}$, in the pseudovector coupling scheme are smaller than those predicted using flavor SU(3) symmetry, but consistent with the values obtained in a QCD sum rule calculation.
△ Less
Submitted 5 April, 2000;
originally announced April 2000.