Search | arXiv e-print repository

Optimal Network Pairwise Comparison

Authors: Jiashun Jin, Zheng Tracy Ke, Shengming Luo, Yucong Ma

Abstract: We are interested in the problem of two-sample network hypothesis testing: given two networks with the same set of nodes, we wish to test whether the underlying Bernoulli probability matrices of the two networks are the same or not. We propose Interlacing Balance Measure (IBM) as a new two-sample testing approach. We consider the {\it Degree-Corrected Mixed-Membership (DCMM)} model for undirected… ▽ More We are interested in the problem of two-sample network hypothesis testing: given two networks with the same set of nodes, we wish to test whether the underlying Bernoulli probability matrices of the two networks are the same or not. We propose Interlacing Balance Measure (IBM) as a new two-sample testing approach. We consider the {\it Degree-Corrected Mixed-Membership (DCMM)} model for undirected networks, where we allow severe degree heterogeneity, mixed-memberships, flexible sparsity levels, and weak signals. In such a broad setting, how to find a test that has a tractable limiting null and optimal testing performances is a challenging problem. We show that IBM is such a test: in a broad DCMM setting with only mild regularity conditions, IBM has $N(0,1)$ as the limiting null and achieves the optimal phase transition. While the above is for undirected networks, IBM is a unified approach and is directly implementable for directed networks. For a broad directed-DCMM (extension of DCMM for directed networks) setting, we show that IBM has $N(0, 1/2)$ as the limiting null and continues to achieve the optimal phase transition. We have also applied IBM to the Enron email network and a gene co-expression network, with interesting results. △ Less

Submitted 13 August, 2024; originally announced August 2024.

Comments: 92 pages

MSC Class: 62H30; 91C20

arXiv:2407.17889 [pdf]

An Error Discovery and Correction for the Family of V-Shaped BPSO Algorithms

Authors: Qing Zhao, Chengkui Zhang, Hao Li, Ting Ke

Abstract: BPSO algorithm is a swarm intelligence optimization algorithm, which has the characteristics of good optimization effect, high efficiency and easy to implement. In recent years, it has been used to optimize a variety of machine learning and deep learning models, such as CNN, LSTM, SVM, etc. But it is easy to fall into local optimum for the lack of exploitation ability. It is found that in the arti… ▽ More BPSO algorithm is a swarm intelligence optimization algorithm, which has the characteristics of good optimization effect, high efficiency and easy to implement. In recent years, it has been used to optimize a variety of machine learning and deep learning models, such as CNN, LSTM, SVM, etc. But it is easy to fall into local optimum for the lack of exploitation ability. It is found that in the article, which is different from previous studies, The reason for the poor performance is an error existing in their velocity update function, which leads to abnormal and chaotic behavior of particles. This not only makes the algorithm difficult to converge, but also often searches the repeated space. So, traditionally, it has to rely on a low w value in the later stage to force these algorithms to converge, but also makes them quickly lose their search ability and prone to getting trapped in local optima. This article proposes a velocity legacy term correction method for all V-shaped BPSOs. Experimentals based on 0/1 knapsack problems show that it has a significant effect on accuracy and efficiency for all of the 4 commonly used V-Shaped BPSOs. Therefore it is an significant breakthrough in the field of swarm intelligence. △ Less

Submitted 25 July, 2024; originally announced July 2024.

Comments: 25 pages, 11 figures

arXiv:2406.17746 [pdf, other]

Recite, Reconstruct, Recollect: Memorization in LMs as a Multifaceted Phenomenon

Authors: USVSN Sai Prashanth, Alvin Deng, Kyle O'Brien, Jyothir S V, Mohammad Aflah Khan, Jaydeep Borkar, Christopher A. Choquette-Choo, Jacob Ray Fuehne, Stella Biderman, Tracy Ke, Katherine Lee, Naomi Saphra

Abstract: Memorization in language models is typically treated as a homogenous phenomenon, neglecting the specifics of the memorized data. We instead model memorization as the effect of a set of complex factors that describe each sample and relate it to the model and corpus. To build intuition around these factors, we break memorization down into a taxonomy: recitation of highly duplicated sequences, recons… ▽ More Memorization in language models is typically treated as a homogenous phenomenon, neglecting the specifics of the memorized data. We instead model memorization as the effect of a set of complex factors that describe each sample and relate it to the model and corpus. To build intuition around these factors, we break memorization down into a taxonomy: recitation of highly duplicated sequences, reconstruction of inherently predictable sequences, and recollection of sequences that are neither. We demonstrate the usefulness of our taxonomy by using it to construct a predictive model for memorization. By analyzing dependencies and inspecting the weights of the predictive model, we find that different factors influence the likelihood of memorization differently depending on the taxonomic category. △ Less

Submitted 25 June, 2024; originally announced June 2024.

arXiv:2406.08698 [pdf, other]

Constraints on Ultra Heavy Dark Matter Properties from Dwarf Spheroidal Galaxies with LHAASO Observations

Authors: Zhen Cao, F. Aharonian, Q. An, Axikegu, Y. X. Bai, Y. W. Bao, D. Bastieri, X. J. Bi, Y. J. Bi, J. T. Cai, Q. Cao, W. Y. Cao, Zhe Cao, J. Chang, J. F. Chang, A. M. Chen, E. S. Chen, Liang Chen, Lin Chen, Long Chen, M. J. Chen, M. L. Chen, Q. H. Chen, S. H. Chen, S. Z. Chen , et al. (255 additional authors not shown)

Abstract: In this work we try to search for signals generated by ultra-heavy dark matter at the Large High Altitude Air Shower Observatory (LHAASO) data. We look for possible gamma-ray by dark matter annihilation or decay from 16 dwarf spheroidal galaxies in the field of view of LHAASO. Dwarf spheroidal galaxies are among the most promising targets for indirect detection of dark matter which have low fluxes… ▽ More In this work we try to search for signals generated by ultra-heavy dark matter at the Large High Altitude Air Shower Observatory (LHAASO) data. We look for possible gamma-ray by dark matter annihilation or decay from 16 dwarf spheroidal galaxies in the field of view of LHAASO. Dwarf spheroidal galaxies are among the most promising targets for indirect detection of dark matter which have low fluxes of astrophysical $γがんま$-ray background while large amount of dark matter. By analyzing more than 700 days observational data at LHAASO, no significant dark matter signal from 1 TeV to 1 EeV is detected. Accordingly we derive the most stringent constraints on the ultra-heavy dark matter annihilation cross-section up to EeV. The constraints on the lifetime of dark matter in decay mode are also derived. △ Less

Submitted 12 June, 2024; originally announced June 2024.

Comments: 17 pages, 12 figures, accepted by PRL

arXiv:2405.17806 [pdf, other]

Entry-Wise Eigenvector Analysis and Improved Rates for Topic Modeling on Short Documents

Authors: Zheng Tracy Ke, Jingming Wang

Abstract: Topic modeling is a widely utilized tool in text analysis. We investigate the optimal rate for estimating a topic model. Specifically, we consider a scenario with $n$ documents, a vocabulary of size $p$, and document lengths at the order $N$. When $N\geq c\cdot p$, referred to as the long-document case, the optimal rate is established in the literature at $\sqrt{p/(Nn)}$. However, when $N=o(p)$, r… ▽ More Topic modeling is a widely utilized tool in text analysis. We investigate the optimal rate for estimating a topic model. Specifically, we consider a scenario with $n$ documents, a vocabulary of size $p$, and document lengths at the order $N$. When $N\geq c\cdot p$, referred to as the long-document case, the optimal rate is established in the literature at $\sqrt{p/(Nn)}$. However, when $N=o(p)$, referred to as the short-document case, the optimal rate remains unknown. In this paper, we first provide new entry-wise large-deviation bounds for the empirical singular vectors of a topic model. We then apply these bounds to improve the error rate of a spectral algorithm, Topic-SCORE. Finally, by comparing the improved error rate with the minimax lower bound, we conclude that the optimal rate is still $\sqrt{p/(Nn)}$ in the short-document case. △ Less

Submitted 28 May, 2024; originally announced May 2024.

Comments: 50 pages

MSC Class: 62H12

Journal ref: MDPI Mathematics, 2024

arXiv:2405.07691 [pdf, other]

Discovery of Very-high-energy Gamma-ray Emissions from the Low Luminosity AGN NGC 4278 by LHAASO

Authors: Zhen Cao, F. Aharonian, Q. An, Axikegu, Y. X. Bai, Y. W. Bao, D. Bastieri, X. J. Bi, Y. J. Bi, J. T. Cai, Q. Cao, W. Y. Cao, Zhe Cao, J. Chang, J. F. Chang, A. M. Chen, E. S. Chen, Liang Chen, Lin Chen, Long Chen, M. J. Chen, M. L. Chen, Q. H. Chen, S. H. Chen, S. Z. Chen , et al. (255 additional authors not shown)

Abstract: The first source catalog of Large High Altitude Air Shower Observatory reported the detection of a very-high-energy gamma ray source, 1LHAASO J1219+2915. In this paper a further detailed study of the spectral and temporal behavior of this point-like source have been carried. The best-fit position of the TeV source ($\rm{RA}=185.05^{\circ}\pm0.04^{\circ}$, $\rm{Dec}=29.25^{\circ}\pm0.03^{\circ}$) i… ▽ More The first source catalog of Large High Altitude Air Shower Observatory reported the detection of a very-high-energy gamma ray source, 1LHAASO J1219+2915. In this paper a further detailed study of the spectral and temporal behavior of this point-like source have been carried. The best-fit position of the TeV source ($\rm{RA}=185.05^{\circ}\pm0.04^{\circ}$, $\rm{Dec}=29.25^{\circ}\pm0.03^{\circ}$) is compatible with NGC 4278 within $\sim0.03$ degree. Variation analysis shows an indication of the variability at a few months level in the TeV band, which is consistent with low frequency observations. Based on these observations, we report the detection of TeV $γがんま$-ray emissions from this low-luminosity AGN NGC 4278. The observations by LHAASO-WCDA during active period has a significance level of 8.8\,$σしぐま$ with best-fit photon spectral index $\varGamma=2.56\pm0.14$ and a flux $f_{1-10\,\rm{TeV}}=(7.0\pm1.1_{\rm{sta}}\pm0.35_{\rm{syst}})\times10^{-13}\,\rm{photons\,cm^{-2}\,s^{-1}}$, or approximately $5\%$ of the Crab Nebula. The discovery of VHE from NGC 4278 indicates that the compact, weak radio jet can efficiently accelerate particles and emit TeV photons. △ Less

Submitted 13 May, 2024; originally announced May 2024.

Comments: 11 pages, 5 figures

arXiv:2405.01507 [pdf, other]

Accelerating Convergence in Bayesian Few-Shot Classification

Authors: Tianjun Ke, Haoqun Cao, Feng Zhou

Abstract: Bayesian few-shot classification has been a focal point in the field of few-shot learning. This paper seamlessly integrates mirror descent-based variational inference into Gaussian process-based few-shot classification, addressing the challenge of non-conjugate inference. By leveraging non-Euclidean geometry, mirror descent achieves accelerated convergence by providing the steepest descent directi… ▽ More Bayesian few-shot classification has been a focal point in the field of few-shot learning. This paper seamlessly integrates mirror descent-based variational inference into Gaussian process-based few-shot classification, addressing the challenge of non-conjugate inference. By leveraging non-Euclidean geometry, mirror descent achieves accelerated convergence by providing the steepest descent direction along the corresponding manifold. It also exhibits the parameterization invariance property concerning the variational distribution. Experimental results demonstrate competitive classification accuracy, improved uncertainty quantification, and faster convergence compared to baseline models. Additionally, we investigate the impact of hyperparameters and components. Code is publicly available at https://github.com/keanson/MD-BSFC. △ Less

Submitted 7 May, 2024; v1 submitted 2 May, 2024; originally announced May 2024.

arXiv:2404.04801 [pdf, ps, other]

doi 10.1007/s41605-024-00467-8

LHAASO-KM2A detector simulation using Geant4

Authors: Zhen Cao, F. Aharonian, Q. An, Axikegu, Y. X. Bai, Y. W. Bao, D. Bastieri, X. J. Bi, Y. J. Bi, J. T. Cai, Q. Cao, W. Y. Cao, Zhe Cao, J. Chang, J. F. Chang, A. M. Chen, E. S. Chen, Liang Chen, Lin Chen, Long Chen, M. J. Chen, M. L. Chen, Q. H. Chen, S. H. Chen, S. Z. Chen , et al. (254 additional authors not shown)

Abstract: KM2A is one of the main sub-arrays of LHAASO, working on gamma ray astronomy and cosmic ray physics at energies above 10 TeV. Detector simulation is the important foundation for estimating detector performance and data analysis. It is a big challenge to simulate the KM2A detector in the framework of Geant4 due to the need to track numerous photons from a large number of detector units (>6000) with… ▽ More KM2A is one of the main sub-arrays of LHAASO, working on gamma ray astronomy and cosmic ray physics at energies above 10 TeV. Detector simulation is the important foundation for estimating detector performance and data analysis. It is a big challenge to simulate the KM2A detector in the framework of Geant4 due to the need to track numerous photons from a large number of detector units (>6000) with large altitude difference (30 m) and huge coverage (1.3 km^2). In this paper, the design of the KM2A simulation code G4KM2A based on Geant4 is introduced. The process of G4KM2A is optimized mainly in memory consumption to avoid memory overffow. Some simpliffcations are used to signiffcantly speed up the execution of G4KM2A. The running time is reduced by at least 30 times compared to full detector simulation. The particle distributions and the core/angle resolution comparison between simulation and experimental data of the full KM2A array are also presented, which show good agreement. △ Less

Submitted 7 April, 2024; originally announced April 2024.

arXiv:2403.11013 [pdf, other]

Improved Algorithm and Bounds for Successive Projection

Authors: Jiashun Jin, Zheng Tracy Ke, Gabriel Moryoussef, Jiajun Tang, Jingming Wang

Abstract: Given a $K$-vertex simplex in a $d$-dimensional space, suppose we measure $n$ points on the simplex with noise (hence, some of the observed points fall outside the simplex). Vertex hunting is the problem of estimating the $K$ vertices of the simplex. A popular vertex hunting algorithm is successive projection algorithm (SPA). However, SPA is observed to perform unsatisfactorily under strong noise… ▽ More Given a $K$-vertex simplex in a $d$-dimensional space, suppose we measure $n$ points on the simplex with noise (hence, some of the observed points fall outside the simplex). Vertex hunting is the problem of estimating the $K$ vertices of the simplex. A popular vertex hunting algorithm is successive projection algorithm (SPA). However, SPA is observed to perform unsatisfactorily under strong noise or outliers. We propose pseudo-point SPA (pp-SPA). It uses a projection step and a denoise step to generate pseudo-points and feed them into SPA for vertex hunting. We derive error bounds for pp-SPA, leveraging on extreme value theory of (possibly) high-dimensional random vectors. The results suggest that pp-SPA has faster rates and better numerical performances than SPA. Our analysis includes an improved non-asymptotic bound for the original SPA, which is of independent interest. △ Less

Submitted 16 March, 2024; originally announced March 2024.

Comments: 32 pages, 5 figures

arXiv:2403.10010 [pdf, other]

doi 10.1103/PhysRevLett.132.131002

Measurements of All-Particle Energy Spectrum and Mean Logarithmic Mass of Cosmic Rays from 0.3 to 30 PeV with LHAASO-KM2A

Authors: The LHAASO Collaboration, Zhen Cao, F. Aharonian, Q. An, A. Axikegu, Y. X. Bai, Y. W. Bao, D. Bastieri, X. J. Bi, Y. J. Bi, J. T. Cai, Q. Cao, W. Y. Cao, Zhe Cao, J. Chang, J. F. Chang, A. M. Chen, E. S. Chen, Liang Chen, Lin Chen, Long Chen, M. J. Chen, M. L. Chen, Q. H. Chen, S. H. Chen , et al. (256 additional authors not shown)

Abstract: We present the measurements of all-particle energy spectrum and mean logarithmic mass of cosmic rays in the energy range of 0.3-30 PeV using data collected from LHAASO-KM2A between September 2021 and December 2022, which is based on a nearly composition-independent energy reconstruction method, achieving unprecedented accuracy. Our analysis reveals the position of the knee at… ▽ More We present the measurements of all-particle energy spectrum and mean logarithmic mass of cosmic rays in the energy range of 0.3-30 PeV using data collected from LHAASO-KM2A between September 2021 and December 2022, which is based on a nearly composition-independent energy reconstruction method, achieving unprecedented accuracy. Our analysis reveals the position of the knee at $3.67 \pm 0.05 \pm 0.15$ PeV. Below the knee, the spectral index is found to be -$2.7413 \pm 0.0004 \pm 0.0050$, while above the knee, it is -$3.128 \pm 0.005 \pm 0.027$, with the sharpness of the transition measured with a statistical error of 2%. The mean logarithmic mass of cosmic rays is almost heavier than helium in the whole measured energy range. It decreases from 1.7 at 0.3 PeV to 1.3 at 3 PeV, representing a 24% decline following a power law with an index of -$0.1200 \pm 0.0003 \pm 0.0341$. This is equivalent to an increase in abundance of light components. Above the knee, the mean logarithmic mass exhibits a power law trend towards heavier components, which is reversal to the behavior observed in the all-particle energy spectrum. Additionally, the knee position and the change in power-law index are approximately the same. These findings suggest that the knee observed in the all-particle spectrum corresponds to the knee of the light component, rather than the medium-heavy components. △ Less

Submitted 26 March, 2024; v1 submitted 15 March, 2024; originally announced March 2024.

Comments: 8 pages, 3 figures

Journal ref: Physical Review Letters 132, 131002 (2024)

arXiv:2402.10885 [pdf, other]

3D Diffuser Actor: Policy Diffusion with 3D Scene Representations

Authors: Tsung-Wei Ke, Nikolaos Gkanatsios, Katerina Fragkiadaki

Abstract: Diffusion policies are conditional diffusion models that learn robot action distributions conditioned on the robot and environment state. They have recently shown to outperform both deterministic and alternative action distribution learning formulations. 3D robot policies use 3D scene feature representations aggregated from a single or multiple camera views using sensed depth. They have shown to g… ▽ More Diffusion policies are conditional diffusion models that learn robot action distributions conditioned on the robot and environment state. They have recently shown to outperform both deterministic and alternative action distribution learning formulations. 3D robot policies use 3D scene feature representations aggregated from a single or multiple camera views using sensed depth. They have shown to generalize better than their 2D counterparts across camera viewpoints. We unify these two lines of work and present 3D Diffuser Actor, a neural policy equipped with a novel 3D denoising transformer that fuses information from the 3D visual scene, a language instruction and proprioception to predict the noise in noised 3D robot pose trajectories. 3D Diffuser Actor sets a new state-of-the-art on RLBench with an absolute performance gain of 18.1% over the current SOTA on a multi-view setup and an absolute gain of 13.1% on a single-view setup. On the CALVIN benchmark, it improves over the current SOTA by a 9% relative increase. It also learns to control a robot manipulator in the real world from a handful of demonstrations. Through thorough comparisons with the current SOTA policies and ablations of our model, we show 3D Diffuser Actor's design choices dramatically outperform 2D representations, regression and classification objectives, absolute attentions, and holistic non-tokenized 3D scene embeddings. △ Less

Submitted 25 July, 2024; v1 submitted 16 February, 2024; originally announced February 2024.

Comments: First two authors contributed equally

arXiv:2402.06559 [pdf, other]

Diffusion-ES: Gradient-free Planning with Diffusion for Autonomous Driving and Zero-Shot Instruction Following

Authors: Brian Yang, Huangyuan Su, Nikolaos Gkanatsios, Tsung-Wei Ke, Ayush Jain, Jeff Schneider, Katerina Fragkiadaki

Abstract: Diffusion models excel at modeling complex and multimodal trajectory distributions for decision-making and control. Reward-gradient guided denoising has been recently proposed to generate trajectories that maximize both a differentiable reward function and the likelihood under the data distribution captured by a diffusion model. Reward-gradient guided denoising requires a differentiable reward fun… ▽ More Diffusion models excel at modeling complex and multimodal trajectory distributions for decision-making and control. Reward-gradient guided denoising has been recently proposed to generate trajectories that maximize both a differentiable reward function and the likelihood under the data distribution captured by a diffusion model. Reward-gradient guided denoising requires a differentiable reward function fitted to both clean and noised samples, limiting its applicability as a general trajectory optimizer. In this paper, we propose DiffusionES, a method that combines gradient-free optimization with trajectory denoising to optimize black-box non-differentiable objectives while staying in the data manifold. Diffusion-ES samples trajectories during evolutionary search from a diffusion model and scores them using a black-box reward function. It mutates high-scoring trajectories using a truncated diffusion process that applies a small number of noising and denoising steps, allowing for much more efficient exploration of the solution space. We show that DiffusionES achieves state-of-the-art performance on nuPlan, an established closed-loop planning benchmark for autonomous driving. Diffusion-ES outperforms existing sampling-based planners, reactive deterministic or diffusion-based policies, and reward-gradient guidance. Additionally, we show that unlike prior guidance methods, our method can optimize non-differentiable language-shaped reward functions generated by few-shot LLM prompting. When guided by a human teacher that issues instructions to follow, our method can generate novel, highly complex behaviors, such as aggressive lane weaving, which are not present in the training data. This allows us to solve the hardest nuPlan scenarios which are beyond the capabilities of existing trajectory optimization methods and driving policies. △ Less

Submitted 16 July, 2024; v1 submitted 9 February, 2024; originally announced February 2024.

arXiv:2401.00775 [pdf, other]

doi 10.1146/annurev-statistics-040522-022138

Recent Advances in Text Analysis

Authors: Zheng Tracy Ke, Pengsheng Ji, Jiashun Jin, Wanshan Li

Abstract: Text analysis is an interesting research area in data science and has various applications, such as in artificial intelligence, biomedical research, and engineering. We review popular methods for text analysis, ranging from topic modeling to the recent neural language models. In particular, we review Topic-SCORE, a statistical approach to topic modeling, and discuss how to use it to analyze MADSta… ▽ More Text analysis is an interesting research area in data science and has various applications, such as in artificial intelligence, biomedical research, and engineering. We review popular methods for text analysis, ranging from topic modeling to the recent neural language models. In particular, we review Topic-SCORE, a statistical approach to topic modeling, and discuss how to use it to analyze MADStat - a dataset on statistical publications that we collected and cleaned. The application of Topic-SCORE and other methods on MADStat leads to interesting findings. For example, $11$ representative topics in statistics are identified. For each journal, the evolution of topic weights over time can be visualized, and these results are used to analyze the trends in statistical research. In particular, we propose a new statistical model for ranking the citation impacts of $11$ topics, and we also build a cross-topic citation graph to illustrate how research results on different topics spread to one another. The results on MADStat provide a data-driven picture of the statistical research in $1975$--$2015$, from a text analysis perspective. △ Less

Submitted 7 February, 2024; v1 submitted 1 January, 2024; originally announced January 2024.

Journal ref: Annual Review of Statistics and Its Application 2024 11:1

arXiv:2311.16102 [pdf, other]

Diffusion-TTA: Test-time Adaptation of Discriminative Models via Generative Feedback

Authors: Mihir Prabhudesai, Tsung-Wei Ke, Alexander C. Li, Deepak Pathak, Katerina Fragkiadaki

Abstract: The advancements in generative modeling, particularly the advent of diffusion models, have sparked a fundamental question: how can these models be effectively used for discriminative tasks? In this work, we find that generative models can be great test-time adapters for discriminative models. Our method, Diffusion-TTA, adapts pre-trained discriminative models such as image classifiers, segmenters… ▽ More The advancements in generative modeling, particularly the advent of diffusion models, have sparked a fundamental question: how can these models be effectively used for discriminative tasks? In this work, we find that generative models can be great test-time adapters for discriminative models. Our method, Diffusion-TTA, adapts pre-trained discriminative models such as image classifiers, segmenters and depth predictors, to each unlabelled example in the test set using generative feedback from a diffusion model. We achieve this by modulating the conditioning of the diffusion model using the output of the discriminative model. We then maximize the image likelihood objective by backpropagating the gradients to discriminative model's parameters. We show Diffusion-TTA significantly enhances the accuracy of various large-scale pre-trained discriminative models, such as, ImageNet classifiers, CLIP models, image pixel labellers and image depth predictors. Diffusion-TTA outperforms existing test-time adaptation methods, including TTT-MAE and TENT, and particularly shines in online adaptation setups, where the discriminative model is continually adapted to each example in the test set. We provide access to code, results, and visualizations on our website: https://diffusion-tta.github.io/. △ Less

Submitted 29 November, 2023; v1 submitted 27 November, 2023; originally announced November 2023.

Comments: Accepted at NeurIPS 2023 Webpage with Code: https://diffusion-tta.github.io/

arXiv:2310.17082 [pdf, ps, other]

Does or did the supernova remnant Cassiopeia A operate as a PeVatron?

Authors: Zhen Cao, F. Aharonian, Q. An, Axikegu, Y. X. Bai, Y. W. Bao, D. Bastieri, X. J. Bi, Y. J. Bi, J. T. Cai, Q. Cao, W. Y. Cao, Zhe Cao, J. Chang, J. F. Chang, A. M. Chen, E. S. Chen, Liang Chen, Lin Chen, Long Chen, M. J. Chen, M. L. Chen, Q. H. Chen, S. H. Chen, S. Z. Chen , et al. (255 additional authors not shown)

Abstract: For decades, supernova remnants (SNRs) have been considered the prime sources of Galactic Cosmic rays (CRs). But whether SNRs can accelerate CR protons to PeV energies and thus dominate CR flux up to the knee is currently under intensive theoretical and phenomenological debate. The direct test of the ability of SNRs to operate as CR PeVatrons can be provided by ultrahigh-energy (UHE;… ▽ More For decades, supernova remnants (SNRs) have been considered the prime sources of Galactic Cosmic rays (CRs). But whether SNRs can accelerate CR protons to PeV energies and thus dominate CR flux up to the knee is currently under intensive theoretical and phenomenological debate. The direct test of the ability of SNRs to operate as CR PeVatrons can be provided by ultrahigh-energy (UHE; $E_γがんま\geq 100$~TeV) $γがんま$-rays. In this context, the historical SNR Cassiopeia A (Cas A) is considered one of the most promising target for UHE observations. This paper presents the observation of Cas A and its vicinity by the LHAASO KM2A detector. The exceptional sensitivity of LHAASO KM2A in the UHE band, combined with the young age of Cas A, enabled us to derive stringent model-independent limits on the energy budget of UHE protons and nuclei accelerated by Cas A at any epoch after the explosion. The results challenge the prevailing paradigm that Cas A-type SNRs are major suppliers of PeV CRs in the Milky Way. △ Less

Submitted 25 October, 2023; originally announced October 2023.

Comments: 11 pages, 3 figures, Accepted by the APJL

arXiv:2310.10379 [pdf, other]

Revisiting Logistic-softmax Likelihood in Bayesian Meta-Learning for Few-Shot Classification

Authors: Tianjun Ke, Haoqun Cao, Zenan Ling, Feng Zhou

Abstract: Meta-learning has demonstrated promising results in few-shot classification (FSC) by learning to solve new problems using prior knowledge. Bayesian methods are effective at characterizing uncertainty in FSC, which is crucial in high-risk fields. In this context, the logistic-softmax likelihood is often employed as an alternative to the softmax likelihood in multi-class Gaussian process classificat… ▽ More Meta-learning has demonstrated promising results in few-shot classification (FSC) by learning to solve new problems using prior knowledge. Bayesian methods are effective at characterizing uncertainty in FSC, which is crucial in high-risk fields. In this context, the logistic-softmax likelihood is often employed as an alternative to the softmax likelihood in multi-class Gaussian process classification due to its conditional conjugacy property. However, the theoretical property of logistic-softmax is not clear and previous research indicated that the inherent uncertainty of logistic-softmax leads to suboptimal performance. To mitigate these issues, we revisit and redesign the logistic-softmax likelihood, which enables control of the \textit{a priori} confidence level through a temperature parameter. Furthermore, we theoretically and empirically show that softmax can be viewed as a special case of logistic-softmax and logistic-softmax induces a larger family of data distribution than softmax. Utilizing modified logistic-softmax, we integrate the data augmentation technique into the deep kernel based Gaussian process meta-learning framework, and derive an analytical mean-field approximation for task-specific updates. Our approach yields well-calibrated uncertainty estimates and achieves comparable or superior results on standard benchmark datasets. Code is publicly available at \url{https://github.com/keanson/revisit-logistic-softmax}. △ Less

Submitted 16 October, 2023; originally announced October 2023.

arXiv:2310.08845 [pdf, other]

doi 10.1126/sciadv.adj2778

Very high energy gamma-ray emission beyond 10 TeV from GRB 221009A

Authors: Zhen Cao, F. Aharonian, Q. An, A. Axikegu, Y. X. Bai, Y. W. Bao, D. Bastieri, X. J. Bi, Y. J. Bi, J. T. Cai, Q. Cao, W. Y. Cao, Zhe Cao, J. Chang, J. F. Chang, A. M. Chen, E. S. Chen, Liang Chen, Lin Chen, Long Chen, M. J. Chen, M. L. Chen, Q. H. Chen, S. H. Chen, S. Z. Chen , et al. (255 additional authors not shown)

Abstract: The highest energy gamma-rays from gamma-ray bursts (GRBs) have important implications for their radiation mechanism. Here we report for the first time the detection of gamma-rays up to 13 TeV from the brightest GRB 221009A by the Large High Altitude Air-shower Observatory (LHAASO). The LHAASO-KM2A detector registered more than 140 gamma-rays with energies above 3 TeV during 230$-$900s after the t… ▽ More The highest energy gamma-rays from gamma-ray bursts (GRBs) have important implications for their radiation mechanism. Here we report for the first time the detection of gamma-rays up to 13 TeV from the brightest GRB 221009A by the Large High Altitude Air-shower Observatory (LHAASO). The LHAASO-KM2A detector registered more than 140 gamma-rays with energies above 3 TeV during 230$-$900s after the trigger. The intrinsic energy spectrum of gamma-rays can be described by a power-law after correcting for extragalactic background light (EBL) absorption. Such a hard spectrum challenges the synchrotron self-Compton (SSC) scenario of relativistic electrons for the afterglow emission above several TeV. Observations of gamma-rays up to 13 TeV from a source with a measured redshift of z=0.151 hints more transparency in intergalactic space than previously expected. Alternatively, one may invoke new physics such as Lorentz Invariance Violation (LIV) or an axion origin of very high energy (VHE) signals. △ Less

Submitted 22 November, 2023; v1 submitted 13 October, 2023; originally announced October 2023.

Comments: 49pages, 11figures

Journal ref: Science Advances, 9, eadj2778 (2023) 15 November 2023

arXiv:2306.05363 [pdf, other]

Subject clustering by IF-PCA and several recent methods

Authors: Dieyi Chen, Jiashun Jin, Zheng Tracy Ke

Abstract: Subject clustering (i.e., the use of measured features to cluster subjects, such as patients or cells, into multiple groups) is a problem of great interest. In recent years, many approaches were proposed, among which unsupervised deep learning (UDL) has received a great deal of attention. Two interesting questions are (a) how to combine the strengths of UDL and other approaches, and (b) how these… ▽ More Subject clustering (i.e., the use of measured features to cluster subjects, such as patients or cells, into multiple groups) is a problem of great interest. In recent years, many approaches were proposed, among which unsupervised deep learning (UDL) has received a great deal of attention. Two interesting questions are (a) how to combine the strengths of UDL and other approaches, and (b) how these approaches compare to one other. We combine Variational Auto-Encoder (VAE), a popular UDL approach, with the recent idea of Influential Feature PCA (IF-PCA), and propose IF-VAE as a new method for subject clustering. We study IF-VAE and compare it with several other methods (including IF-PCA, VAE, Seurat, and SC3) on $10$ gene microarray data sets and $8$ single-cell RNA-seq data sets. We find that IF-VAE significantly improves over VAE, but still underperforms IF-PCA. We also find that IF-PCA is quite competitive, which slightly outperforms Seurat and SC3 over the $8$ single-cell data sets. IF-PCA is conceptually simple and permits delicate analysis. We demonstrate that IF-PCA is capable of achieving the phase transition in a Rare/Weak model. Comparatively, Seurat and SC3 are more complex and theoretically difficult to analyze (for these reasons, their optimality remains unclear). △ Less

Submitted 8 June, 2023; originally announced June 2023.

arXiv:2306.01089 [pdf, other]

Semi-supervised Community Detection via Structural Similarity Metrics

Authors: Yicong Jiang, Tracy Ke

Abstract: Motivated by social network analysis and network-based recommendation systems, we study a semi-supervised community detection problem in which the objective is to estimate the community label of a new node using the network topology and partially observed community labels of existing nodes. The network is modeled using a degree-corrected stochastic block model, which allows for severe degree heter… ▽ More Motivated by social network analysis and network-based recommendation systems, we study a semi-supervised community detection problem in which the objective is to estimate the community label of a new node using the network topology and partially observed community labels of existing nodes. The network is modeled using a degree-corrected stochastic block model, which allows for severe degree heterogeneity and potentially non-assortative communities. We propose an algorithm that computes a `structural similarity metric' between the new node and each of the $K$ communities by aggregating labeled and unlabeled data. The estimated label of the new node corresponds to the value of $k$ that maximizes this similarity metric. Our method is fast and numerically outperforms existing semi-supervised algorithms. Theoretically, we derive explicit bounds for the misclassification error and show the efficiency of our method by comparing it with an ideal classifier. Our findings highlight, to the best of our knowledge, the first semi-supervised community detection algorithm that offers theoretical guarantees. △ Less

Submitted 1 June, 2023; originally announced June 2023.

Comments: 9 pages, 8 figures, accepted by the 11th International Conference on Learning Representations (ICLR 2023)

arXiv:2305.17030 [pdf, other]

doi 10.3847/1538-4365/acfd29

The First LHAASO Catalog of Gamma-Ray Sources

Authors: Zhen Cao, F. Aharonian, Q. An, Axikegu, Y. X. Bai, Y. W. Bao, D. Bastieri, X. J. Bi, Y. J. Bi, J. T. Cai, Q. Cao, W. Y. Cao, Zhe Cao, J. Chang, J. F. Chang, A. M. Chen, E. S. Chen, Liang Chen, Lin Chen, Long Chen, M. J. Chen, M. L. Chen, Q. H. Chen, S. H. Chen, S. Z. Chen , et al. (255 additional authors not shown)

Abstract: We present the first catalog of very-high energy and ultra-high energy gamma-ray sources detected by the Large High Altitude Air Shower Observatory (LHAASO). The catalog was compiled using 508 days of data collected by the Water Cherenkov Detector Array (WCDA) from March 2021 to September 2022 and 933 days of data recorded by the Kilometer Squared Array (KM2A) from January 2020 to September 2022.… ▽ More We present the first catalog of very-high energy and ultra-high energy gamma-ray sources detected by the Large High Altitude Air Shower Observatory (LHAASO). The catalog was compiled using 508 days of data collected by the Water Cherenkov Detector Array (WCDA) from March 2021 to September 2022 and 933 days of data recorded by the Kilometer Squared Array (KM2A) from January 2020 to September 2022. This catalog represents the main result from the most sensitive large coverage gamma-ray survey of the sky above 1 TeV, covering declination from $-$20$^{\circ}$ to 80$^{\circ}$. In total, the catalog contains 90 sources with an extended size smaller than $2^\circ$ and a significance of detection at $> 5σしぐま$. Based on our source association criteria, 32 new TeV sources are proposed in this study. Among the 90 sources, 43 sources are detected with ultra-high energy ($E > 100$ TeV) emission at $> 4σしぐま$ significance level. We provide the position, extension, and spectral characteristics of all the sources in this catalog. △ Less

Submitted 27 November, 2023; v1 submitted 26 May, 2023; originally announced May 2023.

Comments: 40 pages, 13 figures, 4 tables

Journal ref: The Astrophysical Journal Supplement Series, 271 (2024) 25

arXiv:2305.05372 [pdf, other]

doi 10.1103/PhysRevLett.131.151001

Measurement of ultra-high-energy diffuse gamma-ray emission of the Galactic plane from 10 TeV to 1 PeV with LHAASO-KM2A

Authors: Zhen Cao, F. Aharonian, Q. An, Axikegu, Y. X. Bai, Y. W. Bao, D. Bastieri, X. J. Bi, Y. J. Bi, J. T. Cai, Q. Cao, W. Y. Cao, Zhe Cao, J. Chang, J. F. Chang, A. M. Chen, E. S. Chen, Liang Chen, Lin Chen, Long Chen, M. J. Chen, M. L. Chen, Q. H. Chen, S. H. Chen, S. Z. Chen , et al. (255 additional authors not shown)

Abstract: The diffuse Galactic $γがんま$-ray emission, mainly produced via interactions between cosmic rays and the interstellar medium and/or radiation field, is a very important probe of the distribution, propagation, and interaction of cosmic rays in the Milky Way. In this work we report the measurements of diffuse $γがんま$-rays from the Galactic plane between 10 TeV and 1 PeV energies, with the square kilometer ar… ▽ More The diffuse Galactic $γがんま$-ray emission, mainly produced via interactions between cosmic rays and the interstellar medium and/or radiation field, is a very important probe of the distribution, propagation, and interaction of cosmic rays in the Milky Way. In this work we report the measurements of diffuse $γがんま$-rays from the Galactic plane between 10 TeV and 1 PeV energies, with the square kilometer array of the Large High Altitude Air Shower Observatory (LHAASO). Diffuse emissions from the inner ($15^{\circ}<l<125^{\circ}$, $|b|<5^{\circ}$) and outer ($125^{\circ}<l<235^{\circ}$, $|b|<5^{\circ}$) Galactic plane are detected with $29.1σしぐま$ and $12.7σしぐま$ significance, respectively. The outer Galactic plane diffuse emission is detected for the first time in the very- to ultra-high-energy domain ($E>10$~TeV). The energy spectrum in the inner Galaxy regions can be described by a power-law function with an index of $-2.99\pm0.04$, which is different from the curved spectrum as expected from hadronic interactions between locally measured cosmic rays and the line-of-sight integrated gas content. Furthermore, the measured flux is higher by a factor of $\sim3$ than the prediction. A similar spectrum with an index of $-2.99\pm0.07$ is found in the outer Galaxy region, and the absolute flux for $10\lesssim E\lesssim60$ TeV is again higher than the prediction for hadronic cosmic ray interactions. The latitude distributions of the diffuse emission are consistent with the gas distribution, while the longitude distributions show clear deviation from the gas distribution. The LHAASO measurements imply that either additional emission sources exist or cosmic ray intensities have spatial variations. △ Less

Submitted 19 August, 2023; v1 submitted 9 May, 2023; originally announced May 2023.

Comments: 12 pages, 8 figures, 5 tables; accepted for publication in Physical Review Letters; source mask file provided as ancillary file

Journal ref: Phys. Rev. Lett. 131, 151001 (2023)

arXiv:2303.05024 [pdf, other]

Phase transition for detecting a small community in a large network

Authors: Jiashun Jin, Zheng Tracy Ke, Paxton Turner, Anru R. Zhang

Abstract: How to detect a small community in a large network is an interesting problem, including clique detection as a special case, where a naive degree-based $χかい^2$-test was shown to be powerful in the presence of an Erdős-Renyi background. Using Sinkhorn's theorem, we show that the signal captured by the $χかい^2$-test may be a modeling artifact, and it may disappear once we replace the Erdős-Renyi model by… ▽ More How to detect a small community in a large network is an interesting problem, including clique detection as a special case, where a naive degree-based $χかい^2$-test was shown to be powerful in the presence of an Erdős-Renyi background. Using Sinkhorn's theorem, we show that the signal captured by the $χかい^2$-test may be a modeling artifact, and it may disappear once we replace the Erdős-Renyi model by a broader network model. We show that the recent SgnQ test is more appropriate for such a setting. The test is optimal in detecting communities with sizes comparable to the whole network, but has never been studied for our setting, which is substantially different and more challenging. Using a degree-corrected block model (DCBM), we establish phase transitions of this testing problem concerning the size of the small community and the edge densities in small and large communities. When the size of the small community is larger than $\sqrt{n}$, the SgnQ test is optimal for it attains the computational lower bound (CLB), the information lower bound for methods allowing polynomial computation time. When the size of the small community is smaller than $\sqrt{n}$, we establish the parameter regime where the SgnQ test has full power and make some conjectures of the CLB. We also study the classical information lower bound (LB) and show that there is always a gap between the CLB and LB in our range of interest. △ Less

Submitted 8 March, 2023; originally announced March 2023.

arXiv:2301.01381 [pdf, other]

Testing High-dimensional Multinomials with Applications to Text Analysis

Authors: T. Tony Cai, Zheng Tracy Ke, Paxton Turner

Abstract: Motivated by applications in text mining and discrete distribution inference, we investigate the testing for equality of probability mass functions of $K$ groups of high-dimensional multinomial distributions. A test statistic, which is shown to have an asymptotic standard normal distribution under the null, is proposed. The optimal detection boundary is established, and the proposed test is shown… ▽ More Motivated by applications in text mining and discrete distribution inference, we investigate the testing for equality of probability mass functions of $K$ groups of high-dimensional multinomial distributions. A test statistic, which is shown to have an asymptotic standard normal distribution under the null, is proposed. The optimal detection boundary is established, and the proposed test is shown to achieve this optimal detection boundary across the entire parameter space of interest. The proposed method is demonstrated in simulation studies and applied to analyze two real-world datasets to examine variation among consumer reviews of Amazon movies and diversity of statistical paper abstracts. △ Less

Submitted 24 November, 2023; v1 submitted 3 January, 2023; originally announced January 2023.

arXiv:2210.00314 [pdf, other]

Learning Hierarchical Image Segmentation For Recognition and By Recognition

Authors: Tsung-Wei Ke, Sangwoo Mo, Stella X. Yu

Abstract: Large vision and language models learned directly through image-text associations often lack detailed visual substantiation, whereas image segmentation tasks are treated separately from recognition, supervisedly learned without interconnections. Our key observation is that, while an image can be recognized in multiple ways, each has a consistent part-and-whole visual organization. Segmentation thu… ▽ More Large vision and language models learned directly through image-text associations often lack detailed visual substantiation, whereas image segmentation tasks are treated separately from recognition, supervisedly learned without interconnections. Our key observation is that, while an image can be recognized in multiple ways, each has a consistent part-and-whole visual organization. Segmentation thus should be treated not as an end task to be mastered through supervised learning, but as an internal process that evolves with and supports the ultimate goal of recognition. We propose to integrate a hierarchical segmenter into the recognition process, train and adapt the entire model solely on image-level recognition objectives. We learn hierarchical segmentation for free alongside recognition, automatically uncovering part-to-whole relationships that not only underpin but also enhance recognition. Enhancing the Vision Transformer (ViT) with adaptive segment tokens and graph pooling, our model surpasses ViT in unsupervised part-whole discovery, semantic segmentation, image classification, and efficiency. Notably, our model (trained on unlabeled 1M ImageNet images) outperforms SAM (trained on 11M images and 1 billion masks) by absolute 8% in mIoU on PartImageNet object segmentation. △ Less

Submitted 2 May, 2024; v1 submitted 1 October, 2022; originally announced October 2022.

Comments: ICLR 2024 (spotlight). First two authors contributed equally. Code available at https://github.com/twke18/CAST

ACM Class: I.4.6; I.4.10; I.5.3

arXiv:2207.12601 [pdf]

doi 10.1088/1674-1137/ac9371

Flux Variations of Cosmic Ray Air Showers Detected by LHAASO-KM2A During a Thunderstorm on 10 June 2021

Authors: LHAASO Collaboration, F. Aharonian, Q. An, Axikegu, L. X. Bai, Y. X. Bai, Y. W. Bao, D. Bastieri, X. J. Bi, Y. J. Bi, J. T. Cai, Zhe Cao, Zhen Cao, J. Chang, J. F. Chang, E. S. Chen, Liang Chen, Liang Chen, Long Chen, M. J. Chen, M. L. Chen, S. H. Chen, S. Z. Chen, T. L. Chen, X. J. Chen , et al. (248 additional authors not shown)

Abstract: The Large High Altitude Air Shower Observatory (LHAASO) has three sub-arrays, KM2A, WCDA and WFCTA. The flux variations of cosmic ray air showers were studied by analyzing the KM2A data during the thunderstorm on 10 June 2021. The number of shower events that meet the trigger conditions increases significantly in atmospheric electric fields, with maximum fractional increase of 20%. The variations… ▽ More The Large High Altitude Air Shower Observatory (LHAASO) has three sub-arrays, KM2A, WCDA and WFCTA. The flux variations of cosmic ray air showers were studied by analyzing the KM2A data during the thunderstorm on 10 June 2021. The number of shower events that meet the trigger conditions increases significantly in atmospheric electric fields, with maximum fractional increase of 20%. The variations of trigger rates (increases or decreases) are found to be strongly dependent on the primary zenith angle. The flux of secondary particles increases significantly, following a similar trend with that of the shower events. To better understand the observed behavior, Monte Carlo simulations are performed with CORSIKA and G4KM2A (a code based on GEANT4). We find that the experimental data (in saturated negative fields) are in good agreement with simulations, assuming the presence of a uniform upward electric field of 700 V/cm with a thickness of 1500 m in the atmosphere above the observation level. Due to the acceleration/deceleration and deflection by the atmospheric electric field, the number of secondary particles with energy above the detector threshold is modified, resulting in the changes in shower detection rate. △ Less

Submitted 6 December, 2022; v1 submitted 25 July, 2022; originally announced July 2022.

Comments: 18 pages, 11 figures

Journal ref: Chinese Phys. C 47 015001 (2023)

arXiv:2207.06933 [pdf, other]

doi 10.1021/acs.nanolett.2c03130

Controlling Andreev bound states with the magnetic vector potential

Authors: Christian M. Moehle, Prasanna K. Rout, Nayan A. Jainandunsing, Dibyendu Kuiri, Chung Ting Ke, Di Xiao, Candice Thomas, Michael J. Manfra, Michal P. Nowak, Srijit Goswami

Abstract: Tunneling spectroscopy measurements are often used to probe the energy spectrum of Andreev bound states (ABSs) in semiconductor-superconductor hybrids. Recently, this spectroscopy technique has been incorporated into planar Josephson junctions (JJs) formed in two-dimensional electron gases, a potential platform to engineer phase-controlled topological superconductivity. Here, we perform ABS spectr… ▽ More Tunneling spectroscopy measurements are often used to probe the energy spectrum of Andreev bound states (ABSs) in semiconductor-superconductor hybrids. Recently, this spectroscopy technique has been incorporated into planar Josephson junctions (JJs) formed in two-dimensional electron gases, a potential platform to engineer phase-controlled topological superconductivity. Here, we perform ABS spectroscopy at the two ends of planar JJs and study the effects of the magnetic vector potential on the ABS spectrum. We show that the local superconducting phase difference arising from the vector potential is equal in magnitude and opposite in sign at the two ends, in agreement with a model that assumes localized ABSs near the tunnel barriers. Complemented with microscopic simulations, our experiments demonstrate that the local phase difference can be used to estimate the relative position of localized ABSs separated by a few hundred nanometers. △ Less

Submitted 15 November, 2022; v1 submitted 14 July, 2022; originally announced July 2022.

Journal ref: Nano Lett. 22, 8601 (2022)

arXiv:2204.12087 [pdf, other]

Optimal Network Membership Estimation Under Severe Degree Heterogeneity

Authors: Zheng Tracy Ke, Jingming Wang

Abstract: Real networks often have severe degree heterogeneity, with the maximum, average, and minimum node degrees differing significantly. This paper examines the impact of degree heterogeneity on statistical limits of network data analysis. Introducing the heterogeneity distribution (HD) under a degree-corrected mixed-membership network model, we show that the optimal rate of mixed membership estimation… ▽ More Real networks often have severe degree heterogeneity, with the maximum, average, and minimum node degrees differing significantly. This paper examines the impact of degree heterogeneity on statistical limits of network data analysis. Introducing the heterogeneity distribution (HD) under a degree-corrected mixed-membership network model, we show that the optimal rate of mixed membership estimation is an explicit functional of the HD. This result confirms that severe degree heterogeneity may decelerate the error rate, even when the overall sparsity remains unchanged. To obtain a rate-optimal method, we modify an existing spectral algorithm, Mixed-SCORE, by adding a pre-PCA normalization step. This step normalizes the adjacency matrix by a diagonal matrix consisting of the $b$th power of node degrees, for some $b\in \mathbb{R}$. We discover that $b = 1/2$ is universally favorable. The resulting spectral algorithm is rate-optimal for networks with arbitrary degree heterogeneity. A technical component in our proofs is entry-wise eigenvector analysis of the normalized graph Laplacian. △ Less

Submitted 22 July, 2024; v1 submitted 26 April, 2022; originally announced April 2022.

Comments: 109 pages, 8 figures

arXiv:2204.11432 [pdf, other]

Unsupervised Hierarchical Semantic Segmentation with Multiview Cosegmentation and Clustering Transformers

Authors: Tsung-Wei Ke, Jyh-Jing Hwang, Yunhui Guo, Xudong Wang, Stella X. Yu

Abstract: Unsupervised semantic segmentation aims to discover groupings within and across images that capture object and view-invariance of a category without external supervision. Grouping naturally has levels of granularity, creating ambiguity in unsupervised segmentation. Existing methods avoid this ambiguity and treat it as a factor outside modeling, whereas we embrace it and desire hierarchical groupin… ▽ More Unsupervised semantic segmentation aims to discover groupings within and across images that capture object and view-invariance of a category without external supervision. Grouping naturally has levels of granularity, creating ambiguity in unsupervised segmentation. Existing methods avoid this ambiguity and treat it as a factor outside modeling, whereas we embrace it and desire hierarchical grouping consistency for unsupervised segmentation. We approach unsupervised segmentation as a pixel-wise feature learning problem. Our idea is that a good representation shall reveal not just a particular level of grouping, but any level of grouping in a consistent and predictable manner. We enforce spatial consistency of grouping and bootstrap feature learning with co-segmentation among multiple views of the same image, and enforce semantic consistency across the grouping hierarchy with clustering transformers between coarse- and fine-grained features. We deliver the first data-driven unsupervised hierarchical semantic segmentation method called Hierarchical Segment Grouping (HSG). Capturing visual similarity and statistical co-occurrences, HSG also outperforms existing unsupervised segmentation methods by a large margin on five major object- and scene-centric benchmarks. Our code is publicly available at https://github.com/twke18/HSG . △ Less

Submitted 25 April, 2022; originally announced April 2022.

Comments: In CVPR 2022. Webpage & Code: https://twke18.github.io/projects/hsg.html

arXiv:2204.11194 [pdf, other]

Co-citation and Co-authorship Networks of Statisticians

Authors: Pengsheng Ji, Jiashun Jin, Zheng Tracy Ke, Wanshan Li

Abstract: We collected and cleaned a large data set on publications in statistics. The data set consists of the coauthor relationships and citation relationships of 83, 331 papers published in 36 representative journals in statistics, probability, and machine learning, spanning 41 years. The data set allows us to construct many different networks, and motivates a number of research problems about the resear… ▽ More We collected and cleaned a large data set on publications in statistics. The data set consists of the coauthor relationships and citation relationships of 83, 331 papers published in 36 representative journals in statistics, probability, and machine learning, spanning 41 years. The data set allows us to construct many different networks, and motivates a number of research problems about the research patterns and trends, research impacts, and network topology of the statistics community. In this paper we focus on (i) using the citation relationships to estimate the research interests of authors, and (ii) using the coauthor relationships to study the network topology. Using co-citation networks we constructed, we discover a "statistics triangle", reminiscent of the statistical philosophy triangle (Efron, 1998). We propose new approaches to constructing the "research map" of statisticians, as well as the "research trajectory" for a given author to visualize his/her research interest evolvement. Using co-authorship networks we constructed, we discover a multi-layer community tree and produce a Sankey diagram to visualize the author migrations in different sub-areas. We also propose several new metrics for research diversity of individual authors. We find that "Bayes", "Biostatistics", and "Nonparametric" are three primary areas in statistics. We also identify 15 sub-areas, each of which can be viewed as a weighted average of the primary areas, and identify several underlying reasons for the formation of co-authorship communities. We also find that the research interests of statisticians have evolved significantly in the 41-year time window we studied: some areas (e.g., biostatistics, high-dimensional data analysis, etc.) have become increasingly more popular. △ Less

Submitted 24 April, 2022; originally announced April 2022.

Comments: 61 pages, 16 figures

arXiv:2204.11109 [pdf, other]

Power Enhancement and Phase Transitions for Global Testing of the Mixed Membership Stochastic Block Model

Authors: Louis Cammarata, Zheng Tracy Ke

Abstract: The mixed-membership stochastic block model (MMSBM) is a common model for social networks. Given an $n$-node symmetric network generated from a $K$-community MMSBM, we would like to test $K=1$ versus $K>1$. We first study the degree-based $χかい^2$ test and the orthodox Signed Quadrilateral (oSQ) test. These two statistics estimate an order-2 polynomial and an order-4 polynomial of a "signal" matrix,… ▽ More The mixed-membership stochastic block model (MMSBM) is a common model for social networks. Given an $n$-node symmetric network generated from a $K$-community MMSBM, we would like to test $K=1$ versus $K>1$. We first study the degree-based $χかい^2$ test and the orthodox Signed Quadrilateral (oSQ) test. These two statistics estimate an order-2 polynomial and an order-4 polynomial of a "signal" matrix, respectively. We derive the asymptotic null distribution and power for both tests. However, for each test, there exists a parameter regime where its power is unsatisfactory. It motivates us to propose a power enhancement (PE) test to combine the strengths of both tests. We show that the PE test has a tractable null distribution and improves the power of both tests. To assess the optimality of PE, we consider a randomized setting, where the $n$ membership vectors are independently drawn from a distribution on the standard simplex. We show that the success of global testing is governed by a quantity $βべーた_n(K,P,h)$, which depends on the community structure matrix $P$ and the mean vector $h$ of memberships. For each given $(K, P, h)$, a test is called $\textit{ optimal}$ if it distinguishes two hypotheses when $βべーた_n(K, P,h)\to\infty$. A test is called $\textit{optimally adaptive}$ if it is optimal for all $(K, P, h)$. We show that the PE test is optimally adaptive, while many existing tests are only optimal for some particular $(K, P, h)$, hence, not optimally adaptive. △ Less

Submitted 23 April, 2022; originally announced April 2022.

Comments: 78 pages, 6 figures

arXiv:2204.11097 [pdf, other]

The SCORE normalization, especially for highly heterogeneous network and text data

Authors: Zheng Tracy Ke, Jiashun Jin

Abstract: SCORE was introduced as a spectral approach to network community detection. Since many networks have severe degree heterogeneity, the ordinary spectral clustering (OSC) approach to community detection may perform unsatisfactorily. SCORE alleviates the effect of degree heterogeneity by introducing a new normalization idea in the spectral domain and makes OSC more effective. SCORE is easy to use and… ▽ More SCORE was introduced as a spectral approach to network community detection. Since many networks have severe degree heterogeneity, the ordinary spectral clustering (OSC) approach to community detection may perform unsatisfactorily. SCORE alleviates the effect of degree heterogeneity by introducing a new normalization idea in the spectral domain and makes OSC more effective. SCORE is easy to use and computationally fast. It adapts easily to new directions and sees an increasing interest in practice. In this paper, we review the basics of SCORE, the adaption of SCORE to network mixed membership estimation and topic modeling, and the application of SCORE in real data, including two datasets on the publications of statisticians. We also review the theoretical 'ideology' underlying SCORE. We show that in the spectral domain, SCORE converts a simplicial cone to a simplex, and provides a simple and direct link between the simplex and network memberships. SCORE attains an exponential rate and a sharp phase transition in community detection, and achieves optimal rates in mixed membership estimation and topic modeling. △ Less

Submitted 23 April, 2022; originally announced April 2022.

Comments: 34 pages, 5 figures, 7 tables

arXiv:2203.15075 [pdf, other]

A Comparison of Hamming Errors of Representative Variable Selection Methods

Authors: Zheng Tracy Ke, Longlin Wang

Abstract: Lasso is a celebrated method for variable selection in linear models, but it faces challenges when the variables are moderately or strongly correlated. This motivates alternative approaches such as using a non-convex penalty, adding a ridge regularization, or conducting a post-Lasso thresholding. In this paper, we compare Lasso with 5 other methods: Elastic net, SCAD, forward selection, thresholde… ▽ More Lasso is a celebrated method for variable selection in linear models, but it faces challenges when the variables are moderately or strongly correlated. This motivates alternative approaches such as using a non-convex penalty, adding a ridge regularization, or conducting a post-Lasso thresholding. In this paper, we compare Lasso with 5 other methods: Elastic net, SCAD, forward selection, thresholded Lasso, and forward backward selection. We measure their performances theoretically by the expected Hamming error, assuming that the regression coefficients are iid drawn from a two-point mixture and that the Gram matrix is block-wise diagonal. By deriving the rates of convergence of Hamming errors and the phase diagrams, we obtain useful conclusions about the pros and cons of different methods. △ Less

Submitted 28 March, 2022; originally announced March 2022.

Comments: 87 pages; 13 figures

Journal ref: Tenth International Conference on Learning Representations (ICLR 2022)

arXiv:2111.06545 [pdf, ps, other]

doi 10.1126/science.abg5137

Peta-electron volt gamma-ray emission from the Crab Nebula

Authors: The LHAASO Collaboration, Zhen Cao, F. Aharonian, Q. An, Axikegu, L. X. Bai, Y. X. Bai, Y. W. Bao, D. Bastieri, X. J. Bi, Y. J. Bi, H. Cai, J. T. Cai, Zhe Cao, J. Chang, J. F. Chang, B. M. Chen, E. S. Chen, J. Chen, Liang Chen, Liang Chen, Long Chen, M. J. Chen, M. L. Chen, Q. H. Chen , et al. (250 additional authors not shown)

Abstract: The Crab pulsar and the surrounding nebula powered by the pulsar's rotational energy through the formation and termination of a relativistic electron-positron wind is a bright source of gamma-rays carrying crucial information about this complex conglomerate. We report the detection of $γがんま$-rays with a spectrum showing gradual steepening over three energy decades, from $5\times 10^{-4}$ to $1.1$ pet… ▽ More The Crab pulsar and the surrounding nebula powered by the pulsar's rotational energy through the formation and termination of a relativistic electron-positron wind is a bright source of gamma-rays carrying crucial information about this complex conglomerate. We report the detection of $γがんま$-rays with a spectrum showing gradual steepening over three energy decades, from $5\times 10^{-4}$ to $1.1$ petaelectronvolt (PeV). The ultra-high-energy photons exhibit the presence of a PeV electron accelerator (a pevatron) with an acceleration rate exceeding 15% of the absolute theoretical limit. Assuming that unpulsed $γがんま$-rays are produced at the termination of the pulsar's wind, we constrain the pevatron's size, between $0.025$ and $0.1$ pc, and the magnetic field $\approx 110 μみゅー$G. The production rate of PeV electrons, $2.5 \times 10^{36}$ erg $\rm s^{-1}$, constitutes 0.5% of the pulsar's spin-down luminosity, although we do not exclude a non-negligible contribution of PeV protons to the production of the highest energy $γがんま$-rays. △ Less

Submitted 11 November, 2021; originally announced November 2021.

Comments: 43 pages, 13 figures, 2 tables; Published in Science

Journal ref: Science, 2021, Vol 373, Issue 6553, pp. 425-430

arXiv:2110.04381 [pdf, other]

doi 10.1002/sta4.441

Allocation of COVID-19 Testing Budget on a Commute Network of Counties

Authors: Yaxuan Huang, Zheng Tracy Ke, Jiashun Jin

Abstract: The screening testing is an effective tool to control the early spread of an infectious disease such as COVID-19. When the total testing capacity is limited, we aim to optimally allocate testing resources among n counties. We build a (weighted) commute network on counties, with the weight between two counties a decreasing function of their traffic distance. We introduce a network-based disease mod… ▽ More The screening testing is an effective tool to control the early spread of an infectious disease such as COVID-19. When the total testing capacity is limited, we aim to optimally allocate testing resources among n counties. We build a (weighted) commute network on counties, with the weight between two counties a decreasing function of their traffic distance. We introduce a network-based disease model, in which the number of newly confirmed cases of each county depends on the numbers of hidden cases of all counties on the network. Our proposed testing allocation strategy first uses historical data to learn model parameters and then decides the testing rates for all counties by solving an optimization problem. We apply the method on the commute networks of Massachusetts, USA and Hubei, China and observe its advantages over testing allocation strategies that ignore the network structure. Our approach can also be extended to study the vaccine allocation problem. △ Less

Submitted 24 March, 2022; v1 submitted 8 October, 2021; originally announced October 2021.

arXiv:2105.10437 [pdf, other]

doi 10.1021/acs.nanolett.1c03520

InSbAs two-dimensional electron gases as a platform for topological superconductivity

Authors: Christian M. Moehle, Chung Ting Ke, Qingzhen Wang, Candice Thomas, Di Xiao, Saurabh Karwal, Mario Lodari, Vincent van de Kerkhof, Ruben Termaat, Geoffrey C. Gardner, Giordano Scappucci, Michael J. Manfra, Srijit Goswami

Abstract: Topological superconductivity can be engineered in semiconductors with strong spin-orbit interaction coupled to a superconductor. Experimental advances in this field have often been triggered by the development of new hybrid material systems. Among these, two-dimensional electron gases (2DEGs) are of particular interest due to their inherent design flexibility and scalability. Here we discuss resu… ▽ More Topological superconductivity can be engineered in semiconductors with strong spin-orbit interaction coupled to a superconductor. Experimental advances in this field have often been triggered by the development of new hybrid material systems. Among these, two-dimensional electron gases (2DEGs) are of particular interest due to their inherent design flexibility and scalability. Here we discuss results on a 2D platform based on a ternary 2DEG (InSbAs) coupled to in-situ grown Aluminum. The spin-orbit coupling in these 2DEGs can be tuned with the As concentration, reaching values up to 400 meV$\unicode{xC5}$, thus exceeding typical values measured in its binary constituents. In addition to a large Landé g-factor $\sim$ 55 (comparable to InSb), we show that the clean superconductor-semiconductor interface leads to a hard induced superconducting gap. Using this new platform we demonstrate the basic operation of phase-controllable Josephson junctions, superconducting islands and quasi-1D systems, prototypical device geometries used to study Majorana zero modes. △ Less

Submitted 4 November, 2021; v1 submitted 21 May, 2021; originally announced May 2021.

arXiv:2105.00957 [pdf, other]

Universal Weakly Supervised Segmentation by Pixel-to-Segment Contrastive Learning

Authors: Tsung-Wei Ke, Jyh-Jing Hwang, Stella X. Yu

Abstract: Weakly supervised segmentation requires assigning a label to every pixel based on training instances with partial annotations such as image-level tags, object bounding boxes, labeled points and scribbles. This task is challenging, as coarse annotations (tags, boxes) lack precise pixel localization whereas sparse annotations (points, scribbles) lack broad region coverage. Existing methods tackle th… ▽ More Weakly supervised segmentation requires assigning a label to every pixel based on training instances with partial annotations such as image-level tags, object bounding boxes, labeled points and scribbles. This task is challenging, as coarse annotations (tags, boxes) lack precise pixel localization whereas sparse annotations (points, scribbles) lack broad region coverage. Existing methods tackle these two types of weak supervision differently: Class activation maps are used to localize coarse labels and iteratively refine the segmentation model, whereas conditional random fields are used to propagate sparse labels to the entire image. We formulate weakly supervised segmentation as a semi-supervised metric learning problem, where pixels of the same (different) semantics need to be mapped to the same (distinctive) features. We propose 4 types of contrastive relationships between pixels and segments in the feature space, capturing low-level image similarity, semantic annotation, co-occurrence, and feature affinity They act as priors; the pixel-wise feature can be learned from training images with any partial annotations in a data-driven fashion. In particular, unlabeled pixels in training images participate not only in data-driven grouping within each image, but also in discriminative feature learning within and across images. We deliver a universal weakly supervised segmenter with significant gains on Pascal VOC and DensePose. Our code is publicly available at https://github.com/twke18/SPML. △ Less

Submitted 10 May, 2021; v1 submitted 3 May, 2021; originally announced May 2021.

Comments: In ICLR 2021. Webpage & Code: https://twke18.github.io/projects/spml.html

arXiv:2011.11730 [pdf, other]

RISE-SLAM: A Resource-aware Inverse Schmidt Estimator for SLAM

Authors: Tong Ke, Kejian J. Wu, Stergios I. Roumeliotis

Abstract: In this paper, we present the RISE-SLAM algorithm for performing visual-inertial simultaneous localization and mapping (SLAM), while improving estimation consistency. Specifically, in order to achieve real-time operation, existing approaches often assume previously-estimated states to be perfectly known, which leads to inconsistent estimates. Instead, based on the idea of the Schmidt-Kalman filter… ▽ More In this paper, we present the RISE-SLAM algorithm for performing visual-inertial simultaneous localization and mapping (SLAM), while improving estimation consistency. Specifically, in order to achieve real-time operation, existing approaches often assume previously-estimated states to be perfectly known, which leads to inconsistent estimates. Instead, based on the idea of the Schmidt-Kalman filter, which has processing cost linear in the size of the state vector but quadratic memory requirements, we derive a new consistent approximate method in the information domain, which has linear memory requirements and adjustable (constant to linear) processing cost. In particular, this method, the resource-aware inverse Schmidt estimator (RISE), allows trading estimation accuracy for computational efficiency. Furthermore, and in order to better address the requirements of a SLAM system during an exploration vs. a relocalization phase, we employ different configurations of RISE (in terms of the number and order of states updated) to maximize accuracy while preserving efficiency. Lastly, we evaluate the proposed RISE-SLAM algorithm on publicly-available datasets and demonstrate its superiority, both in terms of accuracy and efficiency, as compared to alternative visual-inertial SLAM systems. △ Less

Submitted 23 November, 2020; originally announced November 2020.

Comments: IROS 2019

arXiv:2011.09594 [pdf, other]

Deep Multi-view Depth Estimation with Predicted Uncertainty

Authors: Tong Ke, Tien Do, Khiem Vuong, Kourosh Sartipi, Stergios I. Roumeliotis

Abstract: In this paper, we address the problem of estimating dense depth from a sequence of images using deep neural networks. Specifically, we employ a dense-optical-flow network to compute correspondences and then triangulate the point cloud to obtain an initial depth map.Parts of the point cloud, however, may be less accurate than others due to lack of common observations or small parallax. To further i… ▽ More In this paper, we address the problem of estimating dense depth from a sequence of images using deep neural networks. Specifically, we employ a dense-optical-flow network to compute correspondences and then triangulate the point cloud to obtain an initial depth map.Parts of the point cloud, however, may be less accurate than others due to lack of common observations or small parallax. To further increase the triangulation accuracy, we introduce a depth-refinement network (DRN) that optimizes the initial depth map based on the image's contextual cues. In particular, the DRN contains an iterative refinement module (IRM) that improves the depth accuracy over iterations by refining the deep features. Lastly, the DRN also predicts the uncertainty in the refined depths, which is desirable in applications such as measurement selection for scene reconstruction. We show experimentally that our algorithm outperforms state-of-the-art approaches in terms of depth accuracy, and verify that our predicted uncertainty is highly correlated to the actual depth error. △ Less

Submitted 27 March, 2021; v1 submitted 18 November, 2020; originally announced November 2020.

Comments: IEEE International Conference on Robotics and Automation (ICRA 2021)

arXiv:2010.08132 [pdf, other]

Power of Knockoff: The Impact of Ranking Algorithm, Augmented Design, and Symmetric Statistic

Authors: Zheng Tracy Ke, Jun S. Liu, Yucong Ma

Abstract: The knockoff filter is a recent false discovery rate (FDR) control method for high-dimensional linear models. We point out that knockoff has three key components: ranking algorithm, augmented design, and symmetric statistic, and each component admits multiple choices. By considering various combinations of the three components, we obtain a collection of variants of knockoff. All these variants gua… ▽ More The knockoff filter is a recent false discovery rate (FDR) control method for high-dimensional linear models. We point out that knockoff has three key components: ranking algorithm, augmented design, and symmetric statistic, and each component admits multiple choices. By considering various combinations of the three components, we obtain a collection of variants of knockoff. All these variants guarantee finite-sample FDR control, and our goal is to compare their power. We assume a Rare and Weak signal model on regression coefficients and compare the power of different variants of knockoff by deriving explicit formulas of false positive rate and false negative rate. Our results provide new insights on how to improve power when controlling FDR at a targeted level. We also compare the power of knockoff with its propotype - a method that uses the same ranking algorithm but has access to an ideal threshold. The comparison reveals the additional price one pays by finding a data-driven threshold to control FDR. △ Less

Submitted 13 February, 2024; v1 submitted 15 October, 2020; originally announced October 2020.

Comments: 67 pages, 13 figures

Journal ref: Journal of Machine Learning Research, 2024

arXiv:2009.09177 [pdf, other]

Optimal Estimation of the Number of Communities

Authors: Jiashun Jin, Zheng Tracy Ke, Shengming Luo, Minzhe Wang

Abstract: In network analysis, how to estimate the number of communities $K$ is a fundamental problem. We consider a broad setting where we allow severe degree heterogeneity and a wide range of sparsity levels, and propose Stepwise Goodness-of-Fit (StGoF) as a new approach. This is a stepwise algorithm, where for $m = 1, 2, \ldots$, we alternately use a community detection step and a goodness-of-fit (GoF) s… ▽ More In network analysis, how to estimate the number of communities $K$ is a fundamental problem. We consider a broad setting where we allow severe degree heterogeneity and a wide range of sparsity levels, and propose Stepwise Goodness-of-Fit (StGoF) as a new approach. This is a stepwise algorithm, where for $m = 1, 2, \ldots$, we alternately use a community detection step and a goodness-of-fit (GoF) step. We adapt SCORE \cite{SCORE} for community detection, and propose a new GoF metric. We show that at step $m$, the GoF metric diverges to $\infty$ in probability for all $m < K$ and converges to $N(0,1)$ if $m = K$. This gives rise to a consistent estimate for $K$. Also, we discover the right way to define the signal-to-noise ratio (SNR) for our problem and show that consistent estimates for $K$ do not exist if $\mathrm{SNR} \goto 0$, and StGoF is uniformly consistent for $K$ if $\mathrm{SNR} \goto \infty$. Therefore, StGoF achieves the optimal phase transition. Similar stepwise methods (e.g., \cite{wang2017likelihood, ma2018determining}) are known to face analytical challenges. We overcome the challenges by using a different stepwise scheme in StGoF and by deriving sharp results that are not available before. The key to our analysis is to show that SCORE has the {\it Non-Splitting Property (NSP)}. Primarily due to a non-tractable rotation of eigenvectors dictated by the Davis-Kahan $\sin(θしーた)$ theorem, the NSP is non-trivial to prove and requires new techniques we develop. △ Less

Submitted 25 January, 2022; v1 submitted 19 September, 2020; originally announced September 2020.

MSC Class: 62H12; 62H30; 91C20

arXiv:2008.00092 [pdf, other]

Deep Depth Estimation from Visual-Inertial SLAM

Authors: Kourosh Sartipi, Tien Do, Tong Ke, Khiem Vuong, Stergios I. Roumeliotis

Abstract: This paper addresses the problem of learning to complete a scene's depth from sparse depth points and images of indoor scenes. Specifically, we study the case in which the sparse depth is computed from a visual-inertial simultaneous localization and mapping (VI-SLAM) system. The resulting point cloud has low density, it is noisy, and has non-uniform spatial distribution, as compared to the input f… ▽ More This paper addresses the problem of learning to complete a scene's depth from sparse depth points and images of indoor scenes. Specifically, we study the case in which the sparse depth is computed from a visual-inertial simultaneous localization and mapping (VI-SLAM) system. The resulting point cloud has low density, it is noisy, and has non-uniform spatial distribution, as compared to the input from active depth sensors, e.g., LiDAR or Kinect. Since the VI-SLAM produces point clouds only over textured areas, we compensate for the missing depth of the low-texture surfaces by leveraging their planar structures and their surface normals which is an important intermediate representation. The pre-trained surface normal network, however, suffers from large performance degradation when there is a significant difference in the viewing direction (especially the roll angle) of the test image as compared to the trained ones. To address this limitation, we use the available gravity estimate from the VI-SLAM to warp the input image to the orientation prevailing in the training dataset. This results in a significant performance gain for the surface normal estimate, and thus the dense depth estimates. Finally, we show that our method outperforms other state-of-the-art approaches both on training (ScanNet and NYUv2) and testing (collected with Azure Kinect) datasets. △ Less

Submitted 14 August, 2020; v1 submitted 31 July, 2020; originally announced August 2020.

Comments: 9 pages

arXiv:2007.07498 [pdf, other]

Measurement error models: from nonparametric methods to deep neural networks

Authors: Zhirui Hu, Zheng Tracy Ke, Jun S Liu

Abstract: The success of deep learning has inspired recent interests in applying neural networks in statistical inference. In this paper, we investigate the use of deep neural networks for nonparametric regression with measurement errors. We propose an efficient neural network design for estimating measurement error models, in which we use a fully connected feed-forward neural network (FNN) to approximate t… ▽ More The success of deep learning has inspired recent interests in applying neural networks in statistical inference. In this paper, we investigate the use of deep neural networks for nonparametric regression with measurement errors. We propose an efficient neural network design for estimating measurement error models, in which we use a fully connected feed-forward neural network (FNN) to approximate the regression function $f(x)$, a normalizing flow to approximate the prior distribution of $X$, and an inference network to approximate the posterior distribution of $X$. Our method utilizes recent advances in variational inference for deep neural networks, such as the importance weight autoencoder, doubly reparametrized gradient estimator, and non-linear independent components estimation. We conduct an extensive numerical study to compare the neural network approach with classical nonparametric methods and observe that the neural network approach is more flexible in accommodating different classes of regression functions and performs superior or comparable to the best available method in nearly all settings. △ Less

Submitted 15 July, 2020; originally announced July 2020.

Comments: 37 pages, 8 figures

arXiv:2006.00436 [pdf, other]

Estimation of the number of spiked eigenvalues in a covariance matrix by bulk eigenvalue matching analysis

Authors: Zheng Tracy Ke, Yucong Ma, Xihong Lin

Abstract: The spiked covariance model has gained increasing popularity in high-dimensional data analysis. A fundamental problem is determination of the number of spiked eigenvalues, $K$. For estimation of $K$, most attention has focused on the use of $top$ eigenvalues of sample covariance matrix, and there is little investigation into proper ways of utilizing $bulk$ eigenvalues to estimate $K$. We propose a… ▽ More The spiked covariance model has gained increasing popularity in high-dimensional data analysis. A fundamental problem is determination of the number of spiked eigenvalues, $K$. For estimation of $K$, most attention has focused on the use of $top$ eigenvalues of sample covariance matrix, and there is little investigation into proper ways of utilizing $bulk$ eigenvalues to estimate $K$. We propose a principled approach to incorporating bulk eigenvalues in the estimation of $K$. Our method imposes a working model on the residual covariance matrix, which is assumed to be a diagonal matrix whose entries are drawn from a gamma distribution. Under this model, the bulk eigenvalues are asymptotically close to the quantiles of a fixed parametric distribution. This motivates us to propose a two-step method: the first step uses bulk eigenvalues to estimate parameters of this distribution, and the second step leverages these parameters to assist the estimation of $K$. The resulting estimator $\hat{K}$ aggregates information in a large number of bulk eigenvalues. We show the consistency of $\hat{K}$ under a standard spiked covariance model. We also propose a confidence interval estimate for $K$. Our extensive simulation studies show that the proposed method is robust and outperforms the existing methods in a range of scenarios. We apply the proposed method to analysis of a lung cancer microarray data set and the 1000 Genomes data set. △ Less

Submitted 5 January, 2021; v1 submitted 31 May, 2020; originally announced June 2020.

Comments: 48 pages, 8 figures, 5 tables

arXiv:1910.07309 [pdf, other]

doi 10.1103/PhysRevApplied.13.041003

Stable quantum dots in an InSb two-dimensional electron gas

Authors: Ivan Kulesh, Chung Ting Ke, Candice Thomas, Saurabh Karwal, Christian M. Moehle, Sara Metti, Ray Kallaher, Geoffrey C. Gardner, Michael J. Manfra, Srijit Goswami

Abstract: Indium antimonide (InSb) two-dimensional electron gases (2DEGs) have a unique combination of material properties: high electron mobility, strong spin-orbit interaction, large Landé g-factor, and small effective mass. This makes them an attractive platform to explore a variety of mesoscopic phenomena ranging from spintronics to topological superconductivity. However, there exist limited studies of… ▽ More Indium antimonide (InSb) two-dimensional electron gases (2DEGs) have a unique combination of material properties: high electron mobility, strong spin-orbit interaction, large Landé g-factor, and small effective mass. This makes them an attractive platform to explore a variety of mesoscopic phenomena ranging from spintronics to topological superconductivity. However, there exist limited studies of quantum confined systems in these 2DEGs, often attributed to charge instabilities and gate drifts. We overcome this by removing the $δでるた$-doping layer from the heterostructure, and induce carriers electrostatically. This allows us to perform the first detailed study of stable gate-defined quantum dots in InSb 2DEGs. We demonstrate two distinct strategies for carrier confinement and study the charge stability of the dots. The small effective mass results in a relatively large single particle spacing, allowing for the observation of an even-odd variation in the addition energy. By tracking the Coulomb oscillations in a parallel magnetic field we determine the ground state spin configuration and show that the large g-factor ($\sim$30) results in a singlet-triplet transition at magnetic fields as low as 0.3 T. △ Less

Submitted 16 October, 2019; originally announced October 2019.

Comments: Includes supplementary information. Data files available at https://doi.org/10.4121/uuid:28a121af-1e08-429d-9d53-1cc53764e91e

Journal ref: Phys. Rev. Applied 13, 041003 (2020)

arXiv:1909.06503 [pdf, other]

Community Detection for Hypergraph Networks via Regularized Tensor Power Iteration

Authors: Zheng Tracy Ke, Feng Shi, Dong Xia

Abstract: To date, social network analysis has been largely focused on pairwise interactions. The study of higher-order interactions, via a hypergraph network, brings in new insights. We study community detection in a hypergraph network. A popular approach is to project the hypergraph to a graph and then apply community detection methods for graph networks, but we show that this approach may cause unwanted… ▽ More To date, social network analysis has been largely focused on pairwise interactions. The study of higher-order interactions, via a hypergraph network, brings in new insights. We study community detection in a hypergraph network. A popular approach is to project the hypergraph to a graph and then apply community detection methods for graph networks, but we show that this approach may cause unwanted information loss. We propose a new method for community detection that operates directly on the hypergraph. At the heart of our method is a regularized higher-order orthogonal iteration (reg-HOOI) algorithm that computes an approximate low-rank decomposition of the network adjacency tensor. Compared with existing tensor decomposition methods such as HOSVD and vanilla HOOI, reg-HOOI yields better performance, especially when the hypergraph is sparse. Given the output of tensor decomposition, we then generalize the community detection method SCORE (Jin, 2015) from graph networks to hypergraph networks. We call our new method Tensor-SCORE. In theory, we introduce a degree-corrected block model for hypergraphs (hDCBM), and show that Tensor-SCORE yields consistent community detection for a wide range of network sparsity and degree heterogeneity. As a byproduct, we derive the rates of convergence on estimating the principal subspace by reg-HOOI, with different initializations, including the two new initialization methods we propose, a diagonal-removed HOSVD and a randomized graph projection. We apply our method to several real hypergraph networks which yields encouraging results. It suggests that exploring higher-order interactions provides additional information not seen in graph representations. △ Less

Submitted 2 January, 2020; v1 submitted 13 September, 2019; originally announced September 2019.

Comments: 53 pages, 5 figures

arXiv:1906.07935 [pdf, other]

doi 10.1103/PhysRevResearch.1.033084

2$Φふぁい_{0}$-periodic magnetic interference in ballistic graphene Josephson junctions

Authors: C. T. Ke, A. W. Draelos, A. Seredinski, M. T. Wei, H. Li, M. Hernandez-Rivera, K. Watanabe, T. Taniguchi, M. Yamamoto, S. Tarucha, Y. Bomze, I. V. Borzenets, F. Amet, G. Finkelstein

Abstract: We investigate supercurrent interference patterns measured as a function of magnetic field in ballistic graphene Josephson junctions. At high doping, the expected $Φふぁい_{0}$-periodic "Fraunhofer" pattern is observed, indicating a uniform current distribution. Close to the Dirac point, we find anomalous interference patterns with an apparent 2$Φふぁい_{0}$ periodicity, similar to that predicted for topologi… ▽ More We investigate supercurrent interference patterns measured as a function of magnetic field in ballistic graphene Josephson junctions. At high doping, the expected $Φふぁい_{0}$-periodic "Fraunhofer" pattern is observed, indicating a uniform current distribution. Close to the Dirac point, we find anomalous interference patterns with an apparent 2$Φふぁい_{0}$ periodicity, similar to that predicted for topological Andreev bound states carrying a charge of $e$ instead of $2e$. This feature persists with increasing temperature, ruling out a non-sinusoidal current-phase relationship. It also persists in junctions in which sharp vacuum edges are eliminated. Our results indicate that the observed behavior may originate from an intrinsic property of ballistic graphene Josephson junctions, though the exact mechanism remains unclear. △ Less

Submitted 19 June, 2019; v1 submitted 19 June, 2019; originally announced June 2019.

Comments: Main text+ supplementary

Journal ref: Phys. Rev. Research 1, 033084 (2019)

arXiv:1906.00051 [pdf, other]

Diagonally-Dominant Principal Component Analysis

Authors: Zheng Tracy Ke, Lingzhou Xue, Fan Yang

Abstract: We consider the problem of decomposing a large covariance matrix into the sum of a low-rank matrix and a diagonally dominant matrix, and we call this problem the "Diagonally-Dominant Principal Component Analysis (DD-PCA)". DD-PCA is an effective tool for designing statistical methods for strongly correlated data. We showcase the use of DD-PCA in two statistical problems: covariance matrix estimati… ▽ More We consider the problem of decomposing a large covariance matrix into the sum of a low-rank matrix and a diagonally dominant matrix, and we call this problem the "Diagonally-Dominant Principal Component Analysis (DD-PCA)". DD-PCA is an effective tool for designing statistical methods for strongly correlated data. We showcase the use of DD-PCA in two statistical problems: covariance matrix estimation, and global detection in multiple testing. Using the output of DD-PCA, we propose a new estimator for estimating a large covariance matrix with factor structure. Thanks to a nice property of diagonally dominant matrices, this estimator enjoys the advantage of simultaneous good estimation of the covariance matrix and the precision matrix (by a plain inversion). A plug-in of this estimator to linear discriminant analysis and portfolio optimization yields appealing performance in real data. We also propose two new tests for testing the global null hypothesis in multiple testing when the $z$-scores have a factor covariance structure. Both tests first use DD-PCA to adjust the individual $p$-values and then plug in the adjusted $p$-values to the Higher Criticism (HC) test. These new tests significantly improve over the HC test and compare favorably with other existing tests. For computation of DD-PCA, we propose an iterative projection algorithm and an ADMM algorithm. △ Less

Submitted 31 May, 2019; originally announced June 2019.

arXiv:1905.06485 [pdf, other]

Parallel Search for Information

Authors: T. Tony Ke, Wenpin Tang, J. Miguel Villas-Boas, Yuming Zhang

Abstract: We consider the problem of a decision-maker searching for information on multiple alternatives when information is learned on all alternatives simultaneously. The decision-maker has a running cost of searching for information, and has to decide when to stop searching for information and choose one alternative. The expected payoff of each alternative evolves as a diffusion process when information… ▽ More We consider the problem of a decision-maker searching for information on multiple alternatives when information is learned on all alternatives simultaneously. The decision-maker has a running cost of searching for information, and has to decide when to stop searching for information and choose one alternative. The expected payoff of each alternative evolves as a diffusion process when information is being learned. We present necessary and sufficient conditions for the solution, establishing existence and uniqueness. We show that the optimal boundary where search is stopped (free boundary) is star-shaped, and present an asymptotic characterization of the value function and the free boundary. We show properties of how the distance between the free boundary and the diagonal varies with the number of alternatives, and how the free boundary under parallel search relates to the one under sequential search, with and without economies of scale on the search costs. △ Less

Submitted 9 April, 2020; v1 submitted 15 May, 2019; originally announced May 2019.

Comments: 33 pages, 3 figures

arXiv:1904.11689 [pdf, other]

doi 10.1103/PhysRevB.100.121403

Chiral Quasiparticle Tunneling Between Quantum Hall Edges in Proximity with a Superconductor

Authors: M. T. Wei, A. W. Draelos, A. Seredinski, C. T. Ke, H. Li, Y. Mehta, K. Watanabe, T. Taniguchi, M. Yamamoto, S. Tarucha, G. Finkelstein, F. Amet, I. V. Borzenets

Abstract: We study a two-terminal graphene Josephson junction with contacts shaped to form a narrow constriction, less than 100nm in length. The contacts are made from type II superconducting contacts and able to withstand magnetic fields high enough to reach the quantum Hall (QH) regime in graphene. In this regime, the device conductance is determined by edge states, plus the contribution from the constric… ▽ More We study a two-terminal graphene Josephson junction with contacts shaped to form a narrow constriction, less than 100nm in length. The contacts are made from type II superconducting contacts and able to withstand magnetic fields high enough to reach the quantum Hall (QH) regime in graphene. In this regime, the device conductance is determined by edge states, plus the contribution from the constricted region. In particular, the constriction area can support supercurrents up to fields of ~2.5T. Moreover, enhanced conductance is observed through a wide range of magnetic fields and gate voltages. This additional conductance and the appearance of supercurrent is attributed to the tunneling between counter-propagating quantum Hall edge states along opposite superconducting contacts. △ Less

Submitted 26 April, 2019; originally announced April 2019.

Comments: 4 pages, 3 figures

Journal ref: Phys. Rev. B 100, 121403 (2019)

arXiv:1904.09532 [pdf, other]

Optimal Adaptivity of Signed-Polygon Statistics for Network Testing

Authors: Jiashun Jin, Zheng Tracy Ke, Shengming Luo

Abstract: Given a symmetric social network, we are interested in testing whether it has only one community or multiple communities. The desired tests should (a) accommodate severe degree heterogeneity, (b) accommodate mixed-memberships, (c) have a tractable null distribution, and (d) adapt automatically to different levels of sparsity, and achieve the optimal phase diagram. How to find such a test is a chal… ▽ More Given a symmetric social network, we are interested in testing whether it has only one community or multiple communities. The desired tests should (a) accommodate severe degree heterogeneity, (b) accommodate mixed-memberships, (c) have a tractable null distribution, and (d) adapt automatically to different levels of sparsity, and achieve the optimal phase diagram. How to find such a test is a challenging problem. We propose the Signed Polygon as a class of new tests. Fixing $m \geq 3$, for each $m$-gon in the network, define a score using the centered adjacency matrix. The sum of such scores is then the $m$-th order Signed Polygon statistic. The Signed Triangle (SgnT) and the Signed Quadrilateral (SgnQ) are special examples of the Signed Polygon. We show that both the SgnT and SgnQ tests satisfy (a)-(d), and especially, they work well for both very sparse and less sparse networks. Our proposed tests compare favorably with the existing tests. For example, the EZ and GC tests behave unsatisfactorily in the less sparse case and do not achieve the optimal phase diagram. Also, many existing tests do not allow for severe heterogeneity or mixed-memberships, and they behave unsatisfactorily in our settings. The analysis of the SgnT and SgnQ tests is delicate and extremely tedious, and the main reason is that we need a unified proof that covers a wide range of sparsity levels and a wide range of degree heterogeneity. For lower bound theory, we use a phase transition framework, which includes the standard minimax argument, but is more informative. The proof uses classical theorems on matrix scaling. △ Less

Submitted 21 May, 2019; v1 submitted 20 April, 2019; originally announced April 2019.

MSC Class: 62H15; 62H20; 62C20

Showing 1–50 of 80 results for author: Ke, T