(Translated by https://www.hiragana.jp/)
Search | arXiv e-print repository
Skip to main content

Showing 1–50 of 196 results for author: Yang, H

Searching in archive stat. Search in all archives.
.
  1. arXiv:2409.02363  [pdf, other

    cs.LG stat.ML

    Optimal Neural Network Approximation for High-Dimensional Continuous Functions

    Authors: Ayan Maiti, Michelle Michelle, Haizhao Yang

    Abstract: Recently, the authors of Shen Yang Zhang (JMLR, 2022) developed a neural network with width $36d(2d + 1)$ and depth $11$, which utilizes a special activation function called the elementary universal activation function, to achieve the super approximation property for functions in $C([a,b]^d)$. That is, the constructed network only requires a fixed number of neurons to approximate a $d$-variate con… ▽ More

    Submitted 10 September, 2024; v1 submitted 3 September, 2024; originally announced September 2024.

  2. arXiv:2408.15670  [pdf, other

    stat.ME

    Adaptive Weighted Random Isolation (AWRI): a simple design to estimate causal effects under network interference

    Authors: Changhao Shi, Haoyu Yang, Yichen Qin, Yang Li

    Abstract: Recently, causal inference under interference has gained increasing attention in the literature. In this paper, we focus on randomized designs for estimating the total treatment effect (TTE), defined as the average difference in potential outcomes between fully treated and fully controlled groups. We propose a simple design called weighted random isolation (WRI) along with a restricted difference-… ▽ More

    Submitted 28 August, 2024; originally announced August 2024.

    Comments: 26 pages, 5 figures

  3. arXiv:2408.04313  [pdf, other

    stat.ML cs.LG stat.ME

    Better Locally Private Sparse Estimation Given Multiple Samples Per User

    Authors: Yuheng Ma, Ke Jia, Hanfang Yang

    Abstract: Previous studies yielded discouraging results for item-level locally differentially private linear regression with $s^*$-sparsity assumption, where the minimax rate for $nm$ samples is $\mathcal{O}(s^{*}d / nm\varepsilon^2)$. This can be challenging for high-dimensional data, where the dimension $d$ is extremely large. In this work, we investigate user-level locally differentially private sparse l… ▽ More

    Submitted 8 August, 2024; originally announced August 2024.

    Journal ref: ICML2024 Proceedings

  4. arXiv:2408.03307  [pdf, other

    stat.ML cs.LG

    Pre-training and in-context learning IS Bayesian inference a la De Finetti

    Authors: Naimeng Ye, Hanming Yang, Andrew Siah, Hongseok Namkoong

    Abstract: Accurately gauging uncertainty on the underlying environment is a longstanding goal of intelligent systems. We characterize which latent concepts pre-trained sequence models are naturally able to reason with. We go back to De Finetti's predictive view of Bayesian reasoning: instead of modeling latent parameters through priors and likelihoods like topic models do, De Finetti has long advocated for… ▽ More

    Submitted 6 August, 2024; originally announced August 2024.

  5. arXiv:2407.12996  [pdf, other

    stat.ML cs.LG

    Sharpness-diversity tradeoff: improving flat ensembles with SharpBalance

    Authors: Haiquan Lu, Xiaotian Liu, Yefan Zhou, Qunli Li, Kurt Keutzer, Michael W. Mahoney, Yujun Yan, Huanrui Yang, Yaoqing Yang

    Abstract: Recent studies on deep ensembles have identified the sharpness of the local minima of individual learners and the diversity of the ensemble members as key factors in improving test-time performance. Building on this, our study investigates the interplay between sharpness and diversity within deep ensembles, illustrating their crucial role in robust generalization to both in-distribution (ID) and o… ▽ More

    Submitted 17 July, 2024; originally announced July 2024.

  6. arXiv:2406.16708  [pdf, other

    cs.LG stat.ME

    CausalFormer: An Interpretable Transformer for Temporal Causal Discovery

    Authors: Lingbai Kong, Wengen Li, Hanchen Yang, Yichao Zhang, Jihong Guan, Shuigeng Zhou

    Abstract: Temporal causal discovery is a crucial task aimed at uncovering the causal relations within time series data. The latest temporal causal discovery methods usually train deep learning models on prediction tasks to uncover the causality between time series. They capture causal relations by analyzing the parameters of some components of the trained models, e.g., attention weights and convolution weig… ▽ More

    Submitted 24 June, 2024; originally announced June 2024.

  7. arXiv:2406.03242  [pdf, other

    cs.LG stat.CO

    Variational Pseudo Marginal Methods for Jet Reconstruction in Particle Physics

    Authors: Hanming Yang, Antonio Khalil Moretti, Sebastian Macaluso, Philippe Chlenski, Christian A. Naesseth, Itsik Pe'er

    Abstract: Reconstructing jets, which provide vital insights into the properties and histories of subatomic particles produced in high-energy collisions, is a main problem in data analyses in collider physics. This intricate task deals with estimating the latent structure of a jet (binary tree) and involves parameters such as particle energy, momentum, and types. While Bayesian methods offer a natural approa… ▽ More

    Submitted 5 June, 2024; originally announced June 2024.

  8. arXiv:2406.01416  [pdf, other

    cs.LG stat.ML

    Adapting Conformal Prediction to Distribution Shifts Without Labels

    Authors: Kevin Kasa, Zhiyu Zhang, Heng Yang, Graham W. Taylor

    Abstract: Conformal prediction (CP) enables machine learning models to output prediction sets with guaranteed coverage rate, assuming exchangeable data. Unfortunately, the exchangeability assumption is frequently violated due to distribution shifts in practice, and the challenge is often compounded by the lack of ground truth labels at test time. Focusing on classification in this paper, our goal is to impr… ▽ More

    Submitted 3 June, 2024; originally announced June 2024.

  9. arXiv:2405.15505  [pdf, other

    cs.LG cs.AI stat.ML

    Revisiting Counterfactual Regression through the Lens of Gromov-Wasserstein Information Bottleneck

    Authors: Hao Yang, Zexu Sun, Hongteng Xu, Xu Chen

    Abstract: As a promising individualized treatment effect (ITE) estimation method, counterfactual regression (CFR) maps individuals' covariates to a latent space and predicts their counterfactual outcomes. However, the selection bias between control and treatment groups often imbalances the two groups' latent distributions and negatively impacts this method's performance. In this study, we revisit counterfac… ▽ More

    Submitted 24 May, 2024; originally announced May 2024.

    Comments: 19 pages

  10. arXiv:2405.13481  [pdf, other

    stat.ML cs.CR cs.LG

    Locally Private Estimation with Public Features

    Authors: Yuheng Ma, Ke Jia, Hanfang Yang

    Abstract: We initiate the study of locally differentially private (LDP) learning with public features. We define semi-feature LDP, where some features are publicly available while the remaining ones, along with the label, require protection under local differential privacy. Under semi-feature LDP, we demonstrate that the mini-max convergence rate for non-parametric regression is significantly reduced compar… ▽ More

    Submitted 22 May, 2024; originally announced May 2024.

  11. arXiv:2405.03720  [pdf, other

    cs.LG stat.ME stat.ML

    Spatial Transfer Learning with Simple MLP

    Authors: Hongjian Yang

    Abstract: First step to investigate the potential of transfer learning applied to the field of spatial statistics

    Submitted 5 May, 2024; originally announced May 2024.

  12. arXiv:2405.02551  [pdf, ps, other

    stat.ME math.ST stat.AP

    Power-Enhanced Two-Sample Mean Tests for High-Dimensional Compositional Data with Application to Microbiome Data Analysis

    Authors: Danning Li, Lingzhou Xue, Haoyi Yang, Xiufan Yu

    Abstract: Testing differences in mean vectors is a fundamental task in the analysis of high-dimensional compositional data. Existing methods may suffer from low power if the underlying signal pattern is in a situation that does not favor the deployed test. In this work, we develop two-sample power-enhanced mean tests for high-dimensional compositional data based on the combination of $p$-values, which integ… ▽ More

    Submitted 3 May, 2024; originally announced May 2024.

    Comments: 25 pages

  13. arXiv:2404.17615  [pdf

    stat.ME cs.LG stat.CO stat.ML

    DeepVARMA: A Hybrid Deep Learning and VARMA Model for Chemical Industry Index Forecasting

    Authors: Xiang Li, Hu Yang

    Abstract: Since the chemical industry index is one of the important indicators to measure the development of the chemical industry, forecasting it is critical for understanding the economic situation and trends of the industry. Taking the multivariable nonstationary series-synthetic material index as the main research object, this paper proposes a new prediction model: DeepVARMA, and its variants Deep-VARMA… ▽ More

    Submitted 26 April, 2024; originally announced April 2024.

  14. arXiv:2404.16023  [pdf, other

    stat.AP cs.LG

    Learning Car-Following Behaviors Using Bayesian Matrix Normal Mixture Regression

    Authors: Chengyuan Zhang, Kehua Chen, Meixin Zhu, Hai Yang, Lijun Sun

    Abstract: Learning and understanding car-following (CF) behaviors are crucial for microscopic traffic simulation. Traditional CF models, though simple, often lack generalization capabilities, while many data-driven methods, despite their robustness, operate as "black boxes" with limited interpretability. To bridge this gap, this work introduces a Bayesian Matrix Normal Mixture Regression (MNMR) model that s… ▽ More

    Submitted 24 April, 2024; originally announced April 2024.

    Comments: 6 pages, Accepted by the 35th IEEE Intelligent Vehicles Symposium

  15. arXiv:2404.09402  [pdf, other

    cs.LG cs.AI stat.ML

    Neural McKean-Vlasov Processes: Distributional Dependence in Diffusion Processes

    Authors: Haoming Yang, Ali Hasan, Yuting Ng, Vahid Tarokh

    Abstract: McKean-Vlasov stochastic differential equations (MV-SDEs) provide a mathematical description of the behavior of an infinite number of interacting particles by imposing a dependence on the particle density. As such, we study the influence of explicitly including distributional information in the parameterization of the SDE. We propose a series of semi-parametric methods for representing MV-SDEs, an… ▽ More

    Submitted 14 April, 2024; originally announced April 2024.

    Comments: Appears in AISTATS 2024

  16. arXiv:2404.04800  [pdf, other

    cs.LG cs.CV stat.ML

    Coordinated Sparse Recovery of Label Noise

    Authors: Yukun Yang, Naihao Wang, Haixin Yang, Ruirui Li

    Abstract: Label noise is a common issue in real-world datasets that inevitably impacts the generalization of models. This study focuses on robust classification tasks where the label noise is instance-dependent. Estimating the transition matrix accurately in this task is challenging, and methods based on sample selection often exhibit confirmation bias to varying degrees. Sparse over-parameterized training… ▽ More

    Submitted 6 April, 2024; originally announced April 2024.

    Comments: Pre-print prior to submission to journal

  17. arXiv:2402.02720  [pdf, other

    cs.LG stat.ML

    Discounted Adaptive Online Learning: Towards Better Regularization

    Authors: Zhiyu Zhang, David Bombara, Heng Yang

    Abstract: We study online learning in adversarial nonstationary environments. Since the future can be very different from the past, a critical challenge is to gracefully forget the history while new data comes in. To formalize this intuition, we revisit the discounted regret in online convex optimization, and propose an adaptive (i.e., instance optimal), FTRL-based algorithm that improves the widespread non… ▽ More

    Submitted 18 June, 2024; v1 submitted 4 February, 2024; originally announced February 2024.

    Comments: ICML 2024

  18. arXiv:2401.16421  [pdf, other

    cs.LG cs.AI cs.CL stat.ML

    Two Stones Hit One Bird: Bilevel Positional Encoding for Better Length Extrapolation

    Authors: Zhenyu He, Guhao Feng, Shengjie Luo, Kai Yang, Liwei Wang, Jingjing Xu, Zhi Zhang, Hongxia Yang, Di He

    Abstract: In this work, we leverage the intrinsic segmentation of language sequences and design a new positional encoding method called Bilevel Positional Encoding (BiPE). For each position, our BiPE blends an intra-segment encoding and an inter-segment encoding. The intra-segment encoding identifies the locations within a segment and helps the model capture the semantic information therein via absolute pos… ▽ More

    Submitted 17 June, 2024; v1 submitted 29 January, 2024; originally announced January 2024.

    Comments: 17 pages, 7 figures, 8 tables; ICML 2024 Camera Ready version; Code: https://github.com/zhenyuhe00/BiPE

  19. arXiv:2312.11863  [pdf, other

    cs.LG cs.AI stat.ML

    Neural Network Approximation for Pessimistic Offline Reinforcement Learning

    Authors: Di Wu, Yuling Jiao, Li Shen, Haizhao Yang, Xiliang Lu

    Abstract: Deep reinforcement learning (RL) has shown remarkable success in specific offline decision-making scenarios, yet its theoretical guarantees are still under development. Existing works on offline RL theory primarily emphasize a few trivial settings, such as linear MDP or general function approximation with strong assumptions and independent data, which lack guidance for practical use. The coupling… ▽ More

    Submitted 19 December, 2023; originally announced December 2023.

    Comments: Full version of the paper accepted to the 38th Annual AAAI Conference on Artificial Intelligence (AAAI 2024)

  20. arXiv:2312.04658  [pdf, other

    cs.LG stat.ML

    PAC-Bayes Generalization Certificates for Learned Inductive Conformal Prediction

    Authors: Apoorva Sharma, Sushant Veer, Asher Hancock, Heng Yang, Marco Pavone, Anirudha Majumdar

    Abstract: Inductive Conformal Prediction (ICP) provides a practical and effective approach for equipping deep learning models with uncertainty estimates in the form of set-valued predictions which are guaranteed to contain the ground truth with high probability. Despite the appeal of this coverage guarantee, these sets may not be efficient: the size and contents of the prediction sets are not directly contr… ▽ More

    Submitted 7 December, 2023; originally announced December 2023.

    Comments: NeurIPS 2023

  21. arXiv:2312.03386  [pdf, other

    cs.LG stat.ML

    An Infinite-Width Analysis on the Jacobian-Regularised Training of a Neural Network

    Authors: Taeyoung Kim, Hongseok Yang

    Abstract: The recent theoretical analysis of deep neural networks in their infinite-width limits has deepened our understanding of initialisation, feature learning, and training of those networks, and brought new practical techniques for finding appropriate hyperparameters, learning network weights, and performing inference. In this paper, we broaden this line of research by showing that this infinite-width… ▽ More

    Submitted 21 August, 2024; v1 submitted 6 December, 2023; originally announced December 2023.

    Comments: Accepted at ICML 2024. 74 pages, 18 figures

  22. arXiv:2312.01046  [pdf, other

    stat.ML cs.LG math.ST

    Bagged Regularized $k$-Distances for Anomaly Detection

    Authors: Yuchao Cai, Yuheng Ma, Hanfang Yang, Hanyuan Hang

    Abstract: We consider the paradigm of unsupervised anomaly detection, which involves the identification of anomalies within a dataset in the absence of labeled examples. Though distance-based methods are top-performing for unsupervised anomaly detection, they suffer heavily from the sensitivity to the choice of the number of the nearest neighbors. In this paper, we propose a new distance-based algorithm cal… ▽ More

    Submitted 13 February, 2024; v1 submitted 2 December, 2023; originally announced December 2023.

  23. arXiv:2311.11369  [pdf, other

    stat.ML cs.CR cs.LG

    Optimal Locally Private Nonparametric Classification with Public Data

    Authors: Yuheng Ma, Hanfang Yang

    Abstract: In this work, we investigate the problem of public data assisted non-interactive Local Differentially Private (LDP) learning with a focus on non-parametric classification. Under the posterior drift assumption, we for the first time derive the mini-max optimal convergence rate with LDP constraint. Then, we present a novel approach, the locally differentially private classification tree, which attai… ▽ More

    Submitted 2 June, 2024; v1 submitted 19 November, 2023; originally announced November 2023.

  24. arXiv:2310.07999  [pdf, other

    cs.LG stat.ML

    LEMON: Lossless model expansion

    Authors: Yite Wang, Jiahao Su, Hanlin Lu, Cong Xie, Tianyi Liu, Jianbo Yuan, Haibin Lin, Ruoyu Sun, Hongxia Yang

    Abstract: Scaling of deep neural networks, especially Transformers, is pivotal for their surging performance and has further led to the emergence of sophisticated reasoning capabilities in foundation models. Such scaling generally requires training large models from scratch with random initialization, failing to leverage the knowledge acquired by their smaller counterparts, which are already resource-intens… ▽ More

    Submitted 11 October, 2023; originally announced October 2023.

    Comments: Preprint

  25. arXiv:2310.06389  [pdf, other

    cs.CV stat.ML

    Learning Stackable and Skippable LEGO Bricks for Efficient, Reconfigurable, and Variable-Resolution Diffusion Modeling

    Authors: Huangjie Zheng, Zhendong Wang, Jianbo Yuan, Guanghan Ning, Pengcheng He, Quanzeng You, Hongxia Yang, Mingyuan Zhou

    Abstract: Diffusion models excel at generating photo-realistic images but come with significant computational costs in both training and sampling. While various techniques address these computational challenges, a less-explored issue is designing an efficient and adaptable network backbone for iterative refinement. Current options like U-Net and Vision Transformer often rely on resource-intensive deep netwo… ▽ More

    Submitted 27 June, 2024; v1 submitted 10 October, 2023; originally announced October 2023.

  26. arXiv:2309.16044  [pdf, ps, other

    cs.LG stat.ML

    Improving Adaptive Online Learning Using Refined Discretization

    Authors: Zhiyu Zhang, Heng Yang, Ashok Cutkosky, Ioannis Ch. Paschalidis

    Abstract: We study unconstrained Online Linear Optimization with Lipschitz losses. Motivated by the pursuit of instance optimality, we propose a new algorithm that simultaneously achieves ($i$) the AdaGrad-style second order gradient adaptivity; and ($ii$) the comparator norm adaptivity also known as "parameter freeness" in the literature. In particular, - our algorithm does not employ the impractical dou… ▽ More

    Submitted 22 February, 2024; v1 submitted 27 September, 2023; originally announced September 2023.

    Comments: ALT 2024

  27. arXiv:2307.00126  [pdf, other

    math.OC cs.LG stat.ML

    Accelerating Inexact HyperGradient Descent for Bilevel Optimization

    Authors: Haikuo Yang, Luo Luo, Chris Junchi Li, Michael I. Jordan

    Abstract: We present a method for solving general nonconvex-strongly-convex bilevel optimization problems. Our method -- the \emph{Restarted Accelerated HyperGradient Descent} (\texttt{RAHGD}) method -- finds an $εいぷしろん$-first-order stationary point of the objective with $\tilde{\mathcal{O}}(κかっぱ^{3.25}εいぷしろん^{-1.75})$ oracle complexity, where $κかっぱ$ is the condition number of the lower-level objective and $εいぷしろん$ is the desir… ▽ More

    Submitted 30 June, 2023; originally announced July 2023.

  28. arXiv:2306.00356  [pdf, other

    cs.LG cs.AI stat.ML

    Regularizing Towards Soft Equivariance Under Mixed Symmetries

    Authors: Hyunsu Kim, Hyungi Lee, Hongseok Yang, Juho Lee

    Abstract: Datasets often have their intrinsic symmetries, and particular deep-learning models called equivariant or invariant models have been developed to exploit these symmetries. However, if some or all of these symmetries are only approximate, which frequently happens in practice, these models may be suboptimal due to the architectural restrictions imposed on them. We tackle this issue of approximate sy… ▽ More

    Submitted 1 June, 2023; originally announced June 2023.

    Comments: Proceedings of the International Conference on Machine Learning (ICML), 2023

  29. arXiv:2302.08406  [pdf, other

    cs.LG stat.ML

    Entity Aware Modelling: A Survey

    Authors: Rahul Ghosh, Haoyu Yang, Ankush Khandelwal, Erhu He, Arvind Renganathan, Somya Sharma, Xiaowei Jia, Vipin Kumar

    Abstract: Personalized prediction of responses for individual entities caused by external drivers is vital across many disciplines. Recent machine learning (ML) advances have led to new state-of-the-art response prediction models. Models built at a population level often lead to sub-optimal performance in many personalized prediction settings due to heterogeneity in data across entities (tasks). In personal… ▽ More

    Submitted 16 February, 2023; originally announced February 2023.

    Comments: Submitted to IJCAI, Survey Track

  30. arXiv:2302.01002  [pdf, other

    stat.ML cs.LG math.OC

    Over-parameterised Shallow Neural Networks with Asymmetrical Node Scaling: Global Convergence Guarantees and Feature Learning

    Authors: Francois Caron, Fadhel Ayed, Paul Jung, Hoil Lee, Juho Lee, Hongseok Yang

    Abstract: We consider the optimisation of large and shallow neural networks via gradient flow, where the output of each hidden node is scaled by some positive parameter. We focus on the case where the node scalings are non-identical, differing from the classical Neural Tangent Kernel (NTK) parameterisation. We prove that, for large neural networks, with high probability, gradient flow converges to a global… ▽ More

    Submitted 2 February, 2023; originally announced February 2023.

  31. arXiv:2212.14194  [pdf, ps, other

    math.ST stat.CO stat.ME stat.ML

    Theoretical Guarantees for Sparse Principal Component Analysis based on the Elastic Net

    Authors: Teng Zhang, Haoyi Yang, Lingzhou Xue

    Abstract: Sparse principal component analysis (SPCA) is widely used for dimensionality reduction and feature extraction in high-dimensional data analysis. Despite many methodological and theoretical developments in the past two decades, the theoretical guarantees of the popular SPCA algorithm proposed by Zou, Hastie & Tibshirani (2006) are still unknown. This paper aims to address this critical gap. We firs… ▽ More

    Submitted 27 April, 2023; v1 submitted 29 December, 2022; originally announced December 2022.

    Comments: 60 pages

  32. arXiv:2212.11481  [pdf, other

    stat.ML cs.LG

    A Mathematical Framework for Learning Probability Distributions

    Authors: Hongkang Yang

    Abstract: The modeling of probability distributions, specifically generative modeling and density estimation, has become an immensely popular subject in recent years by virtue of its outstanding performance on sophisticated data such as images and texts. Nevertheless, a theoretical understanding of its success is still incomplete. One mystery is the paradox between memorization and generalization: In theory… ▽ More

    Submitted 28 December, 2022; v1 submitted 21 December, 2022; originally announced December 2022.

    Comments: fixed typos

    MSC Class: 68T07; 62G05; 60-08

    Journal ref: Journal of Machine Learning 1 (2022) 373-431

  33. arXiv:2211.14605  [pdf, other

    cs.LG cs.CV stat.ML

    Looking at the posterior: accuracy and uncertainty of neural-network predictions

    Authors: H. Linander, O. Balabanov, H. Yang, B. Mehlig

    Abstract: Bayesian inference can quantify uncertainty in the predictions of neural networks using posterior distributions for model parameters and network output. By looking at these posterior distributions, one can separate the origin of uncertainty into aleatoric and epistemic contributions. One goal of uncertainty quantification is to inform on prediction accuracy. Here we show that prediction accuracy d… ▽ More

    Submitted 22 November, 2023; v1 submitted 26 November, 2022; originally announced November 2022.

    Comments: 26 pages, 10 figures, 5 tables

    Journal ref: Machine Learning: Science and Technology 4 (2023) 045032

  34. arXiv:2210.08668  [pdf

    cs.LG stat.ML

    Temporal-Spatial dependencies ENhanced deep learning model (TSEN) for household leverage series forecasting

    Authors: Hu Yang, Yi Huang, Haijun Wang, Yu Chen

    Abstract: Analyzing both temporal and spatial patterns for an accurate forecasting model for financial time series forecasting is a challenge due to the complex nature of temporal-spatial dynamics: time series from different locations often have distinct patterns; and for the same time series, patterns may vary as time goes by. Inspired by the successful applications of deep learning, we propose a new model… ▽ More

    Submitted 16 October, 2022; originally announced October 2022.

  35. arXiv:2210.05672  [pdf, other

    q-bio.NC stat.ME

    Interpretable AI for relating brain structural and functional connectomes

    Authors: Haoming Yang, Steven Winter, Zhengwu Zhang, David Dunson

    Abstract: One of the central problems in neuroscience is understanding how brain structure relates to function. Naively one can relate the direct connections of white matter fiber tracts between brain regions of interest (ROIs) to the increased co-activation in the same pair of ROIs, but the link between structural and functional connectomes (SCs and FCs) has proven to be much more complex. To learn a reali… ▽ More

    Submitted 29 August, 2023; v1 submitted 10 October, 2022; originally announced October 2022.

  36. arXiv:2209.08858  [pdf, other

    cs.AI cs.LG stat.ML

    Rethinking Knowledge Graph Evaluation Under the Open-World Assumption

    Authors: Haotong Yang, Zhouchen Lin, Muhan Zhang

    Abstract: Most knowledge graphs (KGs) are incomplete, which motivates one important research topic on automatically complementing knowledge graphs. However, evaluation of knowledge graph completion (KGC) models often ignores the incompleteness -- facts in the test set are ranked against all unknown triplets which may contain a large number of missing facts not included in the KG yet. Treating all unknown tr… ▽ More

    Submitted 19 September, 2022; originally announced September 2022.

    Comments: Accepted at NeurIPS 2022

  37. arXiv:2209.05998  [pdf, other

    econ.EM stat.AP stat.ML

    Interpreting and predicting the economy flows: A time-varying parameter global vector autoregressive integrated the machine learning model

    Authors: Yukang Jiang, Xueqin Wang, Zhixi Xiong, Haisheng Yang, Ting Tian

    Abstract: The paper proposes a time-varying parameter global vector autoregressive (TVP-GVAR) framework for predicting and analysing developed region economic variables. We want to provide an easily accessible approach for the economy application settings, where a variety of machine learning models can be incorporated for out-of-sample prediction. The LASSO-type technique for numerically efficient model sel… ▽ More

    Submitted 31 July, 2022; originally announced September 2022.

  38. arXiv:2208.09819  [pdf, other

    stat.ML cs.LG

    Robust Tests in Online Decision-Making

    Authors: Gi-Soo Kim, Hyun-Joon Yang, Jane P. Kim

    Abstract: Bandit algorithms are widely used in sequential decision problems to maximize the cumulative reward. One potential application is mobile health, where the goal is to promote the user's health through personalized interventions based on user specific information acquired through wearable devices. Important considerations include the type of, and frequency with which data is collected (e.g. GPS, or… ▽ More

    Submitted 21 August, 2022; originally announced August 2022.

    Comments: 17 pages, 1 figure, supplementary material for "Robust Tests in Online Decision-Making" published in Proceedings of the AAAI Conference on Artificial Intelligence (2022)

  39. arXiv:2207.03935  [pdf, other

    stat.ML cs.LG

    ControlBurn: Nonlinear Feature Selection with Sparse Tree Ensembles

    Authors: Brian Liu, Miaolan Xie, Haoyue Yang, Madeleine Udell

    Abstract: ControlBurn is a Python package to construct feature-sparse tree ensembles that support nonlinear feature selection and interpretable machine learning. The algorithms in this package first build large tree ensembles that prioritize basis functions with few features and then select a feature-sparse subset of these basis functions using a weighted lasso optimization criterion. The package includes v… ▽ More

    Submitted 8 July, 2022; originally announced July 2022.

    Comments: 22 pages

  40. arXiv:2206.07766  [pdf, other

    cs.LG stat.ML

    Pareto Invariant Risk Minimization: Towards Mitigating the Optimization Dilemma in Out-of-Distribution Generalization

    Authors: Yongqiang Chen, Kaiwen Zhou, Yatao Bian, Binghui Xie, Bingzhe Wu, Yonggang Zhang, Kaili Ma, Han Yang, Peilin Zhao, Bo Han, James Cheng

    Abstract: Recently, there has been a growing surge of interest in enabling machine learning systems to generalize well to Out-of-Distribution (OOD) data. Most efforts are devoted to advancing optimization objectives that regularize models to capture the underlying invariance; however, there often are compromises in the optimization process of these OOD objectives: i) Many OOD objectives have to be relaxed a… ▽ More

    Submitted 2 March, 2023; v1 submitted 15 June, 2022; originally announced June 2022.

    Comments: ICLR 2023, 50 pages, 58 figures

  41. arXiv:2205.09459  [pdf, other

    cs.LG stat.ML

    Neural Network Architecture Beyond Width and Depth

    Authors: Zuowei Shen, Haizhao Yang, Shijun Zhang

    Abstract: This paper proposes a new neural network architecture by introducing an additional dimension called height beyond width and depth. Neural network architectures with height, width, and depth as hyper-parameters are called three-dimensional architectures. It is shown that neural networks with three-dimensional architectures are significantly more expressive than the ones with two-dimensional archite… ▽ More

    Submitted 14 January, 2023; v1 submitted 19 May, 2022; originally announced May 2022.

    Journal ref: Advances in Neural Information Processing Systems, 35:5669--5681, 2022

  42. arXiv:2205.08187  [pdf, other

    stat.ML cs.LG math.PR math.ST

    Deep neural networks with dependent weights: Gaussian Process mixture limit, heavy tails, sparsity and compressibility

    Authors: Hoil Lee, Fadhel Ayed, Paul Jung, Juho Lee, Hongseok Yang, François Caron

    Abstract: This article studies the infinite-width limit of deep feedforward neural networks whose weights are dependent, and modelled via a mixture of Gaussian distributions. Each hidden node of the network is assigned a nonnegative random variable that controls the variance of the outgoing weights of that node. We make minimal assumptions on these per-node random variables: they are iid and their sum, in e… ▽ More

    Submitted 11 September, 2023; v1 submitted 17 May, 2022; originally announced May 2022.

    Comments: 96 pages, 15 figures, 9 tables

    MSC Class: 68T07 (Primary); 62M45; 60F99 (Secondary)

  43. arXiv:2202.10670  [pdf, other

    stat.ML cs.LG

    From Optimization Dynamics to Generalization Bounds via Łojasiewicz Gradient Inequality

    Authors: Fusheng Liu, Haizhao Yang, Soufiane Hayou, Qianxiao Li

    Abstract: Optimization and generalization are two essential aspects of statistical machine learning. In this paper, we propose a framework to connect optimization with generalization by analyzing the generalization error based on the optimization trajectory under the gradient flow algorithm. The key ingredient of this framework is the Uniform-LGI, a property that is generally satisfied when training machine… ▽ More

    Submitted 12 October, 2022; v1 submitted 21 February, 2022; originally announced February 2022.

    Journal ref: Transactions on Machine Learning Research 2022

  44. arXiv:2202.08057  [pdf, other

    cs.LG cs.CR stat.ML

    Understanding and Improving Graph Injection Attack by Promoting Unnoticeability

    Authors: Yongqiang Chen, Han Yang, Yonggang Zhang, Kaili Ma, Tongliang Liu, Bo Han, James Cheng

    Abstract: Recently Graph Injection Attack (GIA) emerges as a practical attack scenario on Graph Neural Networks (GNNs), where the adversary can merely inject few malicious nodes instead of modifying existing nodes or edges, i.e., Graph Modification Attack (GMA). Although GIA has achieved promising results, little is known about why it is successful and whether there is any pitfall behind the success. To und… ▽ More

    Submitted 5 April, 2022; v1 submitted 16 February, 2022; originally announced February 2022.

    Comments: ICLR2022, 42 pages, 22 figures

  45. arXiv:2201.10745  [pdf, other

    stat.CO stat.ME

    Control Variate Polynomial Chaos: Optimal Fusion of Sampling and Surrogates for Multifidelity Uncertainty Quantification

    Authors: Hang Yang, Yuji Fujii, K. W. Wang, Alex A. Gorodetsky

    Abstract: We present a hybrid sampling-surrogate approach for reducing the computational expense of uncertainty quantification in nonlinear dynamical systems. Our motivation is to enable rapid uncertainty quantification in complex mechanical systems such as automotive propulsion systems. Our approach is to build upon ideas from multifidelity uncertainty quantification to leverage the benefits of both sampli… ▽ More

    Submitted 25 January, 2022; originally announced January 2022.

    MSC Class: 62-08; 65Pxx; 65D15; 65C05; 41A10

  46. arXiv:2201.06461  [pdf, other

    gr-qc astro-ph.HE cs.LG stat.ML

    Using machine learning to parametrize postmerger signals from binary neutron stars

    Authors: Tim Whittaker, William E. East, Stephen R. Green, Luis Lehner, Huan Yang

    Abstract: There is growing interest in the detection and characterization of gravitational waves from postmerger oscillations of binary neutron stars. These signals contain information about the nature of the remnant and the high-density and out-of-equilibrium physics of the postmerger processes, which would complement any electromagnetic signal. However, the construction of binary neutron star postmerger w… ▽ More

    Submitted 17 January, 2022; originally announced January 2022.

    Journal ref: Phys. Rev. D 105, 124021 (2022)

  47. arXiv:2201.00217  [pdf, other

    stat.ML cs.LG

    Deep Nonparametric Estimation of Operators between Infinite Dimensional Spaces

    Authors: Hao Liu, Haizhao Yang, Minshuo Chen, Tuo Zhao, Wenjing Liao

    Abstract: Learning operators between infinitely dimensional spaces is an important learning task arising in wide applications in machine learning, imaging science, mathematical modeling and simulations, etc. This paper studies the nonparametric estimation of Lipschitz operators using deep neural networks. Non-asymptotic upper bounds are derived for the generalization error of the empirical risk minimizer ov… ▽ More

    Submitted 1 January, 2022; originally announced January 2022.

  48. arXiv:2111.07964  [pdf, other

    cs.LG stat.ML

    Deep Network Approximation in Terms of Intrinsic Parameters

    Authors: Zuowei Shen, Haizhao Yang, Shijun Zhang

    Abstract: One of the arguments to explain the success of deep learning is the powerful approximation capacity of deep neural networks. Such capacity is generally accompanied by the explosive growth of the number of parameters, which, in turn, leads to high computational costs. It is of great interest to ask whether we can achieve successful deep learning with a small number of learnable parameters adapting… ▽ More

    Submitted 14 June, 2022; v1 submitted 15 November, 2021; originally announced November 2021.

    Journal ref: Proceedings of the 39th International Conference on Machine Learning, PMLR 162:19909-19934, 2022

  49. arXiv:2109.00531  [pdf, other

    stat.ML cs.LG

    Under-bagging Nearest Neighbors for Imbalanced Classification

    Authors: Hanyuan Hang, Yuchao Cai, Hanfang Yang, Zhouchen Lin

    Abstract: In this paper, we propose an ensemble learning algorithm called \textit{under-bagging $k$-nearest neighbors} (\textit{under-bagging $k$-NN}) for imbalanced classification problems. On the theoretical side, by developing a new learning theory analysis, we show that with properly chosen parameters, i.e., the number of nearest neighbors $k$, the expected sub-sample size $s$, and the bagging rounds… ▽ More

    Submitted 1 September, 2021; originally announced September 2021.

  50. arXiv:2107.13090  [pdf, other

    math.OC cs.GT cs.LG stat.ML

    Policy Gradient Methods Find the Nash Equilibrium in N-player General-sum Linear-quadratic Games

    Authors: Ben Hambly, Renyuan Xu, Huining Yang

    Abstract: We consider a general-sum N-player linear-quadratic game with stochastic dynamics over a finite horizon and prove the global convergence of the natural policy gradient method to the Nash equilibrium. In order to prove the convergence of the method, we require a certain amount of noise in the system. We give a condition, essentially a lower bound on the covariance of the noise in terms of the model… ▽ More

    Submitted 15 August, 2022; v1 submitted 27 July, 2021; originally announced July 2021.