-
Achieving Robust Data-driven Contextual Decision Making in a Data Augmentation Way
Authors:
Zhaoen Li,
Maoqi Liu,
Zhi-Hai Zhang
Abstract:
This paper focuses on the contextual optimization problem where a decision is subject to some uncertain parameters and covariates that have some predictive power on those parameters are available before the decision is made. More specifically, we focus on solving the Wasserstein-distance-based distributionally robust optimization (DRO) model for the problem, which maximizes the worst-case expected…
▽ More
This paper focuses on the contextual optimization problem where a decision is subject to some uncertain parameters and covariates that have some predictive power on those parameters are available before the decision is made. More specifically, we focus on solving the Wasserstein-distance-based distributionally robust optimization (DRO) model for the problem, which maximizes the worst-case expected objective over an uncertainty set including all distributions closed enough to a nominal distribution with respect to the Wasserstein distance. We develop a stochastic gradient descent algorithm based on the idea of data augmentation to solve the model efficiently. The algorithm iteratively a) does a bootstrapping sample from the nominal distribution; b) perturbs the adversarially and c) updates decisions. Accordingly, the computational time of the algorithm is only determined by the number of iterations and the complexity of computing the gradient of a single sample. Except for efficiently solving the model, the algorithm provide additional advantages that the proposed algorithm can cope with any nominal distributions and therefore is extendable to solve the problem in an online setting. We also prove that the algorithm converges to the optimal solution of the DRO model at a rate of a $O(1/\sqrt{T})$, where $T$ is the number of iterations of bootstrapping. Consequently, the performance guarantee of the algorithm is that of the DRO model plus $O(1/\sqrt{T})$. Through extensive numerical experiments, we demonstrate the superior performance of the proposed algorithm to several benchmarks.
△ Less
Submitted 9 August, 2024; v1 submitted 8 August, 2024;
originally announced August 2024.
-
Rational Curves on Real Classical Groups
Authors:
Zijia Li,
Ke Ye
Abstract:
This paper is concerned with rational curves on real classical groups. Our contributions are three-fold: (i) We determine the structure of quadratic rational curves on real classical groups. As a consequence, we completely classify quadratic rational curves on $\mathrm{U}_n$, $\mathrm{O}_n(\mathbb{R})$, $\mathrm{O}_{n-1,1}(\mathbb{R})$ and $\mathrm{O}_{n-2,2}(\mathbb{R})$. (ii) We prove a decompos…
▽ More
This paper is concerned with rational curves on real classical groups. Our contributions are three-fold: (i) We determine the structure of quadratic rational curves on real classical groups. As a consequence, we completely classify quadratic rational curves on $\mathrm{U}_n$, $\mathrm{O}_n(\mathbb{R})$, $\mathrm{O}_{n-1,1}(\mathbb{R})$ and $\mathrm{O}_{n-2,2}(\mathbb{R})$. (ii) We prove a decomposition theorem for rational curves on real classical groups, which can be regarded as a non-commutative generalization of the fundamental theorem of algebra and partial fraction decomposition. (iii) As an application of (i) and (ii), we generalize Kempe's Universality Theorem to rational curves on homogeneous spaces.
△ Less
Submitted 8 August, 2024;
originally announced August 2024.
-
An AI-aided algorithm for multivariate polynomial reconstruction on Cartesian grids and the PLG finite difference method
Authors:
Qinghai Zhang,
Yuke Zhu,
Zhixuan Li
Abstract:
Polynomial reconstruction on Cartesian grids is a key problem in developing finite difference methods as well as in many other engineering applications, yet it is still an open problem how to construct for a finite subset $K$ of $\mathbb{Z}^{\textsf{D}}$ a lattice $\mathcal{T}\subset K$ so that multivariate polynomial interpolation on this lattice is unisolvent. In this work, we solve this open pr…
▽ More
Polynomial reconstruction on Cartesian grids is a key problem in developing finite difference methods as well as in many other engineering applications, yet it is still an open problem how to construct for a finite subset $K$ of $\mathbb{Z}^{\textsf{D}}$ a lattice $\mathcal{T}\subset K$ so that multivariate polynomial interpolation on this lattice is unisolvent. In this work, we solve this open problem of poised lattice generation (PLG) via an interdisciplinary research of approximation theory, abstract algebra, and artificial intelligence (AI). Specifically, we focus on the triangular lattices in approximation theory, study group actions of permutations upon triangular lattices, prove an isomorphism between the group of permutations and that of triangular lattices, and dynamically organize the AI state space of permutations so that a depth-first search of poised lattices has optimal efficiency. Based on this algorithm, we further develop the PLG finite difference method that retains the simplicity of Cartesian grids yet overcomes the disadvantage of legacy finite difference methods in handling irregular geometries. Results of various numerical tests demonstrate the effectiveness of our algorithm and the simplicity, efficiency, and fourth-order accuracy of the PLG finite difference method.
△ Less
Submitted 7 August, 2024;
originally announced August 2024.
-
A complete characterization of split digraphs with a strong arc decomposition
Authors:
Jiangdong Ai,
Fankang He,
Zhaoxiang Li,
Zhongmei Qin,
Changxin Wang
Abstract:
A \textbf{strong arc decomposition} of a (multi-)digraph $D(V, A)$ is a partition of its arc set $A$ into two disjoint arc sets $A_1$ and $A_2$ such that both of the spanning subdigraphs $D(V, A_1)$ and $D(V, A_2)$ are strong. In this paper, we fully characterize all split digraphs that do not have a strong decomposition. This resolves two problems proposed by Bang-Jensen and Wang and contributes…
▽ More
A \textbf{strong arc decomposition} of a (multi-)digraph $D(V, A)$ is a partition of its arc set $A$ into two disjoint arc sets $A_1$ and $A_2$ such that both of the spanning subdigraphs $D(V, A_1)$ and $D(V, A_2)$ are strong. In this paper, we fully characterize all split digraphs that do not have a strong decomposition. This resolves two problems proposed by Bang-Jensen and Wang and contributes to a series of efforts aimed at addressing this problem for specific graph classes. This work continues the research on semicomplete composition [Bang-Jensen, Gutin and Yeo, J. Graph Theory, 2020]; on locally semicomplete digraphs [Bang-Jensen and Huang, J. Combin. Theory Ser. B, 2010]; on a type of tournaments [Bang-Jensen and Yeo, Combinatorica, 2004].
△ Less
Submitted 5 August, 2024;
originally announced August 2024.
-
A Robust Compressed Push-Pull Method for Decentralized Nonconvex Optimization
Authors:
Yiwei Liao,
Zhuorui Li,
Shi Pu,
Tsung-Hui Chang
Abstract:
In the modern paradigm of multi-agent networks, communication has become one of the main bottlenecks for decentralized optimization, where a large number of agents are involved in minimizing the average of the local cost functions. In this paper, we propose a robust compressed push-pull algorithm (RCPP) that combines gradient tracking with communication compression. In particular, RCPP is robust u…
▽ More
In the modern paradigm of multi-agent networks, communication has become one of the main bottlenecks for decentralized optimization, where a large number of agents are involved in minimizing the average of the local cost functions. In this paper, we propose a robust compressed push-pull algorithm (RCPP) that combines gradient tracking with communication compression. In particular, RCPP is robust under a much more general class of compression operators that allow both relative and absolute compression errors, in contrast to the existing works which can handle either one of them or assume convex problems. We show that RCPP enjoys sublinear convergence rate for smooth and possibly nonconvex objective functions over general directed networks. Moreover, under the additional Polyak-Łojasiewicz condition, linear convergence rate can be achieved for RCPP. Numerical examples verify the theoretical findings and demonstrate the efficiency, flexibility, and robustness of the proposed algorithm.
△ Less
Submitted 3 August, 2024;
originally announced August 2024.
-
Universal Approximation of Dynamical Systems by Semi-Autonomous Neural ODEs and Applications
Authors:
Ziqian Li,
Kang Liu,
Lorenzo Liverani,
Enrique Zuazua
Abstract:
In this paper, we introduce semi-autonomous neural ordinary differential equations (SA-NODEs), a variation of the vanilla NODEs, employing fewer parameters. We investigate the universal approximation properties of SA-NODEs for dynamical systems from both a theoretical and a numerical perspective. Within the assumption of a finite-time horizon, under general hypotheses we establish an asymptotic ap…
▽ More
In this paper, we introduce semi-autonomous neural ordinary differential equations (SA-NODEs), a variation of the vanilla NODEs, employing fewer parameters. We investigate the universal approximation properties of SA-NODEs for dynamical systems from both a theoretical and a numerical perspective. Within the assumption of a finite-time horizon, under general hypotheses we establish an asymptotic approximation result, demonstrating that the error vanishes as the number of parameters goes to infinity. Under additional regularity assumptions, we further specify this convergence rate in relation to the number of parameters, utilizing quantitative approximation results in the Barron space. Based on the previous result, we prove an approximation rate for transport equations by their neural counterparts. Our numerical experiments validate the effectiveness of SA-NODEs in capturing the dynamics of various ODE systems and transport equations. Additionally, we compare SA-NODEs with vanilla NODEs, highlighting the superior performance and reduced complexity of our approach.
△ Less
Submitted 25 July, 2024; v1 submitted 24 July, 2024;
originally announced July 2024.
-
Uniform K-stability of $G$-varieties of complexity 1
Authors:
Yan Li,
Zhenye Li
Abstract:
Let ${\rm k}$ be an algebraically closed field of characteristic 0 and $G$ a connect, reductive group over it. Let $X$ be a projective $G$-variety of complexity 1. We classify $G$-equivariant normal test configurations of $X$ with integral central fibre via the combinatorial data. We also give a formula of anti-canonical divisors on $X$. Based on this formula, when $X$ is $\mathbb Q$-Fano, we give…
▽ More
Let ${\rm k}$ be an algebraically closed field of characteristic 0 and $G$ a connect, reductive group over it. Let $X$ be a projective $G$-variety of complexity 1. We classify $G$-equivariant normal test configurations of $X$ with integral central fibre via the combinatorial data. We also give a formula of anti-canonical divisors on $X$. Based on this formula, when $X$ is $\mathbb Q$-Fano, we give an expression of the Futaki invariant, and derive a criterion of uniform K-stability in terms of the combinatorial data.
△ Less
Submitted 20 July, 2024;
originally announced July 2024.
-
Independent GUE minor processes of perfect matchings on rail-yard graphs
Authors:
Zhongyang Li
Abstract:
We study perfect matchings on the rail-yard graphs in which the right boundary condition is given by the empty partition and the left boundary can be divided into finitely many alternating line segments where all the vertices along each line segment are either removed or remained. When the edge weights satisfy certain conditions, we show that the distributions of the locations of certain types of…
▽ More
We study perfect matchings on the rail-yard graphs in which the right boundary condition is given by the empty partition and the left boundary can be divided into finitely many alternating line segments where all the vertices along each line segment are either removed or remained. When the edge weights satisfy certain conditions, we show that the distributions of the locations of certain types of dimers near the right boundary converge to the spectra of independent GUE minor processes. The proof is based on new quantitative analysis of a formula to compute Schur functions at general points discovered in \cite{ZL18}.
△ Less
Submitted 19 July, 2024;
originally announced July 2024.
-
Non-intrusive Least-Squares Functional A Posteriori Error Estimator: Linear and Nonlinear Problems with Plain Convergence
Authors:
Ziyan Li,
Shun Zhang
Abstract:
The a posteriori error estimator using the least-squares functional can be used for adaptive mesh refinement and error control even if the numerical approximations are not obtained from the corresponding least-squares method. This suggests the development of a versatile non-intrusive a posteriori error estimator. In this paper, we present a systematic approach for applying the least-squares functi…
▽ More
The a posteriori error estimator using the least-squares functional can be used for adaptive mesh refinement and error control even if the numerical approximations are not obtained from the corresponding least-squares method. This suggests the development of a versatile non-intrusive a posteriori error estimator. In this paper, we present a systematic approach for applying the least-squares functional error estimator to linear and nonlinear problems that are not solved by the least-squares finite element methods. For the case of an elliptic PDE solved by the standard conforming finite element method, we minimize the least-squares functional with conforming approximation inserted to recover the other physical meaningful variable. By combining the numerical approximation from the original method with the auxiliary recovery approximation, we construct the least-squares functional a posteriori error estimator. Furthermore, we introduce a new interpretation that views the non-intrusive least-squares functional error estimator as an estimator for the combined solve-recover process. This simplifies the reliability and efficiency analysis. We extend the idea to a model nonlinear problem. Plain convergence results are proved for adaptive algorithms of the general second order elliptic equation and a model nonlinear problem with the non-intrusive least-squares functional a posteriori error estimators.
△ Less
Submitted 18 July, 2024;
originally announced July 2024.
-
Graph Linear Canonical Transform: Definition, Vertex-Frequency Analysis and Filter Design
Authors:
Jian Yi Chen,
Bing Zhao Li
Abstract:
This paper proposes a graph linear canonical transform (GLCT) by decomposing the linear canonical parameter matrix into fractional Fourier transform, scale transform, and chirp modulation for graph signal processing. The GLCT enables adjustable smoothing modes, enhancing alignment with graph signals. Leveraging traditional fractional domain time-frequency analysis, we investigate vertex-frequency…
▽ More
This paper proposes a graph linear canonical transform (GLCT) by decomposing the linear canonical parameter matrix into fractional Fourier transform, scale transform, and chirp modulation for graph signal processing. The GLCT enables adjustable smoothing modes, enhancing alignment with graph signals. Leveraging traditional fractional domain time-frequency analysis, we investigate vertex-frequency analysis in the graph linear canonical domain, aiming to overcome limitations in capturing local information. Filter design methods, including optimal design and learning with stochastic gradient descent, are analyzed and applied to image classification tasks. The proposed GLCT and vertex-frequency analysis present innovative approaches to signal processing challenges, with potential applications in various fields.
△ Less
Submitted 2 July, 2024;
originally announced July 2024.
-
On the mean $Ψ$-intermediate dimensions
Authors:
Yu Liu,
Bilel Selmi,
Zhiming Li
Abstract:
In this paper, we introduce the mean $Ψ$-intermediate dimension which has a value between the mean Hausdorff dimension and the metric mean dimension, and prove the equivalent definition of the mean Hausdorff dimension and the metric mean dimension. Furthermore, we delve into the core properties of the mean $Ψ$-intermediate dimensions. Additionally, we establish the mass distribution principle, a F…
▽ More
In this paper, we introduce the mean $Ψ$-intermediate dimension which has a value between the mean Hausdorff dimension and the metric mean dimension, and prove the equivalent definition of the mean Hausdorff dimension and the metric mean dimension. Furthermore, we delve into the core properties of the mean $Ψ$-intermediate dimensions. Additionally, we establish the mass distribution principle, a Frostman-type lemma, Hölder distortion, and derive the corresponding product formula. Finally, we provide illustrative examples of the mean $Ψ$-intermediate dimension, demonstrating its practical applications.
△ Less
Submitted 13 July, 2024;
originally announced July 2024.
-
A restriction estimate for a hyperbolic paraboloid in $\mathbb{R}^5$
Authors:
Zhuoran Li
Abstract:
In this paper, we prove a restriction estimate for a hyperbolic paraboloid in $\mathbb{R}^5$ by the polynomial partitioning method.
In this paper, we prove a restriction estimate for a hyperbolic paraboloid in $\mathbb{R}^5$ by the polynomial partitioning method.
△ Less
Submitted 23 July, 2024; v1 submitted 11 July, 2024;
originally announced July 2024.
-
Polynomial tail solutions of the non-cutoff Boltzmann equation near local Maxwellians
Authors:
Renjun Duan,
Zongguang Li
Abstract:
This paper aims to incorporate the Caflisch's decomposition into the macro-micro decomposition in Boltzmann theory for allowing the microscopic component to exhibit only the polynomial tail in large velocities. In particular, we treat the Cauchy problem on the non-cutoff Boltzmann equation under the compressible Euler scaling in case of three-dimensional whole space. Up to a finite time we constru…
▽ More
This paper aims to incorporate the Caflisch's decomposition into the macro-micro decomposition in Boltzmann theory for allowing the microscopic component to exhibit only the polynomial tail in large velocities. In particular, we treat the Cauchy problem on the non-cutoff Boltzmann equation under the compressible Euler scaling in case of three-dimensional whole space. Up to a finite time we construct the Boltzmann solution around a local Maxwellian corresponding to small-amplitude classical solutions of the full compressible Euler system around constant states. We design a new energy functional which can capture the convergence rate in the small Knudsen number $\varepsilon$ and allow the microscopic part of solutions to decay polynomially in large velocities. Moreover, the energy norm of perturbations can be of the order $\varepsilon^{1/2}$ which the usual method of Hilbert expansion fails to obtain. As a byproduct of the proof, our estimates immediately yield a global-in-time existence result when the Euler solutions are taken to be constant states.
△ Less
Submitted 11 July, 2024;
originally announced July 2024.
-
Computational Graph Representation of Equations System Constructors in Hierarchical Circuit Simulation
Authors:
Zichao Long,
Lin Li,
Lei Han,
Xianglong Meng,
Chongjun Ding,
Ruiyan Li,
Wu Jiang,
Fuchen Ding,
Jiaqing Yue,
Zhichao Li,
Yisheng Hu,
Ding Li,
Heng Liao
Abstract:
Equations system constructors of hierarchical circuits play a central role in device modeling, nonlinear equations solving, and circuit design automation. However, existing constructors present limitations in applications to different extents. For example, the costs of developing and reusing device models -- especially coarse-grained equivalent models of circuit modules -- remain high while parame…
▽ More
Equations system constructors of hierarchical circuits play a central role in device modeling, nonlinear equations solving, and circuit design automation. However, existing constructors present limitations in applications to different extents. For example, the costs of developing and reusing device models -- especially coarse-grained equivalent models of circuit modules -- remain high while parameter sensitivity analysis is complex and inefficient. Inspired by differentiable programming and leveraging the ecosystem benefits of open-source software, we propose an equations system constructor using the computational graph representation, along with its JSON format netlist, to address these limitations. This representation allows for runtime dependencies between signals and subcircuit/device parameters. The proposed method streamlines the model development process and facilitates end-to-end computation of gradients of equations remainders with respect to parameters. This paper discusses in detail the overarching concept of hierarchical subcircuit/device decomposition and nested invocation by drawing parallels to functions in programming languages, and introduces rules for parameters passing and gradient propagation across hierarchical circuit modules. The presented numerical examples, including (1) an uncoupled CMOS model representation using "equivalent circuit decomposition+dynamic parameters" and (2) operational amplifier (OpAmp) auto device sizing, have demonstrated that the proposed method supports circuit simulation and design and particularly subcircuit modeling with improved efficiency, simplicity, and decoupling compared to existing techniques.
△ Less
Submitted 4 July, 2024;
originally announced July 2024.
-
On p-torsions of geometric Brauer groups
Authors:
Zhenghui Li,
Yanshuai Qin
Abstract:
Let $X$ be a smooth projective integral variety over a finitely generated field $k$ of characteristic $p>0$. We show that the finiteness of the exponent of the $p$-primary part of $\mathrm{Br}(X_{k^s})^{G_k}$ is equivalent to the Tate conjecture for divisors, generalizing D'Addezio's theorem for abelian varieties to arbitrary smooth projective varieties. As a result, we show that the cokernel of…
▽ More
Let $X$ be a smooth projective integral variety over a finitely generated field $k$ of characteristic $p>0$. We show that the finiteness of the exponent of the $p$-primary part of $\mathrm{Br}(X_{k^s})^{G_k}$ is equivalent to the Tate conjecture for divisors, generalizing D'Addezio's theorem for abelian varieties to arbitrary smooth projective varieties. As a result, we show that the cokernel of $\mathrm{Br}_{\mathrm{nr}}(K(X)) \rightarrow \mathrm{Br}(X_{k^s})^{G_k}$ is of finite exponent and complete the $p$-primary part of the generalization of Artin-Grothendieck's theorem to higher relative dimensions.
△ Less
Submitted 27 June, 2024;
originally announced June 2024.
-
Algebraic Connectivity Control and Maintenance in Multi-Agent Networks under Attack
Authors:
Wenjie Zhao,
Diego Deplano,
Zhiwu Li,
Alessandro Giua,
Mauro Franceschelli
Abstract:
This paper studies the problem of increasing the connectivity of an ad-hoc peer-to-peer network subject to cyber-attacks targeting the agents in the network. The adopted strategy involves the design of local interaction rules for the agents to locally modify the graph topology by adding and removing links with neighbors. Two distributed protocols are presented to boost the algebraic connectivity o…
▽ More
This paper studies the problem of increasing the connectivity of an ad-hoc peer-to-peer network subject to cyber-attacks targeting the agents in the network. The adopted strategy involves the design of local interaction rules for the agents to locally modify the graph topology by adding and removing links with neighbors. Two distributed protocols are presented to boost the algebraic connectivity of the network graph beyond $k-2\sqrt{k-1}$ where $k\in \mathbb{N}$ is a free design parameter; these two protocols are achieved through the distributed construction of random (approximate) regular graphs. One protocol leverages coordinated actions between pairs of neighboring agents and is mathematically proven to converge to the desired graph topology. The other protocol relies solely on the uncoordinated actions of individual agents and it is validated by a spectral analysis through Monte-Carlo simulations. Numerical simulations offer a comparative analysis with other state-of-the-art algorithms, showing the ability of both proposed protocols to maintain high levels of connectivity despite attacks carried out with full knowledge of the network structure, and highlighting their superior performance.
△ Less
Submitted 26 June, 2024;
originally announced June 2024.
-
Moment-based parameter inference with error guarantees for stochastic reaction networks
Authors:
Zekai Li,
Mauricio Barahona,
Philipp Thomas
Abstract:
Inferring parameters of models of biochemical kinetics from single-cell data remains challenging because of the uncertainty arising from the intractability of the likelihood function of stochastic reaction networks. Such uncertainty falls beyond current error quantification measures, which focus on the effects of finite sample size and identifiability but lack theoretical guarantees when likelihoo…
▽ More
Inferring parameters of models of biochemical kinetics from single-cell data remains challenging because of the uncertainty arising from the intractability of the likelihood function of stochastic reaction networks. Such uncertainty falls beyond current error quantification measures, which focus on the effects of finite sample size and identifiability but lack theoretical guarantees when likelihood approximations are needed. Here, we propose an inference method for stochastic reaction networks with nonlinear and rational propensities at steady state that provides bounds on the parameters via convex optimisation over sets constrained by moment equations and moment matrices. Our approach takes observations from the stochastic reaction network and forms moment intervals, which are then used to constrain parameters through convex sets. The bounds on the parameters contain the true parameters under the condition that the moment intervals contain the true stationary moments, thus providing uncertainty quantification and error guarantees. Our approach does not need to predict moments and distributions for given parameters (i.e., it avoids solving or simulating the forward problem), and hence circumvents intractable likelihood computations or computationally expensive simulations. We demonstrate its use for uncertainty quantification, data integration and prediction of latent species statistics through synthetic data from common nonlinear biochemical models including the Schlögl model, the toggle switch and post-transcriptional regulation.
△ Less
Submitted 25 June, 2024;
originally announced June 2024.
-
A refined uniqueness result of Leray's problem in an infinite-long pipe with the Navier-slip boundary condition
Authors:
Zijin Li,
Ning Liu,
Taoran Zhou
Abstract:
In the recent paper \cite{LPY2024SCM}, authors proved the existence, uniqueness, regularity and exponential decay property of the solution to the generalized Leray's problem in a distorted infinite-long pipe with the Navier-slip boundary condition, where the friction ratio $α>0$ and the flux $Φ$ is no bigger than a critical flux $Φ_0=\frac{Cα}{1+α}$.
In this paper, we consider the generalized Le…
▽ More
In the recent paper \cite{LPY2024SCM}, authors proved the existence, uniqueness, regularity and exponential decay property of the solution to the generalized Leray's problem in a distorted infinite-long pipe with the Navier-slip boundary condition, where the friction ratio $α>0$ and the flux $Φ$ is no bigger than a critical flux $Φ_0=\frac{Cα}{1+α}$.
In this paper, we consider the generalized Leray's problem with the Navier-slip boundary condition in a straight pipe $\mathcal{D}=Σ\times\mathbb{R}$. We show that if the flux of the solution is no larger than a critical value that is independent with $α$, the solution to the problem must be the Poiseuille flow with the given flux. This smallness condition of the flux is weaker compared with the previous result, particularly when the friction ratio is small (close to the total slip Navier boundary condition).
Our proof relies primarily on a refined gradient estimate of the Poiseuille flow with the Navier-slip boundary condition. Additionally, we provide an exact lower bound of the critical flux $Φ_0$ for flow in an infinite unit cylindrical pipe. Some discussions on the essential differences between 2D and 3D problems are also given.
△ Less
Submitted 25 June, 2024;
originally announced June 2024.
-
Towards Dynamic Resource Allocation and Client Scheduling in Hierarchical Federated Learning: A Two-Phase Deep Reinforcement Learning Approach
Authors:
Xiaojing Chen,
Zhenyuan Li,
Wei Ni,
Xin Wang,
Shunqing Zhang,
Yanzan Sun,
Shugong Xu,
Qingqi Pei
Abstract:
Federated learning (FL) is a viable technique to train a shared machine learning model without sharing data. Hierarchical FL (HFL) system has yet to be studied regrading its multiple levels of energy, computation, communication, and client scheduling, especially when it comes to clients relying on energy harvesting to power their operations. This paper presents a new two-phase deep deterministic p…
▽ More
Federated learning (FL) is a viable technique to train a shared machine learning model without sharing data. Hierarchical FL (HFL) system has yet to be studied regrading its multiple levels of energy, computation, communication, and client scheduling, especially when it comes to clients relying on energy harvesting to power their operations. This paper presents a new two-phase deep deterministic policy gradient (DDPG) framework, referred to as ``TP-DDPG'', to balance online the learning delay and model accuracy of an FL process in an energy harvesting-powered HFL system. The key idea is that we divide optimization decisions into two groups, and employ DDPG to learn one group in the first phase, while interpreting the other group as part of the environment to provide rewards for training the DDPG in the second phase. Specifically, the DDPG learns the selection of participating clients, and their CPU configurations and the transmission powers. A new straggler-aware client association and bandwidth allocation (SCABA) algorithm efficiently optimizes the other decisions and evaluates the reward for the DDPG. Experiments demonstrate that with substantially reduced number of learnable parameters, the TP-DDPG can quickly converge to effective polices that can shorten the training time of HFL by 39.4% compared to its benchmarks, when the required test accuracy of HFL is 0.9.
△ Less
Submitted 21 June, 2024;
originally announced June 2024.
-
Braid Group Action and Quantum Queer Superalgebra
Authors:
Jianmin Chen,
Zhenhua Li,
Hongying Zhu
Abstract:
In this paper, we present explicit actions of braid group on the universal enveloping superalgebra ${\boldsymbol U}(\mathfrak{q}_n)$ and the quantum queer superalgebra ${\boldsymbol U}_{\!{v}}(\mathfrak{q}_{n})$. Then we provide a new definition of root vectors and some explicit expression for them. With these procedures, we obtain the PBW-type basis containing the product of root vectors.
In this paper, we present explicit actions of braid group on the universal enveloping superalgebra ${\boldsymbol U}(\mathfrak{q}_n)$ and the quantum queer superalgebra ${\boldsymbol U}_{\!{v}}(\mathfrak{q}_{n})$. Then we provide a new definition of root vectors and some explicit expression for them. With these procedures, we obtain the PBW-type basis containing the product of root vectors.
△ Less
Submitted 17 June, 2024; v1 submitted 15 June, 2024;
originally announced June 2024.
-
Perfect Matchings and Essential Spanning Forests in Hyperbolic Double Circle Packings
Authors:
Zhongyang Li
Abstract:
We investigate perfect matchings and essential spanning forests in planar hyperbolic graphs via circle packings.
We prove the existence of nonconstant harmonic Dirichlet functions that vanish in a closed set of the boundary, generalizing a result in \cite{bsinv}. We then prove the existence of extremal infinite volume measures for uniform spanning forests with partially wired boundary conditions…
▽ More
We investigate perfect matchings and essential spanning forests in planar hyperbolic graphs via circle packings.
We prove the existence of nonconstant harmonic Dirichlet functions that vanish in a closed set of the boundary, generalizing a result in \cite{bsinv}. We then prove the existence of extremal infinite volume measures for uniform spanning forests with partially wired boundary conditions and partially free boundary conditions, generalizing a result in \cite{BLPS01}.
Using the double circle packing for a pair of dual graphs, we relate the inverse of the weighted adjacency matrix to the difference of Green's functions plus an explicit harmonic Dirichlet function. This gives explicit formulas for the probabilities of any cylindrical events.
We prove that the infinite-volume Gibbs measure obtained from approximations by finite domains with exactly two convex white corners converging to two distinct points along the boundary is extremal, yet not invariant with respect to a finite-orbit subgroup of the automorphism group. We then show that under this measure, a.s.~there are no infinite contours in the symmetric difference of two i.i.d.~random perfect matchings.
As an application, we prove that the variance of the height difference of two i.i.d.~uniformly weighted perfect matchings under the boundary condition above on a transitive nonamenable planar graph is always finite; in contrast to the 2D uniformly weighted dimer model on a transitive amenable planar graph as proved in \cite{RK01,KOS06}, where the variance of height difference grows in the order of $\log n$, with $n$ being the graph distance to the boundary. This also implies that a.s.~each point is surrounded by finitely many cycles in the symmetric difference of two i.i.d.~perfect matchings, again in contrast to the 2D Euclidean case.
△ Less
Submitted 29 June, 2024; v1 submitted 12 June, 2024;
originally announced June 2024.
-
Uplink resource allocation optimization for user-centric cell-free MIMO networks
Authors:
Zehua Li,
Raviraj Adve
Abstract:
We examine the problem of optimizing resource allocation in the uplink for a user-centric, cell-free, multi-input multi-output network. We start by modeling and developing resource allocation algorithms for two standard network operation modes. The centralized mode provides high data rates but suffers multiple issues, including scalability. On the other hand, the distributed mode has the opposite…
▽ More
We examine the problem of optimizing resource allocation in the uplink for a user-centric, cell-free, multi-input multi-output network. We start by modeling and developing resource allocation algorithms for two standard network operation modes. The centralized mode provides high data rates but suffers multiple issues, including scalability. On the other hand, the distributed mode has the opposite problem: relatively low rates, but is scalable. To address these challenges, we combine the strength of the two standard modes, creating a new semi-distributed operation mode. To avoid the need for information exchange between access points, we introduce a new quality of service metric to decentralize the resource allocation algorithms. Our results show that we can eliminate the need for information exchange with a relatively small penalty on data rates.
△ Less
Submitted 8 June, 2024;
originally announced June 2024.
-
Whitney Stratification of Algebraic Boundaries of Convex Semi-algebraic Sets
Authors:
Zihao Dai,
Zijia Li,
Zhi-Hong Yang,
Lihong Zhi
Abstract:
Algebraic boundaries of convex semi-algebraic sets are closely related to polynomial optimization problems. Building upon Rainer Sinn's work, we refine the stratification of iterated singular loci to a Whitney (a) stratification, which gives a list of candidates of varieties whose dual is an irreducible component of the algebraic boundary of the dual convex body. We also present an algorithm based…
▽ More
Algebraic boundaries of convex semi-algebraic sets are closely related to polynomial optimization problems. Building upon Rainer Sinn's work, we refine the stratification of iterated singular loci to a Whitney (a) stratification, which gives a list of candidates of varieties whose dual is an irreducible component of the algebraic boundary of the dual convex body. We also present an algorithm based on Teissier's criterion to compute Whitney (a) stratifications, which employs conormal spaces and prime decomposition.
△ Less
Submitted 7 June, 2024;
originally announced June 2024.
-
An iterative constraint energy minimizing generalized multiscale finite element method for contact problem
Authors:
Zishang Li,
Changqing Ye,
Eric T. Chung
Abstract:
This work presents an Iterative Constraint Energy Minimizing Generalized Multiscale Finite Element Method (ICEM-GMsFEM) for solving the contact problem with high contrast coefficients. The model problem can be characterized by a variational inequality, where we add a penalty term to convert this problem into a non-smooth and non-linear unconstrained minimizing problem. The characterization of the…
▽ More
This work presents an Iterative Constraint Energy Minimizing Generalized Multiscale Finite Element Method (ICEM-GMsFEM) for solving the contact problem with high contrast coefficients. The model problem can be characterized by a variational inequality, where we add a penalty term to convert this problem into a non-smooth and non-linear unconstrained minimizing problem. The characterization of the minimizer satisfies the variational form of a mixed Dirilect-Neumann-Robin boundary value problem. So we apply CEM-GMsFEM iteratively and introduce special boundary correctors along with multiscale spaces to achieve an optimal convergence rate. Numerical results are conducted for different highly heterogeneous permeability fields, validating the fast convergence of the CEM-GMsFEM iteration in handling the contact boundary and illustrating the stability of the proposed method with different sets of parameters. We also prove the fast convergence of the proposed iterative CEM-GMsFEM method and provide an error estimate of the multiscale solution under a mild assumption.
△ Less
Submitted 5 June, 2024;
originally announced June 2024.
-
Entropy density and large deviation principles without upper semi-continuity of entropy
Authors:
Zhiqiang Li,
Xianghui Shi
Abstract:
Expanding Thurston maps were introduced by M. Bonk and D. Meyer with motivation from complex dynamics and Cannon's conjecture from geometric group theory via Sullivan's dictionary. In this paper, we show that the entropy map of an expanding Thurston map is upper semi-continuous if and only if the map has no periodic critical points. For all expanding Thurston maps, even in the presence of periodic…
▽ More
Expanding Thurston maps were introduced by M. Bonk and D. Meyer with motivation from complex dynamics and Cannon's conjecture from geometric group theory via Sullivan's dictionary. In this paper, we show that the entropy map of an expanding Thurston map is upper semi-continuous if and only if the map has no periodic critical points. For all expanding Thurston maps, even in the presence of periodic critical points, we show that ergodic measures are entropy-dense and establish level-2 large deviation principles for the distribution of Birkhoff averages, periodic points, and iterated preimages. It follows that iterated preimages and periodic points are equidistributed with respect to the unique equilibrium state for an expanding Thurston map and a potential that is Hölder continuous with respect to a visual metric on $S^2$. In particular, our results answer two questions in [Li15].
The main technical tools in this paper are called subsystems of expanding Thurston maps, inspired by a translation of the notion of subgroups via Sullivan's dictionary.
△ Less
Submitted 3 June, 2024;
originally announced June 2024.
-
A note on the Nearly Dispersability of Odd Toroidal Grids
Authors:
Xiaoxiang Yu,
Zeling Shao,
Zhiguo Li
Abstract:
The \emph{matching book thickness} $mbt(G)$ of $G$ is the minimum integer $m$ such that an $m$-page matching book embedding exists. A graph $G$ is called \emph{dispersable} if $mbt(G)=Δ(G)$, \emph{nearly dispersable} if $mbt(G)=Δ(G)+1$. Recently, the authors determined the nearly dispersability of odd toroidal grids $T_{s,t}$. In this note, we further present a brief proof for this result.
The \emph{matching book thickness} $mbt(G)$ of $G$ is the minimum integer $m$ such that an $m$-page matching book embedding exists. A graph $G$ is called \emph{dispersable} if $mbt(G)=Δ(G)$, \emph{nearly dispersable} if $mbt(G)=Δ(G)+1$. Recently, the authors determined the nearly dispersability of odd toroidal grids $T_{s,t}$. In this note, we further present a brief proof for this result.
△ Less
Submitted 1 June, 2024;
originally announced June 2024.
-
Comparison theorems for mean-field BSDEs whose generators depend on the law of the solution $(Y,Z)$
Authors:
Juan Li,
Zhanxin Li,
Chuanzhi Xing
Abstract:
For general mean-field backward stochastic differential equations (BSDEs) it is well-known that we usually do not have the comparison theorem if the coefficients depend on the law of $Z$-component of the solution process $(Y, Z)$. A natural question is whether general mean-field BSDEs whose coefficients depend on the law of $Z$ have the comparison theorem for some cases. In this paper we establish…
▽ More
For general mean-field backward stochastic differential equations (BSDEs) it is well-known that we usually do not have the comparison theorem if the coefficients depend on the law of $Z$-component of the solution process $(Y, Z)$. A natural question is whether general mean-field BSDEs whose coefficients depend on the law of $Z$ have the comparison theorem for some cases. In this paper we establish the comparison theorems for one-dimensional mean-field BSDEs whose coefficients also depend on the joint law of the solution process $(Y,Z)$. With the help of Malliavin calculus and a BMO martingale argument, we obtain two comparison theorems for different cases and a strong comparison result. In particular, in this framework, we compare not only the first component $Y$ of the solution $(Y,Z)$ for such mean-field BSDEs, but also the second component $Z$.
△ Less
Submitted 31 May, 2024;
originally announced June 2024.
-
Improving Generalization and Convergence by Enhancing Implicit Regularization
Authors:
Mingze Wang,
Haotian He,
Jinbo Wang,
Zilin Wang,
Guanhua Huang,
Feiyu Xiong,
Zhiyu Li,
Weinan E,
Lei Wu
Abstract:
In this work, we propose an Implicit Regularization Enhancement (IRE) framework to accelerate the discovery of flat solutions in deep learning, thereby improving generalization and convergence. Specifically, IRE decouples the dynamics of flat and sharp directions, which boosts the sharpness reduction along flat directions while maintaining the training stability in sharp directions. We show that I…
▽ More
In this work, we propose an Implicit Regularization Enhancement (IRE) framework to accelerate the discovery of flat solutions in deep learning, thereby improving generalization and convergence. Specifically, IRE decouples the dynamics of flat and sharp directions, which boosts the sharpness reduction along flat directions while maintaining the training stability in sharp directions. We show that IRE can be practically incorporated with {\em generic base optimizers} without introducing significant computational overload. Experiments show that IRE consistently improves the generalization performance for image classification tasks across a variety of benchmark datasets (CIFAR-10/100, ImageNet) and models (ResNets and ViTs). Surprisingly, IRE also achieves a $2\times$ {\em speed-up} compared to AdamW in the pre-training of Llama models (of sizes ranging from 60M to 229M) on datasets including Wikitext-103, Minipile, and Openwebtext. Moreover, we provide theoretical guarantees, showing that IRE can substantially accelerate the convergence towards flat minima in Sharpness-aware Minimization (SAM).
△ Less
Submitted 31 May, 2024;
originally announced May 2024.
-
Statistical Properties of Robust Satisficing
Authors:
Zhiyi Li,
Yunbei Xu,
Ruohan Zhan
Abstract:
The Robust Satisficing (RS) model is an emerging approach to robust optimization, offering streamlined procedures and robust generalization across various applications. However, the statistical theory of RS remains unexplored in the literature. This paper fills in the gap by comprehensively analyzing the theoretical properties of the RS model. Notably, the RS structure offers a more straightforwar…
▽ More
The Robust Satisficing (RS) model is an emerging approach to robust optimization, offering streamlined procedures and robust generalization across various applications. However, the statistical theory of RS remains unexplored in the literature. This paper fills in the gap by comprehensively analyzing the theoretical properties of the RS model. Notably, the RS structure offers a more straightforward path to deriving statistical guarantees compared to the seminal Distributionally Robust Optimization (DRO), resulting in a richer set of results. In particular, we establish two-sided confidence intervals for the optimal loss without the need to solve a minimax optimization problem explicitly. We further provide finite-sample generalization error bounds for the RS optimizer. Importantly, our results extend to scenarios involving distribution shifts, where discrepancies exist between the sampling and target distributions. Our numerical experiments show that the RS model consistently outperforms the baseline empirical risk minimization in small-sample regimes and under distribution shifts. Furthermore, compared to the DRO model, the RS model exhibits lower sensitivity to hyperparameter tuning, highlighting its practicability for robustness considerations.
△ Less
Submitted 30 May, 2024;
originally announced May 2024.
-
Efficient Optimal Control of Open Quantum Systems
Authors:
Wenhao He,
Tongyang Li,
Xiantao Li,
Zecheng Li,
Chunhao Wang,
Ke Wang
Abstract:
The optimal control problem for open quantum systems can be formulated as a time-dependent Lindbladian that is parameterized by a number of time-dependent control variables. Given an observable and an initial state, the goal is to tune the control variables so that the expected value of some observable with respect to the final state is maximized. In this paper, we present algorithms for solving t…
▽ More
The optimal control problem for open quantum systems can be formulated as a time-dependent Lindbladian that is parameterized by a number of time-dependent control variables. Given an observable and an initial state, the goal is to tune the control variables so that the expected value of some observable with respect to the final state is maximized. In this paper, we present algorithms for solving this optimal control problem efficiently, i.e., having a poly-logarithmic dependency on the system dimension, which is exponentially faster than best-known classical algorithms. Our algorithms are hybrid, consisting of both quantum and classical components. The quantum procedure simulates time-dependent Lindblad evolution that drives the initial state to the final state, and it also provides access to the gradients of the objective function via quantum gradient estimation. The classical procedure uses the gradient information to update the control variables.
At the technical level, we provide the first (to the best of our knowledge) simulation algorithm for time-dependent Lindbladians with an $\ell_1$-norm dependence. As an alternative, we also present a simulation algorithm in the interaction picture to improve the algorithm for the cases where the time-independent component of a Lindbladian dominates the time-dependent part. On the classical side, we heavily adapt the state-of-the-art classical optimization analysis to interface with the quantum part of our algorithms. Both the quantum simulation techniques and the classical optimization analyses might be of independent interest.
△ Less
Submitted 29 May, 2024;
originally announced May 2024.
-
A Converse to the Skoda $L^2$ Division Theorem
Authors:
Zhi Li,
Xiankui Meng,
Jiafu Ning,
Xiangyu Zhou
Abstract:
In this paper, we present a converse to a version of Skoda's $L^2$ division theorem by investigating the solvability of $\bar{\partial}$ equations of a specific type.
In this paper, we present a converse to a version of Skoda's $L^2$ division theorem by investigating the solvability of $\bar{\partial}$ equations of a specific type.
△ Less
Submitted 30 July, 2024; v1 submitted 28 May, 2024;
originally announced May 2024.
-
Delay Performance Analysis of Delay-Deterministic Wireless Networks with Infinite and Finite Blocklength Transmission
Authors:
Hanxue Ding,
Shaoyi Xu,
Ziheng Xu,
Rongtao Xu,
Zonghui Li,
Junhui Zhao
Abstract:
In order to achieve stable and reliable industrial manufacturing, wireless networks must meet the stringent communication requirements of industrial automation, particularly the need for deterministic low latency communication. The limited wireless resources and time-varying fading channel contribute to the random fluctuations of transmission delay, making it challenging to realize delay-determini…
▽ More
In order to achieve stable and reliable industrial manufacturing, wireless networks must meet the stringent communication requirements of industrial automation, particularly the need for deterministic low latency communication. The limited wireless resources and time-varying fading channel contribute to the random fluctuations of transmission delay, making it challenging to realize delay-deterministic wireless networks. An open challenge in this context is to model delay determinism, also known as jitter, and analyze delay performance. In this paper, we model jitter as the variance of delay and conduct a comprehensive analysis of delay performance. Specifically, we consider two transmission regimes: infinite blocklength (IBL) and finite blocklength (FBL). In the IBL regime, the distribution of the transmission delay is analyzed, and the closed-form expressions for the average delay, jitter, and delay violation probability are derived. In the FBL regime, an upper bound on the transmission delay is first approximated at a high signalto-noise ratio. Based on this upper bound, the delay distribution, delay violation probability, average delay, and jitter are derived. Finally, simulation results are provided to validate the accuracy of the analysis and derivations. Additionally, the impact of system parameters on jitter is analyzed to gain further insights.
△ Less
Submitted 27 May, 2024;
originally announced May 2024.
-
A characterization of contact elements
Authors:
Zhenkun Li,
Shunyu Wan
Abstract:
We show that for a non-trivial element $c$ in $\widehat{HF}(-Y)$, there exists a tight contact structure $ξ$ on $Y$ whose contact invariant realizes $c$ if and only if there exists a non-trivial fibered knot $K$ such that $-τ_c(K)=g(K)$. Moreover, when such a fibered knot $K$ does exist, $ξ$ can be chosen to satisfy the extra condition that $K$ admits a Legendrian representative with Thurston-Benn…
▽ More
We show that for a non-trivial element $c$ in $\widehat{HF}(-Y)$, there exists a tight contact structure $ξ$ on $Y$ whose contact invariant realizes $c$ if and only if there exists a non-trivial fibered knot $K$ such that $-τ_c(K)=g(K)$. Moreover, when such a fibered knot $K$ does exist, $ξ$ can be chosen to satisfy the extra condition that $K$ admits a Legendrian representative with Thurston-Bennequin number tb$(K)$ equal to $0$ in $(Y,ξ)$.
△ Less
Submitted 7 July, 2024; v1 submitted 26 May, 2024;
originally announced May 2024.
-
On the dispersability of graph bundles over cycles
Authors:
Zeling Shao,
Xiaoxiang Yu,
Zhiguo Li
Abstract:
In this paper, the dispersability of the Cartesian graph bundle over two cycles is completely solved. We show the Cartesian graph bundle $G$ over two cycles is dispersable if $G$ is bipartite; otherwise, $G$ is nearly dispersable.
In this paper, the dispersability of the Cartesian graph bundle over two cycles is completely solved. We show the Cartesian graph bundle $G$ over two cycles is dispersable if $G$ is bipartite; otherwise, $G$ is nearly dispersable.
△ Less
Submitted 25 May, 2024;
originally announced May 2024.
-
2-torsion in instanton Floer homology
Authors:
Zhenkun Li,
Fan Ye
Abstract:
This paper studies the existence of $2$-torsion in instanton Floer homology with $\mathbb{Z}$ coefficients for closed $3$-manifolds and singular knots. First, we show that the non-existence of $2$-torsion in the framed instanton Floer homology $I^\sharp(S_n^3(K);\mathbb{Z})$ of any nonzero integral $n$-surgery along a knot $K$ in $S^3$ would imply that $K$ is fibered. Also, we show that…
▽ More
This paper studies the existence of $2$-torsion in instanton Floer homology with $\mathbb{Z}$ coefficients for closed $3$-manifolds and singular knots. First, we show that the non-existence of $2$-torsion in the framed instanton Floer homology $I^\sharp(S_n^3(K);\mathbb{Z})$ of any nonzero integral $n$-surgery along a knot $K$ in $S^3$ would imply that $K$ is fibered. Also, we show that $I^\sharp(S_{r}^3(K);\mathbb{Z})$ for any nontrivial $K$ with $r=1,1/2,1/4$ always has $2$-torsion. These two results indicate that the existence of $2$-torsion is expected to be a generic phenomenon for Dehn surgeries along knots. Second, we show that for genus-one knots with nontrivial Alexander polynomials and for unknotting-number-one knots, the unreduced singular instanton knot homology $I^\sharp(S^3,K;\mathbb{Z})$ always has $2$-torsion. Finally, some crucial lemmas that help us demonstrate the existence of $2$-torsion are motivated by analogous results in Heegaard Floer theory, which may be of independent interest. In particular, we show that, for a knot $K$ in $S^3$, if there is a nonzero rational number $r$ such that the dual knot $\widetilde{K}_r$ inside $S^3_r(K)$ is Floer simple, then $S^3_r(K)$ must be an L-space and $K$ must be an L-space knot.
△ Less
Submitted 25 May, 2024;
originally announced May 2024.
-
Classification of Lagrangian translators and Lagrangian self-expanders in $\mathbb{C}^{2}$
Authors:
Zhi Li,
Guoxin Wei
Abstract:
In this paper, we obtain several classification results of $2$-dimensional complete Lagrangian translators and lagrangian self-expanders with constant squared norm $|\vec{H}|^{2}$ of the mean curvature vector in $\mathbb{C}^{2}$ by using a new Omori-Yau type maximum principle which was proved by Chen and Qiu \cite{CQ}. The same idea is also used to give a similar result of Lagrangian $ξ$-translato…
▽ More
In this paper, we obtain several classification results of $2$-dimensional complete Lagrangian translators and lagrangian self-expanders with constant squared norm $|\vec{H}|^{2}$ of the mean curvature vector in $\mathbb{C}^{2}$ by using a new Omori-Yau type maximum principle which was proved by Chen and Qiu \cite{CQ}. The same idea is also used to give a similar result of Lagrangian $ξ$-translators in $\mathbb{C}^{2}$.
△ Less
Submitted 22 May, 2024;
originally announced May 2024.
-
Dynamical behavior and optimal control of a stochastic SAIRS epidemic model with two saturated incidences
Authors:
Xiaohui Zhang,
Zhiming Li,
Shenglong Chen,
Jikai Yang
Abstract:
Stochastic models are widely used to investigate the spread of epidemics in a complex environment. This paper extends a deterministic SAIRS epidemic model to a stochastic case with limited patient capacity and exposure. We first study the dynamical properties of the model under certain conditions, including persistence, extinction, and ergodic. Then, we introduce vaccination and isolation into the…
▽ More
Stochastic models are widely used to investigate the spread of epidemics in a complex environment. This paper extends a deterministic SAIRS epidemic model to a stochastic case with limited patient capacity and exposure. We first study the dynamical properties of the model under certain conditions, including persistence, extinction, and ergodic. Then, we introduce vaccination and isolation into the model as control variables. The optimal control strategies are obtained based on the Pontryagin minimum principle. Finally, numerical simulations are given to illustrate our theoretical results.
△ Less
Submitted 16 May, 2024; v1 submitted 16 May, 2024;
originally announced May 2024.
-
Uniqueness Problem for the Backward Differential Equation of a Continuous-State Branching Process
Authors:
Pei-Sen Li,
Zenghu Li
Abstract:
The distributional properties of a multi-dimensional continuous-state branching process are determined by its cumulant semigroup, which is defined by the backward differential equation. We provide a proof of the assertion of Rhyzhov and Skorokhod (Theory Probab. Appl., 1970) on the uniqueness of the solutions to the equation, which is based on a characterization of the process as the pathwise uniq…
▽ More
The distributional properties of a multi-dimensional continuous-state branching process are determined by its cumulant semigroup, which is defined by the backward differential equation. We provide a proof of the assertion of Rhyzhov and Skorokhod (Theory Probab. Appl., 1970) on the uniqueness of the solutions to the equation, which is based on a characterization of the process as the pathwise unique solution to a system of stochastic equations.
△ Less
Submitted 9 May, 2024;
originally announced May 2024.
-
General harmonic measures for distance-expanding dynamical systems
Authors:
Zhiqiang Li,
Ruicen Qiu
Abstract:
Partially motivated by the study of I. Binder, N. Makarov, and S. Smirnov [BMS03] on dimension spectra of polynomial Cantor sets, we initiate the investigation on some general harmonic measures, inspired by Sullivan's dictionary, for distance-expanding dynamical systems. Let $f\colon X\to X$ be an open distance-expanding map on a compact metric space $(X,ρ)$. A Gromov hyperbolic tile graph $Γ$ ass…
▽ More
Partially motivated by the study of I. Binder, N. Makarov, and S. Smirnov [BMS03] on dimension spectra of polynomial Cantor sets, we initiate the investigation on some general harmonic measures, inspired by Sullivan's dictionary, for distance-expanding dynamical systems. Let $f\colon X\to X$ be an open distance-expanding map on a compact metric space $(X,ρ)$. A Gromov hyperbolic tile graph $Γ$ associated to the dynamical system $(X,f)$ is constructed following the ideas from M. Bonk, D. Meyer [BM17] and P. Haïssinsky, K. M. Pilgrim [HP09]. We consider a class of one-sided random walks associated with $(X,f)$ on $Γ$. They induce a Martin boundary of the tile graph, which may be different from the hyperbolic boundary. We show that the Martin boundary of such a random walk admits a surjection to $X$. We provide a class of examples to show that the surjection may not be a homeomorphism. Such random walks also induce measures on $X$ called harmonic measures. When $ρ$ is a visual metric, we establish an equality between the fractal dimension of the harmonic measure and the asymptotic quantities of the random walk.
△ Less
Submitted 5 May, 2024;
originally announced May 2024.
-
Inexact Adaptive Cubic Regularization Algorithms on Riemannian Manifolds and Application
Authors:
Z. Y. Li,
X. M. Wang
Abstract:
The adaptive cubic regularization algorithm employing the inexact gradient and Hessian is proposed on general Riemannian manifolds, together with the iteration complexity to get an approximate second-order optimality under certain assumptions on accuracies about the inexact gradient and Hessian. The algorithm extends the inexact adaptive cubic regularization algorithm under true gradient in [Math.…
▽ More
The adaptive cubic regularization algorithm employing the inexact gradient and Hessian is proposed on general Riemannian manifolds, together with the iteration complexity to get an approximate second-order optimality under certain assumptions on accuracies about the inexact gradient and Hessian. The algorithm extends the inexact adaptive cubic regularization algorithm under true gradient in [Math. Program., 184(1-2): 35-70, 2020] to more general cases even in Euclidean settings. As an application, the algorithm is applied to solve the joint diagonalization problem on the Stiefel manifold. Numerical experiments illustrate that the algorithm performs better than the inexact trust-region algorithm in [Advances of the neural information processing systems, 31, 2018].
△ Less
Submitted 4 May, 2024;
originally announced May 2024.
-
Omega Theorems for Logarithmic Derivatives of Zeta and L-functions Near the 1-line
Authors:
Zhonghua Li,
Shengbo Zhao
Abstract:
We establish an omega theorem for logarithmic derivative of the Riemann zeta function near the 1-line by resonance method. We show that the inequality $\left| ζ^{\prime}\left(σ_A+it\right)/ζ\left(σ_A+it\right) \right| \geqslant \left(\left(e^A-1\right)/A\right)\log_2 T + O\left(\log_2 T / \log_3 T\right)$ has a solution $t \in [T^β, T]$ for all sufficiently large $T,$ where…
▽ More
We establish an omega theorem for logarithmic derivative of the Riemann zeta function near the 1-line by resonance method. We show that the inequality $\left| ζ^{\prime}\left(σ_A+it\right)/ζ\left(σ_A+it\right) \right| \geqslant \left(\left(e^A-1\right)/A\right)\log_2 T + O\left(\log_2 T / \log_3 T\right)$ has a solution $t \in [T^β, T]$ for all sufficiently large $T,$ where $σ_A = 1 - A / \log_2 {T}.$Furthermore, we give a conditional lower bound for the measure of the set of $t$ for which the logarithmic derivative of the Riemann zeta function is large. Moreover, similar results can be generalized to Dirichlet $L$-functions.
△ Less
Submitted 26 April, 2024;
originally announced April 2024.
-
Finite-time blowup for Keller-Segel-Navier-Stokes system in three dimensions
Authors:
Zexing Li,
Tao Zhou
Abstract:
While finite-time blowup solutions have been studied in depth for the Keller-Segel equation, a fundamental model describing chemotaxis, the existence of finite-time blowup solutions to chemotaxis-fluid models remains largely unexplored. To fill this gap in the literature, we use a quantitative method to directly construct a smooth finite-time blowup solution for the Keller-Segel-Navier-Stokes syst…
▽ More
While finite-time blowup solutions have been studied in depth for the Keller-Segel equation, a fundamental model describing chemotaxis, the existence of finite-time blowup solutions to chemotaxis-fluid models remains largely unexplored. To fill this gap in the literature, we use a quantitative method to directly construct a smooth finite-time blowup solution for the Keller-Segel-Navier-Stokes system with buoyancy in 3D. The heart of the proof is to establish the non-radial finite-codimensional stability of an explicit self-similar blowup solution to 3D Keller-Segel equation with the abstract semigroup tool from [Merle-Raphaël-Rodnianski-Szeftel, 2022], which partially generalizes the radial stability result [Glogić-Schörkhuber, 2024] to the non-radial setting. Additionally, we introduce a robust localization argument to find blowup solutions with non-negative density and finite mass.
△ Less
Submitted 26 April, 2024;
originally announced April 2024.
-
The Grothendieck group of a triangulated category
Authors:
Xiao-Wu Chen,
Zhi-Wei Li,
Xiaojin Zhang,
Zhibing Zhao
Abstract:
We give a direct proof of the following known result: the Grothendieck group of a triangulated category with a silting subcategory is isomorphic to the split Grothendieck group of the silting subcategory. Moreover, we obtain its cluster-tilting analogue.
We give a direct proof of the following known result: the Grothendieck group of a triangulated category with a silting subcategory is isomorphic to the split Grothendieck group of the silting subcategory. Moreover, we obtain its cluster-tilting analogue.
△ Less
Submitted 31 July, 2024; v1 submitted 24 April, 2024;
originally announced April 2024.
-
Bifurcation for the Lotka-Volterra competition model
Authors:
Zaizheng Li,
Susanna Terracini
Abstract:
We analyze the bifurcation phenomenon for the following two-component competition system:
\begin{equation*}
\begin{cases}
-Δu_1=μu_1(1-u_1)-βαu_1u_2,& \text{in}\ B_1\subset \mathbb{R}^N,
-Δu_2=σu_2(1-u_2)-βγu_1u_2,& \text{in}\ B_1\subset \mathbb{R}^N,
\frac{\partial u_1}{\partial n}= \frac{\partial u_2}{\partial n} =0,&\text{on}\ \partial B_1,
\end{cases}
\end{equation*}
where…
▽ More
We analyze the bifurcation phenomenon for the following two-component competition system:
\begin{equation*}
\begin{cases}
-Δu_1=μu_1(1-u_1)-βαu_1u_2,& \text{in}\ B_1\subset \mathbb{R}^N,
-Δu_2=σu_2(1-u_2)-βγu_1u_2,& \text{in}\ B_1\subset \mathbb{R}^N,
\frac{\partial u_1}{\partial n}= \frac{\partial u_2}{\partial n} =0,&\text{on}\ \partial B_1,
\end{cases}
\end{equation*}
where $N\ge 2$, $α>γ>0$, $σ\geμ>0$ and $β>\fracσγ$. More precisely, treating $β$ as the bifurcation parameter, we initially perform a local bifurcation analysis around the positive constant solutions, obtaining precise information of where bifurcation could occur, and determine the direction of bifurcation. As a byproduct, the instability of the constant solution is provided. Furthermore, we extend our exploration to the global bifurcation analysis.
Lastly, under the condition $σ=μ$, we demonstrate the limiting configuration on each bifurcation branch as the competition rate $β\rightarrow+\infty$.
△ Less
Submitted 20 April, 2024;
originally announced April 2024.
-
Quantum Assisted Stochastic Economic Dispatch for Renewables Rich Power Systems
Authors:
Xutao Han,
Zhiyi Li,
Yue Xu
Abstract:
Considering widely dispersed uncertain renewable energy sources (RESs), scenario-based stochastic optimization is an effective method for the economic dispatch of renewables-rich power systems. However, on classic computers, to simulate RES uncertainties with high accuracy, the massive scenario generation is very time-consuming, and the pertinent optimization problem is high-dimensional NP-hard mi…
▽ More
Considering widely dispersed uncertain renewable energy sources (RESs), scenario-based stochastic optimization is an effective method for the economic dispatch of renewables-rich power systems. However, on classic computers, to simulate RES uncertainties with high accuracy, the massive scenario generation is very time-consuming, and the pertinent optimization problem is high-dimensional NP-hard mixed-integer programming. To this end, we design a quantum-assisted scheme to accelerate the stochastic optimization for power system economic dispatch without losing accuracy. We first propose the unified quantum amplitude estimation to characterize RES uncertainties, thereby generating massive scenarios by a few qubits to reduce state variables. Then, strong Benders cuts corresponding to some specific scenarios are selected to control the solution scale of Benders master problem in the iterative process, all of which are implemented by customized quantum approximation optimization algorithms. Finally, we perform numerical experiments on the modified IEEE 6-bus system to test the designed scheme.
△ Less
Submitted 15 April, 2024;
originally announced April 2024.
-
A Novel State-Centric Necessary Condition for Time-Optimal Control of Controllable Linear Systems Based on Augmented Switching Laws
Authors:
Yunan Wang,
Chuxiong Hu,
Yujie Lin,
Zeyang Li,
Shize Lin,
Suqin He
Abstract:
Most existing necessary conditions for optimal control based on adjoining methods require both state information and costate information, yet the lack of costates for a given feasible trajectory in practice impedes the determination of optimality. This paper establishes a novel theoretical framework for time-optimal control of controllable linear systems, proposing the augmented switching law that…
▽ More
Most existing necessary conditions for optimal control based on adjoining methods require both state information and costate information, yet the lack of costates for a given feasible trajectory in practice impedes the determination of optimality. This paper establishes a novel theoretical framework for time-optimal control of controllable linear systems, proposing the augmented switching law that represents the input control and the feasibility in a compact form. Given a feasible trajectory, the disturbed trajectory under the constraints of augmented switching law is guaranteed to be feasible, resulting in a novel state-centric necessary condition without dependence on costate information. A first order necessary condition is proposed that the Jacobian matrix of the augmented switching law is not full row rank, which also results in an approach to optimizing a given feasible trajectory further. The proposed necessary condition is applied to the chain-of-integrators systems with full box constraints, contributing to some conclusions challenging to reason by traditional costate-based necessary conditions.
△ Less
Submitted 13 April, 2024;
originally announced April 2024.
-
On testing mean of high dimensional compositional data
Authors:
Qianqian Jiang,
Wenbo Li,
Zeng Li
Abstract:
We investigate one/two-sample mean tests for high-dimensional compositional data when the number of variables is comparable with the sample size, as commonly encountered in microbiome research. Existing methods mainly focus on max-type test statistics which are suitable for detecting sparse signals. However, in this paper, we introduce a novel approach using sum-type test statistics which are capa…
▽ More
We investigate one/two-sample mean tests for high-dimensional compositional data when the number of variables is comparable with the sample size, as commonly encountered in microbiome research. Existing methods mainly focus on max-type test statistics which are suitable for detecting sparse signals. However, in this paper, we introduce a novel approach using sum-type test statistics which are capable of detecting weak but dense signals. By establishing the asymptotic independence between the max-type and sum-type test statistics, we further propose a combined max-sum type test to cover both cases. We derived the asymptotic null distributions and power functions for these test statistics. Simulation studies demonstrate the superiority of our max-sum type test statistics which exhibit robust performance regardless of data sparsity.
△ Less
Submitted 12 April, 2024;
originally announced April 2024.
-
Thermodynamic formalism for subsystems of expanding Thurston maps II
Authors:
Zhiqiang Li,
Xianghui Shi
Abstract:
Expanding Thurston maps were introduced by M. Bonk and D. Meyer with motivation from complex dynamics and Cannon's conjecture from geometric group theory via Sullivan's dictionary. In this paper, we study subsystems of expanding Thurston maps motivated via Sullivan's dictionary as analogs of some subgroups of Kleinian groups. We prove the uniqueness and various ergodic properties of the equilibriu…
▽ More
Expanding Thurston maps were introduced by M. Bonk and D. Meyer with motivation from complex dynamics and Cannon's conjecture from geometric group theory via Sullivan's dictionary. In this paper, we study subsystems of expanding Thurston maps motivated via Sullivan's dictionary as analogs of some subgroups of Kleinian groups. We prove the uniqueness and various ergodic properties of the equilibrium states for strongly primitive subsystems and real-valued Hölder continuous potentials, and establish the equidistribution of preimages of subsystems with respect to the equilibrium states. Here, the sphere $S^{2}$ is equipped with a natural metric, called a visual metric, introduced by M. Bonk and D. Meyer. As a result, for strongly primitive subsystems of expanding Thurston maps without periodic critical points, we obtain a level-$2$ large deviation principle for Birkhoff averages and iterated preimages.
△ Less
Submitted 10 April, 2024;
originally announced April 2024.
-
On the robustness of double-word addition algorithms
Authors:
Yuanyuan Yang,
XinYu Lyu,
Sida He,
Xiliang Lu,
Ji Qi,
Zhihao Li
Abstract:
We demonstrate that, even when there are moderate overlaps in the inputs of sloppy or accurate double-word addition algorithms in the QD library, these algorithms still guarantee error bounds of $O(u^2(|a|+|b|))$ in faithful rounding. Furthermore, the accurate algorithm can achieve a relative error bound of $O(u^2)$ in the presence of moderate overlaps in the inputs when rounding function is round…
▽ More
We demonstrate that, even when there are moderate overlaps in the inputs of sloppy or accurate double-word addition algorithms in the QD library, these algorithms still guarantee error bounds of $O(u^2(|a|+|b|))$ in faithful rounding. Furthermore, the accurate algorithm can achieve a relative error bound of $O(u^2)$ in the presence of moderate overlaps in the inputs when rounding function is round-to-nearest. The relative error bound also holds in directed rounding, but certain additional conditions are required. Consequently, in double-word multiplication and addition operations, we can safely omit the normalization step of double-word multiplication and replace the accurate addition algorithm with the sloppy one. Numerical experiments confirm that this approach nearly doubles the performance of double-word multiplication and addition operations, with negligible precision costs. Moreover, in directed rounding mode, the signs of the errors of the two algorithms are consistent with the rounding direction, even in the presence of input overlap. This allows us to avoid changing the rounding mode in interval arithmetic. We also prove that the relative error bound of the sloppy addition algorithm exceeds $3u^2$ if and only if the input meets the condition of Sterbenz's Lemma when rounding to nearest. These findings suggest that the two addition algorithms are more robust than previously believed.
△ Less
Submitted 10 April, 2024; v1 submitted 8 April, 2024;
originally announced April 2024.
-
Implicit Bias of AdamW: $\ell_\infty$ Norm Constrained Optimization
Authors:
Shuo Xie,
Zhiyuan Li
Abstract:
Adam with decoupled weight decay, also known as AdamW, is widely acclaimed for its superior performance in language modeling tasks, surpassing Adam with $\ell_2$ regularization in terms of generalization and optimization. However, this advantage is not theoretically well-understood. One challenge here is that though intuitively Adam with $\ell_2$ regularization optimizes the $\ell_2$ regularized l…
▽ More
Adam with decoupled weight decay, also known as AdamW, is widely acclaimed for its superior performance in language modeling tasks, surpassing Adam with $\ell_2$ regularization in terms of generalization and optimization. However, this advantage is not theoretically well-understood. One challenge here is that though intuitively Adam with $\ell_2$ regularization optimizes the $\ell_2$ regularized loss, it is not clear if AdamW optimizes a specific objective. In this work, we make progress toward understanding the benefit of AdamW by showing that it implicitly performs constrained optimization. More concretely, we show in the full-batch setting, if AdamW converges with any non-increasing learning rate schedule whose partial sum diverges, it must converge to a KKT point of the original loss under the constraint that the $\ell_\infty$ norm of the parameter is bounded by the inverse of the weight decay factor. This result is built on the observation that Adam can be viewed as a smoothed version of SignGD, which is the normalized steepest descent with respect to $\ell_\infty$ norm, and a surprising connection between normalized steepest descent with weight decay and Frank-Wolfe.
△ Less
Submitted 5 April, 2024;
originally announced April 2024.