(Translated by https://www.hiragana.jp/)
Optimal transport with constraints: from mirror descent to classical mechanics

Optimal transport with constraints: from mirror descent to classical mechanics

Abdullahi Adinoyi Ibrahim abdullahi.ibrahim@tuebingen.mpg.de Max Planck Institute for Intelligent Systems, Cyber Valley, Tübingen 72076, Germany    Michael Muehlebach michaelm@tuebingen.mpg.de Max Planck Institute for Intelligent Systems, Cyber Valley, Tübingen 72076, Germany    Caterina De Bacco caterina.debacco@tuebingen.mpg.de Max Planck Institute for Intelligent Systems, Cyber Valley, Tübingen 72076, Germany
Abstract

Finding optimal trajectories for multiple traffic demands in a congested network is a challenging task. Optimal transport theory is a principled approach that has been used successfully to study various transportation problems. Its usage is limited by the lack of principled and flexible ways to incorporate realistic constraints. We propose a principled physics-based approach to impose constraints flexibly in optimal transport problems. Constraints are included in mirror descent dynamics using the principle of D’Alembert-Lagrange from classical mechanics. This results in a sparse, local and linear approximation of the feasible set leading in many cases to closed-form updates.

Introduction.

Optimal transport in networks has important applications in different disciplines, in particular in urban transportation networks Arnott and Small (1994). Congestion not only increases travel time for users and decreases productivity, but it also drives air pollution. Reducing congestion and making transportation more efficient are also a core objective for EU policies, as highlighted throughout the EU Transport White Paper and the Strategic Plan 2020-2024 Commission (2011, 2020).

The design of efficient transportation networks is a complex task that requires a multifaceted solution. One of these facets is the problem of finding optimal routes for passengers. This is a well-studied problem in operations research Ahuja et al. (1988) where minimum-cost optimization is often considered to model discrete flows and can be solved using classical techniques from linear programming. In our work, we consider the continuous case, where flows are real-valued quantities. A variety of approaches have been suggested to model transport in networks using techniques from physics of complex systems Morris and Barthelemy (2012); Gao et al. (2019). Path optimality and congestion control have been studied in discrete settings Noh and Rieger (2002); Dobrin and Duxbury (2001); Bayati et al. (2008) or using the cavity method Yeung and Saad (2012); Yeung et al. (2013). These usually rely on ad-hoc algorithmic updates that depend on the specific type of constraints. The computational complexity of the ad-hoc updates is greatly influenced by the constraints. Other approaches have been proposed to investigate navigation in complex systems Solé-Ribalta et al. (2016); Gómez-Gardenes and Latora (2008); Lacasa et al. (2009); Sneppen et al. (2005); Rosvall et al. (2005); Zhao et al. (2005); Estrada et al. (2023), where the focus lies on investigating the properties of flows, rather than their optimization, as we consider here. In addition, these models often assume that passengers follow their shortest paths, an assumption, which may not be satisfied in practice. Adaptation dynamics Tero et al. (2010); Hu and Cai (2013); Ronellenfitsch and Katifori (2016) have been proposed to model biological distribution networks. However, these methods fall short of describing realistic scenarios where transport flows are limited by constraints.

In the following we cast the problem of designing efficient transportation networks under the broader framework of optimal transport theory (OT) Santambrogio (2015). This has been used to model and optimize various aspects of transport networks such as network design Tero et al. (2010); Ronellenfitsch and Katifori (2016); Baptista et al. (2020); Leite and De Bacco (2022) and traffic flows Bonifaci et al. (2012); Lonardi et al. (2021); Bohn and Magnasco (2007); Ibrahim et al. (2021); Lonardi et al. (2023a). These approaches guarantee a principled and computationally efficient way of solving transportation problems on networks. In addition, they model traffic congestion with a single tuning parameter that enables a transition between opposite traffic regimes, where traffic congestion can either be consolidated or discouraged. In standard OT methods, beyond few obvious constraints (e.g. conservation of mass), the amount of flow passing through an edge of the transportation network is unconstrained. As a result, traffic tends to concentrate on path trajectories that may be structurally unfeasible, which severely limits the applicability of OT models in real-world situations, where, for example, roads have a limited capacity of vehicles traveling at the same time. This letter proposes an approach to avoid this crucial flaw of OT models by imposing constraints. Applying this approach significantly impacts the overall network topology induced by the optimal flows, as the resulting path trajectories have different path lengths and traffic distribution than those obtained from unconstrained scenarios.

Our approach has not only a solid foundation via the principle of D’Alembert-Lagrange from classical mechanics Lanczos (1949), but also leads to algorithms that are computationally efficient and have a low implementation complexity. The key idea is to consider mirror descent dynamics of an OT problem, where constraints are included on a velocity level. This leads to a sparse, local and linear approximation of the feasible set which, in many cases, allows for a closed-form update rule, even in situations where the feasible set is nonconvex.

The model.

In analogy with electrical grids or hydraulic networks, we model mass flow on a transportation network using conductivities and flows on network edges. We consider a multi-commodity scenario Lonardi et al. (2021); Bonifaci et al. (2022), where mass of different type i=1,,M𝑖1𝑀i=1,\dots,Mitalic_i = 1 , … , italic_M can move along different trajectories. The flow Feisuperscriptsubscript𝐹𝑒𝑖F_{e}^{i}italic_F start_POSTSUBSCRIPT italic_e end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_i end_POSTSUPERSCRIPT of mass of type i𝑖iitalic_i along an edge e=(u,v)𝑒𝑢𝑣e=(u,v)italic_e = ( italic_u , italic_v ) can be described by Fei=μみゅーe(puipvi)/e,superscriptsubscript𝐹𝑒𝑖subscript𝜇𝑒superscriptsubscript𝑝𝑢𝑖superscriptsubscript𝑝𝑣𝑖subscript𝑒F_{e}^{i}=\mu_{e}(p_{u}^{i}-p_{v}^{i})/\ell_{e},italic_F start_POSTSUBSCRIPT italic_e end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_i end_POSTSUPERSCRIPT = italic_μみゅー start_POSTSUBSCRIPT italic_e end_POSTSUBSCRIPT ( italic_p start_POSTSUBSCRIPT italic_u end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_i end_POSTSUPERSCRIPT - italic_p start_POSTSUBSCRIPT italic_v end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_i end_POSTSUPERSCRIPT ) / roman_ℓ start_POSTSUBSCRIPT italic_e end_POSTSUBSCRIPT , where puisuperscriptsubscript𝑝𝑢𝑖p_{u}^{i}italic_p start_POSTSUBSCRIPT italic_u end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_i end_POSTSUPERSCRIPT is a pressure potential at node u𝑢uitalic_u for passenger of type i𝑖iitalic_i, esubscript𝑒\ell_{e}roman_ℓ start_POSTSUBSCRIPT italic_e end_POSTSUBSCRIPT is the length of the edge e𝑒eitalic_e and μみゅーesubscript𝜇𝑒\mu_{e}italic_μみゅー start_POSTSUBSCRIPT italic_e end_POSTSUBSCRIPT its conductivity. This latter quantity can be seen as proportional to the size of an edge, and is the main variable of interest in determining optimal trajectories. Once the conductivity is known, the pressure differences can then be calculated from Kirchhoff’s law, which in turns determines the flows Feisuperscriptsubscript𝐹𝑒𝑖F_{e}^{i}italic_F start_POSTSUBSCRIPT italic_e end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_i end_POSTSUPERSCRIPT, see Supporting Material (SM) sup . In the absence of constraints, the optimal conductivities are the stationary solutions of the dynamics μみゅー˙=f˙𝜇𝑓\dot{\mu}=fover˙ start_ARG italic_μみゅー end_ARG = italic_f, where

fe=μみゅーeβべーたi(puipvi)2e2μみゅーeμみゅーeβべーた2|Fe|2μみゅーe,subscript𝑓𝑒superscriptsubscript𝜇𝑒𝛽subscript𝑖superscriptsuperscriptsubscript𝑝𝑢𝑖superscriptsubscript𝑝𝑣𝑖2superscriptsubscript𝑒2subscript𝜇𝑒superscriptsubscript𝜇𝑒𝛽2superscriptsubscript𝐹𝑒2subscript𝜇𝑒f_{e}=\mu_{e}^{\beta}\frac{\sum_{i}(p_{u}^{i}-p_{v}^{i})^{2}}{\ell_{e}^{2}}-% \mu_{e}\equiv\mu_{e}^{\beta-2}|F_{e}|^{2}-\mu_{e}\quad,italic_f start_POSTSUBSCRIPT italic_e end_POSTSUBSCRIPT = italic_μみゅー start_POSTSUBSCRIPT italic_e end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_βべーた end_POSTSUPERSCRIPT divide start_ARG ∑ start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ( italic_p start_POSTSUBSCRIPT italic_u end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_i end_POSTSUPERSCRIPT - italic_p start_POSTSUBSCRIPT italic_v end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_i end_POSTSUPERSCRIPT ) start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_ARG start_ARG roman_ℓ start_POSTSUBSCRIPT italic_e end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_ARG - italic_μみゅー start_POSTSUBSCRIPT italic_e end_POSTSUBSCRIPT ≡ italic_μみゅー start_POSTSUBSCRIPT italic_e end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_βべーた - 2 end_POSTSUPERSCRIPT | italic_F start_POSTSUBSCRIPT italic_e end_POSTSUBSCRIPT | start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT - italic_μみゅー start_POSTSUBSCRIPT italic_e end_POSTSUBSCRIPT , (1)

with Fe=(Fe1,,FeM)subscript𝐹𝑒superscriptsubscript𝐹𝑒1superscriptsubscript𝐹𝑒𝑀F_{e}=(F_{e}^{1},\dots,F_{e}^{M})italic_F start_POSTSUBSCRIPT italic_e end_POSTSUBSCRIPT = ( italic_F start_POSTSUBSCRIPT italic_e end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 1 end_POSTSUPERSCRIPT , … , italic_F start_POSTSUBSCRIPT italic_e end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_M end_POSTSUPERSCRIPT ) and |||\cdot|| ⋅ | denotes the Euclidean norm. Intuitively, this equation describes a positive feedback mechanism where conductivities increase for larger fluxes and decrease for negligible ones Tero et al. (2010). It can be shown that the dynamics in Eq. 1 admits a Lyapunov function 𝔏βべーたsubscript𝔏𝛽\mathfrak{L}_{\beta}fraktur_L start_POSTSUBSCRIPT italic_βべーた end_POSTSUBSCRIPT which can be interpreted as a combination of the cost to operate the network and that of building the infrastructure Lonardi et al. (2021), see SM sup . Moreover, we have that f=S𝔏βべーた𝑓𝑆subscript𝔏𝛽f=-S\,\nabla\mathfrak{L}_{\beta}italic_f = - italic_S ∇ fraktur_L start_POSTSUBSCRIPT italic_βべーた end_POSTSUBSCRIPT, where S𝑆Sitalic_S is a diagonal matrix with diagonal entries Se=2μみゅーeβべーた/esubscript𝑆𝑒2superscriptsubscript𝜇𝑒𝛽subscript𝑒S_{e}=2\mu_{e}^{\beta}/\ell_{e}italic_S start_POSTSUBSCRIPT italic_e end_POSTSUBSCRIPT = 2 italic_μみゅー start_POSTSUBSCRIPT italic_e end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_βべーた end_POSTSUPERSCRIPT / roman_ℓ start_POSTSUBSCRIPT italic_e end_POSTSUBSCRIPT and Eq. 1 can therefore be seen as a mirror descent for the cost function 𝔏βべーたsubscript𝔏𝛽\mathfrak{L}_{\beta}fraktur_L start_POSTSUBSCRIPT italic_βべーた end_POSTSUBSCRIPT Bonifaci (2021). This scaling in S𝑆Sitalic_S has the advantage of ensuring good behavior of the resulting numerical methods. One can also reinterpret Eq. 1 as a classical gradient descent by applying a suitable transformation Facca and Benzi (2021), we do not explore this here.

Variants of these dynamics have been proposed to model distributions over networks Hu and Cai (2013); Bohn and Magnasco (2007); Katifori et al. (2010); Banavar et al. (2000); Ronellenfitsch and Katifori (2016). The constant βべーた(0,2)𝛽02\beta\in(0,2)italic_βべーた ∈ ( 0 , 2 ) regulates the desired transportation regime. The setting βべーた<1𝛽1\beta<1italic_βべーた < 1 penalizes traffic congestion by distributing paths on more edges, βべーた>1𝛽1\beta>1italic_βべーた > 1 encourages path consolidation into fewer highways, and βべーた=1𝛽1\beta=1italic_βべーた = 1 is shortest path-like.

In addition to imposing Kirchhoff’s law on nodes to ensure mass conservation, solving these dynamics outputs otherwise unconstrained optimal μみゅーesubscript𝜇𝑒\mu_{e}italic_μみゅー start_POSTSUBSCRIPT italic_e end_POSTSUBSCRIPT and Fesubscript𝐹𝑒F_{e}italic_F start_POSTSUBSCRIPT italic_e end_POSTSUBSCRIPT (see SM sup ). While this may be enough in ideal cases, in more realistic scenarios it is important to further constrain the solution. For instance, structural constraints may limit the maximum amount of flow that an edge can carry, or a budget constraint may be used to limit the infrastructure cost for building the network. Hence, the dynamics μみゅー˙=f˙𝜇𝑓\dot{\mu}=fover˙ start_ARG italic_μみゅー end_ARG = italic_f must be altered to account for these additional constraints.
There are many ways in which constraints can be added. A popular approach is to add constraints on a so-called position level, which leads to gradient inclusions in continuous time (Aubin and Cellina, 1984, Ch 3.4), and projected gradient descent in discrete time. Unfortunately, the scope of projected gradients is limited, due to the fact that projections can only be efficiently evaluated for constraints that have a particular structure (such as a low-dimensional hyperplane, the probability simplex, or a Euclidean norm ball). When the feasible set is nonconvex and/or fails to have a simple structure, evaluating projections is a computationally daunting task. This motivates our formulation (see also Muehlebach and Jordan (2022)), which includes constraints on a velocity level and yields a sparse local and linear approximation of the feasible set. As a consequence, the updates for μみゅー𝜇\muitalic_μみゅー can often still be evaluated in closed-form (or there is an efficient way of computing them numerically) even though the underlying feasible set is nonconvex or fails to have a simple structure. We will highlight explicit examples of such situations in the remainder of this letter.

We define C:={μみゅー0E|g(μみゅー)0}assign𝐶conditional-set𝜇superscriptsubscriptabsent0𝐸𝑔𝜇0C:=\{\mu\in\mathbb{R}_{\geq 0}^{E}\ |\ g(\mu)\geq 0\}italic_C := { italic_μみゅー ∈ blackboard_R start_POSTSUBSCRIPT ≥ 0 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_E end_POSTSUPERSCRIPT | italic_g ( italic_μみゅー ) ≥ 0 } as the set of feasible conductivities μみゅー=(μみゅー1,,μみゅーE)𝜇subscript𝜇1subscript𝜇𝐸\mu=(\mu_{1},\dots,\mu_{E})italic_μみゅー = ( italic_μみゅー start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , … , italic_μみゅー start_POSTSUBSCRIPT italic_E end_POSTSUBSCRIPT ), with g𝑔gitalic_g a constraint function that we assume continuously differentiable and E𝐸Eitalic_E is the number of network edges. Interpreting μみゅー𝜇\muitalic_μみゅー as a “position” variable we can equivalently express the constraints in C𝐶Citalic_C in terms of a “velocity” variable by imposing μみゅー˙(t)Vαあるふぁ(μみゅー(t))˙𝜇𝑡subscript𝑉𝛼𝜇𝑡\dot{\mu}(t)\in V_{\alpha}(\mu(t))over˙ start_ARG italic_μみゅー end_ARG ( italic_t ) ∈ italic_V start_POSTSUBSCRIPT italic_αあるふぁ end_POSTSUBSCRIPT ( italic_μみゅー ( italic_t ) ), where Vαあるふぁ(μみゅー(t))subscript𝑉𝛼𝜇𝑡V_{\alpha}(\mu(t))italic_V start_POSTSUBSCRIPT italic_αあるふぁ end_POSTSUBSCRIPT ( italic_μみゅー ( italic_t ) ) is the set of feasible velocities and αあるふぁ0𝛼0\alpha\geq 0italic_αあるふぁ ≥ 0 is a constant typically referred to as a “restitution” parameter or “slackness”, see Appendix for details.

For μみゅー(t)C𝜇𝑡𝐶\mu(t)\not\in Citalic_μみゅー ( italic_t ) ∉ italic_C and an active constraint i𝑖iitalic_i, the constraint μみゅー˙(t)Vαあるふぁ(μみゅー(t))˙𝜇𝑡subscript𝑉𝛼𝜇𝑡\dot{\mu}(t)\in V_{\alpha}(\mu(t))over˙ start_ARG italic_μみゅー end_ARG ( italic_t ) ∈ italic_V start_POSTSUBSCRIPT italic_αあるふぁ end_POSTSUBSCRIPT ( italic_μみゅー ( italic_t ) ) is equivalent to dgi(μみゅー(t))/dtαあるふぁgi(μみゅー(t))dsubscript𝑔𝑖𝜇𝑡d𝑡𝛼subscript𝑔𝑖𝜇𝑡\text{d}g_{i}(\mu(t))/\text{d}t\geq-\alpha g_{i}(\mu(t))d italic_g start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ( italic_μみゅー ( italic_t ) ) / d italic_t ≥ - italic_αあるふぁ italic_g start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ( italic_μみゅー ( italic_t ) ), which ensures that potential constraint violations decay at the rate αあるふぁ>0𝛼0\alpha>0italic_αあるふぁ > 0. The situation is visualized in Fig. 1(A).

Refer to caption
Figure 1: (A) Visualization of the set C𝐶Citalic_C and the set of feasible velocities Vαあるふぁ(μみゅー1)subscript𝑉𝛼subscript𝜇1V_{\alpha}(\mu_{1})italic_V start_POSTSUBSCRIPT italic_αあるふぁ end_POSTSUBSCRIPT ( italic_μみゅー start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ) and Vαあるふぁ(μみゅー2)subscript𝑉𝛼subscript𝜇2V_{\alpha}(\mu_{2})italic_V start_POSTSUBSCRIPT italic_αあるふぁ end_POSTSUBSCRIPT ( italic_μみゅー start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ) at points μみゅー1subscript𝜇1\mu_{1}italic_μみゅー start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT and μみゅー2subscript𝜇2\mu_{2}italic_μみゅー start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT, respectively. Point μみゅー1subscript𝜇1\mu_{1}italic_μみゅー start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT lies on the boundary of C𝐶Citalic_C, while μみゅー2subscript𝜇2\mu_{2}italic_μみゅー start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT is infeasible; αあるふぁ𝛼\alphaitalic_αあるふぁ is a restitution parameter. (B) When the vector field f𝑓fitalic_f is pushing away from C𝐶Citalic_C, a force RNVαあるふぁ(μみゅー˙)𝑅subscript𝑁subscript𝑉𝛼˙𝜇-R\in N_{V_{\alpha}}(\dot{\mu})- italic_R ∈ italic_N start_POSTSUBSCRIPT italic_V start_POSTSUBSCRIPT italic_αあるふぁ end_POSTSUBSCRIPT end_POSTSUBSCRIPT ( over˙ start_ARG italic_μみゅー end_ARG ) is added to the dynamics to ensure μみゅー˙Vαあるふぁ(μみゅー)˙𝜇subscript𝑉𝛼𝜇\dot{\mu}\in V_{\alpha}(\mu)over˙ start_ARG italic_μみゅー end_ARG ∈ italic_V start_POSTSUBSCRIPT italic_αあるふぁ end_POSTSUBSCRIPT ( italic_μみゅー ) .

In order to account for the velocity constraint μみゅー˙Vαあるふぁ(μみゅー)˙𝜇subscript𝑉𝛼𝜇\dot{\mu}\in V_{\alpha}(\mu)over˙ start_ARG italic_μみゅー end_ARG ∈ italic_V start_POSTSUBSCRIPT italic_αあるふぁ end_POSTSUBSCRIPT ( italic_μみゅー ) we augment the dynamics μみゅー˙=f˙𝜇𝑓\dot{\mu}=fover˙ start_ARG italic_μみゅー end_ARG = italic_f with a reaction force R𝑅Ritalic_R that forces the solution to remain within the desired constraints:

μみゅー˙=f+R,withRNVαあるふぁ(μみゅー)(μみゅー˙),formulae-sequence˙𝜇𝑓𝑅with𝑅subscript𝑁subscript𝑉𝛼𝜇˙𝜇\dot{\mu}=f+R,\quad\text{with}\,\ -R\in N_{V_{\alpha}(\mu)}(\dot{\mu}),over˙ start_ARG italic_μみゅー end_ARG = italic_f + italic_R , with - italic_R ∈ italic_N start_POSTSUBSCRIPT italic_V start_POSTSUBSCRIPT italic_αあるふぁ end_POSTSUBSCRIPT ( italic_μみゅー ) end_POSTSUBSCRIPT ( over˙ start_ARG italic_μみゅー end_ARG ) , (2)

where NVαあるふぁ(μみゅー)(μみゅー˙)subscript𝑁subscript𝑉𝛼𝜇˙𝜇N_{V_{\alpha}(\mu)}(\dot{\mu})italic_N start_POSTSUBSCRIPT italic_V start_POSTSUBSCRIPT italic_αあるふぁ end_POSTSUBSCRIPT ( italic_μみゅー ) end_POSTSUBSCRIPT ( over˙ start_ARG italic_μみゅー end_ARG ) denotes the normal cone of the set Vαあるふぁ(μみゅー)subscript𝑉𝛼𝜇V_{\alpha}(\mu)italic_V start_POSTSUBSCRIPT italic_αあるふぁ end_POSTSUBSCRIPT ( italic_μみゅー ) at μみゅー˙˙𝜇\dot{\mu}over˙ start_ARG italic_μみゅー end_ARG. Due to the scaling of the gradient with S𝑆Sitalic_S, the normal cone is defined with respect to the inner product a,b=aTS1b𝑎𝑏superscript𝑎𝑇superscript𝑆1𝑏\langle a,b\rangle=a^{T}S^{-1}b⟨ italic_a , italic_b ⟩ = italic_a start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT italic_S start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT italic_b, where a,bE𝑎𝑏superscript𝐸a,b\in\mathbb{R}^{E}italic_a , italic_b ∈ blackboard_R start_POSTSUPERSCRIPT italic_E end_POSTSUPERSCRIPT are arbitrary vectors. This has the important effect of guaranteeing that 𝔏βべーたsubscript𝔏𝛽\mathfrak{L}_{\beta}fraktur_L start_POSTSUBSCRIPT italic_βべーた end_POSTSUBSCRIPT (of the unconstrained dynamics) is still a Lyapunov function also in the constrained setting and that 𝔏βべーた(μみゅー(t))subscript𝔏𝛽𝜇𝑡\mathfrak{L}_{\beta}(\mu(t))fraktur_L start_POSTSUBSCRIPT italic_βべーた end_POSTSUBSCRIPT ( italic_μみゅー ( italic_t ) ) is monotonically decreasing along the trajectories of Eq. 2. A detailed derivation is included in SM sup .

The addition of R𝑅Ritalic_R ensures that even if f𝑓fitalic_f pushes μみゅー𝜇\muitalic_μみゅー away from C𝐶Citalic_C, as shown in Fig. 1(B), the force R𝑅Ritalic_R, which is orthogonal to the set Vαあるふぁ(μみゅー)subscript𝑉𝛼𝜇V_{\alpha}(\mu)italic_V start_POSTSUBSCRIPT italic_αあるふぁ end_POSTSUBSCRIPT ( italic_μみゅー ), annihilates the component of f𝑓fitalic_f that would lead to a constraint violation and ensures that μみゅー˙Vαあるふぁ(μみゅー)˙𝜇subscript𝑉𝛼𝜇\dot{\mu}\in V_{\alpha}(\mu)over˙ start_ARG italic_μみゅー end_ARG ∈ italic_V start_POSTSUBSCRIPT italic_αあるふぁ end_POSTSUBSCRIPT ( italic_μみゅー ). As discussed above, we can therefore conclude that μみゅー(0)Cμみゅー(t)C𝜇0𝐶𝜇𝑡𝐶\mu(0)\in C\Rightarrow\mu(t)\in Citalic_μみゅー ( 0 ) ∈ italic_C ⇒ italic_μみゅー ( italic_t ) ∈ italic_C for all t0𝑡0t\geq 0italic_t ≥ 0 and μみゅー(0)Cμみゅー(t)C𝜇0𝐶𝜇𝑡𝐶\mu(0)\not\in C\Rightarrow\mu(t)\rightarrow Citalic_μみゅー ( 0 ) ∉ italic_C ⇒ italic_μみゅー ( italic_t ) → italic_C for t𝑡t\rightarrow\inftyitalic_t → ∞.

In addition, we infer from Fig. 1 that the resulting μみゅー˙˙𝜇\dot{\mu}over˙ start_ARG italic_μみゅー end_ARG in Eq. 2 is nothing but the projection of f𝑓fitalic_f onto the set Vαあるふぁ(μみゅー)subscript𝑉𝛼𝜇V_{\alpha}(\mu)italic_V start_POSTSUBSCRIPT italic_αあるふぁ end_POSTSUBSCRIPT ( italic_μみゅー ) and as a result, we can rewrite μみゅー˙˙𝜇\dot{\mu}over˙ start_ARG italic_μみゅー end_ARG in the following way:

μみゅー˙:=argminvVαあるふぁ(μみゅー)12vf,vf,assign˙𝜇subscriptargmin𝑣subscript𝑉𝛼𝜇12𝑣𝑓𝑣𝑓\dot{\mu}:=\operatorname*{arg\,min}_{v\in V_{\alpha}(\mu)}\frac{1}{2}\langle v% -f,v-f\rangle\quad,over˙ start_ARG italic_μみゅー end_ARG := start_OPERATOR roman_arg roman_min end_OPERATOR start_POSTSUBSCRIPT italic_v ∈ italic_V start_POSTSUBSCRIPT italic_αあるふぁ end_POSTSUBSCRIPT ( italic_μみゅー ) end_POSTSUBSCRIPT divide start_ARG 1 end_ARG start_ARG 2 end_ARG ⟨ italic_v - italic_f , italic_v - italic_f ⟩ , (3)

which can also be equivalently reformulated as the quadratic program (QP)

μみゅー˙˙𝜇\displaystyle\dot{\mu}over˙ start_ARG italic_μみゅー end_ARG :=argminvVαあるふぁ(μみゅー)12(vf)TS1(vf).assignabsentsubscriptargmin𝑣subscript𝑉𝛼𝜇12superscript𝑣𝑓𝑇superscript𝑆1𝑣𝑓\displaystyle:=\operatorname*{arg\,min}_{v\in V_{\alpha}(\mu)}\frac{1}{2}(v-f)% ^{T}S^{-1}(v-f)\quad.:= start_OPERATOR roman_arg roman_min end_OPERATOR start_POSTSUBSCRIPT italic_v ∈ italic_V start_POSTSUBSCRIPT italic_αあるふぁ end_POSTSUBSCRIPT ( italic_μみゅー ) end_POSTSUBSCRIPT divide start_ARG 1 end_ARG start_ARG 2 end_ARG ( italic_v - italic_f ) start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT italic_S start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT ( italic_v - italic_f ) . (4)

This reformulation is not only useful for numerical computations, but also highlights that the velocity μみゅー˙˙𝜇\dot{\mu}over˙ start_ARG italic_μみゅー end_ARG is chosen, at each point in time, to match the unconstrained f𝑓fitalic_f. Fig. 1(A) visualizes the set C𝐶Citalic_C and the set of feasible velocities Vαあるふぁ(μみゅー1)subscript𝑉𝛼subscript𝜇1V_{\alpha}(\mu_{1})italic_V start_POSTSUBSCRIPT italic_αあるふぁ end_POSTSUBSCRIPT ( italic_μみゅー start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ) and Vαあるふぁ(μみゅー2)subscript𝑉𝛼subscript𝜇2V_{\alpha}(\mu_{2})italic_V start_POSTSUBSCRIPT italic_αあるふぁ end_POSTSUBSCRIPT ( italic_μみゅー start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ) at points μみゅー1subscript𝜇1\mu_{1}italic_μみゅー start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT and μみゅー2subscript𝜇2\mu_{2}italic_μみゅー start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT, respectively. Point μみゅー1subscript𝜇1\mu_{1}italic_μみゅー start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT lies on the boundary of C𝐶Citalic_C, while μみゅー2subscript𝜇2\mu_{2}italic_μみゅー start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT is infeasible. We note that the cone Vαあるふぁ(μみゅー2)subscript𝑉𝛼subscript𝜇2V_{\alpha}(\mu_{2})italic_V start_POSTSUBSCRIPT italic_αあるふぁ end_POSTSUBSCRIPT ( italic_μみゅー start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ) includes an offset, which is controlled by the restitution parameter αあるふぁ𝛼\alphaitalic_αあるふぁ; this ensures that any vVαあるふぁ(μみゅー2)𝑣subscript𝑉𝛼subscript𝜇2v\in V_{\alpha}(\mu_{2})italic_v ∈ italic_V start_POSTSUBSCRIPT italic_αあるふぁ end_POSTSUBSCRIPT ( italic_μみゅー start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ) leads to a decrease in constraint violation. Fig. 1 (B) shows that when the vector field f𝑓fitalic_f is pushing away from C𝐶Citalic_C, a force RNVαあるふぁ(μみゅー˙)𝑅subscript𝑁subscript𝑉𝛼˙𝜇-R\in N_{V_{\alpha}}(\dot{\mu})- italic_R ∈ italic_N start_POSTSUBSCRIPT italic_V start_POSTSUBSCRIPT italic_αあるふぁ end_POSTSUBSCRIPT end_POSTSUBSCRIPT ( over˙ start_ARG italic_μみゅー end_ARG ) is added to the dynamics. The force R𝑅Ritalic_R annihilates the component of f𝑓fitalic_f that would lead to a constraint violation and ensures μみゅー˙Vαあるふぁ(μみゅー)˙𝜇subscript𝑉𝛼𝜇\dot{\mu}\in V_{\alpha}(\mu)over˙ start_ARG italic_μみゅー end_ARG ∈ italic_V start_POSTSUBSCRIPT italic_αあるふぁ end_POSTSUBSCRIPT ( italic_μみゅー ), where μみゅー˙˙𝜇\dot{\mu}over˙ start_ARG italic_μみゅー end_ARG is chosen as close as possible to f𝑓fitalic_f. This can also be interpreted as Gauss’s principle of least constraint. It is important to note that Vαあるふぁ(μみゅー)subscript𝑉𝛼𝜇V_{\alpha}(\mu)italic_V start_POSTSUBSCRIPT italic_αあるふぁ end_POSTSUBSCRIPT ( italic_μみゅー ) is a polyhedral set that only includes the constraints Iμみゅーsubscript𝐼𝜇I_{\mu}italic_I start_POSTSUBSCRIPT italic_μみゅー end_POSTSUBSCRIPT, a subset of the original constraints g(μみゅー)0𝑔𝜇0g(\mu)\geq 0italic_g ( italic_μみゅー ) ≥ 0. The set Vαあるふぁ(μみゅー)subscript𝑉𝛼𝜇V_{\alpha}(\mu)italic_V start_POSTSUBSCRIPT italic_αあるふぁ end_POSTSUBSCRIPT ( italic_μみゅー ) represents therefore a sparse, local and linear approximation of the feasible set. The solution μみゅー˙˙𝜇\dot{\mu}over˙ start_ARG italic_μみゅー end_ARG of Eq. 3 can then be used to update the conductivity with a discrete-time algorithm:

μみゅーt+1=μみゅーt+τたうμみゅー˙,superscript𝜇𝑡1superscript𝜇𝑡𝜏˙𝜇\mu^{t+1}=\mu^{t}+\tau\dot{\mu}\quad,italic_μみゅー start_POSTSUPERSCRIPT italic_t + 1 end_POSTSUPERSCRIPT = italic_μみゅー start_POSTSUPERSCRIPT italic_t end_POSTSUPERSCRIPT + italic_τたう over˙ start_ARG italic_μみゅー end_ARG , (5)

where τたう>0𝜏0\tau>0italic_τたう > 0 is the step size.

This general formalism can be applied to a variety of scenarios, provided one can compute g𝑔\nabla g∇ italic_g, which determines the set Vαあるふぁ(μみゅー)subscript𝑉𝛼𝜇V_{\alpha}(\mu)italic_V start_POSTSUBSCRIPT italic_αあるふぁ end_POSTSUBSCRIPT ( italic_μみゅー ). We can then solve Eq. 4 by using numerical solvers tailored to QP, which then yields the update Eq. 5. Additional details about the computational complexity for solving Eq. 5 are described in SM sup . However, in important special cases, the optimization Eq. 5 can be solved in closed-form, as we illustrate below with three relevant examples.

Capacity constraints.

In cases of structural constraints that strictly limit the amount of mass that can travel along any given edge, one can consider capacities ce0subscript𝑐𝑒0c_{e}\geq 0italic_c start_POSTSUBSCRIPT italic_e end_POSTSUBSCRIPT ≥ 0 on edges and set constraints as ge(μみゅー)=ceμみゅーesubscript𝑔𝑒𝜇subscript𝑐𝑒subscript𝜇𝑒g_{e}(\mu)=c_{e}-\mu_{e}italic_g start_POSTSUBSCRIPT italic_e end_POSTSUBSCRIPT ( italic_μみゅー ) = italic_c start_POSTSUBSCRIPT italic_e end_POSTSUBSCRIPT - italic_μみゅー start_POSTSUBSCRIPT italic_e end_POSTSUBSCRIPT. The velocity constraint vVαあるふぁ(μみゅー)𝑣subscript𝑉𝛼𝜇v\in V_{\alpha}(\mu)italic_v ∈ italic_V start_POSTSUBSCRIPT italic_αあるふぁ end_POSTSUBSCRIPT ( italic_μみゅー ) in Eq. 3 reads as veαあるふぁge(μみゅーe)subscript𝑣𝑒𝛼subscript𝑔𝑒subscript𝜇𝑒v_{e}\leq\alpha g_{e}(\mu_{e})italic_v start_POSTSUBSCRIPT italic_e end_POSTSUBSCRIPT ≤ italic_αあるふぁ italic_g start_POSTSUBSCRIPT italic_e end_POSTSUBSCRIPT ( italic_μみゅー start_POSTSUBSCRIPT italic_e end_POSTSUBSCRIPT ), for eIμみゅー𝑒subscript𝐼𝜇e\in I_{\mu}italic_e ∈ italic_I start_POSTSUBSCRIPT italic_μみゅー end_POSTSUBSCRIPT, which is strictly negative, since αあるふぁ>0𝛼0\alpha>0italic_αあるふぁ > 0 (SM sup ). As previously discussed, αあるふぁ>0𝛼0\alpha>0italic_αあるふぁ > 0 is a restitution parameter that dictates the rate at which constraint violations decay. In discrete time, one should choose αあるふぁ>0𝛼0\alpha>0italic_αあるふぁ > 0 such that αあるふぁτたう1𝛼𝜏1\alpha\,\tau\leq 1italic_αあるふぁ italic_τたう ≤ 1 to guarantee convergence (see Muehlebach and Jordan (2022)). We can then solve Eq. 3 in closed-form for edges violating the constraint obtaining ve=min{αあるふぁ(ceμみゅーe),fe}subscript𝑣𝑒𝛼subscript𝑐𝑒subscript𝜇𝑒subscript𝑓𝑒v_{e}=\min\left\{\alpha\,(c_{e}-\mu_{e}),{f}_{e}\right\}italic_v start_POSTSUBSCRIPT italic_e end_POSTSUBSCRIPT = roman_min { italic_αあるふぁ ( italic_c start_POSTSUBSCRIPT italic_e end_POSTSUBSCRIPT - italic_μみゅー start_POSTSUBSCRIPT italic_e end_POSTSUBSCRIPT ) , italic_f start_POSTSUBSCRIPT italic_e end_POSTSUBSCRIPT }. In summary, for each edge e𝑒eitalic_e, we have:

μみゅー˙e={αあるふぁ(ceμみゅーe),iffeαあるふぁ(ceμみゅーe)andμみゅーece,feotherwise.subscript˙𝜇𝑒cases𝛼subscript𝑐𝑒subscript𝜇𝑒ifsubscript𝑓𝑒𝛼subscript𝑐𝑒subscript𝜇𝑒andsubscript𝜇𝑒subscript𝑐𝑒subscript𝑓𝑒otherwise\dot{\mu}_{e}=\begin{cases}\alpha\,(c_{e}-\mu_{e}),&\text{if}\ \ {f}_{e}\geq% \alpha\,(c_{e}-\mu_{e})\ \text{and}\ \mu_{e}\geq c_{e},\\ \ {f}_{e}&\text{otherwise}\quad.\end{cases}over˙ start_ARG italic_μみゅー end_ARG start_POSTSUBSCRIPT italic_e end_POSTSUBSCRIPT = { start_ROW start_CELL italic_αあるふぁ ( italic_c start_POSTSUBSCRIPT italic_e end_POSTSUBSCRIPT - italic_μみゅー start_POSTSUBSCRIPT italic_e end_POSTSUBSCRIPT ) , end_CELL start_CELL if italic_f start_POSTSUBSCRIPT italic_e end_POSTSUBSCRIPT ≥ italic_αあるふぁ ( italic_c start_POSTSUBSCRIPT italic_e end_POSTSUBSCRIPT - italic_μみゅー start_POSTSUBSCRIPT italic_e end_POSTSUBSCRIPT ) and italic_μみゅー start_POSTSUBSCRIPT italic_e end_POSTSUBSCRIPT ≥ italic_c start_POSTSUBSCRIPT italic_e end_POSTSUBSCRIPT , end_CELL end_ROW start_ROW start_CELL italic_f start_POSTSUBSCRIPT italic_e end_POSTSUBSCRIPT end_CELL start_CELL otherwise . end_CELL end_ROW (6)
Refer to caption
Figure 2: Capacity constraint on synthetic networks. (A) Gini coefficient of the traffic distribution on edges. The edge capacity ce=csubscript𝑐𝑒𝑐c_{e}=citalic_c start_POSTSUBSCRIPT italic_e end_POSTSUBSCRIPT = italic_c is selected as a percentile of the distribution of μみゅー𝜇\muitalic_μみゅー over edges obtained in the unconstrained case (Unconstrained). (B) Ratio of average total path length to that of Unconstrained, 𝚕fsubscriptdelimited-⟨⟩𝚕𝑓\langle\mathit{\mathtt{l}}\rangle_{f}⟨ typewriter_l ⟩ start_POSTSUBSCRIPT italic_f end_POSTSUBSCRIPT. Markers and shadows are averages and standard deviations over 20 network realizations, with 100 randomly selected origins. All passengers have the same central destination (square magenta marker). (c) Example trajectory of one passenger type (green color), whose origin is the green triangle marker. Edge widths are proportional to the amount of passengers traveling through an edge; βべーた=1.8𝛽1.8\beta=1.8italic_βべーた = 1.8.

Fig. 2 shows the path topologies with capacity constraints on synthetic data, compared against the unconstrained case. We generate random planar networks as the Delaunay triangulation (Guibas and Stolfi, 1985) of N=300𝑁300N=300italic_N = 300 points in the plane. We measure the Gini coefficient Gini(T)𝐺𝑖𝑛𝑖𝑇Gini(T)italic_G italic_i italic_n italic_i ( italic_T ) calculated on the traffic on edges, defined as the E𝐸Eitalic_E-dimensional vector T𝑇Titalic_T with entries Te=i|Fei|/nsubscript𝑇𝑒subscript𝑖superscriptsubscript𝐹𝑒𝑖𝑛T_{e}=\sum_{i}|F_{e}^{i}|/nitalic_T start_POSTSUBSCRIPT italic_e end_POSTSUBSCRIPT = ∑ start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT | italic_F start_POSTSUBSCRIPT italic_e end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_i end_POSTSUPERSCRIPT | / italic_n, where n𝑛nitalic_n is the number of passengers. The coefficient has value in [0,1]01[0,1][ 0 , 1 ] and it determines how traffic is distributed along network edges, with Gini(T)=0,1𝐺𝑖𝑛𝑖𝑇01Gini(T)=0,1italic_G italic_i italic_n italic_i ( italic_T ) = 0 , 1 meaning equally-balanced or highly unbalanced traffic on few edges, respectively. The choice of the edge capacity cesubscript𝑐𝑒c_{e}italic_c start_POSTSUBSCRIPT italic_e end_POSTSUBSCRIPT influences this value, with lower cesubscript𝑐𝑒c_{e}italic_c start_POSTSUBSCRIPT italic_e end_POSTSUBSCRIPT imposing stricter constraint and thus encouraging traffic to distribute more equally along the edge, i.e. lower Gini, as shown in Fig. 2(A). Conversely, this implies longer routes for passengers, as measured by an increasing average total path length l=e,ie|Fei|/ndelimited-⟨⟩𝑙subscript𝑒𝑖subscript𝑒superscriptsubscript𝐹𝑒𝑖𝑛\langle l\rangle=\sum_{e,i}\,\ell_{e}\,|F_{e}^{i}|/n⟨ italic_l ⟩ = ∑ start_POSTSUBSCRIPT italic_e , italic_i end_POSTSUBSCRIPT roman_ℓ start_POSTSUBSCRIPT italic_e end_POSTSUBSCRIPT | italic_F start_POSTSUBSCRIPT italic_e end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_i end_POSTSUPERSCRIPT | / italic_n compared to the unconstrained solution, as shown in Fig. 2(B).

Budget constraint.

As a second example, we consider a global constraint that involves all the edges at once, a budget constraint gb(μみゅー)=beμみゅーesubscript𝑔𝑏𝜇𝑏subscript𝑒subscript𝜇𝑒g_{b}(\mu)=b\,-\,\sum_{e}\mu_{e}italic_g start_POSTSUBSCRIPT italic_b end_POSTSUBSCRIPT ( italic_μみゅー ) = italic_b - ∑ start_POSTSUBSCRIPT italic_e end_POSTSUBSCRIPT italic_μみゅー start_POSTSUBSCRIPT italic_e end_POSTSUBSCRIPT. This is relevant when a network manager has a fixed limited amount of resources b>0𝑏0b>0italic_b > 0 to invest. We note that, while the Lyapunov function 𝔏βべーたsubscript𝔏𝛽\mathfrak{L}_{\beta}fraktur_L start_POSTSUBSCRIPT italic_βべーた end_POSTSUBSCRIPT contains a similar budget term–the cost to build the infrastracture–this cost is not regarded as a constraint in standard approaches Hu and Cai (2013); Lonardi et al. (2021) but as part of the energy consumption, and the budget b𝑏bitalic_b is not a Lagrange multiplier but a measurable constant. Furthermore, unlike the previous case where including a positivity constraint μみゅーe0subscript𝜇𝑒0\mu_{e}\geq 0italic_μみゅー start_POSTSUBSCRIPT italic_e end_POSTSUBSCRIPT ≥ 0 is optional (but it can in principle be imposed as well, see SM sup ), here we need to include that explicitly. In the standard OT formalism positivity is ensured, provided μみゅーesubscript𝜇𝑒\mu_{e}italic_μみゅー start_POSTSUBSCRIPT italic_e end_POSTSUBSCRIPT is initialized as a positive quantity. Adding constraint may not preserve positivity anymore during the updates, this is the case for the budget constraint, as we observed empirically. Positivity is enforced by adding gp(μみゅー)=μみゅー0subscript𝑔𝑝𝜇𝜇0g_{p}(\mu)=\mu\geq 0italic_g start_POSTSUBSCRIPT italic_p end_POSTSUBSCRIPT ( italic_μみゅー ) = italic_μみゅー ≥ 0, i.e. μみゅーe0esubscript𝜇𝑒0for-all𝑒\mu_{e}\geq 0\,\forall eitalic_μみゅー start_POSTSUBSCRIPT italic_e end_POSTSUBSCRIPT ≥ 0 ∀ italic_e.

In this budget constraint setting, the conductivities violate the constraint whenever eμみゅーe>bsubscript𝑒subscript𝜇𝑒𝑏\sum_{e}\mu_{e}>b∑ start_POSTSUBSCRIPT italic_e end_POSTSUBSCRIPT italic_μみゅー start_POSTSUBSCRIPT italic_e end_POSTSUBSCRIPT > italic_b. We derive a closed-form solution as: μみゅー˙e=feSeλらむだbsubscript˙𝜇𝑒subscript𝑓𝑒subscript𝑆𝑒subscript𝜆𝑏\dot{\mu}_{e}=f_{e}-S_{e}\lambda_{b}over˙ start_ARG italic_μみゅー end_ARG start_POSTSUBSCRIPT italic_e end_POSTSUBSCRIPT = italic_f start_POSTSUBSCRIPT italic_e end_POSTSUBSCRIPT - italic_S start_POSTSUBSCRIPT italic_e end_POSTSUBSCRIPT italic_λらむだ start_POSTSUBSCRIPT italic_b end_POSTSUBSCRIPT, if feSeλらむだbαあるふぁμみゅーesubscript𝑓𝑒subscript𝑆𝑒subscript𝜆𝑏𝛼subscript𝜇𝑒f_{e}-S_{e}\lambda_{b}\geq-\alpha\,\mu_{e}italic_f start_POSTSUBSCRIPT italic_e end_POSTSUBSCRIPT - italic_S start_POSTSUBSCRIPT italic_e end_POSTSUBSCRIPT italic_λらむだ start_POSTSUBSCRIPT italic_b end_POSTSUBSCRIPT ≥ - italic_αあるふぁ italic_μみゅー start_POSTSUBSCRIPT italic_e end_POSTSUBSCRIPT, and μみゅー˙e=αあるふぁμみゅーesubscript˙𝜇𝑒𝛼subscript𝜇𝑒\dot{\mu}_{e}=-\alpha\,\mu_{e}over˙ start_ARG italic_μみゅー end_ARG start_POSTSUBSCRIPT italic_e end_POSTSUBSCRIPT = - italic_αあるふぁ italic_μみゅー start_POSTSUBSCRIPT italic_e end_POSTSUBSCRIPT otherwise, where λらむだbsubscript𝜆𝑏\lambda_{b}\in\mathbb{R}italic_λらむだ start_POSTSUBSCRIPT italic_b end_POSTSUBSCRIPT ∈ blackboard_R, a Lagrange multiplier for the budget constraint, can be numerically determined via fixed-point iteration (SM sup ).

Combining linear and non-linear constraints.

All the previous examples considered linear constraints, where it is simple to derive analytical solutions. In general, constraints can be more complicated and thus require numerical methods to solve the constrained QP in Eq. 3. In this scenario, we consider a non-linear budget constraint of the form: gδでるた(μみゅー)=beμみゅーeδでるた0subscript𝑔𝛿𝜇𝑏subscript𝑒superscriptsubscript𝜇𝑒𝛿0g_{\delta}(\mu)=b-\sum_{e}\mu_{e}^{\delta}\geq 0italic_g start_POSTSUBSCRIPT italic_δでるた end_POSTSUBSCRIPT ( italic_μみゅー ) = italic_b - ∑ start_POSTSUBSCRIPT italic_e end_POSTSUBSCRIPT italic_μみゅー start_POSTSUBSCRIPT italic_e end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_δでるた end_POSTSUPERSCRIPT ≥ 0, where δでるた>0𝛿0\delta>0italic_δでるた > 0 is a nonlinearity parameter. Setting δでるた=1𝛿1\delta=1italic_δでるた = 1 gives a linear budget constraint as the one discussed earlier. A non-linear example is a volume-preserving constraint where δでるた=1/2𝛿12\delta=1/2italic_δでるた = 1 / 2, this is relevant for biological processes such as leaf venation and vascular systems Takamatsu et al. (2017); Ronellenfitsch and Katifori (2016). This non-linear budget induces the velocity constraint eδでるたμみゅーeδでるた1veαあるふぁgδでるた(μみゅー)subscript𝑒𝛿superscriptsubscript𝜇𝑒𝛿1subscript𝑣𝑒𝛼subscript𝑔𝛿𝜇\sum_{e}\delta\mu_{e}^{\delta-1}v_{e}\leq\alpha\,g_{\delta}(\mu)∑ start_POSTSUBSCRIPT italic_e end_POSTSUBSCRIPT italic_δでるた italic_μみゅー start_POSTSUBSCRIPT italic_e end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_δでるた - 1 end_POSTSUPERSCRIPT italic_v start_POSTSUBSCRIPT italic_e end_POSTSUBSCRIPT ≤ italic_αあるふぁ italic_g start_POSTSUBSCRIPT italic_δでるた end_POSTSUBSCRIPT ( italic_μみゅー ). In addition, we also consider a capacity constraint as in the first scenario studied above. Overall, three functions are required: i) gδでるた(μみゅー)subscript𝑔𝛿𝜇g_{\delta}(\mu)italic_g start_POSTSUBSCRIPT italic_δでるた end_POSTSUBSCRIPT ( italic_μみゅー ) to impose non-linear budget constraint; ii) ge(μみゅー)subscript𝑔𝑒𝜇g_{e}(\mu)italic_g start_POSTSUBSCRIPT italic_e end_POSTSUBSCRIPT ( italic_μみゅー ) to impose edge capacity and iii) gp(μみゅー)subscript𝑔𝑝𝜇g_{p}(\mu)italic_g start_POSTSUBSCRIPT italic_p end_POSTSUBSCRIPT ( italic_μみゅー ) to ensure positivity. We derive the closed-form solution as

μみゅー˙e={αあるふぁ(ceμみゅーe)iffeSeλらむだδでるたheαあるふぁ(ceμみゅーe),μみゅーeceαあるふぁμみゅーeiffeSeλらむだδでるたheαあるふぁμみゅーe,μみゅーe0feSeλらむだδでるたheotherwise,subscript˙𝜇𝑒cases𝛼subscript𝑐𝑒subscript𝜇𝑒formulae-sequenceifsubscript𝑓𝑒subscript𝑆𝑒subscript𝜆𝛿subscript𝑒𝛼subscript𝑐𝑒subscript𝜇𝑒subscript𝜇𝑒subscript𝑐𝑒otherwiseotherwise𝛼subscript𝜇𝑒formulae-sequenceifsubscript𝑓𝑒subscript𝑆𝑒subscript𝜆𝛿subscript𝑒𝛼subscript𝜇𝑒subscript𝜇𝑒0otherwiseotherwisesubscript𝑓𝑒subscript𝑆𝑒subscript𝜆𝛿subscript𝑒otherwise\dot{\mu}_{e}=\begin{cases}\alpha\,(c_{e}-\mu_{e})&\text{if}\ {f}_{e}{-S_{e}% \lambda_{\delta}\,h_{e}}\geq\alpha\,(c_{e}-\mu_{e}),\,\mu_{e}\geq c_{e}\\ \\ -\alpha\,\mu_{e}&\text{if}\ {f}_{e}{-S_{e}\lambda_{\delta}\,h_{e}}\leq-\alpha% \,\mu_{e},\,\mu_{e}\leq 0\\ \\ f_{e}-S_{e}\lambda_{\delta}\,h_{e}&\text{otherwise}\quad,\end{cases}over˙ start_ARG italic_μみゅー end_ARG start_POSTSUBSCRIPT italic_e end_POSTSUBSCRIPT = { start_ROW start_CELL italic_αあるふぁ ( italic_c start_POSTSUBSCRIPT italic_e end_POSTSUBSCRIPT - italic_μみゅー start_POSTSUBSCRIPT italic_e end_POSTSUBSCRIPT ) end_CELL start_CELL if italic_f start_POSTSUBSCRIPT italic_e end_POSTSUBSCRIPT - italic_S start_POSTSUBSCRIPT italic_e end_POSTSUBSCRIPT italic_λらむだ start_POSTSUBSCRIPT italic_δでるた end_POSTSUBSCRIPT italic_h start_POSTSUBSCRIPT italic_e end_POSTSUBSCRIPT ≥ italic_αあるふぁ ( italic_c start_POSTSUBSCRIPT italic_e end_POSTSUBSCRIPT - italic_μみゅー start_POSTSUBSCRIPT italic_e end_POSTSUBSCRIPT ) , italic_μみゅー start_POSTSUBSCRIPT italic_e end_POSTSUBSCRIPT ≥ italic_c start_POSTSUBSCRIPT italic_e end_POSTSUBSCRIPT end_CELL end_ROW start_ROW start_CELL end_CELL start_CELL end_CELL end_ROW start_ROW start_CELL - italic_αあるふぁ italic_μみゅー start_POSTSUBSCRIPT italic_e end_POSTSUBSCRIPT end_CELL start_CELL if italic_f start_POSTSUBSCRIPT italic_e end_POSTSUBSCRIPT - italic_S start_POSTSUBSCRIPT italic_e end_POSTSUBSCRIPT italic_λらむだ start_POSTSUBSCRIPT italic_δでるた end_POSTSUBSCRIPT italic_h start_POSTSUBSCRIPT italic_e end_POSTSUBSCRIPT ≤ - italic_αあるふぁ italic_μみゅー start_POSTSUBSCRIPT italic_e end_POSTSUBSCRIPT , italic_μみゅー start_POSTSUBSCRIPT italic_e end_POSTSUBSCRIPT ≤ 0 end_CELL end_ROW start_ROW start_CELL end_CELL start_CELL end_CELL end_ROW start_ROW start_CELL italic_f start_POSTSUBSCRIPT italic_e end_POSTSUBSCRIPT - italic_S start_POSTSUBSCRIPT italic_e end_POSTSUBSCRIPT italic_λらむだ start_POSTSUBSCRIPT italic_δでるた end_POSTSUBSCRIPT italic_h start_POSTSUBSCRIPT italic_e end_POSTSUBSCRIPT end_CELL start_CELL otherwise , end_CELL end_ROW (7)

where he=δでるたμみゅーeδでるた1subscript𝑒𝛿superscriptsubscript𝜇𝑒𝛿1h_{e}=\delta\,{\mu}_{e}^{\delta-1}italic_h start_POSTSUBSCRIPT italic_e end_POSTSUBSCRIPT = italic_δでるた italic_μみゅー start_POSTSUBSCRIPT italic_e end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_δでるた - 1 end_POSTSUPERSCRIPT and λらむだδでるた>0subscript𝜆𝛿0\lambda_{\delta}>0italic_λらむだ start_POSTSUBSCRIPT italic_δでるた end_POSTSUBSCRIPT > 0. The value of λらむだδでるたsubscript𝜆𝛿\lambda_{\delta}italic_λらむだ start_POSTSUBSCRIPT italic_δでるた end_POSTSUBSCRIPT can be determined numerically using fixed-point iteration (SM sup ). The value αあるふぁ(ceμみゅーe)𝛼subscript𝑐𝑒subscript𝜇𝑒\alpha\,(c_{e}-\mu_{e})italic_αあるふぁ ( italic_c start_POSTSUBSCRIPT italic_e end_POSTSUBSCRIPT - italic_μみゅー start_POSTSUBSCRIPT italic_e end_POSTSUBSCRIPT ) ensures there is no violation on the edge capacity, αあるふぁμみゅーe𝛼subscript𝜇𝑒-\alpha\,\mu_{e}- italic_αあるふぁ italic_μみゅー start_POSTSUBSCRIPT italic_e end_POSTSUBSCRIPT imposes positivity constraint and feSeλらむだδでるたhesubscript𝑓𝑒subscript𝑆𝑒superscript𝜆𝛿subscript𝑒{f}_{e}{-S_{e}\lambda^{\delta}h_{e}}italic_f start_POSTSUBSCRIPT italic_e end_POSTSUBSCRIPT - italic_S start_POSTSUBSCRIPT italic_e end_POSTSUBSCRIPT italic_λらむだ start_POSTSUPERSCRIPT italic_δでるた end_POSTSUPERSCRIPT italic_h start_POSTSUBSCRIPT italic_e end_POSTSUBSCRIPT captures budget violation. Overall, this scenario ensures that the velocity μみゅー˙esubscript˙𝜇𝑒\dot{\mu}_{e}over˙ start_ARG italic_μみゅー end_ARG start_POSTSUBSCRIPT italic_e end_POSTSUBSCRIPT has an upper bound of αあるふぁ(ceμみゅーe)𝛼subscript𝑐𝑒subscript𝜇𝑒\alpha\,(c_{e}-\mu_{e})italic_αあるふぁ ( italic_c start_POSTSUBSCRIPT italic_e end_POSTSUBSCRIPT - italic_μみゅー start_POSTSUBSCRIPT italic_e end_POSTSUBSCRIPT ) and lower bound of αあるふぁμみゅーe𝛼subscript𝜇𝑒-\alpha\,\mu_{e}- italic_αあるふぁ italic_μみゅー start_POSTSUBSCRIPT italic_e end_POSTSUBSCRIPT. The choice of δでるた𝛿\deltaitalic_δでるた impacts the topological properties of the resulting network, e.g., the total path length. In the numerical experiments, we set the nonlinearity parameter as δでるた(0,1)𝛿01\delta\in(0,1)italic_δでるた ∈ ( 0 , 1 ).

network.

We examine the topology of various constrained solutions on the road network of the city of Kujala et al. (2018), see Fig. 3(A). This has 640 nodes and 740 edges. As a relevant example, we set the central bus station as the destination node and select the remaining 639639639639 nodes as origins, but our method still applies to other choices of origin-destination pairs, e.g. peripheral nodes connecting to other peripheral nodes or to various hubs. This can be specified inside Kirchhoff’s law, see SM sup .

Refer to caption
Figure 3: Constrained OT on road network. (A) Path trajectories for the unconstrained OT (Unconstrained), budget constraint (Budget) and a non-linear budget plus capacity (Non-linear). We set b=12eμみゅーe𝑏12subscript𝑒subscript𝜇𝑒b=\frac{1}{2}\sum_{e}\mu_{e}italic_b = divide start_ARG 1 end_ARG start_ARG 2 end_ARG ∑ start_POSTSUBSCRIPT italic_e end_POSTSUBSCRIPT italic_μみゅー start_POSTSUBSCRIPT italic_e end_POSTSUBSCRIPT, where μみゅーesubscript𝜇𝑒\mu_{e}italic_μみゅー start_POSTSUBSCRIPT italic_e end_POSTSUBSCRIPT is that of unconstrained, δでるた=1/2𝛿12\delta=1/2italic_δでるた = 1 / 2 and ce=70subscript𝑐𝑒70c_{e}=70italic_c start_POSTSUBSCRIPT italic_e end_POSTSUBSCRIPT = 70 for all edges. Example trajectories of two passenger types (green and orange), whose origin are the respective triangles. All passengers have the same central destination (magenta marker). Edge widths are proportional to the amount of passengers traveling through an edge. (B) Gini coefficient of the traffic distribution on edges. (C) Ratio of average total path length to that of Unconstrained. Markers and shadows are averages and standard deviations over 100 randomly-selected destinations, respectively.

Routes generated from the non-linear constraint scenario balance traffic more than the unconstrained case and result in longer routes, see Fig. 3(B-C). Adding a budget constraint for βべーた>1𝛽1\beta>1italic_βべーた > 1 results in more distributed traffic (lower Gini) without increasing much the total path length, compared to the unconstrained case. This could be used for instance to allocate to roads infrastructural works aimed at maintenance or upgrade when having a restricted budget.

Discussion.

Distributing flows in a transportation network is challenging. Approaches based on optimal transport theory are promising, but they are limited by the lack of a mechanism to incorporate realistic constraints. We show how to impose arbitrary constraints on OT problems in a principled and flexible way. The constraints are lifted from a position to a velocity level and are included in the corresponding mirror descent dynamics. This results in a scalable algorithm that solves constrained OT problems in a computationally efficient manner. The algorithm relies on a sparse local approximation of the feasible set at each iteration. Thus, closed-form updates can often be derived, even if the underlying feasible set is nonconvex or nonlinear. Otherwise, one can resort to efficient numerical methods to solve at most a quadratic program. Our physics-based approach is a change of paradigm with regard to how OT problems are modelled and solved numerically. This calls for a generalization of transportation problems in wider scenarios, e.g. in networks with multiple transport modes Ibrahim et al. (2021), with real-time traffic demands Lonardi et al. (2023b) or with noise-induced resonances Folz et al. (2023).

We provide an open source implementation McOpt (2023).

Acknowledgements.
Acknowledgments: The authors thank the International Max Planck Research School for Intelligent Systems (IMPRS-IS) for supporting AAI. MM thanks the German Research Foundation and the Branco Weiss Fellowship, administered by ETH Zurich, for the support.

Appendix A Appendix

Details about setting the constraints.

We define C:={μみゅー0E|g(μみゅー)0}assign𝐶conditional-set𝜇superscriptsubscriptabsent0𝐸𝑔𝜇0C:=\{\mu\in\mathbb{R}_{\geq 0}^{E}\ |\ g(\mu)\geq 0\}italic_C := { italic_μみゅー ∈ blackboard_R start_POSTSUBSCRIPT ≥ 0 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_E end_POSTSUPERSCRIPT | italic_g ( italic_μみゅー ) ≥ 0 } as the set of feasible conductivities μみゅー=(μみゅー1,,μみゅーE)𝜇subscript𝜇1subscript𝜇𝐸\mu=(\mu_{1},\dots,\mu_{E})italic_μみゅー = ( italic_μみゅー start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , … , italic_μみゅー start_POSTSUBSCRIPT italic_E end_POSTSUBSCRIPT ), with g𝑔gitalic_g a constraint function that we assume continuously differentiable and E𝐸Eitalic_E is the number of network edges. We focus on those edges where constraints are not satisfied, and denote the set of active constraints for a given μみゅー𝜇\muitalic_μみゅー as Iμみゅー:={i|gi(μみゅー)0}assignsubscript𝐼𝜇conditional-set𝑖subscript𝑔𝑖𝜇0I_{\mu}:=\{i\in\mathbb{Z}\ |\ g_{i}(\mu)\leq 0\}italic_I start_POSTSUBSCRIPT italic_μみゅー end_POSTSUBSCRIPT := { italic_i ∈ blackboard_Z | italic_g start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ( italic_μみゅー ) ≤ 0 }. Interpreting μみゅー𝜇\muitalic_μみゅー as a “position” variable, a constraint to ensure μみゅー(t)C,t0formulae-sequence𝜇𝑡𝐶for-all𝑡0\mu(t)\in C,\forall t\geq 0italic_μみゅー ( italic_t ) ∈ italic_C , ∀ italic_t ≥ 0, can be equivalently formulated as a constraint on its velocity μみゅー˙(t)TC(μみゅー(t)),t0formulae-sequence˙𝜇𝑡subscript𝑇𝐶𝜇𝑡for-all𝑡0\dot{\mu}(t)\in T_{C}(\mu(t)),\forall t\geq 0over˙ start_ARG italic_μみゅー end_ARG ( italic_t ) ∈ italic_T start_POSTSUBSCRIPT italic_C end_POSTSUBSCRIPT ( italic_μみゅー ( italic_t ) ) , ∀ italic_t ≥ 0, with μみゅー(0)C𝜇0𝐶\mu(0)\in Citalic_μみゅー ( 0 ) ∈ italic_C, where TC(μみゅー)subscript𝑇𝐶𝜇T_{C}(\mu)italic_T start_POSTSUBSCRIPT italic_C end_POSTSUBSCRIPT ( italic_μみゅー ) denotes the tangent cone of the feasible set at μみゅー𝜇\muitalic_μみゅー, see Rockafellar and Wets (1998). However, it will be convenient to slightly extend the notion of tangent cone to also account for infeasible initial conditions (this is particularly important for the discretization), which is achieved by imposing μみゅー˙(t)Vαあるふぁ(μみゅー(t))˙𝜇𝑡subscript𝑉𝛼𝜇𝑡\dot{\mu}(t)\in V_{\alpha}(\mu(t))over˙ start_ARG italic_μみゅー end_ARG ( italic_t ) ∈ italic_V start_POSTSUBSCRIPT italic_αあるふぁ end_POSTSUBSCRIPT ( italic_μみゅー ( italic_t ) ), where Vαあるふぁ(μみゅー):={vE|gi(μみゅー)Tvαあるふぁgi(μみゅー),iIμみゅー}assignsubscript𝑉𝛼𝜇conditional-set𝑣superscript𝐸formulae-sequencesubscript𝑔𝑖superscript𝜇𝑇𝑣𝛼subscript𝑔𝑖𝜇𝑖subscript𝐼𝜇V_{\alpha}(\mu):=\{v\in\mathbb{R}^{E}\ |\ \nabla g_{i}(\mu)^{T}v\geq-\alpha\,g% _{i}(\mu),i\in I_{\mu}\}italic_V start_POSTSUBSCRIPT italic_αあるふぁ end_POSTSUBSCRIPT ( italic_μみゅー ) := { italic_v ∈ blackboard_R start_POSTSUPERSCRIPT italic_E end_POSTSUPERSCRIPT | ∇ italic_g start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ( italic_μみゅー ) start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT italic_v ≥ - italic_αあるふぁ italic_g start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ( italic_μみゅー ) , italic_i ∈ italic_I start_POSTSUBSCRIPT italic_μみゅー end_POSTSUBSCRIPT }, and αあるふぁ0𝛼0\alpha\geq 0italic_αあるふぁ ≥ 0 is a constant typically referred to as a “restitution” parameter or “slackness”. We note that Vαあるふぁ(μみゅー)subscript𝑉𝛼𝜇V_{\alpha}(\mu)italic_V start_POSTSUBSCRIPT italic_αあるふぁ end_POSTSUBSCRIPT ( italic_μみゅー ) generalizes the notion of the tangent cone, since for μみゅーC𝜇𝐶\mu\in Citalic_μみゅー ∈ italic_C, Vαあるふぁ(μみゅー)=TC(μみゅー)subscript𝑉𝛼𝜇subscript𝑇𝐶𝜇V_{\alpha}(\mu)=T_{C}(\mu)italic_V start_POSTSUBSCRIPT italic_αあるふぁ end_POSTSUBSCRIPT ( italic_μみゅー ) = italic_T start_POSTSUBSCRIPT italic_C end_POSTSUBSCRIPT ( italic_μみゅー ). We assume mild regularity conditions (constraint qualification). A sufficient condition is, for example, the existence of vE𝑣superscript𝐸v\in\mathbb{R}^{E}italic_v ∈ blackboard_R start_POSTSUPERSCRIPT italic_E end_POSTSUPERSCRIPT such that gi(μみゅー)Tv>0subscript𝑔𝑖superscript𝜇𝑇𝑣0\nabla g_{i}(\mu)^{T}v>0∇ italic_g start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ( italic_μみゅー ) start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT italic_v > 0 for all iIμみゅー𝑖subscript𝐼𝜇i\in I_{\mu}italic_i ∈ italic_I start_POSTSUBSCRIPT italic_μみゅー end_POSTSUBSCRIPT.

For μみゅー(t)C𝜇𝑡𝐶\mu(t)\not\in Citalic_μみゅー ( italic_t ) ∉ italic_C, the constraint μみゅー˙(t)Vαあるふぁ(μみゅー(t))˙𝜇𝑡subscript𝑉𝛼𝜇𝑡\dot{\mu}(t)\in V_{\alpha}(\mu(t))over˙ start_ARG italic_μみゅー end_ARG ( italic_t ) ∈ italic_V start_POSTSUBSCRIPT italic_αあるふぁ end_POSTSUBSCRIPT ( italic_μみゅー ( italic_t ) ) is equivalent to dgi(μみゅー(t))/dtαあるふぁgi(μみゅー(t))dsubscript𝑔𝑖𝜇𝑡d𝑡𝛼subscript𝑔𝑖𝜇𝑡\text{d}g_{i}(\mu(t))/\text{d}t\geq-\alpha g_{i}(\mu(t))d italic_g start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ( italic_μみゅー ( italic_t ) ) / d italic_t ≥ - italic_αあるふぁ italic_g start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ( italic_μみゅー ( italic_t ) ), iIμみゅー(t)𝑖subscript𝐼𝜇𝑡i\in I_{\mu(t)}italic_i ∈ italic_I start_POSTSUBSCRIPT italic_μみゅー ( italic_t ) end_POSTSUBSCRIPT, which ensures that potential constraint violations decay at the rate αあるふぁ>0𝛼0\alpha>0italic_αあるふぁ > 0.

Details about our method.

From a variational optimization perspective, our approach is related to successive linear and sequential quadratic programming Nocedal and Wright (2006); Bertsekas (1999); Luenberger and Ye (2016). The underlying idea of these methods is to linearize the objective function and the constraints about the current iterate and to solve a local linear and/or quadratic program. Our work improves upon these ideas and tailors these to optimal transport problems in the following way: i) we linearize a subset of constraints at every iteration, which means that the subproblem Eq. 3 typically includes very few constraints and can be solved efficiently; ii) we introduce a non-Euclidean inner product that is adapted to optimal transport problems and is used to show that 𝔏βべーたsubscript𝔏𝛽\mathfrak{L}_{\beta}fraktur_L start_POSTSUBSCRIPT italic_βべーた end_POSTSUBSCRIPT is a Lyapunov function; iii) we provide closed-form updates in various problem instances that are practically relevant.

References