(Translated by https://www.hiragana.jp/)
EARTH: Epidemiology-Aware Neural ODE with Continuous Disease Transmission Graph

[Uncaptioned image]   EARTH: Epidemiology-Aware Neural ODE with Continuous Disease Transmission Graph

Guancheng Wan1, Zewen Liu1, Max S. Y. Lau3, B. Aditya Prakash2, Wei Jin1
Abstract

Effective epidemic forecasting is critical for public health strategies and efficient medical resource allocation, especially in the face of rapidly spreading infectious diseases. However, existing deep-learning methods often overlook the dynamic nature of epidemics and fail to account for the specific mechanisms of disease transmission. In response to these challenges, we introduce an innovative end-to-end framework called Epidemiology-Aware Neural ODE with Continuous Disease Transmission Graph (EARTH) in this paper. To learn continuous and regional disease transmission patterns, we first propose EANO, which seamlessly integrates the neural ODE approach with the epidemic mechanism, considering the complex spatial spread process during epidemic evolution. Additionally, we introduce GLTG to model global infection trends and leverage these signals to guide local transmission dynamically. To accommodate both the global coherence of epidemic trends and the local nuances of epidemic transmission patterns, we build a cross-attention approach to fuse the most meaningful information for forecasting. Through the smooth synergy of both components, EARTH offers a more robust and flexible approach to understanding and predicting the spread of infectious diseases. Extensive experiments show EARTH superior performance in forecasting real-world epidemics compared to state-of-the-art methods. The code will be available at https://github.com/Emory-Melody/EpiLearn.

1 Introduction

Refer to caption
Figure 1: Problem illustration. Considering both evolution of regional correlation signals and irregular sampling observation intervals facts, we focus on the continuous-time epidemic system. But existing solutions fail to I) learn disease transmission patterns with epidemic mechanism and II) address missing states. Additionally, they omit to III) learn global trends caused by external factors (e.g., lockdowns) while developing dynamic regional transmission.

The COVID-19 pandemic has resulted in millions of deaths and significant economic losses worldwide, severely disrupting social and economic systems (Martin, Sánchez, and Wilkinson 2023; Pak et al. 2020). To address these challenges, there is a growing interest in epidemiological models, which are crucial for effective public health strategies and efficient medical resource allocation (Liu et al. 2024b; Cm 2020). Traditional models, such as the SIR model and its variants (Dehning et al. 2020), rely on mathematical differential equations to simulate disease spread but often depend on oversimplified assumptions (Funk et al. 2018; Kondratyev 2013; Yang et al. 2023). To enhance performance, deep learning models like Neural Networks (Madden et al. 2024; Rodríguez et al. 2023) or Graph Neural Networks (GNNs) (Zhang et al. 2024c) have been explored. These models effectively represent interactions between entities (e.g., regions) as graphs, capturing the spatial spread of disease through message-passing mechanisms.

Nevertheless, given the dynamic nature of epidemic systems, existing work often neglects the dynamic evolution of regional interactions. For example, changes in people’s behavior (e.g., lockdowns) at a specific time step of one region, will greatly reduce the spread to surrounding regions in the following period. Existing efforts generally predict future epidemic profiles by modeling regional interactions from the whole series while overlooking these dynamic changes during the evolution. Furthermore, irregular sampling observation intervals are not considered. For instance, some regions may be unable to conduct routine reporting in the early stages of an epidemic due to limited resources. Current work simplifies this scenario by learning only regular intervals which is impractical in the real world. The problem illustration is detailed in Fig. 1.

To tackle the aforementioned issues, neural ordinary differential equation (NODE) (Chen et al. 2018; Poli et al. 2021) stands out as a powerful approach to modeling the continuous-time system. Therefore, in this work, we take inspiration from NODE and focus on the continuous-time epidemic system, capturing the intricate dynamics more accurately. However, directly incorporating the neural ODE with the epidemic system faces nontrivial challenges. Firstly, it does not explicitly learn epidemic mechanisms and fails to provide insights to decision-makers. This motivates us to think: I) How can we generally combine the neural ODE approaches with epidemic mechanism? Some work (Arik et al. 2020; Mežnar, Lavrač, and Škrlj 2021) proposes hybrid models trying to combine the epidemic mechanism and deep learning methods. However, due to the limited observations of all disease states (e.g., lacking data on susceptible individuals) in the real world, these models are not flexible and are unable to learn inherent epidemic mechanisms. In the meanwhile, some existing neural ODE approaches from other fields (Luo et al. 2023) fail in addressing this problem either. Thus, the following question naturally emerges: II) How can we learn continuous disease transmission under limited observations more flexibly? Apart from the aforementioned locally subtle spreading patterns, epidemics also exhibit a global infection trend. From a more macroscopic perspective, the infection trend can be seen as a longer-range and often multi-regional overall direction. This global signal impacts and changes the disease propagation, resulting in different spatial transmission patterns at different times. For instance, global political vaccination trends significantly alter local spatial transmission patterns. In regions with high vaccination rates, the spread of the epidemic slows down, and a ”herd immunity” effect may even occur (Chauhan et al. 2023). However, previous work usually considers static geographic graphs or only learns the graph without accounting for the continuous evolution of global signals. This raises another intriguing question: III) How can we model the global infection trend and learn dynamic regional transmission patterns during continuous evolution?

To address the identified questions, we propose an innovative and end-to-end framework for continuous-time epidemic modeling: Epidemiology-AwaRe ODE with Continuous Disease Transmission GrapH (EARTH). To address question I) and facilitate epidemiology-informed transparency, we revisit the classic compartmental models (i.e., SIR). In order to surpass previous efforts (Rodríguez et al. 2023) and fully leverage the expressive ability of neural networks, we propose a neural ODE-based Network SIR (Brede 2012) to implicitly capture the continuous evolution of the regional propagation graph. To overcome the challenge II) we propose to initialize disease state features and feed them into a proposed Epidemic-Aware Neural ODE (EANO) module to learn inherent epidemic transmission pattern. Moreover, we attempt to achieve the target III). We first obtain a long-range view of epidemic progression and establish a relationship with regions that share similar development patterns. To further consider dynamic regional transmission, we develop an innovative Global-guided Local Transmission Graph (GLTG). Specifically, we fuse global trend indicative features for different regions with GNN. Then they are utilized to generate more fine-grained locally dynamic transmission graphs, which guide our EANO disease spreading during the evolution. Finally, we develop a cross-attention mechanism to accommodate both the global coherence of epidemic trends and the local nuances of disease transmission patterns. We conjecture that these two components together make EARTH a competitive method for epidemic forecasting. Our principal contributions are summarized as follows:

  • We are the first to harmonize the neural ODE with the epidemic mechanism, developing an innovative framework considering the time-continuous nature of epidemic dynamics while learning inherent disease spreading patterns.

  • We further consider global epidemic trends and learn dynamic regional transmission patterns during continuous evolution within the end-to-end model.

  • By integrating global coherence and local dynamics via a cross-attention mechanism, we achieve superior results on multiple epidemic forecasting datasets including COVID-19 and influenza-like illness.

2 Related Work

2.1 Epidemic Forecasting

Epidemic forecasting plays a crucial role in predicting the spread and impact of infectious diseases, enabling timely and effective public health interventions (Emanuel et al. 2020; Fine 2015; Terris 1993). Traditional models like the SIR (Susceptible-Infectious-Recovered) model (Hethcote 2000) use differential equations to describe disease dynamics but often rely on oversimplified assumptions (Dehning et al. 2020; Caals, Saxena, and Ho 2017). Recent advances incorporate deep learning methods like Graph Neural Networks (GNNs) (Dai et al. 2022; Wan et al. 2024), which better capture the complex interactions and spatial dependencies in disease spread from data-driven perspectives (Deng et al. 2020; Yu et al. 2023). However, these deep learning methods neglect the dynamic nature of the epidemic system. The issues of regional correlation signals and irregular sampling observation intervals remain unresolved, hindering the accurate capture of real-world epidemics. Therefore, in this work, we propose a general framework by seamlessly integrating epidemic mechanisms into Neural ODE, capturing the complex evolution of continuous-time epidemics.

2.2 Graph Neural Networks

Graph Neural Networks (GNNs) (Hamilton, Ying, and Leskovec 2017; Veličković et al. 2017; Huang et al. 2023) are widely recognized for processing non-Euclidean data structures, such as traffic networks (Wu et al. 2019). They update node representations by aggregating information from neighbors via message-passing (Zhang et al. 2024b, , a). Many studies have used GNNs for epidemic modeling (Sha, Al Hasan, and Mohler 2021; Wang et al. 2023), focusing on the spatial relationships in disease spread (Jhun 2021; La Gatta et al. 2021), but often overlook dynamic transmission patterns. Our approach addresses this by concentrating on continuous-time epidemic modeling, using GNNs to integrate multi-region global trends and create disease transmission graphs for regional propagation.

2.3 Neural Ordinary Differential Equation

Neural ODEs extend discrete neural networks to continuous-time scenarios, offering superior performance and flexibility (Chen et al. 2018). They have been widely adopted in various fields such as traffic flow forecasting (Fang et al. 2021; Choi et al. 2021), continuous dynamical systems (Chen et al. 2024; Huang et al. ), and recommendations (Qin et al. 2024). Recent advancements have integrated GNNs with Neural ODEs, enhancing the modeling of complex dependencies in graph-structured data (Luo et al. 2023; Wan, Huang, and Ye 2024). In contrast to prior work, we extend this concept to investigate important continuous-time epidemic modeling. Since the epidemic system is time-varying, we first attempt to associate each region with the time-corresponding latent variable 𝐙v(t)subscript𝐙𝑣𝑡\mathbf{Z}_{v}(t)bold_Z start_POSTSUBSCRIPT italic_v end_POSTSUBSCRIPT ( italic_t ) by a parameterized ODE 𝐙˙v(t):=d𝐙v(t)/dt=ψt(θt;t;𝐙v(t))assignsubscript˙𝐙𝑣𝑡𝑑subscript𝐙𝑣𝑡𝑑𝑡subscript𝜓𝑡subscript𝜃𝑡𝑡subscript𝐙𝑣𝑡\dot{\mathbf{Z}}_{v}(t):=d\mathbf{Z}_{v}(t)/dt=\psi_{t}\Big{(}\theta_{t};t;% \mathbf{Z}_{v}(t)\Big{)}over˙ start_ARG bold_Z end_ARG start_POSTSUBSCRIPT italic_v end_POSTSUBSCRIPT ( italic_t ) := italic_d bold_Z start_POSTSUBSCRIPT italic_v end_POSTSUBSCRIPT ( italic_t ) / italic_d italic_t = italic_ψ start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT ( italic_θ start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT ; italic_t ; bold_Z start_POSTSUBSCRIPT italic_v end_POSTSUBSCRIPT ( italic_t ) ), which depicts the region-specific dynamic trajectory for series. Thus we can derive temporal dynamics at T𝑇Titalic_T for all regions 𝐙(T)N×d𝐙𝑇superscript𝑁𝑑\mathbf{Z}(T)\in\mathbb{R}^{N\times d}bold_Z ( italic_T ) ∈ blackboard_R start_POSTSUPERSCRIPT italic_N × italic_d end_POSTSUPERSCRIPT as follows:

𝐙(T)=𝐙(0)+0Tψt(θt;t;𝐙(t))𝑑t.𝐙𝑇𝐙0superscriptsubscript0𝑇subscript𝜓𝑡subscript𝜃𝑡𝑡𝐙𝑡differential-d𝑡\mathbf{Z}(T)=\mathbf{Z}(0)+\int_{0}^{T}\psi_{t}\Big{(}\theta_{t};t;\mathbf{Z}% (t)\Big{)}dt.bold_Z ( italic_T ) = bold_Z ( 0 ) + ∫ start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT italic_ψ start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT ( italic_θ start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT ; italic_t ; bold_Z ( italic_t ) ) italic_d italic_t . (1)

Here ψtsubscript𝜓𝑡\psi_{t}italic_ψ start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT denotes the time manipulation function parameterized by θtsubscript𝜃𝑡\theta_{t}italic_θ start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT. In our framework, we leverage multi-layer perceptrons (MLP) for the time modeling module ψtsubscript𝜓𝑡\psi_{t}italic_ψ start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT by default. Furthermore, for different regions, diverse and complex processes of infectious disease transmission exist. Inspired by Neural Controlled Differential Equations (NCDE) (Kidger et al. 2020), we exploit a continuous path 𝐐vsubscript𝐐𝑣\mathbf{Q}_{v}bold_Q start_POSTSUBSCRIPT italic_v end_POSTSUBSCRIPT for each region v𝑣vitalic_v, which reformulates Eq. 1 as:

𝐙(T)=𝐙(0)+0Tψt(θt;𝐙(t))d𝐐(t)dt𝑑t.𝐙𝑇𝐙0superscriptsubscript0𝑇subscript𝜓𝑡subscript𝜃𝑡𝐙𝑡𝑑𝐐𝑡𝑑𝑡differential-d𝑡\mathbf{Z}(T)=\mathbf{Z}(0)+\int_{0}^{T}\psi_{t}\Big{(}\theta_{t};\mathbf{Z}(t% )\Big{)}\frac{d\mathbf{Q}(t)}{dt}dt.bold_Z ( italic_T ) = bold_Z ( 0 ) + ∫ start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT italic_ψ start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT ( italic_θ start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT ; bold_Z ( italic_t ) ) divide start_ARG italic_d bold_Q ( italic_t ) end_ARG start_ARG italic_d italic_t end_ARG italic_d italic_t . (2)

Eq. 2 transforms the integral problem from a Riemann integral to a Riemann-Stieltjes integral. Specifically, 𝐐(t)𝐐𝑡\mathbf{Q}(t)bold_Q ( italic_t ) is created from {(ti,𝐱i)}i=0Nsuperscriptsubscriptsubscript𝑡𝑖subscript𝐱𝑖𝑖0𝑁\{(t_{i},\mathbf{x}_{i})\}_{i=0}^{N}{ ( italic_t start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT , bold_x start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ) } start_POSTSUBSCRIPT italic_i = 0 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_N end_POSTSUPERSCRIPT by an interpolation algorithm.

3 Preliminaries

We approach the epidemic forecasting challenge by employing a graph-based prediction model. Let 𝒢=(𝒱,)𝒢𝒱\mathcal{G}=(\mathcal{V},\mathcal{E})caligraphic_G = ( caligraphic_V , caligraphic_E ) represent the graph, where 𝒱𝒱\mathcal{V}caligraphic_V denotes a set of nodes comprising |𝒱|=N𝒱𝑁|\mathcal{V}|=N| caligraphic_V | = italic_N regions (e.g., cities or states). The edge set 𝒱×𝒱𝒱𝒱\mathcal{E}\subseteq\mathcal{V}\times\mathcal{V}caligraphic_E ⊆ caligraphic_V × caligraphic_V represents the geographic links between these regions. The adjacency matrix An×nAsuperscript𝑛𝑛\textbf{A}\in\mathbb{R}^{n\times n}A ∈ blackboard_R start_POSTSUPERSCRIPT italic_n × italic_n end_POSTSUPERSCRIPT is defined such that Aij=1subscriptA𝑖𝑗1\textbf{A}_{ij}=1A start_POSTSUBSCRIPT italic_i italic_j end_POSTSUBSCRIPT = 1 if there is an edge ei,jsubscript𝑒𝑖𝑗e_{i,j}\in\mathcal{E}italic_e start_POSTSUBSCRIPT italic_i , italic_j end_POSTSUBSCRIPT ∈ caligraphic_E and Aij=0subscriptA𝑖𝑗0\textbf{A}_{ij}=0A start_POSTSUBSCRIPT italic_i italic_j end_POSTSUBSCRIPT = 0 otherwise. The normalized adjacency matrix is given by A^=D1/2AD1/2^AsuperscriptD12superscriptAD12\hat{\textbf{A}}=\textbf{D}^{-1/2}\textbf{A}\textbf{D}^{-1/2}over^ start_ARG A end_ARG = D start_POSTSUPERSCRIPT - 1 / 2 end_POSTSUPERSCRIPT bold_A bold_D start_POSTSUPERSCRIPT - 1 / 2 end_POSTSUPERSCRIPT, where the degree matrix D is a diagonal matrix with Dii=jAijsubscriptD𝑖𝑖subscript𝑗subscriptA𝑖𝑗\textbf{D}_{ii}=\sum_{j}\textbf{A}_{ij}D start_POSTSUBSCRIPT italic_i italic_i end_POSTSUBSCRIPT = ∑ start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT A start_POSTSUBSCRIPT italic_i italic_j end_POSTSUBSCRIPT.

Problem Formulation. Each node corresponds to a region with an associated time series input over a window T𝑇Titalic_T, such as infection counts for T𝑇Titalic_T weeks. We represent the training data over this period as 𝐗=[𝐱1,,𝐱T]N×T𝐗subscript𝐱1subscript𝐱𝑇superscript𝑁𝑇\mathbf{X}=[\mathbf{x}_{1},\ldots,\mathbf{x}_{T}]\in\mathbb{R}^{N\times T}bold_X = [ bold_x start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , … , bold_x start_POSTSUBSCRIPT italic_T end_POSTSUBSCRIPT ] ∈ blackboard_R start_POSTSUPERSCRIPT italic_N × italic_T end_POSTSUPERSCRIPT. The goal is to construct a model capable of predicting an epidemiological profile 𝐱T+hsubscript𝐱𝑇\mathbf{x}_{T+h}bold_x start_POSTSUBSCRIPT italic_T + italic_h end_POSTSUBSCRIPT at a future time point T+h𝑇T+hitalic_T + italic_h, where hhitalic_h denotes the prediction horizon.

4 Methodology

4.1 Overview

In Epidemic-Aware Neural ODE, we initialize disease states as region-specific features and build them upon time variables, which are then fed into proposed Network SIR-inspired neural ODE functions to capture local subtle disease transmission patterns. Furthermore, in Global-guided Local Transmission Graph we leverage the GNN to obtain global trends and evolutionary graphs. Then dynamic graphs are learned during the evolution of epidemics to guide EANO propagation. Ultimately, we proposed a cross-attention mechanism to accommodate both local nuances and global coherence for final forecasting. The illustration of the overall framework is detailed in Fig. 2.

4.2 Epidemic-Aware Neural ODE

Motivation. Existing deep learning methods neglect the continuous evolution of epidemic systems and do not explicitly learn about epidemic development. Traditional mechanistic models attempt to understand spreading patterns through ODEs but fail to utilize available data sources and model more complex epidemics fully. Therefore, we aim to combine the advantages of both approaches to enable an Epidemic-Aware Neural ODE framework.

SIR-inspired Neural ODE. In epidemiology, the standard SIR model (Hethcote 2000) categorizes the population into three distinct groups based on their disease states: susceptible (S) to infection, currently infectious (I), and recovered (R), with the latter group being immune to both contraction and transmission of the disease. The SIR model, formulated using ODEs (Grassly and Fraser 2008), describes the epidemic dynamics as follows:

dSdt=βStItP,dIdt=βStItPγIt,dRdt=γIt.\begin{gathered}\frac{dS}{dt}=-\beta\frac{S_{t}I_{t}}{P},\\ \frac{dI}{dt}=\beta\frac{S_{t}I_{t}}{P}-\gamma I_{t},\quad\frac{dR}{dt}=\gamma I% _{t}.\end{gathered}start_ROW start_CELL divide start_ARG italic_d italic_S end_ARG start_ARG italic_d italic_t end_ARG = - italic_β divide start_ARG italic_S start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT italic_I start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT end_ARG start_ARG italic_P end_ARG , end_CELL end_ROW start_ROW start_CELL divide start_ARG italic_d italic_I end_ARG start_ARG italic_d italic_t end_ARG = italic_β divide start_ARG italic_S start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT italic_I start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT end_ARG start_ARG italic_P end_ARG - italic_γ italic_I start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT , divide start_ARG italic_d italic_R end_ARG start_ARG italic_d italic_t end_ARG = italic_γ italic_I start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT . end_CELL end_ROW (3)

These equations distribute the total population P𝑃Pitalic_P across the aforementioned categories. Here, susceptible individuals become infectious upon contact with infectious ones, driven by the transmission rate β𝛽\betaitalic_β (SI𝑆𝐼S\to Iitalic_S → italic_I). In the meanwhile, infectious individuals recover and gain immunity at the recovery rate γ𝛾\gammaitalic_γ (IR𝐼𝑅I\to Ritalic_I → italic_R). However, due to resource restrictions and observation limitations, these explicit cases may not always be available in real-world scenarios. To address this issue, we propose to utilize neural ODEs to automatically infer these ODE functions in Eq. 3 via neural networks in a data-driven manner. Specifically, we treat these disease states (S𝑆Sitalic_S, I𝐼Iitalic_I, and R𝑅Ritalic_R) as latent high-dimensional variables. Each state is represented by a matrix 𝐒(t),𝐈(t)𝐒𝑡𝐈𝑡\mathbf{S}(t),\mathbf{I}(t)bold_S ( italic_t ) , bold_I ( italic_t ), and 𝐑(t)𝐑𝑡\mathbf{R}(t)bold_R ( italic_t ) in N×dsuperscript𝑁𝑑\mathbb{R}^{N\times d}blackboard_R start_POSTSUPERSCRIPT italic_N × italic_d end_POSTSUPERSCRIPT with d𝑑ditalic_d denoting the hidden dimension. In a continuous epidemic system, these states are intrinsically linked with the time variables. Thus, we utilize the NCDE approach in Eq. 2 and model the specific epidemic state 𝐂𝐂\mathbf{C}bold_C as:

𝐂(T)=𝐂(0)+0Tϕc(θc;𝐂(t))d𝐙(t)dt𝑑t,=𝐂(0)+0Tϕc(θc;𝐂(t))ψt(θt;𝐙(t))d𝐐(t)dt𝑑t.\begin{gathered}\mathbf{C}(T)=\mathbf{C}(0)+\int_{0}^{T}\phi_{c}\Big{(}\theta_% {c};\mathbf{C}(t)\Big{)}\frac{d\mathbf{Z}(t)}{dt}dt,\\ =\mathbf{C}(0)+\int_{0}^{T}\phi_{c}\Big{(}\theta_{c};\mathbf{C}(t)\Big{)}\psi_% {t}\Big{(}\theta_{t};\mathbf{Z}(t)\Big{)}\frac{d\mathbf{Q}(t)}{dt}dt.\end{gathered}start_ROW start_CELL bold_C ( italic_T ) = bold_C ( 0 ) + ∫ start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT italic_ϕ start_POSTSUBSCRIPT italic_c end_POSTSUBSCRIPT ( italic_θ start_POSTSUBSCRIPT italic_c end_POSTSUBSCRIPT ; bold_C ( italic_t ) ) divide start_ARG italic_d bold_Z ( italic_t ) end_ARG start_ARG italic_d italic_t end_ARG italic_d italic_t , end_CELL end_ROW start_ROW start_CELL = bold_C ( 0 ) + ∫ start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT italic_ϕ start_POSTSUBSCRIPT italic_c end_POSTSUBSCRIPT ( italic_θ start_POSTSUBSCRIPT italic_c end_POSTSUBSCRIPT ; bold_C ( italic_t ) ) italic_ψ start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT ( italic_θ start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT ; bold_Z ( italic_t ) ) divide start_ARG italic_d bold_Q ( italic_t ) end_ARG start_ARG italic_d italic_t end_ARG italic_d italic_t . end_CELL end_ROW (4)

Here 𝐐(t)𝐐𝑡\mathbf{Q}(t)bold_Q ( italic_t ) are controlling paths for regions given by the interpolation algorithm. They are resilient against irregular cases (e.g., unpredictable outbreaks) when implemented in real-life epidemics, providing a more responsive model for predicting disease spread. Through the NCDE, we can model these disease states in continuous-time epidemics.

Nevertheless, it directly learns these states independently while considering the high-level spatial spreading pattern within diseases. Therefore, we move beyond and designate the ODE function for each state. With well-crafted procedures ϕssubscriptitalic-ϕ𝑠\phi_{s}italic_ϕ start_POSTSUBSCRIPT italic_s end_POSTSUBSCRIPT, ϕisubscriptitalic-ϕ𝑖\phi_{i}italic_ϕ start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT, and ϕrsubscriptitalic-ϕ𝑟\phi_{r}italic_ϕ start_POSTSUBSCRIPT italic_r end_POSTSUBSCRIPT functions, we can learn not only intra-state development but also inter-state interactions within regions. Taking SIR process Eq. 3 into consideration, ϕssubscriptitalic-ϕ𝑠\phi_{s}italic_ϕ start_POSTSUBSCRIPT italic_s end_POSTSUBSCRIPT should be associated as input S(t)S𝑡\textbf{S}(t)S ( italic_t ) and I(t)I𝑡\textbf{I}(t)I ( italic_t ), given that ϕs(θs;S(t),I(t))subscriptitalic-ϕ𝑠subscript𝜃𝑠S𝑡I𝑡\phi_{s}\Big{(}\theta_{s};\textbf{S}(t),\textbf{I}(t)\Big{)}italic_ϕ start_POSTSUBSCRIPT italic_s end_POSTSUBSCRIPT ( italic_θ start_POSTSUBSCRIPT italic_s end_POSTSUBSCRIPT ; S ( italic_t ) , I ( italic_t ) ). Similarly, we then have other two functions ϕi(θi;S(t),I(t))subscriptitalic-ϕ𝑖subscript𝜃𝑖S𝑡I𝑡\phi_{i}\Big{(}\theta_{i};\textbf{S}(t),\textbf{I}(t)\Big{)}italic_ϕ start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ( italic_θ start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ; S ( italic_t ) , I ( italic_t ) ) and ϕr(θr;I(t))subscriptitalic-ϕ𝑟subscript𝜃𝑟I𝑡\phi_{r}\Big{(}\theta_{r};\textbf{I}(t)\Big{)}italic_ϕ start_POSTSUBSCRIPT italic_r end_POSTSUBSCRIPT ( italic_θ start_POSTSUBSCRIPT italic_r end_POSTSUBSCRIPT ; I ( italic_t ) ). Inspired by Network SIR applications (Balcan et al. 2009; Sha, Al Hasan, and Mohler 2021), we further incorporate regional correlations in disease transmission into the state’s updating process. Specifically, these functions are rewritten by:

ϕs=d𝐒v(t)dt=𝐖trans[𝐒v(t)||u𝒩v𝐞vu𝐈u(t)],ϕi=d𝐈v(t)dt=𝐖trans[𝐒v(t)||u𝒩v𝐞vu𝐈u(t)]𝐖recov𝐈v(t),ϕr=d𝐑v(t)dt=𝐖recov𝐈v(t),\begin{gathered}\phi_{s}=\frac{d\mathbf{S}_{v}(t)}{dt}=-\mathbf{W}_{\text{% trans}}\Big{[}\mathbf{S}_{v}(t)||\sum_{u\in\mathcal{N}_{v}}\mathbf{e}_{vu}% \mathbf{I}_{u}(t)\Big{]},\\ \phi_{i}=\frac{d\mathbf{I}_{v}(t)}{dt}=\mathbf{W}_{\text{trans}}\Big{[}\mathbf% {S}_{v}(t)||\sum_{u\in\mathcal{N}_{v}}\mathbf{e}_{vu}\mathbf{I}_{u}(t)\Big{]}-% \mathbf{W}_{\text{recov}}\mathbf{I}_{v}(t),\\ \phi_{r}=\frac{d\mathbf{R}_{v}(t)}{dt}=\mathbf{W}_{\text{recov}}\mathbf{I}_{v}% (t),\end{gathered}start_ROW start_CELL italic_ϕ start_POSTSUBSCRIPT italic_s end_POSTSUBSCRIPT = divide start_ARG italic_d bold_S start_POSTSUBSCRIPT italic_v end_POSTSUBSCRIPT ( italic_t ) end_ARG start_ARG italic_d italic_t end_ARG = - bold_W start_POSTSUBSCRIPT trans end_POSTSUBSCRIPT [ bold_S start_POSTSUBSCRIPT italic_v end_POSTSUBSCRIPT ( italic_t ) | | ∑ start_POSTSUBSCRIPT italic_u ∈ caligraphic_N start_POSTSUBSCRIPT italic_v end_POSTSUBSCRIPT end_POSTSUBSCRIPT bold_e start_POSTSUBSCRIPT italic_v italic_u end_POSTSUBSCRIPT bold_I start_POSTSUBSCRIPT italic_u end_POSTSUBSCRIPT ( italic_t ) ] , end_CELL end_ROW start_ROW start_CELL italic_ϕ start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT = divide start_ARG italic_d bold_I start_POSTSUBSCRIPT italic_v end_POSTSUBSCRIPT ( italic_t ) end_ARG start_ARG italic_d italic_t end_ARG = bold_W start_POSTSUBSCRIPT trans end_POSTSUBSCRIPT [ bold_S start_POSTSUBSCRIPT italic_v end_POSTSUBSCRIPT ( italic_t ) | | ∑ start_POSTSUBSCRIPT italic_u ∈ caligraphic_N start_POSTSUBSCRIPT italic_v end_POSTSUBSCRIPT end_POSTSUBSCRIPT bold_e start_POSTSUBSCRIPT italic_v italic_u end_POSTSUBSCRIPT bold_I start_POSTSUBSCRIPT italic_u end_POSTSUBSCRIPT ( italic_t ) ] - bold_W start_POSTSUBSCRIPT recov end_POSTSUBSCRIPT bold_I start_POSTSUBSCRIPT italic_v end_POSTSUBSCRIPT ( italic_t ) , end_CELL end_ROW start_ROW start_CELL italic_ϕ start_POSTSUBSCRIPT italic_r end_POSTSUBSCRIPT = divide start_ARG italic_d bold_R start_POSTSUBSCRIPT italic_v end_POSTSUBSCRIPT ( italic_t ) end_ARG start_ARG italic_d italic_t end_ARG = bold_W start_POSTSUBSCRIPT recov end_POSTSUBSCRIPT bold_I start_POSTSUBSCRIPT italic_v end_POSTSUBSCRIPT ( italic_t ) , end_CELL end_ROW (5)

here 𝒩vsubscript𝒩𝑣\mathcal{N}_{v}caligraphic_N start_POSTSUBSCRIPT italic_v end_POSTSUBSCRIPT denotes the neighboring nodes for node v𝑣vitalic_v in the set 𝒱𝒱\mathcal{V}caligraphic_V, with 𝐞vusubscript𝐞𝑣𝑢\mathbf{e}_{vu}bold_e start_POSTSUBSCRIPT italic_v italic_u end_POSTSUBSCRIPT representing the weight of disease transmission intensity between regions v𝑣vitalic_v and u𝑢uitalic_u. The symbol ||||| | signifies the concatenation operation. By substituting the traditional SIR’s two simple rates from Eq. 3 with the more flexible parameters Wtrans2d×dsubscriptWtranssuperscript2𝑑𝑑\textbf{W}_{\text{trans}}\in\mathbb{R}^{2d\times d}W start_POSTSUBSCRIPT trans end_POSTSUBSCRIPT ∈ blackboard_R start_POSTSUPERSCRIPT 2 italic_d × italic_d end_POSTSUPERSCRIPT and Wrecovd×dsubscriptWrecovsuperscript𝑑𝑑\textbf{W}_{\text{recov}}\in\mathbb{R}^{d\times d}W start_POSTSUBSCRIPT recov end_POSTSUBSCRIPT ∈ blackboard_R start_POSTSUPERSCRIPT italic_d × italic_d end_POSTSUPERSCRIPT, our model can derive more detailed representations of the disease spread and recovery processes. This adaptability enables the model to adjust to diverse epidemic conditions, reflecting the intricate mechanisms of disease transmission.

Additionally, incorporating regional correlations through 𝐞vusubscript𝐞𝑣𝑢\mathbf{e}_{vu}bold_e start_POSTSUBSCRIPT italic_v italic_u end_POSTSUBSCRIPT allows the model to account for spatial dependencies, thereby improving its ability to mirror real-world interactions. However, the approach still depends on static neighborhood relationships and lacks the capability to capture the dynamic nature of disease transmission as epidemics evolve.

4.3 Global-guided Local Transmission Graph

Motivation. As previously noted, EANO solely accounts for the pre-defined neighborhood connections, neglecting the evolutionary disease interactions. This observation drives us to explore more effective approaches to represent local spatial transmission patterns during the evolution.

Refer to caption
Figure 2: Architecture illustration of Epidemiology-Aware Neural ODE with Continuous Disease Transmission Graph. EARTH is a general and end-to-end framework that can flexibly capture the time-continuous epidemic mechanism. Best viewed in color.

Global Infection Trend. In addition to local transmission dynamics, epidemic systems also exhibit a global signal that represents the overall infection trend. This trend encapsulates the broader patterns of infection spread, influenced by factors such as international travel, global health policies, and widespread behavioral changes. For example, during a pandemic, international travel restrictions and lockdowns can significantly curtail cross-border disease transmission, resulting in differing regional infection rates and modifying the epidemic’s overall trajectory (Demey et al. 2020; Russell et al. 2021). To address this, we propose creating a global infection indicator feature for each region, correlated with the corresponding temporal features:

𝐇(T)=𝐇(0)+0Tφg(θg;𝐇(t))ψt(θt;𝐙(t))d𝐐(t)dt𝑑t.𝐇𝑇𝐇0superscriptsubscript0𝑇subscript𝜑𝑔subscript𝜃𝑔𝐇𝑡subscript𝜓𝑡subscript𝜃𝑡𝐙𝑡𝑑𝐐𝑡𝑑𝑡differential-d𝑡\mathbf{H}(T)=\mathbf{H}(0)+\int_{0}^{T}\varphi_{g}\Big{(}\theta_{g};\mathbf{H% }(t)\Big{)}\psi_{t}\Big{(}\theta_{t};\mathbf{Z}(t)\Big{)}\frac{d\mathbf{Q}(t)}% {dt}dt.bold_H ( italic_T ) = bold_H ( 0 ) + ∫ start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT italic_φ start_POSTSUBSCRIPT italic_g end_POSTSUBSCRIPT ( italic_θ start_POSTSUBSCRIPT italic_g end_POSTSUBSCRIPT ; bold_H ( italic_t ) ) italic_ψ start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT ( italic_θ start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT ; bold_Z ( italic_t ) ) divide start_ARG italic_d bold_Q ( italic_t ) end_ARG start_ARG italic_d italic_t end_ARG italic_d italic_t . (6)

Since global here refers to the multi-regional overall direction, we employ Dynamic Time Warping to assess the similarity of historical case trends across different regions, thereby extending the original geographic connections:

A~uv={1,if Auv=1,1,if uv and uTopk(v),0,otherwise.subscript~A𝑢𝑣cases1subscriptif A𝑢𝑣11if 𝑢𝑣 and 𝑢𝑇𝑜subscript𝑝𝑘𝑣0otherwise.\tilde{\textbf{A}}_{uv}=\begin{cases}1,&\text{if }\textbf{A}_{uv}=1,\\ 1,&\text{if }u\neq v\text{ and }u\in Top_{k}(v),\\ 0,&\text{otherwise.}\end{cases}over~ start_ARG A end_ARG start_POSTSUBSCRIPT italic_u italic_v end_POSTSUBSCRIPT = { start_ROW start_CELL 1 , end_CELL start_CELL if bold_A start_POSTSUBSCRIPT italic_u italic_v end_POSTSUBSCRIPT = 1 , end_CELL end_ROW start_ROW start_CELL 1 , end_CELL start_CELL if italic_u ≠ italic_v and italic_u ∈ italic_T italic_o italic_p start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT ( italic_v ) , end_CELL end_ROW start_ROW start_CELL 0 , end_CELL start_CELL otherwise. end_CELL end_ROW (7)

In this context, Topk(v)𝑇𝑜subscript𝑝𝑘𝑣Top_{k}(v)italic_T italic_o italic_p start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT ( italic_v ) denotes the indices of the k𝑘kitalic_k most similar nodes to node v𝑣vitalic_v. By doing this, we establish an epidemic-semantic spatial relationship that emphasizes regions with analogous epidemic progression patterns. To promote interactions between regions, we implement a residual GNN layer to update the global infection trend:

φg=σ(D~1/2A~D~1/2𝐇(t)𝐖g)+𝐇(t)subscript𝜑𝑔𝜎superscript~D12~Asuperscript~D12𝐇𝑡subscript𝐖g𝐇𝑡\varphi_{g}=\sigma\Big{(}\tilde{\textbf{D}}^{-1/2}\tilde{\textbf{A}}\tilde{% \textbf{D}}^{-1/2}\mathbf{H}(t)\mathbf{W}_{\text{g}}\Big{)}+\mathbf{H}(t)italic_φ start_POSTSUBSCRIPT italic_g end_POSTSUBSCRIPT = italic_σ ( over~ start_ARG D end_ARG start_POSTSUPERSCRIPT - 1 / 2 end_POSTSUPERSCRIPT over~ start_ARG A end_ARG over~ start_ARG D end_ARG start_POSTSUPERSCRIPT - 1 / 2 end_POSTSUPERSCRIPT bold_H ( italic_t ) bold_W start_POSTSUBSCRIPT g end_POSTSUBSCRIPT ) + bold_H ( italic_t ) (8)

Here, D~~D\tilde{\textbf{D}}over~ start_ARG D end_ARG represents the degree matrix corresponding to A~~A\tilde{\textbf{A}}over~ start_ARG A end_ARG, 𝐖gsubscript𝐖g\mathbf{W}_{\text{g}}bold_W start_POSTSUBSCRIPT g end_POSTSUBSCRIPT is the learnable weight matrix, and σ𝜎\sigmaitalic_σ denotes the activation function. The GNN layer φgsubscript𝜑𝑔\varphi_{g}italic_φ start_POSTSUBSCRIPT italic_g end_POSTSUBSCRIPT facilitates the aggregation and propagation of information across a broader regional scope, resulting in the fused global trend features 𝐇(t)𝐇𝑡\mathbf{H}(t)bold_H ( italic_t ).

Dynamic Regional Transmission. Having established the global infection trends, we leverage this comprehensive signal to guide local spatial transmission patterns by impacting the regional contexts in which local transmission occurs. For instance, high global vaccination rates can decrease the pool of susceptible individuals across multiple regions, thereby reducing local transmission opportunities. We employ a mapping function to learn these dynamically evolving patterns based on 𝐇(t)𝐇𝑡\mathbf{H}(t)bold_H ( italic_t ):

𝐌1(t)=tanh(𝐇(t)𝐖1+𝐛1),𝐌2(t)=tanh(𝐇(t)𝐖2+𝐛2),𝒜~(t)=σ(tanh(𝐌1(t)𝐌2(t)𝐌2(t)𝐌1(t))).formulae-sequencesubscript𝐌1𝑡𝐇𝑡subscript𝐖1subscript𝐛1formulae-sequencesubscript𝐌2𝑡𝐇𝑡subscript𝐖2subscript𝐛2~𝒜𝑡𝜎subscript𝐌1𝑡subscript𝐌2superscript𝑡topsubscript𝐌2𝑡subscript𝐌1superscript𝑡top\begin{gathered}\mathbf{M}_{1}(t)=\tanh\Big{(}\mathbf{H}(t)\mathbf{W}_{1}+% \mathbf{b}_{1}\Big{)},\\ \mathbf{M}_{2}(t)=\tanh\Big{(}\mathbf{H}(t)\mathbf{W}_{2}+\mathbf{b}_{2}\Big{)% },\\ \tilde{\mathcal{A}}(t)=\sigma\Big{(}\tanh(\mathbf{M}_{1}(t)\mathbf{M}_{2}(t)^{% \top}-\mathbf{M}_{2}(t)\mathbf{M}_{1}(t)^{\top})\Big{)}.\end{gathered}start_ROW start_CELL bold_M start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ( italic_t ) = roman_tanh ( bold_H ( italic_t ) bold_W start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT + bold_b start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ) , end_CELL end_ROW start_ROW start_CELL bold_M start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ( italic_t ) = roman_tanh ( bold_H ( italic_t ) bold_W start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT + bold_b start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ) , end_CELL end_ROW start_ROW start_CELL over~ start_ARG caligraphic_A end_ARG ( italic_t ) = italic_σ ( roman_tanh ( bold_M start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ( italic_t ) bold_M start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ( italic_t ) start_POSTSUPERSCRIPT ⊤ end_POSTSUPERSCRIPT - bold_M start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ( italic_t ) bold_M start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ( italic_t ) start_POSTSUPERSCRIPT ⊤ end_POSTSUPERSCRIPT ) ) . end_CELL end_ROW (9)

The concurrent local transmission relationship 𝒜~(t)~𝒜𝑡\tilde{\mathcal{A}}(t)over~ start_ARG caligraphic_A end_ARG ( italic_t ) is guided by the global trend, while Eq. 9 ensures that the learned pattern does not form a completely bidirectional graph. This highlights that inter-regional dissemination is not perfectly symmetrical, capturing the asymmetric nature of spatial interactions in epidemic spread. Additionally, we utilize a masking technique to balance the weights of static and dynamic transmission patterns:

𝕄(t)=σ(𝐖3𝒜~(t)+𝐛3𝐉),𝐄(t)=𝕄(t)A+(𝐉𝕄(t))𝒜~(t).formulae-sequence𝕄𝑡𝜎subscript𝐖3~𝒜𝑡subscript𝐛3𝐉𝐄𝑡direct-product𝕄𝑡Adirect-product𝐉𝕄𝑡~𝒜𝑡\begin{gathered}\mathbb{M}(t)=\sigma\Big{(}\mathbf{W}_{3}\tilde{\mathcal{A}}(t% )+\mathbf{b}_{3}\mathbf{J}\Big{)},\\ \mathbf{E}(t)=\mathbb{M}(t)\odot\textbf{A}+\Big{(}\mathbf{J}-\mathbb{M}(t)\Big% {)}\odot\tilde{\mathcal{A}}(t).\end{gathered}start_ROW start_CELL blackboard_M ( italic_t ) = italic_σ ( bold_W start_POSTSUBSCRIPT 3 end_POSTSUBSCRIPT over~ start_ARG caligraphic_A end_ARG ( italic_t ) + bold_b start_POSTSUBSCRIPT 3 end_POSTSUBSCRIPT bold_J ) , end_CELL end_ROW start_ROW start_CELL bold_E ( italic_t ) = blackboard_M ( italic_t ) ⊙ A + ( bold_J - blackboard_M ( italic_t ) ) ⊙ over~ start_ARG caligraphic_A end_ARG ( italic_t ) . end_CELL end_ROW (10)

In Eq. 10, 𝕄(t)𝕄𝑡\mathbb{M}(t)blackboard_M ( italic_t ) is a continuous mask matrix, and 𝐉=𝟏N𝟏N𝐉subscript1𝑁superscriptsubscript1𝑁top\mathbf{J}=\mathbf{1}_{N}\mathbf{1}_{N}^{\top}bold_J = bold_1 start_POSTSUBSCRIPT italic_N end_POSTSUBSCRIPT bold_1 start_POSTSUBSCRIPT italic_N end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ⊤ end_POSTSUPERSCRIPT represents the all-ones matrix. The matrix 𝐄(t)𝐄𝑡\mathbf{E}(t)bold_E ( italic_t ) integrates the mask matrix 𝕄(t)𝕄𝑡\mathbb{M}(t)blackboard_M ( italic_t ), the original adjacency matrix A, and the globally guided pattern 𝒜~(t)~𝒜𝑡\tilde{\mathcal{A}}(t)over~ start_ARG caligraphic_A end_ARG ( italic_t ) through element-wise Hadamard products, aiming to capture propagation patterns between regions in this dynamic system. Once the fused spatial transmission pattern 𝐄(t)𝐄𝑡\mathbf{E}(t)bold_E ( italic_t ) is obtained, we use it to update the weights 𝐞uvsubscript𝐞𝑢𝑣\mathbf{e}_{uv}bold_e start_POSTSUBSCRIPT italic_u italic_v end_POSTSUBSCRIPT in Eq. 5, resulting in:

ϕs=𝐖trans[𝐒v(t)||u𝒩~v𝐞vu(t)𝐈u(t)],ϕi=𝐖trans[𝐒v(t)||u𝒩~v𝐞vu(t)𝐈u(t)]𝐖recov𝐈v(t),ϕr=𝐖recov𝐈v(t).\begin{gathered}\phi_{s}=-\mathbf{W}_{\text{trans}}\Big{[}\mathbf{S}_{v}(t)||% \sum_{u\in\tilde{\mathcal{N}}_{v}}\mathbf{e}_{vu}(t)\mathbf{I}_{u}(t)\Big{]},% \\ \phi_{i}=\mathbf{W}_{\text{trans}}\Big{[}\mathbf{S}_{v}(t)||\sum_{u\in\tilde{% \mathcal{N}}_{v}}\mathbf{e}_{vu}(t)\mathbf{I}_{u}(t)\Big{]}-\mathbf{W}_{\text{% recov}}\mathbf{I}_{v}(t),\\ \phi_{r}=\mathbf{W}_{\text{recov}}\mathbf{I}_{v}(t).\end{gathered}start_ROW start_CELL italic_ϕ start_POSTSUBSCRIPT italic_s end_POSTSUBSCRIPT = - bold_W start_POSTSUBSCRIPT trans end_POSTSUBSCRIPT [ bold_S start_POSTSUBSCRIPT italic_v end_POSTSUBSCRIPT ( italic_t ) | | ∑ start_POSTSUBSCRIPT italic_u ∈ over~ start_ARG caligraphic_N end_ARG start_POSTSUBSCRIPT italic_v end_POSTSUBSCRIPT end_POSTSUBSCRIPT bold_e start_POSTSUBSCRIPT italic_v italic_u end_POSTSUBSCRIPT ( italic_t ) bold_I start_POSTSUBSCRIPT italic_u end_POSTSUBSCRIPT ( italic_t ) ] , end_CELL end_ROW start_ROW start_CELL italic_ϕ start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT = bold_W start_POSTSUBSCRIPT trans end_POSTSUBSCRIPT [ bold_S start_POSTSUBSCRIPT italic_v end_POSTSUBSCRIPT ( italic_t ) | | ∑ start_POSTSUBSCRIPT italic_u ∈ over~ start_ARG caligraphic_N end_ARG start_POSTSUBSCRIPT italic_v end_POSTSUBSCRIPT end_POSTSUBSCRIPT bold_e start_POSTSUBSCRIPT italic_v italic_u end_POSTSUBSCRIPT ( italic_t ) bold_I start_POSTSUBSCRIPT italic_u end_POSTSUBSCRIPT ( italic_t ) ] - bold_W start_POSTSUBSCRIPT recov end_POSTSUBSCRIPT bold_I start_POSTSUBSCRIPT italic_v end_POSTSUBSCRIPT ( italic_t ) , end_CELL end_ROW start_ROW start_CELL italic_ϕ start_POSTSUBSCRIPT italic_r end_POSTSUBSCRIPT = bold_W start_POSTSUBSCRIPT recov end_POSTSUBSCRIPT bold_I start_POSTSUBSCRIPT italic_v end_POSTSUBSCRIPT ( italic_t ) . end_CELL end_ROW (11)

The continuous and regional correlation intensity is denoted by 𝐞uv(t)subscript𝐞𝑢𝑣𝑡\mathbf{e}_{uv}(t)bold_e start_POSTSUBSCRIPT italic_u italic_v end_POSTSUBSCRIPT ( italic_t ). The set 𝒩~vsubscript~𝒩𝑣\tilde{\mathcal{N}}_{v}over~ start_ARG caligraphic_N end_ARG start_POSTSUBSCRIPT italic_v end_POSTSUBSCRIPT represents the neighborhood nodes with nonzero weights in 𝒜~(t)~𝒜𝑡\tilde{\mathcal{A}}(t)over~ start_ARG caligraphic_A end_ARG ( italic_t ) for the dynamic regional transmission of region v𝑣vitalic_v. This approach enables GLTG to leverage the global infection trend signal to guide local spatial transmission patterns, considering diverse external factors in epidemics and extending beyond simple partial transfer.

  Australia-COVID  US-Region  US-States
h=55h=5italic_h = 5 h=1010h=10italic_h = 10 h=1515h=15italic_h = 15  h=55h=5italic_h = 5 h=1010h=10italic_h = 10 h=1515h=15italic_h = 15  h=55h=5italic_h = 5 h=1010h=10italic_h = 10 h=1515h=15italic_h = 15
Methods {\mathcal{R}}caligraphic_R 𝒫𝒫{\mathcal{P}}caligraphic_P {\mathcal{R}}caligraphic_R 𝒫𝒫{\mathcal{P}}caligraphic_P {\mathcal{R}}caligraphic_R 𝒫𝒫{\mathcal{P}}caligraphic_P  {\mathcal{R}}caligraphic_R 𝒫𝒫{\mathcal{P}}caligraphic_P {\mathcal{R}}caligraphic_R 𝒫𝒫{\mathcal{P}}caligraphic_P {\mathcal{R}}caligraphic_R 𝒫𝒫{\mathcal{P}}caligraphic_P  {\mathcal{R}}caligraphic_R 𝒫𝒫{\mathcal{P}}caligraphic_P {\mathcal{R}}caligraphic_R 𝒫𝒫{\mathcal{P}}caligraphic_P {\mathcal{R}}caligraphic_R 𝒫𝒫{\mathcal{P}}caligraphic_P
VAR 665.3 83.47 575.2 92.41 502.9 95.39    1151 572.3 1396 701.6 1418 688.3    339.2 90.38 371.3 103.2 402.4 150.7
LSTM 228.0 39.78 433.6 108.4 432.6 98.09    1173 538.6 1475 736.5 1509 757.6    331.8 93.45 370.0 109.8 411.3 154.9
DCRNN 514.8 166.5 853.7 286.4 1186 404.6    1488 760.9 1443 732.7 1412 710.3    329.4 93.15 334.7 96.90 372.8 142.6
STGCN 833.7 232.6 787.8 227.7 802.1 248.3    1335 678.1 1522 819.2 1638 925.4    304.7 89.32 293.7 85.33 312.5 116.3
ASTGCN 821.5 221.9 765.9 201.3 804.1 254.8    1252 545.2 1478 801.1 1576 821.3    310.2 93.44 290.5 80.99 344.6 123.4
STGODE 310.5 66.32 392.2 91.05 571.3 159.2    1304 668.2 1403 732.1 1577 804.3    345.2 107.8 402.4 120.4 477.3 199.4
STG-NCDE 287.2 49.21 341.3 77.92 479.2 111.2    1284 643.1 1399 691.2 1421 732.1    319.2 94.39 377.6 101.5 421.3 176.7
CNNRNN-Res 1802 624.7 612.6 151.4 622.1 153.1    1190 588.3 1332 642.8 1374 652.1  303.3 86.78 292.1 79.33 333.6 105.4
EpiGNN 210.3 40.12 467.3 120.1 764.2 233.7    1136 534.2 1454 728.9 1444 764.2    288.5 84.32 297.6 84.32 391.6 157.4
CAMul 231.4 44.32 398.2 76.62 634.1 164.7    1145 557.3 1434 703.2 1402 699.2    294.6 88.16 312.8 86.71 325.2 107.5
EINN 206.2 38.19 312.4 64.21 456.9 98.72  1178 571.6 1432 729.1 1489 792.3    321.2 97.91 342.1 100.1 402.7 162.9
ColaGNN 224.2 55.23 544.8 161.6 795.8 258.0    1148 533.6 1524 846.6 1552 856.3    299.1 81.53 283.4 79.12 339.4 120.6
EpiColaGNN 204.3 36.86 345.4 68.39 886.0 296.5    1185 575.7 1341 648.1 1371 666.9    286.1 83.38 300.9 90.65 375.1 132.5
EARTH 156.8 30.12 177.6 38.62 225.3 56.32  1080 522.4 1244 605.3 1301 647.1  243.2 67.43 277.8 80.43 300.1 104.2
Table 1: Comparison with the state-of-the-art methods on three epidemic forecasting datasets. Best in bold and second with underline.

Global and Local Epidemic Fusion. With the differential and integral processes of epidemics established, we determine the global infection trend 𝐇(t)𝐇𝑡\mathbf{H}(t)bold_H ( italic_t ) for continuous time t𝑡titalic_t using a designated ODE solver, such as Runge–Kutta:

𝐇(t)=ODESolver(d𝐇(t)dt,𝐇0,t).𝐇𝑡ODESolver𝑑𝐇𝑡𝑑𝑡subscript𝐇0𝑡\mathbf{H}(t)=\text{ODESolver}\left(\frac{d\mathbf{H}(t)}{dt},\mathbf{H}_{0},t% \right).bold_H ( italic_t ) = ODESolver ( divide start_ARG italic_d bold_H ( italic_t ) end_ARG start_ARG italic_d italic_t end_ARG , bold_H start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT , italic_t ) . (12)

Given the overall historical time window T𝑇Titalic_T, we ultimately derive the global infection trend 𝐇(T)𝐇𝑇\mathbf{H}(T)bold_H ( italic_T ) and local disease states (T)={𝐒(t),𝐈(t),𝐑(t)}𝑇𝐒𝑡𝐈𝑡𝐑𝑡\mathcal{M}(T)=\{\mathbf{S}(t),\mathbf{I}(t),\mathbf{R}(t)\}caligraphic_M ( italic_T ) = { bold_S ( italic_t ) , bold_I ( italic_t ) , bold_R ( italic_t ) }. To integrate both the global coherence of epidemic trends and the local intricacies of disease states, we design a multi-headed cross-attention mechanism to merge the global and local transmission information. Specifically, we use 𝐇(T)𝐇𝑇\mathbf{H}(T)bold_H ( italic_T ) to guide the fusion of (T)𝑇\mathcal{M}(T)caligraphic_M ( italic_T ). Given three common sets of inputs: query set Q𝑄Qitalic_Q, key set K𝐾Kitalic_K, and value set V𝑉Vitalic_V, we define \mathcal{H}caligraphic_H as follows:

(Q,K,V)=(Ω1Ω2ΩN𝒯)𝐖,𝒮(Q,K,V)=softmax(QKTdf)V,Ωμ=𝒮(Q𝐖μQ,K𝐖μK,V𝐖μV)|μ=1N𝒯.formulae-sequence𝑄𝐾𝑉direct-sumsubscriptΩ1subscriptΩ2subscriptΩsubscript𝑁𝒯𝐖formulae-sequence𝒮𝑄𝐾𝑉softmax𝑄superscript𝐾𝑇subscript𝑑𝑓𝑉subscriptΩ𝜇evaluated-at𝒮𝑄𝐖superscript𝜇𝑄𝐾𝐖superscript𝜇𝐾𝑉𝐖superscript𝜇𝑉𝜇1subscript𝑁𝒯\begin{gathered}\mathcal{H}(Q,K,V)=(\Omega_{1}\oplus\Omega_{2}\oplus\cdots% \oplus\Omega_{N_{\mathcal{T}}})\mathbf{W},\\ \mathcal{S}(Q,K,V)=\text{softmax}\left(\frac{QK^{T}}{\sqrt{d_{f}}}\right)V,\\ \Omega_{\mu}=\mathcal{S}(Q\mathbf{W}\mu^{Q},K\mathbf{W}\mu^{K},V\mathbf{W}\mu^% {V})\big{|}_{\mu=1}^{N_{\mathcal{T}}}.\end{gathered}start_ROW start_CELL caligraphic_H ( italic_Q , italic_K , italic_V ) = ( roman_Ω start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ⊕ roman_Ω start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ⊕ ⋯ ⊕ roman_Ω start_POSTSUBSCRIPT italic_N start_POSTSUBSCRIPT caligraphic_T end_POSTSUBSCRIPT end_POSTSUBSCRIPT ) bold_W , end_CELL end_ROW start_ROW start_CELL caligraphic_S ( italic_Q , italic_K , italic_V ) = softmax ( divide start_ARG italic_Q italic_K start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT end_ARG start_ARG square-root start_ARG italic_d start_POSTSUBSCRIPT italic_f end_POSTSUBSCRIPT end_ARG end_ARG ) italic_V , end_CELL end_ROW start_ROW start_CELL roman_Ω start_POSTSUBSCRIPT italic_μ end_POSTSUBSCRIPT = caligraphic_S ( italic_Q bold_W italic_μ start_POSTSUPERSCRIPT italic_Q end_POSTSUPERSCRIPT , italic_K bold_W italic_μ start_POSTSUPERSCRIPT italic_K end_POSTSUPERSCRIPT , italic_V bold_W italic_μ start_POSTSUPERSCRIPT italic_V end_POSTSUPERSCRIPT ) | start_POSTSUBSCRIPT italic_μ = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_N start_POSTSUBSCRIPT caligraphic_T end_POSTSUBSCRIPT end_POSTSUPERSCRIPT . end_CELL end_ROW (13)

The μ𝜇\muitalic_μ-th head is represented by ΩμsubscriptΩ𝜇\Omega_{\mu}roman_Ω start_POSTSUBSCRIPT italic_μ end_POSTSUBSCRIPT, and the attention function is denoted as 𝒮𝒮\mathcal{S}caligraphic_S. The learnable linear mappings include 𝐖𝐖\mathbf{W}bold_W, 𝐖Qsuperscript𝐖𝑄\mathbf{W}^{Q}bold_W start_POSTSUPERSCRIPT italic_Q end_POSTSUPERSCRIPT, 𝐖Ksuperscript𝐖𝐾\mathbf{W}^{K}bold_W start_POSTSUPERSCRIPT italic_K end_POSTSUPERSCRIPT, and 𝐖Vsuperscript𝐖𝑉\mathbf{W}^{V}bold_W start_POSTSUPERSCRIPT italic_V end_POSTSUPERSCRIPT. The formulation for the global-local fusion is given by:

𝐅(T)=(𝐙(T),(T),(T)).𝐅𝑇𝐙𝑇𝑇𝑇\mathbf{F}(T)=\mathcal{H}\Big{(}\mathbf{Z}(T),\mathcal{M}(T),\mathcal{M}(T)% \Big{)}.bold_F ( italic_T ) = caligraphic_H ( bold_Z ( italic_T ) , caligraphic_M ( italic_T ) , caligraphic_M ( italic_T ) ) . (14)

In Eq. 14, 𝐅(T)𝐅𝑇\mathbf{F}(T)bold_F ( italic_T ) represents the fused feature. Conceptually, the global trend feature serves as a query, calculating the similarity with each detailed disease state. This method aids in recognizing the semantic epidemic conditions and attentively integrating the disease features.

4.4 Overall Objective

Ultimately, we derive the fused features 𝐅(T)𝐅𝑇\mathbf{F}(T)bold_F ( italic_T ), which harmonize the global consistency of epidemic trends with the particularities of local health conditions. We then concatenate these features with the time-corresponding features 𝐙(T)𝐙𝑇\mathbf{Z}(T)bold_Z ( italic_T ) and employ an MLP parameterized by θfsubscript𝜃𝑓\theta_{f}italic_θ start_POSTSUBSCRIPT italic_f end_POSTSUBSCRIPT to pool the final prediction for region v𝑣vitalic_v:

yv=f(θf;[𝐅v(T)||𝐙v(T)]).y_{v}=f(\theta_{f};[\mathbf{F}_{v}(T)||\mathbf{Z}_{v}(T)]).italic_y start_POSTSUBSCRIPT italic_v end_POSTSUBSCRIPT = italic_f ( italic_θ start_POSTSUBSCRIPT italic_f end_POSTSUBSCRIPT ; [ bold_F start_POSTSUBSCRIPT italic_v end_POSTSUBSCRIPT ( italic_T ) | | bold_Z start_POSTSUBSCRIPT italic_v end_POSTSUBSCRIPT ( italic_T ) ] ) . (15)

Following previous methods (Deng et al. 2020; Xie et al. 2022), we use the MSE loss to compare the predicted values with the ground truth:

mse=i=1Bv=1N|yi,vy^i,v|,subscript𝑚𝑠𝑒superscriptsubscript𝑖1𝐵superscriptsubscript𝑣1𝑁subscript𝑦𝑖𝑣subscript^𝑦𝑖𝑣\mathcal{L}_{mse}=\sum_{i=1}^{B}\sum_{v=1}^{N}|y_{i,v}-\hat{y}_{i,v}|,caligraphic_L start_POSTSUBSCRIPT italic_m italic_s italic_e end_POSTSUBSCRIPT = ∑ start_POSTSUBSCRIPT italic_i = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_B end_POSTSUPERSCRIPT ∑ start_POSTSUBSCRIPT italic_v = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_N end_POSTSUPERSCRIPT | italic_y start_POSTSUBSCRIPT italic_i , italic_v end_POSTSUBSCRIPT - over^ start_ARG italic_y end_ARG start_POSTSUBSCRIPT italic_i , italic_v end_POSTSUBSCRIPT | , (16)

where B𝐵Bitalic_B denotes the sample size, and i𝑖iitalic_i is the sample index. y^i,vsubscript^𝑦𝑖𝑣\hat{y}_{i,v}over^ start_ARG italic_y end_ARG start_POSTSUBSCRIPT italic_i , italic_v end_POSTSUBSCRIPT represents the true value for sample i𝑖iitalic_i of region v𝑣vitalic_v.

5 Experiment

In this section, we comprehensively evaluate our proposed EARTH by answering the main questions:

  • Q1: Performance. Does EARTH outperforms the existing state-of-the-art epidemic forecasting methods?

  • Q2: Resilience. Is EARTH stable on different settings?

  • Q3: Effectiveness. Are proposed two key components: EANO and GLTG both effective?

  • Q4: Sensitivity. What is the performance of the proposed method with different hyper-parameters?

The answers of Q1-Q4 are illustrated as follows.

5.1 Experimental Setup

Refer to caption
(a) Visualization of Region 4
Refer to caption
(b) Visualization of Region 8
Figure 3: Visualization of predicted cases. We randomly pick two regions in the Australia-COVID dataset with horizon 10. It shows that EARTH fits the ground truth well and follows the developing trend of epidemics. Better view in enlarged.

Real-world Datasets. We leverage three datasets to examine the validity of our EARTH, including COVID-19 and influenza-like illness: Australia-COVID, US-Regions, and US-States. Please see Appendix A for dataset details.

Implemention Details. We use two metrics following (Liu, Liu, and Liu 2023): \mathcal{R}caligraphic_R represents RMSE (Root Mean Square Error), while 𝒫𝒫\mathcal{P}caligraphic_P stands for Peak Time Error, which calculates the MAE (Mean Absolute Error) focusing only on significant peaks in the epidemics using a specified threshold. For more details please refer to Appendix B.

Counterparts. We compare ours against several SOTA epidemic forecasting methods using the computational epidemiology repository EpiLearn (Liu et al. 2024a): VAR (Song et al. 2020), LSTM (Sesti et al. 2021), DCRNN (Li et al. 2018), STGCN (Yu, Yin, and Zhu 2017), ASTGCN (Guo et al. 2019), STGODE (Fang et al. 2021), STG-NCDE (Choi et al. 2021), CNNRNN-Res (Wu et al. 2018), CAMul (Kamarthi et al. 2021), EINN (Rodríguez et al. 2023), ColaGNN (Deng et al. 2020), EpiGNN (Xie et al. 2022) and EpiColaGNN (Liu, Liu, and Liu 2023).

5.2 Performance

This section addresses Q1. To demonstrate the excellent performance of our proposed EARTH, we conducted comprehensive experiments on various epidemic datasets. We considered multiple baselines, including general spatio-temporal and epidemic forecasting methods, as detailed in Tab. 1. Key observations include: 1) VAR and LSTM are inadequate at capturing complex spatial dependencies, making them less effective. 2) General spatio-temporal methods like STGCN or ASTGCN can capture some regional dependencies but struggle with time development and long-term predictions. 3) ODE-based methods like STGODE can learn complex dynamic systems but do not sufficiently consider epidemic mechanisms. 4) Some epidemic-specific methods achieve excellent results but still struggle to model the evolution of epidemics. 5) Mechanistic methods like EINN do not succeed in capturing high-level spatial interaction between diseases from different regions. 6) EARTH demonstrates competitive performance across various real-world datasets due to its ability to learn complex epidemic evolution and dynamic regional propagation patterns.

Additionally, to visually underscore the superiority of EARTH, we compared predicted cases in the Australia-COVID dataset with a horizon of 10 against different baselines. The results, shown in Fig. 3, indicate that our method more accurately fits the ground truth and follows the trend of epidemic development. We also show the learned graph in our GLTG component in Fig. 4, which demonstrates that our method goes beyond geographical connections and obtains global horizons during evolution.

5.3 Resilience

This section addresses the question Q2. We conducted two key experiments to evaluate this aspect: 1) We examine the robustness of the method under different irregular conditions with a range of missing rates, as detailed in Tab. 3. The outcomes show that EARTH remains robust across different datasets with diverse intervals. Our proposed method consistently outperforms other baseline methods, demonstrating its resilience to variable intervals and missing data. 2) We also test the performance of EARTH across different prediction horizons as shown in Appendix C. The results indicate that our method can make stable predictions over various horizons while learning epidemic mechanisms enables superior long-term prediction compared to other methods.

  Australia-COVID  US-Region
h=55h=5italic_h = 5 h=1010h=10italic_h = 10  h=55h=5italic_h = 5 h=1010h=10italic_h = 10
Variants \mathcal{R}caligraphic_R 𝒫𝒫\mathcal{P}caligraphic_P \mathcal{R}caligraphic_R 𝒫𝒫\mathcal{P}caligraphic_P  \mathcal{R}caligraphic_R 𝒫𝒫\mathcal{P}caligraphic_P \mathcal{R}caligraphic_R 𝒫𝒫\mathcal{P}caligraphic_P
w/o Both 267.4 43.65 322.7 63.04    1235 637.4 1367 675.3
w EANO 178.6 36.99 184.5 44.62    1120 538.3 1282 639.2
w GLTG 232.4 40.44 301.2 57.42    1204 579.3 1321 654.3
w/o Dyna. Graph 227.6 38.44 290.0 53.89    1184 572.5 1302 647.0
w/o Glo. Trend 172.4 33.75 182.0 44.39    1102 541.2 1267 621.3
Fully Connected 192.1 39.62 194.7 40.22    1192 564.3 1347 666.0
Sparse Penalty 160.9 32.60 182.4 59.37    1075 519.2 1271 635.4
EARTH 156.8 30.12 177.6 38.62  1080 522.4 1244 605.3
Table 2: Ablation Study of different variants on two datasets.
Refer to caption
Figure 4: Learned Regional Graph in GLTG. We visualize top-3 weighted edges for each region in the US-States dataset, excluding states with no available data.

5.4 Effectiveness

The explanation for Q3 is presented in this section. Tab. 2 firstly discusses two key design elements in our method: EANO effectively enhances performance by leveraging the powerful capabilities of Neural ODEs while specifically considering the disease propagation mechanism. Additionally, GLTG yields promising results by learning the dynamic regional patterns during the evolution of epidemics.

In addition, we dive into GLTG deeper by considering it without dynamic graphs (w/o Dyna. Graph) or global trends (w/o Glo. Trend). The results show that dynamic graphs play a crucial role in modeling disease spatial interactions, while global trends help to capture long-distance information. We also examine EARTH under a fully connected regional graph, declaring this will lead to information redundancy. The variant considering adding a sparse penalty loss to the learned graph for encouraging sparsity, will not impact the final performance significantly.

Refer to caption
(a) Head of Attention N𝒯subscript𝑁𝒯N_{\mathcal{T}}italic_N start_POSTSUBSCRIPT caligraphic_T end_POSTSUBSCRIPT
Refer to caption
(b) Global Connection k𝑘kitalic_k
Figure 5: Analysis on hyper-parameter. Performance with hyper-parameter N𝒯subscript𝑁𝒯N_{\mathcal{T}}italic_N start_POSTSUBSCRIPT caligraphic_T end_POSTSUBSCRIPT and k𝑘kitalic_k, where red, yellow, and green represent the Australia-COVID, US-States, and US-Region respectively.
  Australia-COVID  US-Region
h=55h=5italic_h = 5 h=1010h=10italic_h = 10  h=55h=5italic_h = 5 h=1010h=10italic_h = 10
Missing Rate \mathcal{R}caligraphic_R 𝒫𝒫\mathcal{P}caligraphic_P \mathcal{R}caligraphic_R 𝒫𝒫\mathcal{P}caligraphic_P  \mathcal{R}caligraphic_R 𝒫𝒫\mathcal{P}caligraphic_P \mathcal{R}caligraphic_R 𝒫𝒫\mathcal{P}caligraphic_P
40% 173.1 38.02 190.2 45.68    1129 541.9 1267 629.9
30% 168.5 36.42 187.2 44.50    1115 535.4 1265 631.8
20% 162.4 33.44 184.7 42.97    1110 536.2 1262 618.4
10% 158.4 30.95 180.4 40.77    1089 529.6 1259 613.1
0% 156.8 30.12 177.6 38.62    1080 522.4 1244 605.3
Table 3: Analysis under irregular conditions on two datasets.

5.5 Sensitivity

This section provides an answer to Q4. As shown in Fig. 5, we first investigate the effect of varying the number of attention heads N𝒯subscript𝑁𝒯N_{\mathcal{T}}italic_N start_POSTSUBSCRIPT caligraphic_T end_POSTSUBSCRIPT in Eq. 13. The results demonstrate overall stability with different numbers of heads, although too few heads can impair the method’s ability to capture diverse information. We also test our method with different values of k𝑘kitalic_k in Eq. 7. More connections can lead to redundancy in message passing for datasets with fewer regions (e.g., Australia-COVID). In contrast, larger datasets (e.g., US-States) can tolerate more connections. In all cases, global connections are essential for learning global infection trends.

6 Conclusion

In this paper, we propose a novel framework, EARTH, to improve epidemic forecasting performance. By integrating neural ODEs with traditional compartmental models, EANO captures the underlying disease propagation mechanisms. We also identify global infection trends and introduce GLTG to dynamically adjust local transmission patterns. Using a global-local cross-attention fusion approach, we extract representative features that account for both subtle disease states and broader trends. Extensive experiments on real-world epidemic datasets highlight the effectiveness of EARTH, offering valuable insights into combining mechanistic models with deep learning for future applications in epidemiology and data science.

References

  • Arik et al. (2020) Arik, S.; Li, C.-L.; Yoon, J.; Sinha, R.; Epshteyn, A.; Le, L.; Menon, V.; Singh, S.; Zhang, L.; Nikoltchev, M.; et al. 2020. Interpretable sequence learning for COVID-19 forecasting. Advances in Neural Information Processing Systems, 33: 18807–18818.
  • Balcan et al. (2009) Balcan, D.; Colizza, V.; Gonçalves, B.; Hu, H.; Ramasco, J. J.; and Vespignani, A. 2009. Multiscale Mobility Networks and the Spatial Spreading of Infectious Diseases. Proceedings of the National Academy of Sciences, 106(51): 21484–21489.
  • Brede (2012) Brede, M. 2012. Networks—An Introduction . Mark E. J. Newman. (2010, Oxford University Press.) $65.38, £35.96 (Hardcover), 772 Pages. ISBN-978-0-19-920665-0. Artificial Life, 18(2): 241–242.
  • Caals, Saxena, and Ho (2017) Caals, K.; Saxena, A.; and Ho, C. W.-L. 2017. Ethics of Epidemics, Research and Surveillance: A WHO Workshop Report. Asian Bioethics Review, 9(3): 265–271.
  • Chauhan et al. (2023) Chauhan, R.; Varma, G.; Yafi, E.; and Zuhairi, M. F. 2023. The Impact of Geo-Political Socio-Economic Factors on Vaccine Dissemination Trends: A Case-Study on COVID-19 Vaccination Strategies. BMC Public Health, 23(1): 2142.
  • Chen et al. (2018) Chen, R. T.; Rubanova, Y.; Bettencourt, J.; and Duvenaud, D. K. 2018. Neural ordinary differential equations. NeurIPS, 31.
  • Chen et al. (2024) Chen, Y.; Ren, K.; Wang, Y.; Fang, Y.; Sun, W.; and Li, D. 2024. ContiFormer: Continuous-Time Transformer for Irregular Time Series Modeling. arXiv:2402.10635.
  • Choi et al. (2021) Choi, J.; Choi, H.; Hwang, J.; and Park, N. 2021. Graph Neural Controlled Differential Equations for Traffic Forecasting. arXiv:2112.03558.
  • Cm (2020) Cm, J. 2020. Does the Inadequate Health Resources Aggravate Covid-19 Pandemic? Scholars Journal of Applied Medical Sciences, 8(7): 1646–1650.
  • Dai et al. (2022) Dai, E.; Zhao, T.; Zhu, H.; Xu, J.; Guo, Z.; Liu, H.; Tang, J.; and Wang, S. 2022. A comprehensive survey on trustworthy graph neural networks: Privacy, robustness, fairness, and explainability. arXiv preprint arXiv:2204.08570.
  • Dehning et al. (2020) Dehning, J.; Zierenberg, J.; Spitzner, F. P.; Wibral, M.; Neto, J. P.; Wilczek, M.; and Priesemann, V. 2020. Inferring Change Points in the Spread of COVID-19 Reveals the Effectiveness of Interventions. Science, 369(6500): eabb9789.
  • Demey et al. (2020) Demey, B.; Daher, N.; François, C.; Lanoix, J.-P.; Duverlie, G.; Castelain, S.; and Brochot, E. 2020. Dynamic Profile for the Detection of Anti-SARS-CoV-2 Antibodies Using Four Immunochromatographic Assays. Journal of Infection, 81(2): e6–e10.
  • Deng et al. (2020) Deng, S.; Wang, S.; Rangwala, H.; Wang, L.; and Ning, Y. 2020. Cola-GNN: Cross-location Attention Based Graph Neural Networks for Long-term ILI Prediction. In Proceedings of the 29th ACM International Conference on Information & Knowledge Management, 245–254. ACM.
  • Emanuel et al. (2020) Emanuel, E. J.; Persad, G.; Upshur, R.; Thome, B.; Parker, M.; Glickman, A.; Zhang, C.; Boyle, C.; Smith, M.; and Phillips, J. P. 2020. Fair allocation of scarce medical resources in the time of Covid-19.
  • Fang et al. (2021) Fang, Z.; Long, Q.; Song, G.; and Xie, K. 2021. Spatial-Temporal Graph ODE Networks for Traffic Flow Forecasting. In ACM SIGKDD, 364–373.
  • Fine (2015) Fine, P. 2015. Another Defining Moment for Epidemiology. The Lancet, 385(9965): 319–320.
  • Funk et al. (2018) Funk, S.; Camacho, A.; Kucharski, A. J.; Eggo, R. M.; and Edmunds, W. J. 2018. Real-Time Forecasting of Infectious Disease Dynamics with a Stochastic Semi-Mechanistic Model. Epidemics, 22: 56–61.
  • Grassly and Fraser (2008) Grassly, N. C.; and Fraser, C. 2008. Mathematical Models of Infectious Disease Transmission. Nature Reviews Microbiology, 6(6): 477–487.
  • Guo et al. (2019) Guo, S.; Lin, Y.; Feng, N.; Song, C.; and Wan, H. 2019. Attention based spatial-temporal graph convolutional networks for traffic flow forecasting. In Proceedings of the AAAI conference on artificial intelligence, volume 33, 922–929.
  • Hamilton, Ying, and Leskovec (2017) Hamilton, W.; Ying, Z.; and Leskovec, J. 2017. Inductive representation learning on large graphs. In NeurIPS.
  • Hethcote (2000) Hethcote, H. W. 2000. The Mathematics of Infectious Diseases. SIAM Review, 42(4): 599–653.
  • Huang et al. (2023) Huang, W.; Wan, G.; Ye, M.; and Du, B. 2023. Federated Graph Semantic and Structural Learning.
  • (23) Huang, Z.; Zhao, W.; Gao, J.; Hu, Z.; Luo, X.; Cao, Y.; Chen, Y.; Sun, Y.; and Wang, W. ???? TANGO: Time-reversal Latent GraphODE for Multi-Agent Dynamical Systems.
  • Jhun (2021) Jhun, B. 2021. Effective Vaccination Strategy Using Graph Neural Network Ansatz. arXiv:2111.00920.
  • Kamarthi et al. (2021) Kamarthi, H.; Kong, L.; Rodríguez, A.; Zhang, C.; and Prakash, B. A. 2021. CAMul: Calibrated and Accurate Multi-View Time-Series Forecasting.
  • Kidger et al. (2020) Kidger, P.; Morrill, J.; Foster, J.; and Lyons, T. 2020. Neural controlled differential equations for irregular time series. NeurIPS, 33: 6696–6707.
  • Kondratyev (2013) Kondratyev, M. A. 2013. Forecasting Methods and Models of Disease Spread. Computer Research and Modeling, 5(5): 863–882.
  • La Gatta et al. (2021) La Gatta, V.; Moscato, V.; Postiglione, M.; and Sperli, G. 2021. An Epidemiological Neural Network Exploiting Dynamic Graph Structured Data Applied to the COVID-19 Outbreak. IEEE Transactions on Big Data, 7(1): 45–55.
  • Li et al. (2018) Li, Y.; Yu, R.; Shahabi, C.; and Liu, Y. 2018. Diffusion Convolutional Recurrent Neural Network: Data-Driven Traffic Forecasting. arXiv:1707.01926.
  • Liu, Liu, and Liu (2023) Liu, M.; Liu, Y.; and Liu, J. 2023. Epidemiology-Aware Deep Learning for Infectious Disease Dynamics Prediction. In International Conference on Information and Knowledge Management, Proceedings, 4084–4088. Association for Computing Machinery.
  • Liu et al. (2024a) Liu, Z.; Li, Y.; Wei, M.; Wan, G.; Lau, M. S.; and Jin, W. 2024a. EpiLearn: A Python Library for Machine Learning in Epidemic Modeling. In Seventh epiDAMIK Workshop at ACM SIGKDD Conference on Knowledge Discovery and Data Mining.
  • Liu et al. (2024b) Liu, Z.; Wan, G.; Prakash, B. A.; Lau, M. S.; and Jin, W. 2024b. A Review of Graph Neural Networks in Epidemic Modeling. arXiv preprint arXiv:2403.19852.
  • Luo et al. (2023) Luo, X.; Yuan, J.; Huang, Z.; Jiang, H.; Qin, Y.; Ju, W.; Zhang, M.; and Sun, Y. 2023. HOPE: High-order Graph ODE For Modeling Interacting Dynamics. In Proceedings of the 40th International Conference on Machine Learning, 23124–23139. PMLR.
  • Madden et al. (2024) Madden, W.; Jin, W.; Lopman, B.; Zufle, A.; Dalziel, B. D.; Metcalf, J.; Grenfell, B. D.; and Lau, M. S. 2024. Neural networks for endemic measles dynamics: comparative analysis and integration with mechanistic models. medRxiv, 2024–05.
  • Martin, Sánchez, and Wilkinson (2023) Martin, F. M.; Sánchez, J. M.; and Wilkinson, O. 2023. The Economic Impact of COVID-19 around the World. Review, 105(2).
  • Mežnar, Lavrač, and Škrlj (2021) Mežnar, S.; Lavrač, N.; and Škrlj, B. 2021. Prediction of the Effects of Epidemic Spreading with Graph Neural Networks. In Benito, R. M.; Cherifi, C.; Cherifi, H.; Moro, E.; Rocha, L. M.; and Sales-Pardo, M., eds., Complex Networks & Their Applications IX, Studies in Computational Intelligence, 420–431. Springer International Publishing.
  • Pak et al. (2020) Pak, A.; Adegboye, O. A.; Adekunle, A. I.; Rahman, K. M.; McBryde, E. S.; and Eisen, D. P. 2020. Economic Consequences of the COVID-19 Outbreak: The Need for Epidemic Preparedness. Frontiers in Public Health, 8.
  • Poli et al. (2021) Poli, M.; Massaroli, S.; Park, J.; Yamashita, A.; Asama, H.; and Park, J. 2021. Graph Neural Ordinary Differential Equations. arXiv:1911.07532.
  • Qin et al. (2024) Qin, Y.; Ju, W.; Wu, H.; Luo, X.; and Zhang, M. 2024. Learning Graph ODE for Continuous-Time Sequential Recommendation. IEEE Transactions on Knowledge and Data Engineering, 1–14.
  • Robbins and Monro (1951) Robbins, H.; and Monro, S. 1951. A stochastic approximation method. AoMS, 400–407.
  • Rodríguez et al. (2023) Rodríguez, A.; Cui, J.; Ramakrishnan, N.; Adhikari, B.; and Prakash, B. A. 2023. EINNs: Epidemiologically-informed Neural Networks. arXiv:2202.10446.
  • Russell et al. (2021) Russell, T. W.; Wu, J. T.; Clifford, S.; Edmunds, W. J.; Kucharski, A. J.; and Jit, M. 2021. Effect of Internationally Imported Cases on Internal Spread of COVID-19: A Mathematical Modelling Study. The Lancet Public Health, 6(1): e12–e20.
  • Sesti et al. (2021) Sesti, N.; Garau-Luis, J. J.; Crawley, E.; and Cameron, B. 2021. Integrating LSTMs and GNNs for COVID-19 Forecasting. arXiv:2108.10052.
  • Sha, Al Hasan, and Mohler (2021) Sha, H.; Al Hasan, M.; and Mohler, G. 2021. Source Detection on Networks Using Spatial Temporal Graph Convolutional Networks. In 2021 IEEE 8th International Conference on Data Science and Advanced Analytics (DSAA), 1–11. IEEE.
  • Song et al. (2020) Song, C.; Lin, Y.; Guo, S.; and Wan, H. 2020. Spatial-temporal synchronous graph convolutional networks: A new framework for spatial-temporal network data forecasting. In Proceedings of the AAAI conference on artificial intelligence, volume 34, 914–921.
  • Terris (1993) Terris, M. 1993. The Society for Epidemiologic Research and the Future of Epidemiology. Journal of Public Health Policy, 14(2): 137.
  • Veličković et al. (2017) Veličković, P.; Cucurull, G.; Casanova, A.; Romero, A.; Lio, P.; and Bengio, Y. 2017. Graph attention networks. arXiv preprint arXiv:1710.10903.
  • Wan, Huang, and Ye (2024) Wan, G.; Huang, W.; and Ye, M. 2024. Federated Graph Learning under Domain Shift with Generalizable Prototypes. In AAAI.
  • Wan et al. (2024) Wan, G.; Tian, Y.; Huang, W.; Chawla, N. V.; and Ye, M. 2024. S3GCL: Spectral, Swift, Spatial Graph Contrastive Learning. In Forty-first International Conference on Machine Learning.
  • Wang et al. (2023) Wang, S.; Zhao, X.; Qiu, J.; Wang, H.; and Tao, C. 2023. WDCIP: Spatio-Temporal AI-driven Disease Control Intelligent Platform for Combating COVID-19 Pandemic. Geo-spatial Information Science, 0(0): 1–25.
  • Wu et al. (2018) Wu, Y.; Yang, Y.; Nishiura, H.; and Saitoh, M. 2018. Deep learning for epidemiological predictions. In The 41st International ACM SIGIR Conference on Research & Development in Information Retrieval, 1085–1088.
  • Wu et al. (2019) Wu, Z.; Pan, S.; Long, G.; Jiang, J.; and Zhang, C. 2019. Graph WaveNet for Deep Spatial-Temporal Graph Modeling. arXiv:1906.00121.
  • Xie et al. (2022) Xie, F.; Zhang, Z.; Li, L.; and Tan, Y. 2022. EpiGNN: Exploring Spatial Transmission with Graph Neural Network for Regional Epidemic Forecasting. Technical report.
  • Yang et al. (2023) Yang, C.; Zhang, Z.; Fan, Z.; Jiang, R.; Chen, Q.; Song, X.; and Shibasaki, R. 2023. EpiMob: Interactive Visual Analytics of Citywide Human Mobility Restrictions for Epidemic Control. IEEE Transactions on Visualization and Computer Graphics, 29(8): 3586–3601.
  • Yu, Yin, and Zhu (2017) Yu, B.; Yin, H.; and Zhu, Z. 2017. Spatio-temporal graph convolutional networks: A deep learning framework for traffic forecasting. arXiv preprint arXiv:1709.04875.
  • Yu et al. (2023) Yu, S.; Xia, F.; Li, S.; Hou, M.; and Sheng, Q. Z. 2023. Spatio-Temporal Graph Learning for Epidemic Prediction. ACM Transactions on Intelligent Systems and Technology, 14(2).
  • Zhang et al. (2024a) Zhang, G.; Sun, X.; Yue, Y.; Wang, K.; Chen, T.; and Pan, S. 2024a. Graph Sparsification via Mixture of Graphs. arXiv preprint arXiv:2405.14260.
  • Zhang et al. (2024b) Zhang, G.; Wang, K.; Huang, W.; Yue, Y.; Wang, Y.; Zimmermann, R.; Zhou, A.; Cheng, D.; Zeng, J.; and Liang, Y. 2024b. Graph lottery ticket automated. In The Twelfth International Conference on Learning Representations.
  • Zhang et al. (2024c) Zhang, T.; Zhang, Y.; Wang, K.; Wang, K.; Yang, B.; Zhang, K.; Shao, W.; Liu, P.; Zhou, J. T.; and You, Y. 2024c. Two trades is not baffled: Condense graph via crafting rational gradient matching. arXiv preprint arXiv:2402.04924.
  • (60) Zhang, Y.; Zhang, T.; Wang, K.; Guo, Z.; Liang, Y.; Bresson, X.; Jin, W.; and You, Y. ???? Navigating Complexity: Toward Lossless Graph Condensation via Expanding Window Matching. In Forty-first International Conference on Machine Learning.

Appendix A Datasets Details

Following previous epidemic forecasting work (Deng et al. 2020; Liu, Liu, and Liu 2023), we exploit three widely-used datasets including COVID-19 and influenza-like illness:

  • Australia-COVID. Provided by JHU-CSSE111https://github.com/CSSEGISandData/COVID-19, this dataset records daily new COVID-19 cases, including 6 states and 2 territories, from January 2020 to August 2021.

  • US-Regions. This dataset comprises weekly influenza activity levels for 10 Health and Human Services (HHS) regions, spanning from 2002 to 2017, and offers data on regional influenza patterns over time.

  • US-States. The US-States dataset contains weekly counts of patient visits for influenza-like illness (ILI) across 49 states in the United States from 2010 to 2017, excluding Florida, capturing influenza trends.

Appendix B Implemention Details

In all experimental setups, we set the learning rate to 1e31𝑒31e-31 italic_e - 3 and use SGD (Robbins and Monro 1951) as the optimizer with a momentum of 0.9 and weight decay of 1e51𝑒51e-51 italic_e - 5. The default hidden size is 64, and the window size T𝑇Titalic_T is 20. Considering that decision-makers need time to allocate prevention resources in epidemic modeling, we set the horizon hhitalic_h to 5, 10, and 15. We repeat each experiment five times for each dataset and record the average results.

Appendix C Ablation Study

We test the performance of EARTH across different prediction horizons, ranging from 1 to 20, as shown in Fig. 6. The results indicate that our method can make stable predictions over various horizons. Better performance is observed for larger horizons, demonstrating EARTH’s ability to learn epidemic mechanisms for long-term prediction.

Refer to caption
Figure 6: Analysis on different horizon with four methods.