1 Introduction

Abstract

This paper presents a comparative analysis of the performance of Equivariant Quantum Neural Networks (EQNNs) and Quantum Neural Networks (QNNs), juxtaposed against their classical counterparts: Equivariant Neural Networks (ENNs) and Deep Neural Networks (DNNs). We evaluate the performance of each network with three two-dimensional toy examples for a binary classification task, focusing on model complexity (measured by the number of parameters) and the size of the training dataset. Our results show that the $\mathbb{Z}_{2}\times\mathbb{Z}_{2}$ EQNN and the QNN provide superior performance for smaller parameter sets and modest training data samples.

keywords:

quantum computing; deep learning; quantum machine learning; equivariance; invariance; supervised learning; classification; particle physics; Large Hadron Collider

\pubvolume

1 \issuenum1 \articlenumber0 \externaleditorAcademic Editor: Mariam Zomorodi \datereceived25 January 2024 \daterevised2 March 2024 \dateaccepted8 March 2024 \datepublished \hreflinkhttps://doi.org/ \pubvolume13 \issuenum3 \articlenumber188 \externaleditorAcademic Editor: Mariam Zomorodi \datereceived25 January 2024 \daterevised2 March 2024 \dateaccepted8 March 2024 \datepublished13 March 2024 \hreflinkhttps://doi.org/10.3390/axioms13030188 \Title $\mathbb{Z}_{2}\times\mathbb{Z}_{2}$ Equivariant Quantum Neural Networks: Benchmarking against Classical Neural Networks \TitleCitation $\mathbb{Z}_{2}\times\mathbb{Z}_{2}$ Equivariant Quantum Neural Networks: Benchmarking against Classical Neural Networks \Author Zhongtian Dong ${}^{1,*,\dagger}$ \orcidG, Marçal Comajoan Cara ${}^{2,\dagger}$ \orcidB, Gopal Ramesh Dahale ${}^{3,\dagger}$ \orcidC, Roy T. Forestano ${}^{4,\dagger}$ \orcidA, Sergei Gleyzer ${}^{5,\dagger}$ \orcidH, Daniel Justice ${}^{6,\dagger}$ \orcidI, Kyoungchul Kong ${}^{1,\dagger}$ \orcidJ, Tom Magorsch ${}^{7,\dagger}$ \orcidK, Konstantin T. Matchev ${}^{4,}$ ${}^{,\dagger}$ \orcidD, Katia Matcheva ${}^{4,\dagger}$ \orcidE and Eyup B. Unlu ${}^{4,\dagger}$ \orcidF \AuthorNamesZhongtian Dong, Marçal Comajoan Cara, Gopal Ramesh Dahale, Roy Forestano, Sergei Gleyzer, Daniel Justice, Kyoungchul Kong, Tom Magorsch, Konstantin Matchev, Katia Matcheva and Eyup Unlu \AuthorCitationDong, Z.; Cara, M.C.; Dahale, G.R.; Forestano, R.T.; Gleyzer, S.; Justice, D.; Kong, K.; Magorsch, T.; Matchev, K.T.; Matcheva, K.; et al. \corresCorrespondence: cdong@ku.edu (Z.D.) \firstnoteThese authors contributed equally to this work. \MSC81P68 and 68Q12

1 Introduction

The rapidly evolving convergence of machine learning (ML) and high-energy physics (HEP) offers a range of opportunities and challenges for the HEP community. Beyond simply applying traditional ML methods to HEP issues, a fresh cohort of experts skilled in both areas is pioneering innovative and potentially groundbreaking approaches. ML methods based on symmetries play a crucial role in improving data analysis as well as expediting the discovery of new physics Shanahan et al. (2022); Feickert and Nachman (2021). In particular, classical Equivariant Neural Networks (ENNs) exploit the underlying symmetry structure of the data, ensuring that the input and output transform consistently under the symmetry Cohen and Welling (2016). ENNs have been widely used in various applications including deep convolutional neural networks for computer vision Krizhevsky et al. (2012), AlphaFold for protein structure prediction Jumper et al. (2021), Lorentz equivariant neural networks for particle physics Bogatskiy et al. (2020), and many other HEP applications Cohen et al. (2019); Boyda et al. (2021); Favoni et al. (2022); Dolan and Ore (2021); Bulusu et al. (2021).

Meanwhile, the rise of readily available noisy intermediate-scale quantum computers Preskill (2018) has sparked considerable interest in using quantum algorithms to tackle high-energy physics problems. Modern quantum computers boast impressive quantum volume and are capable of executing highly complex computations, driving a collaborative effort within the community Feynman (1982); Georgescu et al. (2014) to explore their applications in quantum physics, particularly in addressing theoretical challenges in particle physics. Recent research on quantum algorithms for particle physics at the Large Hadron Collider (LHC) covers a range of tasks, including the evaluation of Feynman loop integrals Ramírez-Uribe et al. (2022), simulation of parton showers Bepari et al. (2022) and structure Li et al. (2022), development of quantum algorithms for helicity amplitude assessments Bepari et al. (2021), and simulation of quantum field theories Jordan et al. (2014); Preskill (2018); Bauer et al. (2021); Abel et al. (2021); Abel and Spannowsky (2021); Davoudi et al. (2021).

An intriguing prospect in this realm is the emerging field of quantum machine learning (QML), which harnesses the computational capabilities of quantum devices for machine learning tasks. With classical machine learning algorithms already proving effective for various applications at the LHC, it is very natural to explore whether QML can enhance these classical approaches Mott et al. (2017); Blance and Spannowsky (2020); Wu et al. (2021); Blance and Spannowsky (2021); Abel et al. (2022); Wu et al. (2021); Chen et al. (2021); Terashi et al. (2021); Araz and Spannowsky (2022); Ngairangbam et al. (2022). In recent years, significant development has been made in their quantum counterparts, Equivariant Quantum Neural Networks (EQNNs) Nguyen et al. (2022); Meyer et al. (2023); West et al. (2023); Skolik et al. (2023); Chang et al. (2023).

In this paper we benchmark the performance of EQNNs against various classical and/or non-equivariant alternatives for three two-dimensional toy datasets, which exhibit a $\mathbb{Z}_{2}\times\mathbb{Z}_{2}$ symmetry structure. Such patterns often appear in high-energy physics data, e.g., as kinematic boundaries in the high-dimensional phase space describing the final state Kim (2010); Franceschini et al. (2022). By a clever choice of the kinematic variables for the analysis, these boundaries can be preserved in projections onto a lower-dimensional feature space Kersting (2009); Bisset et al. (2011); Burns et al. (2009); Debnath et al. (2016, 2017). For example, one can form various combinations of possible invariant mass for the generic decay chain considered in Ref. Burns et al. (2009), $D\to jC\to j\ell^{\pm}_{n}B\to j\ell_{n}^{\pm}\ell_{f}^{\mp}A$ , where Particles $A$ , $B$ , $C$ , and $D$ are hypothetical particles in new physics beyond the standard model of masses $\{m_{A},m_{B},m_{C},m_{D}\}$ , while the corresponding standard model decay products consist of a jet $j$ , a “near” lepton $\ell_{n}^{\pm}$ , and a “far” lepton $\ell_{f}^{\pm}$ . The two-dimensional (bivariate) distribution $\frac{d^{2}\Gamma}{dR_{ij}dR_{kl}}$ Γがんま𝑑subscript𝑅𝑖𝑗𝑑subscript𝑅𝑘𝑙\frac{d^{2}\Gamma}{dR_{ij}dR_{kl}}divide start_ARG italic_d start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT roman_Γがんま end_ARG start_ARG italic_d italic_R start_POSTSUBSCRIPT italic_i italic_j end_POSTSUBSCRIPT italic_d italic_R start_POSTSUBSCRIPT italic_k italic_l end_POSTSUBSCRIPT end_ARG shows distributions similar to those in Figures 1–3, where $R_{ij}=\frac{m_{i}^{2}}{m_{j}^{2}}$ is the mass square ratio. Symmetric, anti-symmetric, or non-symmetric structures provide information of particle masses involved in the cascade decays.

Refer to caption — Figure 1: Pictorial illustration of the first dataset used in this study—the symmetric case (1).

In this study, we consider simplified two-dimensional datasets that mimic the data arising in such projections. This setup allows us to focus on the comparison between different methods, avoiding unnecessary issues that may arise when dealing with actual particle physics simulation data such as sampling statistics, parton distribution functions, unknown particle mass spectrum, unknown width, detector effects, etc. We explore EQNNs and benchmark them against classical neural network models. We find that the variational quantum circuits learn the data better with the smaller number of parameters and the smaller training dataset compared to their classical counterparts.

2 Dataset Description

In all three examples, we consider two-dimensional data $(x_{1},x_{2})$ on the unit square ( $-1\leq x_{i}\leq 1$ ). The data points belong to two classes: $y=+1$ (blue points) and $y=-1$ (red points).

(i)

Symmetric case:

In the first example (Figure 1), the labels are generated by the function

	$\displaystyle y(x_{1},x_{2})$	$\displaystyle=$	$\displaystyle 2H\left(R-\sqrt{(x_{1}+1)^{2}+(x_{2}-1)^{2}}\right)$		(1)
		$\displaystyle+$	$\displaystyle 2H\left(R-\sqrt{(x_{1}-1)^{2}+(x_{2}+1)^{2}}\right)-1,$		(1)

where $H(x)$ is the Heaviside step function and for definiteness we choose $R=1.1$ . The function (1) respects a $\mathbb{Z}_{2}\times\mathbb{Z}_{2}$ symmetry, where the first $\mathbb{Z}_{2}$ is given by a reflection about the $x_{1}=x_{2}$ diagonal

x_{1}\to x_{2},\qquad x_{2}\to x_{1},\qquad y\to y,

(2)

while the second $\mathbb{Z}_{2}$ corresponds to a reflection about the $x_{1}=-x_{2}$ diagonal

x_{1}\to-x_{2},\qquad x_{2}\to-x_{1},\qquad y\to y.

(3)

This $\mathbb{Z}_{2}\times\mathbb{Z}_{2}$ example was studied in Ref. Meyer et al. (2023) and we shall refer to it as the symmetric case since the $y$ label is invariant.

(ii)

Anti-symmetric case:

The second example is illustrated in Figure 2. The labels are generated by the function

	$\displaystyle y(x_{1},x_{2})$	$\displaystyle=H\left(-x_{1}\right)H\left(-x_{2}\right)+H\left(-x_{1}\right)H% \left(x_{2}\right)H(x_{1}+x_{2})$
		$\displaystyle-H\left(x_{1}\right)H\left(x_{2}\right)+H\left(x_{1}\right)H\left% (-x_{2}\right)H(x_{1}+x_{2}).$		(4)

The first $\mathbb{Z}_{2}$ is still realized as in (2). However, this time, the labels are flipped under a reflection along the $x_{1}=-x_{2}$ diagonal:

x_{1}\to-x_{2},\qquad x_{2}\to-x_{1},\qquad y\to-y,

(5)

which is why we shall refer to this case as anti-symmetric.

(iii)

Fully anti-symmetric case:

The last example is depicted in Figure 3. The labels are generated by the function

$\displaystyle y(x_{1},x_{2})$	$\displaystyle=H\left(x_{1}\right)H\left(x_{2}\right)(2H\left(x_{1}-x_{2}\right% )-1)+H\left(-x_{1}\right)H\left(-x_{2}\right)(2H\left(x_{2}-x_{1}\right)-1)$
	$\displaystyle+H\left(-x_{1}\right)H\left(x_{2}\right)H\left(x_{1}+x_{2}\right)% (2H\left(R-\sqrt{(x_{1}+1)^{2}+(x_{2}-1)^{2}}\right)-1)$
	$\displaystyle+H\left(-x_{1}\right)H\left(x_{2}\right)H\left(-x_{1}-x_{2}\right% )(1-2H\left(R-\sqrt{(x_{1}+1)^{2}+(x_{2}-1)^{2}}\right))$	(6)
	$\displaystyle+H\left(x_{1}\right)H\left(-x_{2}\right)H\left(x_{1}+x_{2}\right)% (1-2H\left(R-\sqrt{(x_{1}-1)^{2}+(x_{2}+1)^{2}}\right))$
	$\displaystyle+H\left(x_{1}\right)H\left(-x_{2}\right)H\left(-x_{1}-x_{2}\right% )(2H\left(R-\sqrt{(x_{1}-1)^{2}+(x_{2}+1)^{2}}\right)-1),$

where $H(x)$ is the Heaviside step function and for definiteness we choose $R=1$ . In this case, the labels are flipped under both reflections along the $x_{1}=-x_{2}$ diagonal as well as the $x_{1}=x_{2}$ diagonal, which is why we shall refer to this case as fully anti-symmetric. As we will see later, it is straightforward to incorporate both symmetric and anti-symmetric properties in variational quantum circuits, while it is not obvious how to consider the anti-symmetric case in the classical neural networks.

3 Network Architectures

To assess the importance of embedding the symmetry in the network, and to compare the classical and quantum versions of the networks, we study the performance of the following four different architectures: (i) Deep Neural Network (DNN), (ii) Equivariant Neural Network (ENN), (iii) Quantum Neural Network (QNN), and (iv) Equivariant Quantum Neural Network (EQNN). In each case, we adjust the hyperparameters to ensure that the number of network parameters is roughly the same.

(i)

Deep Neural Networks:

In our DNN, for the symmetric (anti-symmetric) case, we use one (two) hidden layer(s) with four neurons. For both types of classical networks, we use the softmax activation function, Adam optimizer, and a learning rate of $0.1$ . We use the binary cross-entropy for both the DNN and ENN.

(ii)

Equivariant Neural Networks:

A given map $f:x\in X\to f(x)\in Y$ between an input space $X$ and an output space $Y$ is said to be equivariant under a group $G$ if it satisfies the following relation:

f(g_{\text{in}}(x))=g_{\text{out}}(f(x)),

(7)

where $g_{\text{in}}$ ( $g_{\text{out}}$ ) is a representation of a group element $g\in G$ acting on the input (output) space. In the special case when $g_{\text{out}}$ is the trivial representation, the map is called invariant under the group $G$ , i.e., a symmetry transformation acting on the input data $x$ does not change the output of the map. The goal of ENNs, or equivariant learning models in general, is to design a trainable map $f$ which would always satisfy Equation (7). In tasks where the symmetry is known, such equivariant models are believed to have an advantage in terms of the number of parameters and training complexity. Several studies in high-energy physics have attempted to use classical equivariant neural networks Bogatskiy et al. (2020, 2022); Hao et al. (2023); Buhmann et al. (2023); Batatia et al. (2023). Our ENN model utilizes four $\mathbb{Z}_{2}\times\mathbb{Z}_{2}$ symmetric copies for each data point, which are fed into the input layer, followed by one equivariant layer with three (two) neurons and one dense layer with four (four) neurons in the symmetric (anti-symmetric) case.

(iii)

Quantum Neural Networks:

For the QNN, we utilize the one-qubit data-reuploading model Pérez-Salinas et al. (2020), as shown in Fig. 4, with depth four (eight) for the symmetric (anti-symmetric and fully anti-symmetric) case, using the angle embedding and three parameters at each depth. This choice leads to a similar number of parameters as in the classical networks. We use the Adam optimizer and the loss

L_{QNN}=y(1-|\mathinner{\langle{\psi}|}O_{1}\mathinner{|{\psi}\rangle}|)^{2}+(% 1-y)(1-|\mathinner{\langle{\psi}|}O_{2}\mathinner{|{\psi}\rangle}|)^{2}

ψぷさい | end_ATOM italic_O start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT start_ATOM | italic_ψぷさい ⟩ end_ATOM | ) start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT + ( 1 - italic_y ) ( 1 - | start_ATOM ⟨ italic_ψぷさい | end_ATOM italic_O start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT start_ATOM | italic_ψぷさい ⟩ end_ATOM | ) start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT

(8)

for any choice of two orthogonal operators $O_{1}$ and $O_{2}$ (see Ref. Ahmed (2019) for more details.). In this paper, we use

O_{1}=\begin{pmatrix}[r]1&0\\ 0&0\end{pmatrix},\quad\quad O_{2}=\begin{pmatrix}[r]0&0\\ 0&1\end{pmatrix}.

(9)

for all three datasets considered in this paper.

(iv)

Equivariant Quantum Neural Networks.

In EQNN models, symmetry transformations acting on the embedding space of input features are realized as finite-dimensional unitary transformations $U_{g}$ , $g\in G$ . Consider the simplest case where one trainable operator $U(\theta,x)$ θしーた , italic_x ) acts on a state $\mathinner{|{\psi}\rangle}$ ψぷさい ⟩: $U(\theta,x)\mathinner{|{\psi}\rangle}$ θしーた , italic_x ) start_ATOM | italic_ψぷさい ⟩ end_ATOM. If for a symmetry transformation $U_{g}$ , the condition

U(\theta,x)\,U_{g}\mathinner{|{\psi}\rangle}=U_{g}\,U(\theta,x)\mathinner{|{% \psi}\rangle},

θしーた , italic_x ) italic_U start_POSTSUBSCRIPT italic_g end_POSTSUBSCRIPT start_ATOM | italic_ψぷさい ⟩ end_ATOM = italic_U start_POSTSUBSCRIPT italic_g end_POSTSUBSCRIPT italic_U ( italic_θしーた , italic_x ) start_ATOM | italic_ψぷさい ⟩ end_ATOM ,

(10)

is satisfied, then the operator $U$ is equivariant, i.e., the equivariant gate should commute with the symmetry. In general, the $U_{g}$ operators on the two sides of Equation (10) do not necessarily have to be in the same representation but are often assumed so for simplicity. The output of a QNN is the measurement of the expectation value of the state with respect to some observable $O$ . If the gates are equivariant and we apply some symmetry transformation $U_{g}$ , then this is equivalent to measuring the observable $U_{g}^{\dagger}OU_{g}$ . Hence, if $O$ commutes with the symmetry $U_{g}$ , the model as a whole would be invariant under $U_{g}$ , which is the case in our symmetric example. Otherwise the model is equivariant, as in our anti-symmetric example.

Our EQNN uses the two-qubit quantum circuit depicted in Figure 5 for depth 1. This circuit is repeated five (ten) times with different parameters for the symmetric (anti-symmetric and fully anti-symmetric). The two $R_{Z}$ gates embed $x_{1}$ and $x_{2}$ , respectively. The $R_{X}$ gates share the same parameter ( $\theta_{1}$ θしーた start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT) and the $R_{ZZ}$ gate uses another parameter ( $\theta_{2}$ θしーた start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT). The invariant model (for the symmetric case) uses the same observable ${O}$ for both classes in the data. In the anti-symmetric case, we use two different observables ${O}_{1}$ and ${O}_{2}$ that correspond to each label. They transform into one another under reflection $g_{r}$ , i.e., $U^{\dagger}_{g_{r}}{O}_{1}U_{g_{r}}={O}_{2}$ .

In the symmetric case, we use binary cross-entropy loss, assuming the true label $y$ is either $0$ or $1$ ,

L_{EQNN}^{\rm symm}=y\log\Big{(}|\mathinner{\langle{\psi}|}O\mathinner{|{\psi}% \rangle}|\Big{)}+(1-y)\log\Big{(}1-|\mathinner{\langle{\psi}|}O\mathinner{|{% \psi}\rangle}|\Big{)}\,.

ψぷさい | end_ATOM italic_O start_ATOM | italic_ψぷさい ⟩ end_ATOM | ) + ( 1 - italic_y ) roman_log ( 1 - | start_ATOM ⟨ italic_ψぷさい | end_ATOM italic_O start_ATOM | italic_ψぷさい ⟩ end_ATOM | ) .

(11)

The observables $O$ and the reflection $U_{g_{r}}$ along $x_{1}=-x_{2}$ are defined as follows:

O=\dfrac{1}{4}\begin{pmatrix}1&1&1&1\\ 1&1&1&1\\ 1&1&1&1\\ 1&1&1&1\end{pmatrix},\quad\quad U_{g_{r}}=\begin{pmatrix}0&0&0&1\\ 0&0&1&0\\ 0&1&0&0\\ 1&0&0&0\end{pmatrix}.

(12)

In the anti-symmetric and fully anti-symmetric cases, we used the same loss as in QNN

L_{EQNN}^{\rm anti-symm}=y(1-|\mathinner{\langle{\psi}|}O_{1}\mathinner{|{\psi% }\rangle}|)^{2}+(1-y)(1-|\mathinner{\langle{\psi}|}O_{2}\mathinner{|{\psi}% \rangle}|)^{2}\,.

(13)

For the anti-symmetric case $O_{1}$ ( $O_{2}$ ) is the observable corresponding to $y=1$ ( $y=0$ )

O_{1}=\dfrac{1}{4}\begin{pmatrix}[r]1&1&1&-1\\ 1&1&1&-1\\ 1&1&1&-1\\ -1&-1&-1&1\end{pmatrix},\quad\quad O_{2}=\dfrac{1}{4}\begin{pmatrix}[r]1&-1&-1% &-1\\ -1&1&1&1\\ -1&1&1&1\\ -1&1&1&1\end{pmatrix}.

(14)

For the fully anti-symmetric case, we use another set of observables, so one will transform into the other with reflection along any of the two diagonals. They are given as follows:

O_{1}=\dfrac{1}{4}\begin{pmatrix}[r]1&-1&1&1\\ -1&1&-1&1\\ 1&-1&1&-1\\ 1&1&-1&1\end{pmatrix},\quad\quad O_{2}=\dfrac{1}{4}\begin{pmatrix}[r]1&1&-1&1% \\ 1&1&-1&-1\\ -1&-1&1&1\\ 1&-1&1&1\end{pmatrix}.

(15)

Since it is anti-symmetric with respect to each of the diagonals, the result is invariant if both reflections are applied. It is difficult to build a classical equivariant neural network using these anti-symmetries since classical equivariant models are built based on the assumption that the target is invariant under certain transformations. When discussing the theory of classical equivariant machine learning models, the models that transform non-trivially under the symmetry group are often discussed mathematically but rarely implemented in code. For our classical model on partially anti-symmetric data, we only implemented the invariant part of the symmetry ( $\mathbb{Z}_{2}$ ) and ignored the anti-symmetric portion of the data. While it may not be impossible to consider such asymmetric cases in classical neural networks, implementation can be quite involved.

On the other hand, it is straightforward to build quantum equivariant models. For this purpose, we would only need to exploit the transformation properties of the observables. If one observable transforms to the other under the transformation of interest (reflection along the diagonal in this case), then measurement made on one observable is equivalent to the measurement of the other observable given the transformed input.

We can consider equivariant quantum models with anti-symmetric transformation from the point of view of representation theory. The fully invariant (symmetric) case can be considered as the model transform under the trivial representation of the group, where all the transformations defined by the group do not change the output of the model. The asymmetric (either anti-symmetric or fully anti-symmetric) cases that we considered here can be interpreted as transforms under some other (one-dimensional) representation of the group, where some transformations change the output of the model to its opposite value, while other transformations do not change the output.

4 Results

The left panels in Figure 6 show the receiver operating characteristic (ROC) curves for each network with $N_{\rm train}=200$ and $N_{\rm test}=2000$ samples for the symmetric (top), anti-symmetric (middle), and fully anti-symmetric (bottom) dataset. The results for the DNN, ENN, QNN, and EQNN are shown in (green, dotted), (yellow, dotdashed), (red, dashed), and (blue, solid), respectively. As expected, networks with an equivariance structure (EQNN and ENN) improve the performance of the corresponding networks (QNN and DNN) without the symmetry. We also observe that quantum networks perform better than the classical analogs. In the legends, numerical values followed by network acronyms represent the number of parameters used for each network. For the symmetric example, the EQNN uses only 10 parameters; thus, for fair comparison, we constructed the other networks with ${\cal O}$ (10) parameters as well. For the anti-symmetric example, we use 20 parameters for the EQNN.

The evolution of the accuracy during training and testing is shown in the right panels of Figure 6. The accuracy converges faster (after only 5 epochs) for the QNN and EQNN in comparison to their classical counterparts (10–20 epochs). The same color-scheme is used, but this time, solid curves represent training accuracy, while dashed curves show test accuracy.

To further quantify the performance of our quantum networks, in Figure 7, we show the AUC (Area under the ROC Curve) as a function of the number of parameters (left panels) with a fixed size of the training data ( $N_{\rm train}=200$ ), and as a function of the number of training samples (right panels) with a fixed number of parameters ( $N_{\rm params}=20$ ). The top, middle, and bottom panels show results for the symmetric, anti-symmetric, and fully anti-symmetric dataset. As the number of parameters increases, the performance of all networks improves. All AUえーゆーC values become similar when $N_{\rm params}\approx 20$ ( $N_{\rm params}\approx 40$ ) for the symmetric (anti-symmetric) case. As shown in the bottom panels, the performances of all networks become comparable to each other for both examples once the size of the training data reaches $\sim$ 400, except for the fully anti-symmetric case. We observe that from the top panel to the bottom panel, the relative improvement from the QNN to the EQNN grows, indicating the importance of symmetry implementation on the network. Similar relative improvement exists from the DNN to the QNN, emphasizing the importance of quantum algorithms. Note that the ENN curves are missing in the bottom panel of both Figures 6 and 7. This is due to the non-trivial implementation of the anti-symmetric property in classical ENNs.

Finally Table 1 shows the accuracy of the DNN for the fully anti-symmetric dataset. The different rows and columns represent different choices of the number of parameters and the number of training samples, respectively. These numbers are compared against those in right-bottom panel of Figure 7. The EQNN achieves 0.95 accuracy with 20 parameters and 200 training samples, while the DNN requires more parameters and/or more training samples.

Table 1: Accuracy of DNN for the fully anti-symmetric dataset. The different rows (columns) represent different choices of the number of parameters (the number of training samples).

$N_{\rm params}$ \ $N_{\rm train}$	100	200	300	400	500	600	700	800	900
105	0.764	0.855	0.879	0.963	0.973	0.981	0.981	0.982	0.988
85	0.669	0.743	0.804	0.953	0.951	0.978	0.986	0.946	0.981
67	0.587	0.722	0.695	0.946	0.886	0.9632	0.975	0.944	0.980
51	0.624	0.655	0.856	0.926	0.908	0.876	0.846	0.974	0.986
37	0.596	0.696	0.639	0.782	0.747	0.816	0.849	0.922	0.952

5 Conclusions

In this paper, we examined the performance of Equivariant Quantum Neural Networks and Quantum Neural Networks, compared against their classical counterparts, Equivariant Neural Networks and Deep Neural Networks, considering two toy examples for a binary classification task. Our study demonstrates that EQNNs and QNNs outperform their classical counterparts, particularly in scenarios with fewer parameters and smaller training datasets. This highlights the potential of quantum-inspired architectures in resource-constrained settings. This point has been emphasized in a similar study recently in Ref. Chang et al. (2023), which showed that an EQNN outperforms the non-equivariant one in terms of generalization power, especially with a small training set size. We note a more significant enhancement in the performance of an EQNN and QNN compared to an ENN and DNN, particularly evident in the anti-symmetric example rather than the symmetric one. This underscores the robustness of quantum algorithms. The code used for this study is publicly available at https://github.com/ZhongtianD/EQNN/tree/main (accessed on 9 March 2024).

While our current study has primarily focused on an EQNN with discrete symmetries, it is crucial to acknowledge the significant role that continuous symmetries, such as Lorentz symmetry or gauge symmetries, play in particle physics. In our future research, we aim to compare an EQNN with continuous symmetries against classical neural networks. Exploring more complex datasets with high-dimensional features is another direction we plan to pursue. However, handling such examples would necessitate an increase in the number of network parameters, prompting an investigation into related issues like overparameterization, barren plateaus, and others.

\authorcontributions

Conceptualization, Z.D.; methodology, M.C.C., G.R.D., Z.D., R.T.F., S.G., D.J., K.K., T.M., K.T.M., K.M., and E.B.U.; software, Z.D.; validation, M.C.C., G.R.D., Z.D., R.T.F., T.M., and E.B.U.; formal analysis, Z.D.; investigation, M.C.C., G.R.D., Z.D., R.T.F., T.M., and E.B.U.; resources, Z.D., K.T.M., and K.M.; data curation, G.R.D., S.G., and T.M.; writing—original draft preparation, Z.D.; writing—review and editing, S.G., D.J., K.K., K.T.M., and K.M.; visualization, Z.D.; supervision, S.G., D.J., K.K., K.T.M., and K.M.; project administration, S.G., D.J., K.K., K.T.M., and K.M.; funding acquisition, S.G. All authors have read and agreed to the published version of the manuscript.

\funding

This research used resources of the National Energy Research Scientific Computing Center, a DOE Office of Science User Facility supported by the Office of Science of the U.S. Department of Energy under Contract No. DE-AC02-05CH11231 using NERSC award NERSC DDR-ERCAP0025759. SG is supported in part by the U.S. Department of Energy (DOE) under Award No. DE-SC0012447. KM is supported in part by the U.S. Department of Energy award number DE-SC0022148. KK is supported in part by the US DOE DE-SC0024407. CD is supported in part by the College of Liberal Arts and Sciences Research Fund at the University of Kansas. CD, RF, EU, MCC, and TM were participants in the 2023 Google Summer of Code.

\institutionalreview

Not applicable.

\dataavailability

The dataset used in this analysis was sampled from Equations (1), (4) and (6).

\conflictsofinterest

The authors declare no conflicts of interest. The funders had no role in the design of the study; in the collection, analyses, or interpretation of data; in the writing of the manuscript; or in the decision to publish the results.

\abbreviations

Abbreviations The following abbreviations are used in this manuscript:

API	Application Processing Interface
AUC	Area Under the Curve
DNN	Deep Neural Network
ENN	Equivariant Neural Network
EQNN	Equivariant Quantum Neural Network
HEP	High-Energy Physics
LHC	Large Hadron Collider
MDPI	Multidisciplinary Digital Publishing Institute
ML	Machine Learning
NN	Neural Network
QML	Quantum Machine Learning
QNN	Quantum Neural Network
ROC	Receiver Operating Characteristic

\reftitle

References

Shanahan et al. (2022) Shanahan, P.; Terao, K.; Whiteson, D. Snowmass 2021 Computational Frontier CompF03 Topical Group Report: Machine Learning. arXiv, 2022. arXiv:2209.07559.
Feickert and Nachman (2021) Feickert, M.; Nachman, B. A Living Review of Machine Learning for Particle Physics. arXiv, 2021. arXiv:2102.02770.
Cohen and Welling (2016) Cohen, T.; Welling, M. Group Equivariant Convolutional Networks. In Proceedings of the 33rd International Conference on Machine Learning, New York, NY, USA, 20–22 Jun 2016; Balcan, M.F.; Weinberger, K.Q., Eds.; Volume 48, pp. 2990–2999.
Krizhevsky et al. (2012) Krizhevsky, A.; Sutskever, I.; Hinton, G.E. ImageNet Classification with Deep Convolutional Neural Networks. In Proceedings of the Advances in Neural Information Processing Systems, Lake Tahoe, NV, USA, 3–6 December 2012; Pereira, F.; Burges, C.; Bottou, L.; Weinberger, K., Eds.; Curran Associates, Inc.: Sydney, New South Wales, 2012, Volume 25.
Jumper et al. (2021) Jumper, J.M.; Evans, R.; Pritzel, A.; Green, T.; Figurnov, M.; Ronneberger, O.; Tunyasuvunakool, K.; Bates, R.; Zídek, A.; Potapenko, A.; et al. Highly accurate protein structure prediction with AlphaFold. Nature 2021, 596, 583–589.
Bogatskiy et al. (2020) Bogatskiy, A.; Anderson, B.; Offermann, J.; Roussi, M.; Miller, D.; Kondor, R. Lorentz Group Equivariant Neural Network for Particle Physics. In Proceedings of the 37th International Conference on Machine Learning, Virtual, 13–18 July 2020; III, H.D.; Singh, A., Eds.; Volume 119, pp. 992–1002.
Cohen et al. (2019) Cohen, T.S.; Weiler, M.; Kicanaoglu, B.; Welling, M. Gauge Equivariant Convolutional Networks and the Icosahedral CNN. arXiv, 2019, 2. arXiv:1902.04615.
Boyda et al. (2021) Boyda, D.; Kanwar, G.; Racanière, S.; Rezende, D.J.; Albergo, M.S.; Cranmer, K.; Hackett, D.C.; Shanahan, P.E. Sampling using $SU(N)$ gauge equivariant flows. Phys. Rev. D 2021, 103, 074504. https://doi.org/10.1103/PhysRevD.103.074504.
Favoni et al. (2022) Favoni, M.; Ipp, A.; Müller, D.I.; Schuh, D. Lattice Gauge Equivariant Convolutional Neural Networks. Phys. Rev. Lett. 2022, 128, 032003. https://doi.org/10.1103/PhysRevLett.128.032003.
Dolan and Ore (2021) Dolan, M.J.; Ore, A. Equivariant Energy Flow Networks for Jet Tagging. Phys. Rev. D 2021, 103, 074022. https://doi.org/10.1103/PhysRevD.103.074022.
Bulusu et al. (2021) Bulusu, S.; Favoni, M.; Ipp, A.; Müller, D.I.; Schuh, D. Generalization capabilities of translationally equivariant neural networks. Phys. Rev. D 2021, 104, 074504. https://doi.org/10.1103/PhysRevD.104.074504.
Preskill (2018) Preskill, J. Quantum Computing in the NISQ era and beyond. Quantum 2018, 2, 79. https://doi.org/10.22331/q-2018-08-06-79.
Feynman (1982) Feynman, R.P. Simulating physics with computers. Int. J. Theor. Phys. 1982, 21, 467–488. https://doi.org/10.1007/BF02650179.
Georgescu et al. (2014) Georgescu, I.M.; Ashhab, S.; Nori, F. Quantum Simulation. Rev. Mod. Phys. 2014, 86, 153. https://doi.org/10.1103/RevModPhys.86.153.
Ramírez-Uribe et al. (2022) Ramírez-Uribe, S.; Rentería-Olivo, A.E.; Rodrigo, G.; Sborlini, G.F.R.; Vale Silva, L. Quantum algorithm for Feynman loop integrals. JHEP 2022, 5, 100. https://doi.org/10.1007/JHEP05(2022)100.
Bepari et al. (2022) Bepari, K.; Malik, S.; Spannowsky, M.; Williams, S. Quantum walk approach to simulating parton showers. Phys. Rev. D 2022, 106, 056002. https://doi.org/10.1103/PhysRevD.106.056002.
Li et al. (2022) Li, T.; Guo, X.; Lai, W.K.; Liu, X.; Wang, E.; Xing, H.; Zhang, D.B.; Zhu, S.L. Partonic collinear structure by quantum computing. Phys. Rev. D 2022, 105, L111502. https://doi.org/10.1103/PhysRevD.105.L111502.
Bepari et al. (2021) Bepari, K.; Malik, S.; Spannowsky, M.; Williams, S. Towards a quantum computing algorithm for helicity amplitudes and parton showers. Phys. Rev. D 2021, 103, 076020. https://doi.org/10.1103/PhysRevD.103.076020.
Jordan et al. (2014) Jordan, S.P.; Lee, K.S.M.; Preskill, J. Quantum Algorithms for Fermionic Quantum Field Theories. arXiv, 2014. arXiv:1404.7115.
Preskill (2018) Preskill, J. Simulating quantum field theory with a quantum computer. PoS 2018, LATTICE2018, 024. https://doi.org/10.22323/1.334.0024.
Bauer et al. (2021) Bauer, C.W.; de Jong, W.A.; Nachman, B.; Provasoli, D. Quantum Algorithm for High Energy Physics Simulations. Phys. Rev. Lett. 2021, 126, 062001. https://doi.org/10.1103/PhysRevLett.126.062001.
Abel et al. (2021) Abel, S.; Chancellor, N.; Spannowsky, M. Quantum computing for quantum tunneling. Phys. Rev. D 2021, 103, 016008. https://doi.org/10.1103/PhysRevD.103.016008.
Abel and Spannowsky (2021) Abel, S.; Spannowsky, M. Quantum-Field-Theoretic Simulation Platform for Observing the Fate of the False Vacuum. PRX Quantum 2021, 2, 010349. https://doi.org/10.1103/PRXQuantum.2.010349.
Davoudi et al. (2021) Davoudi, Z.; Linke, N.M.; Pagano, G. Toward simulating quantum field theories with controlled phonon-ion dynamics: A hybrid analog-digital approach. Phys. Rev. Res. 2021, 3, 043072. https://doi.org/10.1103/PhysRevResearch.3.043072.
Mott et al. (2017) Mott, A.; Job, J.; Vlimant, J.R.; Lidar, D.; Spiropulu, M. Solving a Higgs optimization problem with quantum annealing for machine learning. Nature 2017, 550, 375–379. https://doi.org/10.1038/nature24047.
Blance and Spannowsky (2020) Blance, A.; Spannowsky, M. Unsupervised event classification with graphs on classical and photonic quantum computers. JHEP 2020, 21, 170. https://doi.org/10.1007/JHEP08(2021)170.
Wu et al. (2021) Wu, S.L.; Chan, J.; Guan, W.; Sun, S.; Wang, A.; Zhou, C.; Livny, M.; Carminati, F.; Di Meglio, A.; Li, A.C.; et al. Application of quantum machine learning using the quantum variational classifier method to high energy physics analysis at the LHC on IBM quantum computer simulator and hardware with 10 qubits. J. Phys. G 2021, 48, 125003. https://doi.org/10.1088/1361-6471/ac1391.
Blance and Spannowsky (2021) Blance, A.; Spannowsky, M. Quantum Machine Learning for Particle Physics using a Variational Quantum Classifier. JHEP 2021, 2, 212. https://doi.org/10.1007/JHEP02(2021)212.
Abel et al. (2022) Abel, S.; Blance, A.; Spannowsky, M. Quantum optimization of complex systems with a quantum annealer. Phys. Rev. A 2022, 106, 042607. https://doi.org/10.1103/PhysRevA.106.042607.
Wu et al. (2021) Wu, S.L.; Sun, S.; Guan, W.; Zhou, C.; Chan, J.; Cheng, C.L.; Pham, T.; Qian, Y.; Wang, A.Z.; Zhang, R.; et al. Application of quantum machine learning using the quantum kernel algorithm on high energy physics analysis at the LHC. Phys. Rev. Res. 2021, 3, 033221. https://doi.org/10.1103/PhysRevResearch.3.033221.
Chen et al. (2021) Chen, S.Y.C.; Wei, T.C.; Zhang, C.; Yu, H.; Yoo, S. Hybrid Quantum-Classical Graph Convolutional Network. arXiv, 2021. arXiv:2101.06189.
Terashi et al. (2021) Terashi, K.; Kaneda, M.; Kishimoto, T.; Saito, M.; Sawada, R.; Tanaka, J. Event Classification with Quantum Machine Learning in High-Energy Physics. Comput. Softw. Big Sci. 2021, 5, 2. https://doi.org/10.1007/s41781-020-00047-7.
Araz and Spannowsky (2022) Araz, J.Y.; Spannowsky, M. Classical versus quantum: Comparing tensor-network-based quantum circuits on Large Hadron Collider data. Phys. Rev. A 2022, 106, 062423. https://doi.org/10.1103/PhysRevA.106.062423.
Ngairangbam et al. (2022) Ngairangbam, V.S.; Spannowsky, M.; Takeuchi, M. Anomaly detection in high-energy physics using a quantum autoencoder. Phys. Rev. D 2022, 105, 095004. https://doi.org/10.1103/PhysRevD.105.095004.
Chang et al. (2023) Chang, S.Y.; Grossi, M.; Saux, B.L.; Vallecorsa, S. Approximately Equivariant Quantum Neural Network for $p4m$ Group Symmetries in Images. In Proceedings of the 2023 International Conference on Quantum Computing and Engineering, Bellevue, WA, USA, 17–22 September 2023.
Nguyen et al. (2022) Nguyen, Q.T.; Schatzki, L.; Braccia, P.; Ragone, M.; Coles, P.J.; Sauvage, F.; Larocca, M.; Cerezo, M. Theory for Equivariant Quantum Neural Networks. arXiv, 2022. arXiv:2210.08566.
Meyer et al. (2023) Meyer, J.J.; Mularski, M.; Gil-Fuster, E.; Mele, A.A.; Arzani, F.; Wilms, A.; Eisert, J. Exploiting Symmetry in Variational Quantum Machine Learning. PRX Quantum 2023, 4, 010328. https://doi.org/10.1103/prxquantum.4.010328.
West et al. (2023) West, M.T.; Sevior, M.; Usman, M. Reflection equivariant quantum neural networks for enhanced image classification. Mach. Learn. Sci. Technol. 2023, 4, 035027. https://doi.org/10.1088/2632-2153/acf096.
Skolik et al. (2023) Skolik, A.; Cattelan, M.; Yarkoni, S.; Bäck, T.; Dunjko, V. Equivariant quantum circuits for learning on weighted graphs. npj Quantum Inf. 2023, 9, 47.
Kim (2010) Kim, I.W. Algebraic Singularity Method for Mass Measurement with Missing Energy. Phys. Rev. Lett. 2010, 104, 081601. https://doi.org/10.1103/PhysRevLett.104.081601.
Franceschini et al. (2022) Franceschini, R.; Kim, D.; Kong, K.; Matchev, K.T.; Park, M.; Shyamsundar, P. Kinematic Variables and Feature Engineering for Particle Phenomenology. Rev. Mod. Phys. 2022, 95, 045004.
Kersting (2009) Kersting, N. On Measuring Split-SUSY Gaugino Masses at the LHC. Eur. Phys. J. C 2009, 63, 23–32. https://doi.org/10.1140/epjc/s10052-009-1063-6.
Bisset et al. (2011) Bisset, M.; Lu, R.; Kersting, N. Improving SUSY Spectrum Determinations at the LHC with Wedgebox Technique. JHEP 2011, 5, 095. https://doi.org/10.1007/JHEP05(2011)095.
Burns et al. (2009) Burns, M.; Matchev, K.T.; Park, M. Using kinematic boundary lines for particle mass measurements and disambiguation in SUSY-like events with missing energy. JHEP 2009, 5, 094. https://doi.org/10.1088/1126-6708/2009/05/094.
Debnath et al. (2016) Debnath, D.; Gainer, J.S.; Kim, D.; Matchev, K.T. Edge Detecting New Physics the Voronoi Way. EPL 2016, 114, 41001. https://doi.org/10.1209/0295-5075/114/41001.
Debnath et al. (2017) Debnath, D.; Gainer, J.S.; Kilic, C.; Kim, D.; Matchev, K.T.; Yang, Y.P. Detecting kinematic boundary surfaces in phase space: Particle mass measurements in SUSY-like events. JHEP 2017, 6, 092. https://doi.org/10.1007/JHEP06(2017)092.
Bogatskiy et al. (2022) Bogatskiy, A.; Hoffman, T.; Miller, D.W.; Offermann, J.T. PELICAN: Permutation Equivariant and Lorentz Invariant or Covariant Aggregator Network for Particle Physics. arXiv, 2022. arXiv:2211.00454.
Hao et al. (2023) Hao, Z.; Kansal, R.; Duarte, J.; Chernyavskaya, N. Lorentz group equivariant autoencoders. Eur. Phys. J. C 2023, 83, 485. https://doi.org/10.1140/epjc/s10052-023-11633-5.
Buhmann et al. (2023) Buhmann, E.; Kasieczka, G.; Thaler, J. EPiC-GAN: Equivariant point cloud generation for particle jets. SciPost Phys. 2023, 15, 130. https://doi.org/10.21468/SciPostPhys.15.4.130.
Batatia et al. (2023) Batatia, I.; Geiger, M.; Munoz, J.; Smidt, T.; Silberman, L.; Ortner, C. A General Framework for Equivariant Neural Networks on Reductive Lie Groups. arXiv, 2023. arXiv:2306.00091.
Pérez-Salinas et al. (2020) Pérez-Salinas, A.; Cervera-Lierta, A.; Gil-Fuster, E.; Latorre, J.I. Data re-uploading for a universal quantum classifier. Quantum 2020, 4, 226. https://doi.org/10.22331/q-2020-02-06-226.
Ahmed (2019) Ahmed, S. Data-Reuploading Classifier. Available online: https://pennylane.ai/qml/demos/tutorial_data_reuploading_classifier (accessed on 9 March 2024).

\PublishersNote