Sybil-Proof Mechanism for Information Propagation with Budgets

Junjie Zheng¹ Xu Ge¹ Bin Li ² Dengji Zhao ^1∗
¹ShanghaiTech University
²Nanjing University of Science and Technology
{zhengjj, gexu, zhaodj}@shanghaitech.edu.cn, cs.libin@njust.edu.cn

Abstract

This paper examines the problem of distributing rewards on social networks to improve the efficiency of crowdsourcing tasks for sponsors. To complete the tasks efficiently, we aim to design reward mechanisms that incentivize early-joining agents to invite more participants to the tasks. Nonetheless, participants could potentially engage in strategic behaviors, e.g., not inviting others to the tasks, misreporting their capacity for the tasks, or creaking fake identities (aka Sybil attacks), to maximize their own rewards. The focus of this study is to address the challenge outlined above by designing effective reward mechanisms. To this end, we propose a novel reward mechanism, called Propagation Reward Distribution Mechanism (PRDM), for the general information propagation model with limited budgets. It is proved that the PRDM can not only incentivize all agents to contribute their full efforts to the tasks and share the task information to all their neighbors in the social networks, but can also prevent them from Sybil attacks.

1 Introduction

The widespread availability of mobile Internet devices has fostered greater interconnectedness among individuals via social networks and amplified the impact of information spread through social connections.^†^†^∗Corresponding Author. In certain fields, including viral marketing Leskovec et al. (2006), crowdsourcing distribution Singer and Mittal (2011); Doan et al. (2011), answer querying Kleinberg and Raghavan (2005), sponsors frequently incentivize participants with monetary rewards to gather as much data or sell as many products as possible. In 2005, Amazon launched a crowdsourcing platform called Amazon Mechanical Turk (MTurk) to gather data from non-professionals. On the MTurk platform, the sponsors can post tasks and rewards, and then the workers claim the tasks and receive payments accordingly based on the quantity and quality of their completed tasks. Many studies requiring extensive data started collecting data through MTurk Sorokin and Forsyth (2008). One study in 2019 showed that more than 250,000 people have completed at least one task on MTurk Robinson et al. (2019). However, a large percentage of these workers are fixed, which is mainly because that inviting new people to join is not beneficial. Making existing workers invite more people to participate can significantly improve efficiency.

In this paper, we aim to adequately utilize people’s connections in the network to design a reward distribution mechanism Zhang and Zhao (2022). This mechanism incentivizes agents to invite more people to participate by the reward distribution, which eventually improves the overall completion efficiency. The first difficulty is distributing the rewards within a constrained budget. The mechanism should motivate agents to spread the information in their social network as much as possible. In the DARPA network challenge Pickard et al. (2010); Tang et al. (2011), the winning team from MIT used a pioneering mechanism to effectively motivate people to spread information and quickly found all ten red balloons. In multi-level marketing Emek et al. (2011a); Drucker and Fleischer (2012), the seller expects to sell more products by attracting more people to purchase. In our problem setting, we also need to properly allocate the limited budget to participants.

Another difficulty is resolving Sybil attacks in social networks. A Sybil attack is when participants create multiple false identities to accomplish specific purposes. Sybil attacks are widespread and easily performed, affecting eventual results and harming others Alothali et al. (2018); Yu et al. (2006); Zhang et al. (2014). Traditional defense approaches are mainly focused on the communication domain Chen et al. (2021); Jamshidi et al. (2019); Zhang and Lee (2019). Scholars have extensively studied this phenomenon in various domains, such as the Vickrey-Clarke-Groves process in auction theory is vulnerable to Sybil attacks Yokoo et al. (2004), and Yokoo et al. Yokoo et al. (2001) developed a new protocol against false-name bids. In Bitcoin transactions, Babaioff et al. Babaioff et al. (2012) devised a scheme that rewards information propagation to prevent Sybil attacks to make more revenue. In crowdsourcing, individuals have different abilities, such as computing power, purchasing advertising, or providing data. Emek et al. Emek et al. (2011b) solved the problem of Sybil attacks in viral marketing by rewarding propagation behavior based on the size of a maximum perfect binary tree. We aim to use this authentic contribution information to design an information propagation mechanism that defends against Sybil attacks.

In this paper, our mechanism drives improvements in the following dimensions.

•

We propose a model that quantifies an agent’s contribution by introducing the concept of capacity. The model considers the general setting of Sybil attacks.
•

We propose a novel natural mechanism to allocate rewards that maximize information propagation within a limited budget while resisting Sybil attacks.

Related work. With a fixed budget, Shi et al. Shi et al. (2020) devised a mechanism that maximizes information propagation but is not resistant to Sybil attacks. Chen et al. Chen and Li (2021) designed a special scenario of a free market with lotteries, where participants have a strong incentive to maximize the diffusion of information, and false-name manipulations fail to yield excessive rewards. In the answer querying problem, Zhang et al. Zhang et al. (2020) designed a mechanism that incentivizes the agents to propagate the requestor’s query information while making the Sybil attack unavailable for additional gain. However, their mechanism only solves the scene of a single problem query in a tree. Hong et al. Chen et al. (2022) solved the problem of Sybil attacks in diffusion auctions by removing possible fake agents by graph-structured methods, providing a new approach to tackle similar issues.

The remainder of this paper is organized as follows. Section 2 describes the fundamental setup and definition of the model. Section 3 shows our mechanism and an example of running the mechanism. Section 4 shows the properties of our mechanism. In Section 5, we discuss these properties. In Section 6, we summarize our work and discuss possible future directions.

2 The Model

We consider the crowdsourcing problem powered by social networks, where a sponsor expects to leverage the social connections to recruit more participants (or agents) to some crowdsourcing task, e.g., data collecting. For convenience, we model the social connections of all agents as a directed graph $G=(V,E)$ , where $V$ represents the set of vertices and $E$ denotes the edge set. Except for the sponsor $s$ , the graph $G$ consists of a set $N=\{1,\ldots,n\}$ of agents who can contribute to the task, i.e., $V=\{s\}\cup N$ . For each agent $i\in N$ , we denote by $c_{i}$ the maximum contribution capacity (or simply, capacity) of $i$ for the task, e.g., $c_{i}$ can denote the affordable number of pictures that need to be labeled. For any two agents $i,j\in V$ , there is an edge $(i,j)\in E$ if and only if agent $i$ can invite agent $j$ . Given an edge $(i,j)\in E$ , we say $j$ is a child of $i$ and use $n_{i}$ to denote the set of $i$ ’s children in $G$ . Without promotions, the sponsor can only recruit her direct children $n_{s}$ to the task, and within such small number of participants the task may fail to be accomplished. To attract more agents, the sponsor plans to reward the participants to incentivize them to further spread the task information to their children, under a total budget of $B$ , and the amount of each participant’s reward is determined by her reports, including her performance on the task and her diffusion efforts.

As usual, let $t_{i}=(n_{i},c_{i})$ be agent $i$ ’s private type, where $n_{i}$ denotes the set of her children and $c_{i}>0$ is her capacity. In addition, denote by $\mathbf{t}=(t_{1},\ldots,t_{n})$ the type profile of all agents, and $\mathbf{t}_{-i}$ the type profile of all agents except agent $i$ , i.e., $\mathbf{t}=(t_{i},\mathbf{t}_{-i})$ . For convenience’s sake, we use $\mathcal{T}_{i}=\mathcal{P}(N)\times\mathbb{R}^{+}$ to denote the type space of agent $i$ where $\mathcal{P}(N)$ is the power set of the set $N$ , and $\mathcal{T}=\times\mathcal{T}_{i}$ to denote the space of all type profiles. As $t_{i}$ is private information, agent $i$ can cheat the sponsor to benefit herself. Let $t^{\prime}_{i}=(n_{i}^{\prime},c^{\prime}_{i})$ be the type reported by agent $i$ , i.e., $i$ diffused information to $n_{i}^{\prime}$ and contributed $c^{\prime}_{i}$ to the task. Since agent $i$ is unaware of other agents in the graph who are not her children and cannot contribute more than her capacity, we require that $n_{i}^{\prime}\subseteq n_{i}$ and $c^{\prime}_{i}\in(0,c_{i}]$ . Similarly, let $\mathbf{t}^{\prime}=(t^{\prime}_{i},\mathbf{t}^{\prime}_{-i})$ denote the report profile of all agents, where $\mathbf{t}^{\prime}_{-i}$ represents the report profile of all agents except agent $i$ . Accordingly, we use $\mathcal{T}^{\prime}_{i}=\mathcal{P}(n_{i})\times(0,c_{i}]$ to denote the space of $t^{\prime}_{i}$ , $\mathcal{T}^{\prime}=\times\mathcal{T}^{\prime}_{i}$ the space of $\mathbf{t}^{\prime}$ , and $\mathcal{T}^{\prime}_{-i}=\times_{j\neq i}\mathcal{T}^{\prime}_{j}$ the space of $\mathbf{t}^{\prime}_{-i}$ .

Definition 1.

Given a report profile $\mathbf{t}^{\prime}$ , we say agent $i$ is active if there exists a sequence of agents $\{i_{1},i_{2},\ldots,i_{k}\}$ , where $i_{1}\in n_{s},i\in n_{i_{k}}^{\prime}$ and $i_{j}\in n_{i_{j-1}}^{\prime}$ holds for any $1<j\leq k$ .

That is, an agent is an active agent if there is a “diffusion path” from the sponsor to her. Note that only active agents are real participants of the crowdsourcing task. Based on the definition of active agents, we next introduce the concept of active network.

Definition 2.

Given a report profile $\mathbf{t}^{\prime}$ , we use $G(\mathbf{t}^{\prime})=(V(\mathbf{t}^{\prime}),E(\mathbf{t}^{\prime}))$ (or $G^{\prime}=(V^{\prime},E^{\prime})$ for short) to denote the active network generated by $\mathbf{t}^{\prime}$ , where $V^{\prime}$ is the set of all active agents and $E^{\prime}=\{(i,j)|(i\in V^{\prime},j\in n_{i}^{\prime})\vee(i=s,j\in n_{s})\}$ .

The active network represents all agents that do participate in the task. Given any report profile $\mathbf{t}^{\prime}$ , the sponsor only need to reward agents in the active networks.

Definition 3.

A reward distribution mechanism $M=(r_{i})_{i\in N}$ on the social network consists of a set of reward functions, where $r_{i}:\mathcal{T}^{\prime}\to\mathbb{R}$ is the reward function for $i$ and $r_{i}(\mathbf{t}^{\prime})=0$ for an inactive agent $i$ .

Given any report profile $\mathbf{t}^{\prime}\in\mathcal{T}^{\prime}$ , $r_{i}(\mathbf{t}^{\prime})$ outputs the reward to $i$ . If an agent is not in the active network, her reward is always zero as she does not participate in the task and contributes nothing. When $\mathbf{t}^{\prime}$ is clear from the context, we write as $\mathbf{r}$ and $r_{i}$ for short. In the following, we define some desirable properties that a reward mechanism should satisfy. First, the reward mechanism should be individually rational, which guarantees that each participant is willing to stay in the mechanism.

Definition 4.

A reward distribution mechanism $M$ is individually rational (IR) if $r_{i}(\mathbf{t}^{\prime})\geq 0$ for all graph $G$ , all $i\in N$ and all report profile $\mathbf{t}^{\prime}\in\mathcal{T}^{\prime}$ .

If a reward mechanism is not individually rational, then in certain cases some participants will pay to the sponsor and the best reply is leaving the mechanism. Therefore, the individually rational property is also known as the participation constraint. Besides the IR property, the sponsor also expects an agent to authentically contribute all her abilities and invite all her children to the task.

Definition 5.

A reward distribution mechanism $M$ is incentive compatible (IC) if the following inequality

r_{i}(t_{i},\mathbf{t}^{\prime}_{-i})\geq r_{i}(t^{\prime}_{i},\mathbf{t}^{% \prime}_{-i})

(1)

holds for all graph $G$ , all $i\in N$ , all $t_{i}\in\mathcal{T}_{i}$ , all $t^{\prime}_{i}\in\mathcal{T}^{\prime}_{i}$ and all $\mathbf{t}^{\prime}_{-i}\in\mathcal{T}^{\prime}_{-i}$ .

Incentive compatibility implies that diffusing the task information to all children and contributing all her efforts to the task is a dominant strategy for all agents. As the sponsor is endowed with a fixed budget, the total rewards to agents are limited.

Definition 6.

A reward distribution mechanism $M$ is budget balanced (BB) if

\sum^{n}_{i=1}{r_{i}(\mathbf{t}^{\prime})}=B

(2)

for all graph $G$ , all $i\in N$ and all report profile $\mathbf{t}^{\prime}\in\mathcal{T}^{\prime}$ .

Definition 7.

A reward distribution mechanism $M$ is asymptotically budget balanced (ABB) if

\lim_{\sum_{i\in N}{c_{i}^{\prime}}\to\infty}\sum_{i\in N}{r_{i}(\mathbf{t}^{% \prime})}=B

(3)

for all graph $G$ , all $i\in N$ and all report profile $\mathbf{t}^{\prime}\in\mathcal{T}^{\prime}$ .

The ABB property requires the sponsor’s budget to be fully distributed to agents when the sum of all agents’ contributions goes to infinity. If a reward mechanism is IR and IC, then agents are motivated to contribute all their capacities and propagate the task information to all their children. However, as the agents are individuals distributed in the network, they can easily create fake identities or even fake social networks to gain more reward. Such behaviors are called Sybil attack or false-name attack, and a good reward mechanism should prevent such kind of behavior. Next, we give a formal definition of Sybil attacks.

Definition 8.

A Sybil attack of agent $i$ is denoted by an attacking type report $a_{i}=(\nu_{i},\tau_{i})\in\mathcal{A}_{i}$ , where $\nu_{i}=\{i,i_{1},\ldots,i_{m}\}$ is a set of fake identities and accordingly $\tau_{i}=\{t^{\prime}_{i},t^{\prime}_{i_{1}},\ldots,t^{\prime}_{i_{m}}\}$ are their reports, where

•

$\sum_{j\in\nu_{i}}{c^{\prime}_{j}}\leq c_{i}$ ;
•

$n_{j}^{\prime}\subseteq n_{i}\cup\nu_{i}$ for all $j\in\nu_{i}$ .

In other words, agent $i$ can create arbitrary number of fake identities and arbitrary social connections between these identities. Let us consider a special case of Sybil attack: all the fake nodes are invited by the inviters of node $i$ .

Definition 9.

A parallel Sybil attack of agent $i$ is a special kind of Sybil attack, where $\nu_{i}=\{i,i_{1},\ldots,i_{m}\}$ is a set of fake identities invited by the parents of $i$ .

A Parallel Sybil attack implies only fake in parallel, where the fake participants are all invited by at least one inviter of the agent committing the attack. With the definition of Sybil attacks, we intend to design reward mechanisms that can defend against Sybil attacks.

Definition 10.

A reward distribution mechanism $M$ is Sybil-proof (SP), if the inequality

\sum_{j\in\nu_{i}}{r_{j}(a_{i},\mathbf{t}^{\prime}_{-i})}\leq r_{i}(t_{i},% \mathbf{t}^{\prime}_{-i})

(4)

holds for all graph $G$ , all $i\in N$ , all $t_{i}\in\mathcal{T}_{i}$ , all $\mathbf{t}^{\prime}_{-i}\in\mathcal{T}^{\prime}_{-i}$ and $a_{i}\in\mathcal{A}_{i}$ , where $(a_{i},\mathbf{t}^{\prime}_{-i})=(t^{\prime}_{i},t^{\prime}_{i_{1}},\ldots,t^{% \prime}_{i_{m}},\mathbf{t}^{\prime}_{-i})$ is the report profile of all agents under Sybil attack $a_{i}$ . The mechanism is parallel Sybil-proof (PSP) if the Sybil attacks satisfy the situation of parallel Sybil attacks.

The SP property may be too strong to be held, and we next introduce a mild condition for Sybil-proofness, called $\gamma$ -SP.

Definition 11.

A reward distribution mechanism $M$ is $\gamma$ -Sybil-proof ( $\gamma$ -SP), if the inequality

\sum_{j\in\nu_{i}}{r_{j}(a_{i},\mathbf{t}^{\prime}_{-i})}\leq\gamma r_{i}(t_{i% },\mathbf{t}^{\prime}_{-i})

(5)

holds for all graph $G=(V,E)$ , all $i\in N$ , all $t_{i}\in\mathcal{T}_{i}$ , all $\mathbf{t}^{\prime}_{-i}\in\mathcal{T}^{\prime}_{-i}$ and $a_{i}\in\mathcal{A}_{i}$ .

In the following contents, we focus on designing reward mechanisms that satisfy IR, IC and other expected properties.

3 Propagation Reward Distribution Mechanism

This section introduces a novel reward distribution mechanism called Propagation Reward Distribution Mechanism (PRDM). PRDM starts by layering a given network and then determines the final rewards for each agent by the contribution phase and propagation phase.

The goal of all agents is to get more rewards except that the sponsor wants to maximize the information propagation instead of receiving a reward. Sponsor $s$ will always diffuse the information to all the children. For a given report profile $\mathbf{t}^{\prime}$ , we generate the active network $G(\mathbf{t}^{\prime})=(V(\mathbf{t}^{\prime}),E(\mathbf{t}^{\prime}))$ . In $G^{\prime}$ , define the depth of agent $i$ as the length of the shortest path from $s$ to $i$ , written as $dep(i)$ . Therefore, different agents can be divided into different layers based on their depths, and define the $k$ -th layer $l_{k}=\{i\in V^{\prime}|dep(i)=k\}$ as the set of all agents with depth $k$ .

Since we only allow information to be propagated from the previous layer to the next layer, for all $i\in l_{k}$ , only the edges from agent $i$ to the agents in the $(k+1)$ -th layer are retained. By the above processing, we construct a layered directed graph based on $\mathbf{t}^{\prime}$ . Figure 1 shows an example of how to get the corresponding layered graph from an active network. In the obtained layered graph, for any $i\in l_{k}$ , define $p_{i}$ as the set of all parents of $i$ in $(k-1)$ -th layer.

Refer to caption — Figure 1: An example of transforming an active network (a) into a layered graph (b).

Input: A report profile

\mathbf{t}^{\prime}

, a fixed budget

B

and parameters

c_{s}>0

and

\beta\in[0,1/2]

1 Construct the active network

G(\mathbf{t}^{\prime})=(V(\mathbf{t}^{\prime}),E(\mathbf{t}^{\prime}))

;

2 Compute the depth of each agent who is on the graph

G(\mathbf{t}^{\prime})

to obtain the layer sets

l_{1},l_{2},\ldots,l_{d}

;

3 For

k=1,2,\ldots,d

, let

C_{k}^{\prime}=c_{s}+\sum_{i\in V(\mathbf{t}^{\prime}),dep(i)\leq k}{c_{i}^{% \prime}}

be the total contribution of

s

and layer

l_{1},l_{2},\ldots,l_{k}

;

4 Contribution phase: Initialize each agent’s weight

w_{i}=0

for

i\in N

, and the initial budget of the first layer is

B_{1}=B

;

5 for $k=1,2,\ldots,d$ do

6 for each agent $i\in l_{k}$ do

w_{i}=\frac{c_{i}^{\prime}}{C_{k}^{\prime}}B_{k}

;

B_{k+1}=B_{k}-\sum_{i\in l_{k}}{w_{i}}

;

11Propagation phase: Initialize each agent’s reward

r_{i}=w_{i}

for all

i\in l_{1}

, and

r_{i}=(1-\beta)w_{i}

for

i\in N\setminus l_{1}

;

12 for $k=2,3,\ldots,d$ do

13 for each agent $i\in l_{k}$ do

14 for each agent $j\in p_{i}$ do

r_{j}=r_{j}+\frac{c_{j}^{\prime}}{\sum_{m\in p_{i}}{c_{m}^{\prime}}}\beta w_{i}

;

Output: the reward vector

\mathbf{r}(\mathbf{t}^{\prime})

Algorithm 1 Propagation Reward Distribution Mechanism

PRDM is divided into a contribution phase and a propagation phase. In the contribution phase, the corresponding weight is determined by each agent’s depth and contribution. In the propagation phase, the weight is redistributed according to agents’ propagation and output agents’ final reward. In PRDM, the parameter $c_{s}$ is a virtual capacity of the sponsor, which is utilized to deliver the budget to the following layers. The parameter $\beta$ measures what proportion of the rewards an agent gives her invitees. With the above definitions, the general procedure of PRDM is shown in Algorithm 1.

3.1 An Example of PRDM

In this subsection, we show an example of the mechanism in operation. An instance is shown in Figure 2 to give an illustration of PRDM. The sponsor transmits the information to the first layer $l_{1}=\{1,2,3\}$ . After that, $l_{2}=\{4,5,6\}$ and $l_{3}=\{7,8\}$ . The invitation relationships among all the agents are presented in Figure 2(a).

Assuming a budget $B=100$ , we set $\beta=0.2$ and $c_{s}=20$ , all agents report a contribution of $10$ . The process of distributing rewards using PRDM is as follows.

Contribution phase:

•

Step 1: $C_{1}^{\prime}$ is the total contribution of sponsor $s$ and agents $1$ , $2$ , and $3$ . We can calculate $C_{1}^{\prime}=20+3*10=50$ and the budget $B_{1}=B=100$ , so that each of them has weight

w_{1}=w_{2}=w_{3}=\frac{10}{50}*100=20

•

Step 2: Calculate the budget $B_{2}=B_{1}-w_{1}-w_{2}-w_{3}=40$ and $C_{2}^{\prime}=C_{1}^{\prime}+3*10=80$ . Then we obtain the weight of the agent $4$ , $5$ , and $6$ as

w_{4}=w_{5}=w_{6}=\frac{10}{80}*40=5

•

Step 3: Similarly, $B_{3}=B_{2}-w_{4}-w_{5}-w_{6}=25$ , $C_{3}^{\prime}=C_{2}^{\prime}+2*10=100$ , so the weight of agents $7$ and $8$ is

w_{7}=w_{8}=\frac{10}{100}*25=2.5

Propagation phase:

•

Step 4: The initial reward for agents is the weight calculated in the contribution phase

	$\displaystyle r_{1}=r_{2}=r_{3}=20;$
	$\displaystyle r_{4}=r_{5}=r_{6}=(1-\beta)*5=4;$
	$\displaystyle r_{7}=r_{8}=(1-\beta)*2.5=2$

•

Step 5: Agent $4$ and agent $5$ transfer $0.2$ of their weights to agent $1$ respectively as rewards; agent $6$ transfers $\frac{\beta}{2}=\frac{0.2}{2}=0.1$ of her weights to agent $2$ and agent $3$

	$\displaystyle\dashrightarrow$	$\displaystyle r_{1}=r_{1}+\beta*w_{4}=21;$
	$\displaystyle\dashrightarrow$	$\displaystyle r_{1}=r_{1}+\beta*w_{5}=22;$
	$\displaystyle\dashrightarrow$	$\displaystyle r_{2}=r_{2}+\beta/2*w_{6}=20.5,$
		$\displaystyle r_{3}=r_{3}+\beta/2*w_{6}=20.5$

•

Step 6: Similarly, we consider the transfer of agent $7$ and agent $8$

	$\displaystyle\dashrightarrow$	$\displaystyle r_{4}=r_{4}+\beta*w_{7}=4.5;$
	$\displaystyle\dashrightarrow$	$\displaystyle r_{6}=r_{6}+\beta*w_{8}=4.5$

The final reward is $\mathbf{r}=(22,20.5,20.5,4.5,4,4.5,2,2)$ according to PRDM. Each component of $\mathbf{r}$ represents the reward of the corresponding agent. Note that we still have $B_{4}=B_{3}-w_{7}-w_{8}=20$ available for further propagation.

4 Properties of PRDM

In this section, we show several properties of PRDM. We start by discussing the straightforward properties of PRDM, and then we illustrate how PRDM maximizes information propagation and defends against Sybil attacks.

For the convenience contents of the following formulation, denote $C_{S}^{\prime}$ as the sum of the contributions of the set $S$ , e.g., $C_{l_{k}}^{\prime}$ is the total contribution of $k$ -th layer. Recall that when $k$ is an integer, $C_{k}^{\prime}$ denotes the total contribution of the first $k$ layers.

Theorem 1.

The Propagation Reward Distribution Mechanism is asymptotically budget balanced.

Proof.

In PRDM, the division of the initial budget $B$ is performed only in the contribution phase, which implies $\sum_{i\in N}{r_{i}}=\sum_{i\in N}{w_{i}}$ . Recall that for an active network $G^{\prime}=(V^{\prime},E^{\prime})$ , the sponsor $s$ has a virtual contribution $c_{s}>0$ and $C_{k}^{\prime}=c_{s}+\sum_{i\in V(\mathbf{t}^{\prime}),dep(i)\leq k}{c_{i}^{% \prime}}$ is the total contribution of $s$ and all the agents in layer $l_{1},\ldots,l_{k}$ .

According to PRDM, each layer can only divide a part of the remaining reward from the previous layer. Suppose that there are $d$ layers. We focus on $B_{k}$ , which is the residual budget of layer $l_{k}$ inherited from the upper layer. Generally, for $k=1,\ldots,d-1$ , we have $B_{k+1}=B_{k}-\sum_{i\in l_{k}}{w_{i}}$ . Specially, let $B_{d+1}=B_{d}-\sum_{i\in l_{d}}{w_{i}}$ be the budget that has not been distributed. Then, we can infer that

	$\displaystyle\sum_{i=1}^{n}{r_{i}}$	$\displaystyle=\sum_{i=1}^{n}{w_{i}}=\sum_{k=1}^{d}{\sum_{i\in l_{k}}{w_{i}}}$
		$\displaystyle=\sum_{k=1}^{d}{(B_{k}-B_{k+1})}=B-B_{d+1}$

Next, we show that $B_{d+1}$ converges to 0 when the total contribution goes to infinity. Starting from the first layer, we can get

	$\displaystyle B_{1}=$	$\displaystyle\ B$
	$\displaystyle B_{2}=$	$\displaystyle\ B_{1}-\sum_{i\in l_{1}}{w_{i}}=B_{1}-\sum_{i\in l_{1}}{\frac{c_% {i}^{\prime}}{C_{1}^{\prime}}}B_{1}=\frac{c_{s}}{C_{1}^{\prime}}B$
	$\displaystyle B_{3}=$	$\displaystyle\ B_{2}-\sum_{i\in l_{2}}{w_{i}}=B_{2}-\sum_{i\in l_{2}}{\frac{c_% {i}^{\prime}}{C_{2}^{\prime}}}B_{2}=\frac{c_{s}}{C_{2}^{\prime}}B$

Similarly, for $k=2,\ldots,d$ , we have $B_{k}=\frac{c_{s}}{C_{k-1}^{\prime}}B$ . Then, when the total contribution goes to infinity, $C_{d}^{\prime}=\sum_{i=1}^{n}{c_{i}^{\prime}}\to\infty$ , hence $B_{d+1}=\frac{c_{s}}{C_{d}^{\prime}}B\to 0$ .

∎

The above theorem indicates that PRDM will allocate all of the sponsor’s budget to the agents when the total contribution is large enough. Meanwhile, the sponsor does not need to pay extra budgets for the contributions of extra participants.

Theorem 2.

The Propagation Reward Distribution Mechanism is individually rational.

Proof.

Intuitively, any agent $i$ in a social network $G$ , at any stage of PRDM, does not need to pay a fee, so $r_{i}\geq 0$ holds.

∎

Actually, for any agent $i\in G(\mathbf{t}^{\prime})$ of the active network, they always have a positive reward $r_{i}>0$ . Furthermore, Theorem 3 shows that an agent maximize the reward when she truthfully report her type.

Theorem 3.

The Propagation Reward Distribution Mechanism is incentive compatible.

Proof.

By the definition of incentive compatible, PRDM needs to satisfy that for any agent $i\in N$ , for any report profile $\mathbf{t}^{\prime}_{-i}$ of others, truthfully reporting her private type $t_{i}$ is a dominant strategy. The report $t^{\prime}_{i}$ of agent $i$ consists of the contributions $c_{i}^{\prime}$ and the set of children $n_{i}^{\prime}$ . Hence for any agent $i\in N$ , we need to prove the following two parts

•

Agent $i$ contributes as much as she is capable $c_{i}^{\prime}=c_{i}$ to maximize her reward.
•

Agent $i$ invites all her children $n_{i}^{\prime}=n_{i}$ to maximize her reward.

Part 1: if agent $i$ is not in the active network $G(\mathbf{t}^{\prime})=(V(\mathbf{t}^{\prime}),$ $E(\mathbf{t}^{\prime}))$ , the reward is zero regardless of how much she contributes, so $c_{i}^{\prime}=c_{i}$ maximizes her reward. For any $i\in V(\mathbf{t}^{\prime})$ , assume that agent $i$ is in the $k$ -th layer ( $1<k<d$ ) in the layered graph with $d$ layers and agent $i$ is the only parent of her children in $(k+1)$ -th layer. Thus for any $0<c_{i}^{\prime}\leq c_{i}$ , any $n_{i}^{\prime}\subseteq n_{i}$ and $0\leq\beta\leq\frac{1}{2}$ , we have

$\begin{aligned} &r_{i}(t^{\prime}_{i},\mathbf{t}^{\prime}_{-i})=(1-\beta)\frac% {c_{i}^{\prime}}{C_{k-1}^{\prime}+c_{i}^{\prime}+C_{l_{k}\setminus\{i\}}^{% \prime}}B_{k}\\ &+\beta\frac{C_{l_{k+1}\cap n_{i}^{\prime}}^{\prime}}{C_{k-1}^{\prime}+c_{i}^{% \prime}+C_{l_{k}\setminus\{i\}}^{\prime}+C_{l_{k+1}}^{\prime}}\frac{C_{k-1}^{% \prime}}{C_{k-1}^{\prime}+c_{i}^{\prime}+C_{l_{k}\setminus\{i\}}^{\prime}}B_{k% }\end{aligned}$

(6)

where $C_{l_{k}\setminus\{i\}}^{\prime}$ is the total contribution in $k$ -th layer except $i$ , $C_{l_{k+1}\cap n_{i}^{\prime}}$ is the total contribution of $i$ ’s children in $(k+1)$ -th layer. The first term of $r_{i}(t^{\prime}_{i},\mathbf{t}^{\prime}_{-i})$ in Equation (6) is the reward reserved by $i$ . The second term is the reward coming from the next layer. All quantities except $c_{i}^{\prime}$ are fixed, so the first term increases as $c_{i}^{\prime}$ increases and the second term decreases as $c_{i}^{\prime}$ increases. Consider the worst case: $C_{l_{k}\setminus\{i\}}^{\prime}=0$ , $C_{l_{k+1}\cap n_{i}^{\prime}}=C_{l_{k+1}}$ , $\beta=\frac{1}{2}$ when the first term decreases the fastest while the second term increases the slowest, $r_{i}(t^{\prime}_{i},\mathbf{t}^{\prime}_{-i})$ can be reduced as

	$\displaystyle r_{i}(t^{\prime}_{i},\mathbf{t}^{\prime}_{-i})$
$\displaystyle=$	$\displaystyle\frac{1}{2}\frac{1}{C_{k-1}^{\prime}}\left(c_{i}^{\prime}+\frac{C% _{k-1}^{\prime}C_{l_{k+1}}^{\prime}}{C_{k-1}^{\prime}+c_{i}^{\prime}+C_{l_{k+1% }}^{\prime}}\right)B_{k}$
$\displaystyle=$	$\displaystyle\frac{1}{2}\frac{1}{C_{k-1}^{\prime}+c_{i}^{\prime}}\frac{c_{i}^{% \prime}C_{k-1}^{\prime}+c_{i}^{\prime}c_{i}^{\prime}+c_{i}^{\prime}C_{l_{k+1}}% ^{\prime}+C_{k-1}^{\prime}C_{l_{k+1}}^{\prime}}{C_{k-1}^{\prime}+c_{i}^{\prime% }+C_{l_{k+1}}^{\prime}}B_{k}$
$\displaystyle=$	$\displaystyle\frac{1}{2}\frac{(C_{k-1}^{\prime}+c_{i}^{\prime})(c_{i}^{\prime}% +C_{l_{k+1}}^{\prime})}{(C_{k-1}^{\prime}+c_{i}^{\prime})(C_{k-1}^{\prime}+c_{% i}^{\prime}+C_{l_{k+1}}^{\prime})}B_{k}$
$\displaystyle=$	$\displaystyle\frac{1}{2}\frac{c_{i}^{\prime}+C_{l_{k+1}}^{\prime}}{C_{k-1}^{% \prime}+c_{i}^{\prime}+C_{l_{k+1}}^{\prime}}B_{k}$	(7)

Since $r_{i}(t^{\prime}_{i},\mathbf{t}^{\prime}_{-i})$ is a monotonically increasing function of $c_{i}^{\prime}$ , agent $i$ receives the highest reward when $c_{i}^{\prime}=c_{i}$ . Furthermore, if $k=1$ , agent $i$ is in the first layer and is not required to distribute rewards to the previous layer, the first term in Equation (6) will be larger. If $k=d$ , agent $i$ is in the last layer and has no rewards from the next layer, so the second term in Equation (6) is $0$ . If agent $i$ is not the only parent of her children in $(k+1)$ -th layer, the second term in the equation (6) decreases more slowly. All of these cases will be better than the worst case we discussed in Equation (7). Therefore $c_{i}^{\prime}=c_{i}$ maximizes the reward of agent $i$ .

Part 2: if agent $i$ is not in the active network $G(\mathbf{t}^{\prime})=(V(\mathbf{t}^{\prime})$ , $E(\mathbf{t}^{\prime}))$ , again her reward is always equal to $0$ . If $i\in V(\mathbf{t}^{\prime})$ , for all $n_{i}^{\prime}\subset n_{i}$ , she add one more child $j\in n_{i}$ into $n_{i}^{\prime}$ . Suppose agent $j$ is already in $V(\mathbf{t}^{\prime})$ . In that case, we consider that $j$ is in the layer below $i$ , $i$ gets an additional reward without affecting the existing reward, and $i$ ’s reward remains unchanged if $j$ is in other layers. Alternatively $j$ is a new agent in the active network, then $j$ must be in the next layer of $i$ , the reward of $i$ changes from $(1-\beta)\frac{c_{i}^{\prime}}{C_{k}^{\prime}}B_{k}+\beta\frac{C_{l_{k+1}\cap n% _{i}^{\prime}}}{C_{k+1}^{\prime}}B_{k+1}$ to $(1-\beta)\frac{c_{i}^{\prime}}{C_{k}^{\prime}}B_{k}+\beta\frac{c_{j}^{\prime}+% C_{l_{k+1}\cap n_{i}^{\prime}}}{c_{j}^{\prime}+C_{k+1}^{\prime}}B_{k+1}$ , which is obviously increased. Hence when agent $i$ invites all her children, she maximizes the reward.

In conclusion, PRDM is incentive compatible, which indicates that truthful report is the dominant strategy for all agents. In other words, all agents will maximize information propagation while making the largest contributions within their capacity.

∎

Next, we will discuss the property of Sybil-proofness.

Theorem 4.

The Propagation Reward Distribution Mechanism is parallel Sybil-proof.

Proof.

Suppose agent $i\in l_{k}$ ( $1\leq k\leq d$ ). When agent $i$ does commit a parallel Sybil attack to be $\nu_{i}=\{i,i_{1},\ldots,i_{m}\}$ . It can be simply deduced from the proof of incentive compatible that for all nodes in the set $\nu_{i}$ , their dominant strategy is making the largest contributions within their capacity and invites all their children. However, their capacity is limited by $\sum_{j\in\nu_{i}}{c^{\prime}_{j}}\leq c_{i}$ , which means that truthful reports without creating fake nodes will maximize the benefit of agent $i$ .

∎

Then we discuss the more general situation of Sybil attacks. Before giving the main conclusion, we first present two lemmas. Lemma 1 concludes that an agent cannot increase her weight in contribution phase by making fake nodes.

Lemma 1.

Each agent $i\in V(\mathbf{t}^{\prime})$ cannot increase the total weight in contribution phase by committing Sybil attack $a_{i}=(\nu_{i},\tau_{i})$ .

Proof.

Suppose agent $i\in l_{k}$ ( $1\leq k\leq d$ ). When agent $i$ does not commit a Sybil attack, the network is shown in Figure 3(a), the weight of $i$ is $w_{i}=\frac{c_{i}^{\prime}}{C_{k}^{\prime}}B_{k}$ . Let us first show that an agent cannot increase her weight by making several fake nodes as her own children. For convenience, we denote $\nu_{-i}=\nu_{i}\setminus\{i\}$ .

Without loss of generality, let $c_{i}^{\prime}=c_{i}$ . After committing Sybil attack $a_{i}=(\nu_{i},\tau_{i})$ , agent $i$ can transfer part of her contribution $\delta$ to her fake nodes ( $0<\delta<c_{i}$ ) and $\sum_{j\in\nu_{-i}}{c_{j}^{\prime}}=\delta$ . Let $\mathcal{W}_{i}(\delta)=\sum_{j\in\nu_{i}}{w_{j}}$ be the total weight of $i$ and all her fake nodes. According to PRDM, as shown in Figure 3(b), when all the fake nodes are in the next layer of $i$ , we have

$\begin{aligned} &\mathcal{W}_{i}(0)=\frac{c_{i}}{C_{k-1}^{\prime}+c_{i}+C_{l_{% k}\setminus\{i\}}^{\prime}}B_{k}\\ &\mathcal{W}_{i}(\delta)=\frac{c_{i}-\delta}{C_{k-1}^{\prime}+c_{i}+C_{l_{k}% \setminus\{i\}}^{\prime}-\delta}B_{k}\\ &+\frac{\delta}{C_{k-1}^{\prime}+c_{i}+C_{l_{k}\setminus\{i\}}^{\prime}+C_{l_{% k+1}\setminus\nu_{-i}}^{\prime}}\frac{C_{k-1}^{\prime}}{C_{k-1}^{\prime}+c_{i}% +C_{l_{k}\setminus\{i\}}^{\prime}-\delta}B_{k}\end{aligned}$

It can be shown that for any $\delta$ , there is $\mathcal{W}_{i}(0)-\mathcal{W}_{i}(\delta)=\frac{P}{Q}$ , where

	$\displaystyle P=$	$\displaystyle\ \delta C_{l_{k}\setminus\{i\}}^{\prime}\left(C_{l_{k}\setminus% \{i\}}^{\prime}+C_{l_{k+1}\setminus\nu_{-i}}^{\prime}+C_{k-1}^{\prime}+c_{i}\right)$
		$\displaystyle+\delta C_{k-1}^{\prime}C_{l_{k+1}\setminus\nu_{-i}}^{\prime}\geq 0$
	$\displaystyle Q=$	$\displaystyle\left(C_{k-1}^{\prime}+c_{i}+C_{l_{k}\setminus\{i\}}^{\prime}% \right)\left(C_{k-1}^{\prime}+c_{i}+C_{l_{k}\setminus\{i\}}^{\prime}-\delta\right)$