Improved discouragement article further

This commit is contained in:
Vitalik Buterin 2017-07-09 07:58:11 -04:00
parent a66f3766e7
commit 58c1ea6574
2 changed files with 21 additions and 11 deletions

Binary file not shown.

View File

@ -93,33 +93,43 @@ Here we evaluate the feasibility of attackers with a two-step plan. First, run a
This kind of attack is difficult to economically model because under certain assumptions the cost is zero: if an attacker can credibly announce that they will grief with $h > 1$, then all other validators will leave, and the attacker will then be free to join with one single validator and perform a censorship attack at infinitesimal cost. This result is true in \textit{any} game where the net profit of a validator can be made to drop below zero through no fault of their own, which is itself true of any consensus algorithm where a censorship attack has nonzero cost, because of the fundamental fault inattributability of censorship versus a minority going offline.
What we \textit{can} do is model the game in various ways that add realistic ``friction" to non-attacking validators' economic reasoning, and see how the parameters of the game can be optimized so as to maximize the cost of attack given these frictions. One possibility is to model it as a three-phase game, where in phase 1 the attacker griefs with some $h$, all validators get their due rewards and penalties ($1 - \frac{h}{r}$ for the attacker, $1 - h$ for everyone else), then in phase 2 both the attacker and other validators make choices about how to allocate their resources and finally in phase 3 the attacker decides whether or not to attack.
What we \textit{can} do is model the game in various ways that add realistic ``friction" to non-attacking validators' economic reasoning, and see how the parameters of the game can be optimized so as to maximize the cost of attack given these frictions. To more clearly illustrate the difference between losses on the order of security deposits and losses on the order of rewards, we now assume that all rewards and penalties are multiplied by some base interest rate $y_0$; that is, the victims earn $y_0 * \frac{1 - h}{D^p}$ and the attacker earns $y_0 * \frac{1 - \frac{h}{r}}{D^p}$.
Let us first consider finality reversion attacks. In a finality reversion attack, if the deposit size is $D$, the cost of an attack is $\frac{D}{3}$. An attacker's strategy is easy: grief with $h = 1$ in phase 1, drive all other validators away as their revenue drops to zero, and then attack in phase 2. The attacker's cost here, assuming the attacker had $50\%$ of the validator set in phase 1, is $\frac{1}{2} * (1 - \frac{1}{r})$.
One possibility is to model it as a three-phase game, where in phase 1 the attacker griefs with some $h$, all validators get their due rewards and penalties, then in phase 2 both the attacker and other validators make choices about how to allocate their resources and finally in phase 3 the attacker decides whether or not to attack.
Now, let us modify the game slightly: suppose that of the $\frac{x}{3}$ penalized, half goes to all other validators. Suppose the total deposit size is $x_1$ in phase 1, with base interest rate $y_1 = \frac{1}{x_1^p}$. The attacker griefs with some $h$ in phase 1, and as a result in phase 2 the total deposit size drops to $x_2$, with base interest rate $y_2 = \frac{2}{x_2^p}$. The attacker then attacks with probability $P_{attack}$.
Let us first consider finality reversion attacks. In a finality reversion attack, if the deposit size is $D$, the cost of an attack is $\frac{D}{3}$. An attacker's strategy is easy: grief with $h = 1$ in phase 1, drive all other validators away as their revenue drops to zero, and then attack in phase 2. The attacker's cost here, assuming the attacker had $50\%$ of the validator set in phase 1, is $\frac{1}{2} * y_0 * (1 - \frac{1}{r})$.
Now, let us modify the game slightly: suppose that of the $\frac{D}{3}$ penalized, half goes to all other validators. The attacker griefs with some $h$ in phase 1, and as a result in phase 2 the total deposit size drops from 1 to $D_2$, with base interest rate $y_2 = \frac{y_0}{D_2^p}$. The attacker then attacks with probability $P_{attack}$.
The attacker's cost is:
$\frac{1}{2} * x_1 * y_1 * h + P_{attack} * \frac{1}{3} * x_2$
$\frac{1}{2} * y_0 * h + P_{attack} * \frac{1}{3} * D_2$
The first term in the sum is the cost in phase 1, and the second term is the expected cost in phase 2.
Supply-demand equilibrium tells us that in phase 2 we have:
$y_2 * (1-h) + \frac{1}{4} * P_{attack} = x_2^d$
$y_2 * (1-h) + \frac{1}{4} * P_{attack} = y_0 * D_2^d$
The $\frac{1}{4}$ fraction comes from the fact that during an attack, non-attacker's deposits would increase by 25\%. Let us assume $d = p = 1$. We can simplify:
The $\frac{1}{4}$ fraction comes from the fact that during an attack, non-attacker's deposits would increase by 25\%, and because the original intersection was $(1, y_0)$ the demand curve must also be multiplied by $y_0$. Let us assume $d = p = 1$. We can simplify:
$\frac{1}{x_2} * (1-h) + \frac{1}{4} * P_{attack} = x_2$
$\frac{y_0}{D_2} * (1-h) + \frac{1}{4} * P_{attack} = y_0 * D_2$
This gives us $x_2$ out of $P_{attack}$ and $h$ through a quadratic equation, which we can then plug into the attacker's cost. If we normalize $x_1 = y_1 = 1$, then this gives the cost as a function of $h$ and $P_{attack}$. The quadratic equation is:
Or:
$x_2 = \frac{\frac{P_{attack}}{4} + (\frac{P_{attack}^2}{16} - 4 * (h-1))^{\frac{1}{2}}}{2}$
$(h-1) - \frac{P_{attack}}{y_0 * 4} * D_2 + D_2^2 = 0$
Let's look at the case $P_{attack} = 1$. Then, $x_2 = \frac{\frac{1}{4} + (\frac{1}{16} - 4 * (h-1))^{\frac{1}{2}}}{2}$. The discriminant equals zero when $h = \frac{65}{64}$, and for higher values of $h$ the discriminant goes negative. Hence, for any value of $h$ substantially above 1, there is no solution, suggesting that there is no $x_2$ at which validators would find it profitable to stay. This can be seen as a negative feedback loop: the lower $x_1$ goes, the more highly negative the interest rate can go, and so $x_1$ goes further down. This suggests a rationale for designing the mechanism so that as the total deposit size decreases the maximum possible $h$ is also decreases, until below some critical size it is not much lower than 1. Additionally, it is also an argument for selecting lower values of $p$, though the benefit is fairly marginal.
This gives us $D_2$ out of $P_{attack}$ and $h$ through a quadratic equation, which we can then plug into the attacker's cost. This gives the cost as a function of $h$ and $P_{attack}$. The quadratic equation is:
In general, what this analysis suggests is that (i) discouragement attacks for consensus breaking are difficult to fully defeat, (ii) setting lower values of $p$ is a good idea, and (iii) perhaps the best way to is to increase friction for validators in the consensus game looking to drop out.
$D_2 = \frac{\frac{P_{attack}}{4 * y_0} + (\frac{P_{attack}^2}{16 * y_0^2} - 4 * (h-1))^{\frac{1}{2}}}{2}$
The discriminant equals zero at when $\frac{P_{attack}^2}{16 * y_0^2} = 4 * (h-1)$, or $h = 1 + (\frac{P_{attack}}{8 * y_0})^2$; if $h$ is higher than this value then there is no intersection between the new de-facto supply curve and the demand curve, meaning that non-attacking validators will lose money regardless of what happens, and so $D_2 = 0$.
Because the benefits to the attacker of removing validators from the validator set are so high, we find that the optimal $h$ for any given $P_{attack}$ is generally precisely the one which sets $D_2 = 0$, ie. $h = 1 + (\frac{P_{attack}}{8 * y_0})^2 + \epsilon$.
One possible mitigation to this kind of attack is to simply make it more difficult to grief with $h$ much higher than $1$ in the specific case where $D$ is low. That is, suppose that there exists some behavior in the network that causes some given amount of harm to the protocol, and one cannot determine whether it is caused by offline validators or censoring validators. Instead of setting punishments proportional to $\frac{y_0}{D^p}$, set them proportional to $y_0$, or perhaps as a compromise $\frac{y_0}{D^\frac{p}{2}}$, or a piecewise function. This means that if $D$ is low, attackers will be able to cause more disruption of performance to the network at lower cost to themselves, but in return creates a scenario where it is more difficult to engage in a discouragement attack, because causing enough damage to the network for $h$ to exceed $1$ will take a longer time.
The second case that we can analyze is the case where the attacker engages in a discouragement attack, and then in the second stage engages in a censorship attack. Here, there is no counter-pressure where validators are encouraged to stay because of the possibility they will get a windfall from the attack, as in a censorship attack all validators, including the attacker and victims, must be penalized. This case is even worse than the above, as the $h$ required to drive out other validators will be \textit{less} than $1$. However, the mitigation strategy is broadly similar. Because this kind of attack is strictly worse than a finality reversion attack, it may not be worth the complexity to implement a scheme where malicious validators' rewards are distributed to other validators, as we can expect that malicious attackers will nearly always opt for a censorship attack instead of a finality reversion attack in any case.
\section{Bribing to counter-grief}