logos.co/technology/blockchain.md

34 KiB
Raw Blame History

This document describes the Glacier Binary Byzantine Agreement algorithm, and its extension to a DAG structure in order to achieve a leaderless byzantine-resistant algorithm for consensus, of application in the context of permission-less blockchains.

Objective

The first and foremost objective of this research project is to bypass the Avalanche algorithm patent. Minimal modifications are required to achieve this, and they must not introduce any detrimental effects over the original one. However, we take this as an opportunity to improve the algorithm in some meaningful aspect.

A challenge that is presented in this research is that the algorithm is very simple, and when it comes to scalability is hard to beat. It also performs very well when it comes to decentralization and security. Improving any of these aspects is indeed challenging.

  • Improving scalability would entail either fundamentally improving the number of tx/s or reducing the communication costs (which generally translates into the ability to support a larger number of nodes).
  • Improving security would imply better byzantine behavior resistance, or perhaps the ability to reduce time to finalization, which would mean that the security parameters are reached faster in the new algorithm. This could also include parameterization, or dynamics in parameters across the network for various types of decisions.
  • Improving decentralization would entail the participation of a larger number of nodes, and/or the reduction of power given by concentration of voting power.

Of these vertices of the so-called blockchain trilemma, scalability and decentralization are optimal or nearly optimal in the case of Avalanche, given the sampling and probabilistic nature of the algorithm. That is not to say that is impossible to improve in those directions, but only marginal improvements are possible with high probability.

Use cases

The main use case of the blockchain infrastructure is to provide support for bootstrapping and maintaining Common Interest Communities (CIC). Common Interest Communities have the following key characteristics:

  • We expect the range of network sizes to be wide. We can expect some CICs to be in the scale ~10 nodes, while others could (potentially) scale up to current blockchain limits.
  • CICs will have varying degrees of inter-communication. Some will be existing in relative isolation (ie. no communication), while others could be very interdependent.
  • Some CICs will exist as part of the Status Network, while others could exist completely independently as a separate infrastructure. In other words, unlike Avalanche, which requires an $AVAX fee to create a subnetwork, this requirement would be lifted.

These properties lead to two main research lines that could result, if successfully accomplished, in a differentiated algorithm with a strong value proposition:

  1. DAG construction flexible enough to adapt to the different communication requirements between CICs. A stream or subgraph would provide fast finality within the CIC, but also allow through a secondary protocol or a specific node selection step to occasionally “tie together” two subgraphs. This secondary protocol should provide enough flexibility to accommodate for different levels of communication frequency and finality requirements between subgraphs (ie CICs). The general idea would be to have DAGs growing separately that merge at certain intervals or under specific conditions.
  2. Explore the space between permissioned committee-based blockchains and permissionless blockchains. This could be explored as a hybrid reputation and staking based approach, under the framework of Asymmetric Trust. This is specifically aimed at small communities that dont have a large enough economic support to secure the network through staking (or have extremely powerful adversaries), while being able to rely partially on meatspace reputation (PoA, Stellar) or dynamic systems of reputation (IOTAs Mana).

Requirements

We define the set of initial requirements as follows:

  • Must-have. These requirements are non-negotiable, fundamental to satisfying the use cases.
    • Sybil Resistant. This is a fundamental requirement of any blockchain design, but we list it here as it is unique in our case. We potentially need to support different models of sybil resistance depending on the scale of the network, and for bootstrapping purposes.
    • Crash & Byzantine Fault Tolerant. These are fundamental requirements in a public blockchain.
    • Fast Finality (ideally sub-second, or in the 2 sec range, like Avalanche). It is fundamental to the usability of the blockchain, and something within our reach considering the starting point given by Avalanche.
    • Leaderless / Weak Leadership, in order to keep the network as decentralized as possible, we aim at mitigating the centralization effects of strong leadership (like leader selection algorithms).
    • Individual participant scale ~ constant with respect to network size: in similar spirit to the leaderless or weak leadership requirement, we establish the need to allow participants in the network to have the same impact regardless of the scale of the network. In other words, the expectation of being queried by other nodes should not vary in accordance to network size.
    • Suitable for “social applications” (fast and efficient, modest computational resources). It is key that we steer away from the centralization effects of high computation requirements.
    • Ability to transition from a low node count, to a large node count. This is an important technical requirement in order to fulfill the use cases. The general idea is to be able to support networks from their inception at low scales to their growth and expansion phase.
    • Ability to bootstrap small networks. Either from scratch or as spin-off of a larger network. Closely related to the previous requirement, this refers to the process of growth from low to high scale networks, in a smooth manner.
  • Desirable or to be explored. These requirements are initially considered as relevant to matching the use cases, but they arent required per se.
    • Round-less. This is usually associated to the lack of a leader election and distinct protocol phases as usually seen in classical BFT variants.
    • Probabilistic. Not a requirement per se, but a likely condition to achieve substantial improvements over classical BFT. The choice of a probabilistic approach to consensus is key in the Avalanche algorithm to achieve state of the art performance.
    • Strong liveness (at the expense of partition tolerance). A strongly live system makes progress under network partitioning conditions, and later on reconciles the forks.
    • Asynchronous. This is also associated with partition tolerance, in the sense that a chain can make progress even if a substantial part of the network is out of reach.
    • Decouple execution layer as much as possible, so that execution models interchangeable and/or extensible.
    • Explore the possibility of highly partitioned blockchains with local views. This takes inspiration from the Tangle (IOTA) and Hashgraph. The idea would be to explore these key points:
      • Could clients participate in consensus as well, in a very lightweight manner, by just synchronizing a small part of the blockchain.
      • By the same token, this would bring many partial local chains, perhaps only updated on-demand. Potentially client nodes can do this in a different way than verifier nodes, that hold larger portions of the blockchain.

Reference Work

Prior to conducting the design of a new consensus algorithm with the requirements described above, an extensive research was carried out. The research approach was to start out from surveys and taxonomies to cover the ground as systematically as it is possible, as well as identifying papers and algorithms that would not fit those taxonomies (via other paper references primarily).

  • Snowball is the binary byzantine agreement algorithm that serves as the base for Avalanche to resolve conflicts in the DAG. In short, Avalanche is a combination of a DAG structure that employs the parent paths as a series of indirect votes for previous transactions. Voting itself is performed by the Snowball algorithm, with the difference that it runs a single iteration per DAG vertex. More details here:

    Link to Avalanche Paper.

  • Lachesis is a family of algorithms that combine a DAG structure with classical BFT algorithms. The interest of this algorithm is that it leverages the DAG so that BFT rounds are embedded in it. It draws some inspiration from Hotstuff, one of the top-performing BFT algorithms, specifically in the consolidation of the multiple rounds into a continuous algorithm that interweaves them seamlessly. Link to Lachesis Paper. Note: As mentioned, Lachesis is a family of algorithms, so there are a series of papers. See Zotero for more.

  • Evidence/Confidence. In our research, one of the most influential papers for the development of the new algorithm was the paper Confidence as Higher-Order Uncertainty, which elaborates the notion of confidence defined in terms of the amount of available evidence, and interpreted and processed as the relative stability of the first-order uncertainty evaluation.

    Link to Paper

  • Other references. As part of the initial research, many other algorithms were visited, such as Hashgraph, Honeybadger, Dfinity, etc. More conclusions and information can be found in the checkpoint documents, but since they have been less influential for our purposes, we leave the details out of this document.

Approach

As pointed out in the Reference Work section, the Snowball and Lachesis algorithms have been selected as the starting points for the development of the new algorithm. An extensive comparison between the two has been detailed in the following document.

Comparison diagram between Snowball and Lachesis (hover and click “Original” for full view)

Comparison diagram between Snowball and Lachesis (hover and click “Original” for full view)

comment

The comparison is somewhat complex, as both algorithms are state of the art, but relatively unproven in production (MainNet). Here, we would like to highlight the following key differences:

  • Avalanche is probabilistic, which is directly related to its superior scalability. Conversely, Lachesis finality is deterministic, which simplifies many things (from algorithm analysis to inter-chain communication).
  • Avalanches DAG is partially ordered, whereas Lachesis is totally ordered. The direct consequence of this is that Lachesis supports EVM without modifications, and Avalanche is better suited for an UTXO model.

The remaining question is which one of the two serves best as a basis for the new algorithm. We choose to start with Snowball because of its superior fundamental features (namely scalability), and embrace the need of researching specific solutions for the execution model and the multi-DAG problem in an innovative manner. In other words, while going the Lachesis route would simplify both execution model (providing direct support for EVM) and easier multi-DAG design (given deterministic finality), we see higher potential in embracing the current limitations of Avalanche, and innovate in those directions.

Snowball

Our starting point is Avalanches Binary Byzantine Agreement algorithm. As long as modifications allow to build a DAG later on, this simplifies the design significantly. The DAG stays the same in principle: it supports confidence, but the core algorithm can be modeled without.

The concept of the Snowball algorithm is relatively simple. Following is a simplified description (lacking some details, but giving an overview). For further details, please refer to the Avalanche paper.

  1. The objective is to vote yes/no on a decision (this decision could be a single bit, or, in our DAG use case, whether a vertex should be included or not).
  2. Every node has an eventually-consistent complete view of the network. It will select at random k nodes, and will ask their opinion on the decision (yes/no).
  3. After this sampling is finished, if there is a vote that has more than an \alpha threshold, it accumulates one count for this opinion, as well as changes its opinion to this one. But, if a different opinion is received, the counter is reset to 1. If no threshold \alpha is reached, the counter is reset to 0 instead.
  4. After several iterations of this algorithm, we will reach a threshold \beta, and decide on that as final.

Next, we will proceed to describe our new algorithm, based on Snowball. There are 3 main paths to achieving the initial target of bypassing the Avalanche patent:

  • Find ways to extend the Snowball algorithm.
  • Find ways to modify the Snowball algorithm.
  • Find ways to strengthen the Snowball algorithm in particular scenarios, or as generic as possible.

In our design, all 3 are attempted.

Glacier

Background

We have identified a shortcoming of the Snowball algorithm that was a perfect starting point for devising improvements. The scenario is as follows:

  • There is a powerful adversary in the network, that controls a large percentage of the node population: 10% to ~50%.
  • This adversary follows a strategy that allows them to rapidly change the decision bit (possibly even in a coordinated way) so as to maximally confuse the honest nodes.
  • Under normal conditions, honest nodes will accumulate supermajorities soon enough, and reach the \beta threshold. However, when an honest node performs a query and does not reach the threshold \alpha of responses, the counter will be set to 0.
  • The highest threat to Snowball is an adversary that keeps it from reaching the \beta threshold, managing to continuously reset the counter, and steering Snowball away from making a decision.

Concept

Glacier is an evolution of the Snowball BBA algorithm, in which we tackle specifically the weakness described above. The main focus is going to be the counter and the triggering of the reset. Following, we elaborate the different modifications and features that have been added to the reference algorithm:

  1. Instead of allowing the latest evidence to change the opinion completely, we take into account all accumulated evidence, to reduce the impact of high variability when there is already a large amount of evidence collected.
  2. Eliminate the counter and threshold scheme, and introduce instead two regimes of operation:
    • One focused on grabbing opinions and reacting as soon as possible. This part is somewhat closer conceptually to the reference algorithm.
    • Another one focused on interpreting the accumulated data instead of reacting to the latest information gathered.
  3. Finally, combine those two phases via a transition function. This avoids the creation of a step function, or a sudden change in behavior that could complicate analysis and understanding of the dynamics. Instead, we can have a single algorithm that transfers weight from one operation to the other as more evidence is gathered.
  4. Additionally, we introduce a function for weighted sampling. This will allow the combination of different forms of weighting:
    • Staking
    • Heuristic reputation
    • Manual reputation.

Its worth delving a bit into the way the data is interpreted in order to reach a decision. Our approach is based conceptually on the paper Confidence as Higher-Order Uncertainty, which describes a frequentist approach to decision certainty. The first-order certainty, measured by frequency, is caused by known positive evidence, and the higher-order certainty is caused by potential positive evidence. Because confidence is a relative measurement defined on evidence, it naturally follows comparing the amount of evidence the system knows with the amount that it will know in the near future (defining “near” as a constant).

Intuitively, we are looking for a function of w, call it c for confidence, that satisfies the following conditions:

  1. Confidence c is a continuous and monotonically increasing function of w. (More evidence, higher confidence.)
  2. When w = 0, c = 0. (Without any evidence, confidence is minimum.)
  3. When w goes to infinity, c converges to 1. (With infinite evidence, confidence is maximum.)

The paper describes also a set of operations for the evidence/confidence pairs, so that different sources of knowledge could be combined. However, we leave here the suggestion of a possible research line in the future combining an algebra of evidence/confidence pairs with swarm-propagation algorithm like the one described in this paper.

Algorithm

The algorithm is divided into 4 phases:

  1. Querying. We select k nodes from the complete pool of peers in the network. This query is weighted, so the probability of selecting nodes is proportional to their weight.

P(i) = \frac{w_i}{\sum_{j=0}^{j=N}w_j}

The list of nodes is maintained by a separate protocol (the network layer), and eventual consistency of this knowledge in the network suffices. Even if there are slight divergences in the network view from different nodes, the algorithm is resilient to those.

Adaptive querying. An additional optimization in the query consists of adaptively growing the k constant in the event of high confusion. We define high confusion as the situation in which neither opinion is strongly held in a query (i.e. a threshold is not reached for either yes or no). For this, we will use the \alpha threshold defined below. This adaptive g