Merge branch 'master' into todos
This commit is contained in:
commit
ed014c4fbc
|
@ -1,71 +1,74 @@
|
|||
# Casper+Sharding chain v2.1
|
||||
# Ethereum 2.0 spec—Casper and sharding
|
||||
|
||||
###### tags: `spec`, `eth2.0`, `casper`, `sharding`
|
||||
###### spec version: 2.2 (October 2018)
|
||||
|
||||
## WORK IN PROGRESS!!!!!!!
|
||||
**NOTICE**: This document is a work-in-progress for researchers and implementers. It reflects recent spec changes and takes precedence over the [Python proof-of-concept implementation](https://github.com/ethereum/beacon_chain).
|
||||
|
||||
This is the work-in-progress document describing the specification for the Casper+Sharding (shasper) chain, version 2.1.
|
||||
### Introduction
|
||||
|
||||
In this protocol, there is a central PoS "beacon chain" which stores and manages the current set of active PoS validators. The only mechanism available to become a validator initially is to send a transaction on the existing PoW chain containing 32 ETH. When you do so, as soon as the beacon chain processes that block, you will be queued, and eventually inducted as an active validator until you either voluntarily deregister or you are forcibly deregistered as a penalty for misbehavior.
|
||||
At the center of Ethereum 2.0 is a system chain called the "beacon chain". The beacon chain stores and manages the set of active proof-of-stake validators. In the initial deployment phases of Ethereum 2.0 the only mechanism to become a validator is to make a fixed-size one-way ETH deposit to a registration contract on the Ethereum 1.0 PoW chain. Induction as a validator happens after registration transaction receipts are processed by the beacon chain and after a queuing process. Deregistration is either voluntary or done forcibly as a penalty for misbehavior.
|
||||
|
||||
The primary source of load on the beacon chain is **attestations**. An attestation has a double role:
|
||||
The primary source of load on the beacon chain are "attestations". Attestations simultaneously attest to a shard block and a corresponding beacon chain block. A sufficient number of attestations for the same shard block create a "crosslink", confirming the shard segment up to that shard block into the beacon chain. Crosslinks also serve as infrastructure for asynchronous cross-shard communication.
|
||||
|
||||
1. It attests to some parent block in the beacon chain
|
||||
2. It attests to a block hash in a shard (a sufficient number of such attestations create a "crosslink", confirming that shard block into the beacon chain).
|
||||
### Terminology
|
||||
|
||||
Every shard (e.g. there might be 1024 shards in total) is itself a PoS chain, and the shard chains are where the transactions and accounts will be stored. The crosslinks serve to "confirm" segments of the shard chains into the beacon chain, and are also the primary way through which the different shards will be able to talk to each other.
|
||||
|
||||
Note that one can also consider a simpler "minimal sharding algorithm" where crosslinks are simply hashes of proposed blocks of data that are not themselves chained to each other in any way.
|
||||
|
||||
Note: the python code at https://github.com/ethereum/beacon_chain and [an ethresear.ch post](https://ethresear.ch/t/convenience-link-to-full-casper-chain-v2-spec/2332) do not reflect all of the latest changes. If there is a discrepancy, this document is likely to reflect the more recent changes.
|
||||
|
||||
### Glossary
|
||||
|
||||
* **Validator**—a participant in the Ethereum 2.0 consensus system with the right to produce blocks, attestations, and other consensus objects.
|
||||
* **Committee**—a statistically representative validator subset, sampled pseudo-randomly.
|
||||
* **Proposer**—a validator with the right to create a block at a given slot.
|
||||
* **Attester**—a validator in an attestation committee with the right to attest to a block.
|
||||
* **Beacon chain**—the central proof-of-state chain of Ethereum 2.0.
|
||||
* **Shard**—one of the chains on which user transactions take place and contract state is stored.
|
||||
* **Crosslink**—sufficient signatures from an attestation committee attesting to a given block.
|
||||
* **Slot**—a period of `SLOT_DURATION` seconds, during which one proposer has the ability to create a block and some attesters have the ability to make attestations
|
||||
* **Dynasty transition**—a beacon chain state transaction where the validator set may change.
|
||||
* **Dynasty height**—the number of dynasty transitions that have happened in a given chain since genesis.
|
||||
* **Cycle**—a span of slots during which all validators get exactly one chance to make an attestation.
|
||||
* **Finalized**, **justified**—see the [Casper FFG paper](https://arxiv.org/abs/1710.09437). [TODO: flesh out definitions]
|
||||
* **Validator** - a participant in the Casper/sharding consensus system. You can become one by depositing 32 ETH into the Casper mechanism.
|
||||
* **Active validator set** - those validators who are currently participating, and which the Casper mechanism looks to produce and attest to blocks, crosslinks and other consensus objects.
|
||||
* **Committee** - a (pseudo-) randomly sampled subset of the active validator set. When a committee is referred to collectively, as in "this committee attests to X", this is assumed to mean "some subset of that committee that contains enough validators that the protocol recognizes it as representing the committee".
|
||||
* **Proposer** - the validator that creates a block
|
||||
* **Attester** - a validator that is part of a committee that needs to sign off on a block.
|
||||
* **Beacon chain** - the central PoS chain that is the base of the sharding system.
|
||||
* **Shard chain** - one of the chains on which user transactions take place and account data is stored.
|
||||
* **Crosslink** - a set of signatures from a committee attesting to a block in a shard chain, which can be included into the beacon chain. Crosslinks are the main means by which the beacon chain "learns about" the updated state of shard chains.
|
||||
* **Slot** - a period of `SLOT_DURATION` seconds, during which one proposer has the ability to create a block and some attesters have the ability to make attestations
|
||||
* **Dynasty transition** - a change of the validator set
|
||||
* **Dynasty** - the number of dynasty transitions that have happened in a given chain since genesis
|
||||
* **Cycle** - a span of blocks during which all validators get exactly one chance to make an attestation (unless a dynasty transition happens inside of one)
|
||||
* **Finalized**, **justified** - see Casper FFG finalization here: https://arxiv.org/abs/1710.09437
|
||||
* **Withdrawal period** - number of slots between a validator exit and the validator balance being withdrawable
|
||||
* **Genesis time** - the Unix time of the genesis beacon chain block at slot 0
|
||||
|
||||
### Constants
|
||||
|
||||
* **SHARD_COUNT** - a constant referring to the number of shards. Currently set to 1024.
|
||||
* **DEPOSIT_SIZE** - 32 ETH, or 32 * 10\*\*18 wei
|
||||
* **MAX_VALIDATOR_COUNT** - 2<sup>22</sup> = 4194304 # Note: this means that up to ~134 million ETH can stake at the same time
|
||||
* **GENESIS_TIME** - time of beacon chain startup (slot 0) in seconds since the Unix epoch
|
||||
* **SLOT_DURATION** - 16 seconds
|
||||
* **CYCLE_LENGTH** - 64 slots
|
||||
* **MIN_DYNASTY_LENGTH** - 256 slots
|
||||
* **MIN_COMMITTEE_SIZE** - 128 (rationale: see recommended minimum 111 here https://vitalik.ca/files/Ithaca201807_Sharding.pdf)
|
||||
* **SQRT\_E\_DROP\_TIME** - a constant set to reflect the amount of time it will take for the quadratic leak to cut nonparticipating validators' deposits by ~39.4%. Currently set to 2**20 seconds (~12 days).
|
||||
* **BASE\_REWARD\_QUOTIENT** - 1/this is the per-slot interest rate assuming all validators are participating, assuming total deposits of 1 ETH. Currently set to `2**15 = 32768`, corresponding to ~3.88% annual interest assuming 10 million participating ETH.
|
||||
* **WITHDRAWAL_PERIOD** - number of slots between a validator exit and the validator slot being withdrawable. Currently set to `2**19 = 524288` slots, or `2**23` seconds ~= 97 days.
|
||||
* **MAX\_VALIDATOR\_CHANGE\_QUOTIENT** - a maximum of 1/x validators can change during each dynasty. Currently set to 32.
|
||||
* **PENDING\_LOG\_IN** = 0 (status code)
|
||||
* **LOGGED\_IN** = 1 (status code)
|
||||
* **PENDING\_EXIT** = 2 (status code)
|
||||
* **PENDING\_WITHDRAW** = 3 (status code)
|
||||
* **PENALIZED** = 128 (status code)
|
||||
* **WITHDRAWN** = 4 (status code)
|
||||
| Constant | Value | Unit | Approximation |
|
||||
| --- | --- | :---: | - |
|
||||
| `SHARD_COUNT` | 2**10 (= 1,024)| shards |
|
||||
| `DEPOSIT_SIZE` | 2**5 (= 32) | ETH |
|
||||
| `MIN_COMMITTEE_SIZE` | 2**7 (= 128) | validators |
|
||||
| `GENESIS_TIME` | **TBD** | seconds |
|
||||
| `SLOT_DURATION` | 2**4 (= 16) | seconds |
|
||||
| `CYCLE_LENGTH` | 2**6 (= 64) | slots | ~17 minutes |
|
||||
| `MIN_DYNASTY_LENGTH` | 2**8 (= 256) | slots | ~1.1 hours |
|
||||
| `SQRT_E_DROP_TIME` | 2**16 (= 65,536) | slots | ~12 days |
|
||||
| `WITHDRAWAL_PERIOD` | 2**19 (= 524,288) | slots | ~97 days |
|
||||
| `BASE_REWARD_QUOTIENT` | 2**15 (= 32,768) | — |
|
||||
| `MAX_VALIDATOR_CHURN_QUOTIENT` | 2**5 (= 32) | — |
|
||||
|
||||
**Notes**
|
||||
|
||||
* The `SQRT_E_DROP_TIME` constant is the amount of time it takes for the quadratic leak to cut deposits of non-participating validators by ~39.4%.
|
||||
* The `BASE_REWARD_QUOTIENT` constant is the per-slot interest rate assuming all validators are participating, assuming total deposits of 1 ETH. It corresponds to ~3.88% annual interest assuming 10 million participating ETH.
|
||||
* At most `1/MAX_VALIDATOR_CHURN_QUOTIENT` of the validators can change during each dynasty.
|
||||
|
||||
**Status codes**
|
||||
|
||||
| Status code | Value |
|
||||
| - | :-: |
|
||||
| `PENDING_LOG_IN` | `0` |
|
||||
| `LOGGED_IN` | `1` |
|
||||
| `PENDING_EXIT` | `2` |
|
||||
| `PENDING_WITHDRAW` | `3` |
|
||||
| `WITHDRAWN` | `4` |
|
||||
| `PENALIZED` | `128` |
|
||||
| `ENTRY` | `1` |
|
||||
| `EXIT` | `2` |
|
||||
|
||||
### PoW chain registration contract
|
||||
|
||||
The initial deployment phases of Ethereum 2.0 are implemented without consensus changes to the PoW chain. A registration contract is added to the PoW chain to deposit ETH. This contract has a `registration` function which takes the following arguments:
|
||||
The initial deployment phases of Ethereum 2.0 are implemented without consensus changes to the PoW chain. A registration contract is added to the PoW chain to deposit ETH. This contract has a `registration` function which takes as arguments `pubkey`, `withdrawal_shard`, `withdrawal_address`, `randao_commitment` as defined in a `ValidatorRecord` below. A BLS `proof_of_possession` of types `bytes` is given as a final argument.
|
||||
|
||||
1) `pubkey` (bytes)
|
||||
2) `withdrawal_shard_id` (int)
|
||||
3) `withdrawal_address` (address)
|
||||
4) `randao_commitment` (bytes32)
|
||||
5) `bls_proof_of_possession` (bytes)
|
||||
|
||||
The registration contract does minimal validation, pushing most of the registration logic to the beacon chain. In particular, the BLS proof of possession (based on the BLS12-381 curve) is not verified by the registration contract.
|
||||
The registration contract emits a log with the various arguments for consumption by the beacon chain. It does not do validation, pushing the registration logic to the beacon chain. In particular, the proof of possession (based on the BLS12-381 curve) is not verified by the registration contract.
|
||||
|
||||
## Data Structures
|
||||
|
||||
|
@ -75,11 +78,11 @@ Beacon chain block structure:
|
|||
|
||||
```python
|
||||
fields = {
|
||||
# Hash of ancestor blocks (32 items, i'th is 2**i'th ancestor or zero bytes)
|
||||
# Skip list of ancestor block hashes. The i'th item is 2**i'th ancestor (or zero bytes) for i = 0, ..., 31
|
||||
'ancestor_hashes': ['hash32'],
|
||||
# Slot number (for the PoS mechanism)
|
||||
# Slot number
|
||||
'slot': 'int64',
|
||||
# Randao commitment reveal
|
||||
# RANDAO commitment reveal
|
||||
'randao_reveal': 'hash32',
|
||||
# Attestations
|
||||
'attestations': [AttestationRecord],
|
||||
|
@ -163,14 +166,16 @@ fields = {
|
|||
'last_finalized_slot': 'int64',
|
||||
# The current dynasty
|
||||
'current_dynasty': 'int64',
|
||||
# Records about the most recent crosslink `for each shard
|
||||
# Records about the most recent crosslink for each shard
|
||||
'crosslink_records': [CrosslinkRecord],
|
||||
# Used to select the committees for each shard
|
||||
'dynasty_seed': 'hash32',
|
||||
# Start of the current dynasty
|
||||
'dynasty_start': 'int64',
|
||||
# Total deposits penalized in the given withdrawal period
|
||||
'deposits_penalized_in_period': ['int32']
|
||||
'deposits_penalized_in_period': ['int32'],
|
||||
# Hash chain of validator set changes, allows light clients to track deltas more easily
|
||||
'validator_set_delta_hash_chain': 'hash32'
|
||||
}
|
||||
```
|
||||
|
||||
|
@ -269,29 +274,29 @@ We start off by defining some helper algorithms. First, the function that select
|
|||
|
||||
```python
|
||||
def get_active_validator_indices(validators):
|
||||
o = []
|
||||
for i in range(len(validators)):
|
||||
if validators[i].status == LOGGED_IN:
|
||||
o.append(i)
|
||||
return o
|
||||
return [i for i, v in enumerate(validators) if v.status == LOGGED_IN]
|
||||
```
|
||||
|
||||
Now, a function that shuffles this list:
|
||||
|
||||
```python
|
||||
def shuffle(lst, seed):
|
||||
assert len(lst) <= 16777216
|
||||
# entropy is consumed in 3 byte chunks
|
||||
# rand_max is defined to remove the modulo bias from this entropy source
|
||||
rand_max = 2**24
|
||||
assert len(lst) <= rand_max
|
||||
|
||||
o = [x for x in lst]
|
||||
source = seed
|
||||
i = 0
|
||||
while i < len(lst):
|
||||
source = blake(source)
|
||||
source = hash(source)
|
||||
for pos in range(0, 30, 3):
|
||||
m = int.from_bytes(source[pos:pos+3], 'big')
|
||||
remaining = len(lst) - i
|
||||
if remaining == 0:
|
||||
break
|
||||
rand_max = 16777216 - 16777216 % remaining
|
||||
rand_max = rand_max - rand_max % remaining
|
||||
if m < rand_max:
|
||||
replacement_pos = (m % remaining) + i
|
||||
o[i], o[replacement_pos] = o[replacement_pos], o[i]
|
||||
|
@ -350,7 +355,16 @@ def get_block_hash(active_state, curblock, slot):
|
|||
return active_state.recent_block_hashes[slot - earliest_slot_in_array]
|
||||
```
|
||||
|
||||
`get_block_hash(_, _, h)` should always return the block in the chain at slot `h`, and `get_shards_and_committees_for_slot(_, h)` should not change unless the dynasty changes.
|
||||
`get_block_hash(_, _, s)` should always return the block in the chain at slot `s`, and `get_shards_and_committees_for_slot(_, s)` should not change unless the dynasty changes.
|
||||
|
||||
We define a function to "add a link" to the validator hash chain, used when a validator is added or removed:
|
||||
|
||||
```python
|
||||
def add_validator_set_change_record(crystallized_state, index, pubkey, flag):
|
||||
crystallized_state.validator_set_delta_hash_chain = \
|
||||
hash(crystallized_state.validator_set_delta_hash_chain +
|
||||
bytes1(flag) + bytes3(index) + bytes32(pubkey))
|
||||
```
|
||||
|
||||
Finally, we abstractly define `int_sqrt(n)` for use in reward/penalty calculations as the largest integer `k` such that `k**2 <= n`. Here is one possible implementation, though clients are free to use their own including standard libraries for [integer square root](https://en.wikipedia.org/wiki/Integer_square_root) if available and meet the specification.
|
||||
|
||||
|
@ -480,8 +494,8 @@ For all (`shard_id`, `shard_block_hash`) tuples, compute the total deposit size
|
|||
Let `time_since_finality = block.slot - last_finalized_slot`, and let `B` be the balance of any given validator whose balance we are adjusting, not including any balance changes from this round of state recalculation. Let:
|
||||
|
||||
* `total_deposits = sum([v.balance for i, v in enumerate(validators) if i in get_active_validator_indices(validators, current_dynasty)])` and `total_deposits_in_ETH = total_deposits // 10**18`
|
||||
* `reward_quotient = BASE_REWARD_QUOTIENT * int_sqrt(total_deposits_in_ETH)` (1/this is the per-slot max interest rate)
|
||||
* `quadratic_penalty_quotient = (SQRT_E_DROP_TIME / SLOT_DURATION)**2` (after D slots, ~D<sup>2</sup>/2 divided by this is the portion lost by offline validators)
|
||||
* `reward_quotient = BASE_REWARD_QUOTIENT * int_sqrt(total_deposits_in_ETH)` (`1/reward_quotient` is the per-slot max interest rate)
|
||||
* `quadratic_penalty_quotient = SQRT_E_DROP_TIME**2` (after `D` slots about `D*D/2/quadratic_penalty_quotient` is the portion lost by offline validators)
|
||||
|
||||
For each slot `S` in the range `last_state_recalculation - CYCLE_LENGTH ... last_state_recalculation - 1`:
|
||||
|
||||
|
@ -496,13 +510,14 @@ Validators with `status == PENALIZED` also lose `B // reward_quotient + B * time
|
|||
|
||||
#### Balance recalculations related to crosslink rewards
|
||||
|
||||
For each shard S for which a crosslink committee exists in the cycle prior to the most recent cycle (`last_state_recalculation - CYCLE_LENGTH ... last_state_recalculation - 1`), let V be the corresponding validator set. Let `B` be the balance of any given validator whose balance we are adjusting, not including any balance changes from this round of state recalculation. For each S, V do the following:
|
||||
For each shard `S` for which a crosslink committee exists in the cycle prior to the most recent cycle (`last_state_recalculation - CYCLE_LENGTH ... last_state_recalculation - 1`), let `V` be the corresponding validator set. Let `B` be the balance of any given validator whose balance we are adjusting, not including any balance changes from this round of state recalculation. For each `S`, `V`:
|
||||
|
||||
* Let `total_v_deposits` be the total balance of V, and `total_participated_v_deposits` be the total balance of the subset of V that participated (note: it's always true that `total_participated_v_deposits <= total_v_deposits`)
|
||||
* Let `total_v_deposits` be the total balance of `V`
|
||||
* Let `total_participated_v_deposits` be the total balance of the subset of `V` that participated (note that `total_participated_v_deposits <= total_v_deposits`)
|
||||
* Let `time_since_last_confirmation` be `block.slot - crosslink_records[S].slot`
|
||||
* Adjust balances as follows:
|
||||
* If `crosslink_records[S].dynasty == current_dynasty`, no reward adjustments
|
||||
* Otherwise, participating validators' balances are increased by `B // reward_quotient * (2 * total_participated_v_deposits - total_v_deposits) // total_v_deposits`, and non-participating validators' balances are decreased by `B // reward_quotient + B * time_since_last_confirmation // quadratic_penalty_quotient`
|
||||
* Otherwise, participating validators' balances are increased by `B // reward_quotient * (2 * total_participated_v_deposits - total_v_deposits) // total_v_deposits`, and the balances of non-participating validators are decreased by `B // reward_quotient + B * time_since_last_confirmation // quadratic_penalty_quotient`
|
||||
|
||||
Let `committees` be the set of committees processed and `time_since_last_confirmation(c)` be the value of `time_since_last_confirmation` in that committee. Validators with `status == PENALIZED` lose `B // reward_quotient + B * sum([time_since_last_confirmation(c) for c in committees]) // len(committees) // quadratic_penalty_quotient`.
|
||||
|
||||
|
@ -510,8 +525,13 @@ Let `committees` be the set of committees processed and `time_since_last_confirm
|
|||
|
||||
For each `SpecialObject` `obj` in `active_state.pending_specials`:
|
||||
|
||||
* **[coverts logouts]**: If `obj.type == 0`, interpret `data[0]` as a validator index as an `int32` and `data[1]` as a signature. If `BLSVerify(pubkey=validators[data[0]].pubkey, msg=hash("bye bye"), sig=data[1])`, and `validators[i].status == LOGGED_IN`, set `validators[i].status = PENDING_EXIT` and `validators[i].exit_slot = current_slot`
|
||||
* **[covers NO\_DBL\_VOTE, NO\_SURROUND, NO\_DBL\_PROPOSE slashing conditions]:** If `obj.type == 1`, interpret `data[0]` as a list of concatenated `int32` values where each value represents an index into `validators`, `data[1]` as the data being signed and `data[2]` as an aggregate signature. Interpret `data[3:6]` similarly. Verify that both signatures are valid, that the two signatures are signing distinct data, and that they are either signing the same slot number, or that one surrounds the other (ie. `source1 < source2 < target2 < target1`). Let `inds` be the list of indices in both signatures; verify that its length is at least 1. For each validator index `v` in `inds`, set their end dynasty to equal the current dynasty + 1, and if its `status` does not equal `PENALIZED`, then (i) set its `exit_slot` to equal the current `slot`, (ii) set its `status` to `PENALIZED`, and (iii) set `crystallized_state.deposits_penalized_in_period[slot // WITHDRAWAL_PERIOD] += validators[v].balance`, extending the array if needed.
|
||||
* **[covers logouts]**: If `obj.type == 0`, interpret `data[0]` as a validator index as an `int32` and `data[1]` as a signature. If `BLSVerify(pubkey=validators[data[0]].pubkey, msg=hash("bye bye"), sig=data[1])`, and `validators[i].status == LOGGED_IN`, set `validators[i].status = PENDING_EXIT` and `validators[i].exit_slot = current_slot`
|
||||
* **[covers `NO_DBL_VOTE`, `NO_SURROUND`, `NO_DBL_PROPOSE` slashing conditions]:** If `obj.type == 1`, interpret `data[0]` as a list of concatenated `int32` values where each value represents an index into `validators`, `data[1]` as the data being signed and `data[2]` as an aggregate signature. Interpret `data[3:6]` similarly. Verify that both signatures are valid, that the two signatures are signing distinct data, and that they are either signing the same slot number, or that one surrounds the other (ie. `source1 < source2 < target2 < target1`). Let `inds` be the list of indices in both signatures; verify that its length is at least 1. For each validator index `v` in `inds`, set their end dynasty to equal the current dynasty plus 1, and if its `status` does not equal `PENALIZED`, then:
|
||||
|
||||
1. Set its `exit_slot` to equal the current `slot`
|
||||
2. Set its `status` to `PENALIZED`
|
||||
3. Set `crystallized_state.deposits_penalized_in_period[slot // WITHDRAWAL_PERIOD] += validators[v].balance`, extending the array if needed
|
||||
4. Run `add_validator_set_change_record(crystallized_state, v, validators[v].pubkey, EXIT)`
|
||||
|
||||
#### Finally...
|
||||
|
||||
|
@ -539,7 +559,7 @@ def change_validators(validators):
|
|||
# The maximum total wei that can deposit+withdraw
|
||||
max_allowable_change = max(
|
||||
DEPOSIT_SIZE * 2,
|
||||
total_deposits // MAX_VALIDATOR_CHANGE_QUOTIENT
|
||||
total_deposits // MAX_VALIDATOR_CHURN_QUOTIENT
|
||||
)
|
||||
# Go through the list start to end depositing+withdrawing as many as possible
|
||||
total_changed = 0
|
||||
|
@ -547,10 +567,12 @@ def change_validators(validators):
|
|||
if validators[i].status == PENDING_LOG_IN:
|
||||
validators[i].status = LOGGED_IN
|
||||
total_changed += DEPOSIT_SIZE
|
||||
add_validator_set_change_record(crystallized_state, i, validators[i].pubkey, ENTRY)
|
||||
if validators[i].status == PENDING_EXIT:
|
||||
validators[i].status = PENDING_WITHDRAW
|
||||
validators[i].exit_slot = current_slot
|
||||
total_changed += validators[i].balance
|
||||
add_validator_set_change_record(crystallized_state, i, validators[i].pubkey, EXIT)
|
||||
if total_changed >= max_allowable_change:
|
||||
break
|
||||
|
||||
|
@ -622,12 +644,8 @@ Note: This spec is ~60% complete.
|
|||
|
||||
# Appendix
|
||||
## Appendix A - Hash function
|
||||
The general hash function `hash(x)` in this specification is defined as:
|
||||
|
||||
`hash(x) := BLAKE2b-512(x)[0:32]`, where `BLAKE2b-512` (`blake2b512`) algorithm is defined in [RFC 7693](https://tools.ietf.org/html/rfc7693) and input `x` is bytes type.
|
||||
|
||||
* `BLAKE2b-512` is the *default* `BLAKE2b` algorithm with 64-byte digest size. To get a 32-byte result, the general hash function output is defined as the leftmost `32` bytes of `BLAKE2b-512` hash output.
|
||||
* The design rationale is keeping using the default algorithm and avoiding too much dependency on external hash function libraries.
|
||||
We aim to have a STARK-friendly hash function `hash(x)` for the production launch of the beacon chain. While the standardisation process for a STARK-friendly hash function takes place—led by STARKware, who will produce a detailed report with recommendations—we use `BLAKE2b-512` as a placeholder. Specifically, we set `hash(x) := BLAKE2b-512(x)[0:32]` where the `BLAKE2b-512` algorithm is defined in [RFC 7693](https://tools.ietf.org/html/rfc7693) and the input `x` is of type `bytes`.
|
||||
|
||||
## Copyright
|
||||
Copyright and related rights waived via [CC0](https://creativecommons.org/publicdomain/zero/1.0/).
|
|
@ -0,0 +1,348 @@
|
|||
# [WIP] SimpleSerialize (SSZ) Spec
|
||||
|
||||
This is the **work in progress** document to describe `simpleserialize`, the
|
||||
current selected serialization method for Ethereum 2.0 using the Beacon Chain.
|
||||
|
||||
This document specifies the general information for serializing and
|
||||
deserializing objects and data types.
|
||||
|
||||
## ToC
|
||||
|
||||
* [About](#about)
|
||||
* [Terminology](#terminology)
|
||||
* [Constants](#constants)
|
||||
* [Overview](#overview)
|
||||
+ [Serialize/Encode](#serializeencode)
|
||||
- [uint: 8/16/24/32/64/256](#uint-816243264256)
|
||||
- [Address](#address)
|
||||
- [Hash](#hash)
|
||||
* [Hash32](#hash32)
|
||||
* [Hash96](#hash96)
|
||||
* [Hash97](#hash97)
|
||||
- [Bytes](#bytes)
|
||||
- [List/Vectors](#listvectors)
|
||||
- [Container (TODO)](#container)
|
||||
+ [Deserialize/Decode](#deserializedecode)
|
||||
- [uint: 8/16/24/32/64/256](#uint-816243264256-1)
|
||||
- [Address](#address-1)
|
||||
- [Hash](#hash-1)
|
||||
* [Hash32](#hash32-1)
|
||||
* [Hash96](#hash96-1)
|
||||
* [Hash97](#hash97-1)
|
||||
- [Bytes](#bytes-1)
|
||||
- [List/Vectors](#listvectors-1)
|
||||
- [Container (TODO)](#container-1)
|
||||
* [Implementations](#implementations)
|
||||
|
||||
## About
|
||||
|
||||
`SimpleSerialize` was first proposed by Vitalik Buterin as the serialization
|
||||
protocol for use in the Ethereum 2.0 Beacon Chain.
|
||||
|
||||
The core feature of `ssz` is the simplicity of the serialization with low
|
||||
overhead.
|
||||
|
||||
## Terminology
|
||||
|
||||
| Term | Definition |
|
||||
|:-------------|:-----------------------------------------------------------------------------------------------|
|
||||
| `big` | Big Endian |
|
||||
| `byte_order` | Specifies [endianness:](https://en.wikipedia.org/wiki/Endianness) Big Endian or Little Endian. |
|
||||
| `len` | Length/Number of Bytes. |
|
||||
| `to_bytes` | Convert to bytes. Should take parameters ``size`` and ``byte_order``. |
|
||||
| `from_bytes` | Convert from bytes to object. Should take ``bytes`` and ``byte_order``. |
|
||||
| `value` | The value to serialize. |
|
||||
| `rawbytes` | Raw serialized bytes. |
|
||||
|
||||
## Constants
|
||||
|
||||
| Constant | Value | Definition |
|
||||
|:---------------|:-----:|:--------------------------------------------------------------------------------------|
|
||||
| `LENGTH_BYTES` | 4 | Number of bytes used for the length added before a variable-length serialized object. |
|
||||
|
||||
|
||||
## Overview
|
||||
|
||||
### Serialize/Encode
|
||||
|
||||
#### uint: 8/16/24/32/64/256
|
||||
|
||||
Convert directly to bytes the size of the int. (e.g. ``uint16 = 2 bytes``)
|
||||
|
||||
All integers are serialized as **big endian**.
|
||||
|
||||
| Check to perform | Code |
|
||||
|:-----------------------|:----------------------|
|
||||
| Size is a byte integer | ``int_size % 8 == 0`` |
|
||||
|
||||
```python
|
||||
assert(int_size % 8 == 0)
|
||||
buffer_size = int_size / 8
|
||||
return value.to_bytes(buffer_size, 'big')
|
||||
```
|
||||
|
||||
#### Address
|
||||
|
||||
The address should already come as a hash/byte format. Ensure that length is
|
||||
**20**.
|
||||
|
||||
| Check to perform | Code |
|
||||
|:-----------------------|:---------------------|
|
||||
| Length is correct (20) | ``len(value) == 20`` |
|
||||
|
||||
```python
|
||||
assert( len(value) == 20 )
|
||||
return value
|
||||
```
|
||||
|
||||
#### Hash
|
||||
|
||||
| Hash Type | Usage |
|
||||
|:---------:|:------------------------------------------------|
|
||||
| `hash32` | Hash size of ``keccak`` or `blake2b[0.. < 32]`. |
|
||||
| `hash96` | BLS Public Key Size. |
|
||||
| `hash97` | BLS Public Key Size with recovery bit. |
|
||||
|
||||
|
||||
| Checks to perform | Code |
|
||||
|:-----------------------------------|:---------------------|
|
||||
| Length is correct (32) if `hash32` | ``len(value) == 32`` |
|
||||
| Length is correct (96) if `hash96` | ``len(value) == 96`` |
|
||||
| Length is correct (97) if `hash97` | ``len(value) == 97`` |
|
||||
|
||||
|
||||
**Example all together**
|
||||
|
||||
```python
|
||||
if (type(value) == 'hash32'):
|
||||
assert(len(value) == 32)
|
||||
elif (type(value) == 'hash96'):
|
||||
assert(len(value) == 96)
|
||||
elif (type(value) == 'hash97'):
|
||||
assert(len(value) == 97)
|
||||
else:
|
||||
raise TypeError('Invalid hash type supplied')
|
||||
|
||||
return value
|
||||
```
|
||||
|
||||
##### Hash32
|
||||
|
||||
Ensure 32 byte length and return the bytes.
|
||||
|
||||
```python
|
||||
assert(len(value) == 32)
|
||||
return value
|
||||
```
|
||||
|
||||
##### Hash96
|
||||
|
||||
Ensure 96 byte length and return the bytes.
|
||||
|
||||
```python
|
||||
assert(len(value) == 96)
|
||||
return value
|
||||
```
|
||||
|
||||
##### Hash97
|
||||
|
||||
Ensure 97 byte length and return the bytes.
|
||||
|
||||
```python
|
||||
assert(len(value) == 97)
|
||||
return value
|
||||
```
|
||||
|
||||
#### Bytes
|
||||
|
||||
For general `byte` type:
|
||||
1. Get the length/number of bytes; Encode into a `4-byte` integer.
|
||||
2. Append the value to the length and return: ``[ length_bytes ] + [
|
||||
value_bytes ]``
|
||||
|
||||
| Check to perform | Code |
|
||||
|:-------------------------------------|:-----------------------|
|
||||
| Length of bytes can fit into 4 bytes | ``len(value) < 2**32`` |
|
||||
|
||||
```python
|
||||
assert(len(value) < 2**32)
|
||||
byte_length = (len(value)).to_bytes(LENGTH_BYTES, 'big')
|
||||
return byte_length + value
|
||||
```
|
||||
|
||||
#### List/Vectors
|
||||
|
||||
| Check to perform | Code |
|
||||
|:--------------------------------------------|:----------------------------|
|
||||
| Length of serialized list fits into 4 bytes | ``len(serialized) < 2**32`` |
|
||||
|
||||
|
||||
1. Get the number of raw bytes to serialize: it is ``len(list) * sizeof(element)``.
|
||||
* Encode that as a `4-byte` **big endian** `uint32`.
|
||||
2. Append the elements in a packed manner.
|
||||
|
||||
* *Note on efficiency*: consider using a container that does not need to iterate over all elements to get its length. For example Python lists, C++ vectors or Rust Vec.
|
||||
|
||||
**Example in Python**
|
||||
|
||||
```python
|
||||
|
||||
serialized_list_string = b''
|
||||
|
||||
for item in value:
|
||||
serialized_list_string += serialize(item)
|
||||
|
||||
assert(len(serialized_list_string) < 2**32)
|
||||
|
||||
serialized_len = (len(serialized_list_string).to_bytes(LENGTH_BYTES, 'big'))
|
||||
|
||||
return serialized_len + serialized_list_string
|
||||
```
|
||||
|
||||
#### Container
|
||||
|
||||
```
|
||||
########################################
|
||||
TODO
|
||||
########################################
|
||||
```
|
||||
|
||||
|
||||
### Deserialize/Decode
|
||||
|
||||
The decoding requires knowledge of the type of the item to be decoded. When
|
||||
performing decoding on an entire serialized string, it also requires knowledge
|
||||
of the order in which the objects have been serialized.
|
||||
|
||||
Note: Each return will provide ``deserialized_object, new_index`` keeping track
|
||||
of the new index.
|
||||
|
||||
At each step, the following checks should be made:
|
||||
|
||||
| Check to perform | Check |
|
||||
|:-------------------------|:-----------------------------------------------------------|
|
||||
| Ensure sufficient length | ``length(rawbytes) >= current_index + deserialize_length`` |
|
||||
|
||||
#### uint: 8/16/24/32/64/256
|
||||
|
||||
Convert directly from bytes into integer utilising the number of bytes the same
|
||||
size as the integer length. (e.g. ``uint16 == 2 bytes``)
|
||||
|
||||
All integers are interpreted as **big endian**.
|
||||
|
||||
```python
|
||||
assert(len(rawbytes) >= current_index + int_size)
|
||||
byte_length = int_size / 8
|
||||
new_index = current_index + int_size
|
||||
return int.from_bytes(rawbytes[current_index:current_index+int_size], 'big'), new_index
|
||||
```
|
||||
|
||||
#### Address
|
||||
|
||||
Return the 20 bytes.
|
||||
|
||||
```python
|
||||
assert(len(rawbytes) >= current_index + 20)
|
||||
new_index = current_index + 20
|
||||
return rawbytes[current_index:current_index+20], new_index
|
||||
```
|
||||
|
||||
#### Hash
|
||||
|
||||
##### Hash32
|
||||
|
||||
Return the 32 bytes.
|
||||
|
||||
```python
|
||||
assert(len(rawbytes) >= current_index + 32)
|
||||
new_index = current_index + 32
|
||||
return rawbytes[current_index:current_index+32], new_index
|
||||
```
|
||||
|
||||
##### Hash96
|
||||
|
||||
Return the 96 bytes.
|
||||
|
||||
```python
|
||||
assert(len(rawbytes) >= current_index + 96)
|
||||
new_index = current_index + 96
|
||||
return rawbytes[current_index:current_index+96], new_index
|
||||
```
|
||||
|
||||
##### Hash97
|
||||
|
||||
Return the 97 bytes.
|
||||
|
||||
```python
|
||||
assert(len(rawbytes) >= current_index + 97)
|
||||
new_index = current_index + 97
|
||||
return rawbytes[current_index:current_index+97], new_index
|
||||
```
|
||||
|
||||
|
||||
#### Bytes
|
||||
|
||||
Get the length of the bytes, return the bytes.
|
||||
|
||||
| Check to perform | code |
|
||||
|:--------------------------------------------------|:-------------------------------------------------|
|
||||
| rawbytes has enough left for length | ``len(rawbytes) > current_index + LENGTH_BYTES`` |
|
||||
| bytes to return not greater than serialized bytes | ``len(rawbytes) > bytes_end `` |
|
||||
|
||||
```python
|
||||
assert(len(rawbytes) > current_index + LENGTH_BYTES)
|
||||
bytes_length = int.from_bytes(rawbytes[current_index:current_index + LENGTH_BYTES], 'big')
|
||||
|
||||
bytes_start = current_index + LENGTH_BYTES
|
||||
bytes_end = bytes_start + bytes_length
|
||||
new_index = bytes_end
|
||||
|
||||
assert(len(rawbytes) >= bytes_end)
|
||||
|
||||
return rawbytes[bytes_start:bytes_end], new_index
|
||||
```
|
||||
|
||||
#### List/Vectors
|
||||
|
||||
Deserialize each object in the list.
|
||||
1. Get the length of the serialized list.
|
||||
2. Loop through deserializing each item in the list until you reach the
|
||||
entire length of the list.
|
||||
|
||||
|
||||
| Check to perform | code |
|
||||
|:------------------------------------------|:----------------------------------------------------------------|
|
||||
| rawbytes has enough left for length | ``len(rawbytes) > current_index + LENGTH_BYTES`` |
|
||||
| list is not greater than serialized bytes | ``len(rawbytes) > current_index + LENGTH_BYTES + total_length`` |
|
||||
|
||||
```python
|
||||
assert(len(rawbytes) > current_index + LENGTH_BYTES)
|
||||
total_length = int.from_bytes(rawbytes[current_index:current_index + LENGTH_BYTES], 'big')
|
||||
new_index = current_index + LENGTH_BYTES + total_length
|
||||
assert(len(rawbytes) >= new_index)
|
||||
item_index = current_index + LENGTH_BYTES
|
||||
deserialized_list = []
|
||||
|
||||
while item_index < new_index:
|
||||
object, item_index = deserialize(rawbytes, item_index, item_type)
|
||||
deserialized_list.append(object)
|
||||
|
||||
return deserialized_list, new_index
|
||||
```
|
||||
|
||||
#### Container
|
||||
|
||||
```
|
||||
########################################
|
||||
TODO
|
||||
########################################
|
||||
```
|
||||
|
||||
## Implementations
|
||||
|
||||
| Language | Implementation | Description |
|
||||
|:--------:|--------------------------------------------------------------------------------------------------------------------------------------------------------------------|----------------------------------------------------------|
|
||||
| Python | [ https://github.com/ethereum/beacon_chain/blob/master/ssz/ssz.py ](https://github.com/ethereum/beacon_chain/blob/master/ssz/ssz.py) | Beacon chain reference implementation written in Python. |
|
||||
| Rust | [ https://github.com/sigp/lighthouse/tree/master/ssz ](https://github.com/sigp/lighthouse/tree/master/ssz) | Lighthouse (Rust Ethereum 2.0 Node) maintained SSZ. |
|
||||
| Nim | [ https://github.com/status-im/nim-beacon-chain/blob/master/beacon_chain/ssz.nim ](https://github.com/status-im/nim-beacon-chain/blob/master/beacon_chain/ssz.nim) | Nim Implementation maintained SSZ. |
|
||||
| Rust | [ https://github.com/paritytech/shasper/tree/master/util/ssz ](https://github.com/paritytech/shasper/tree/master/util/ssz) | Shasper implementation of SSZ maintained by ParityTech. |
|
Loading…
Reference in New Issue