Merge pull request #766 from ethereum/vitalik81

Added light client related files
This commit is contained in:
Danny Ryan 2019-04-03 00:23:21 -06:00 committed by GitHub
commit afdfb2a5de
No known key found for this signature in database
GPG Key ID: 4AEE18F83AFDEB23
3 changed files with 64 additions and 30 deletions

View File

@ -18,6 +18,8 @@ Accompanying documents can be found in [specs](specs) and include
* [BLS signature verification](specs/bls_signature.md) * [BLS signature verification](specs/bls_signature.md)
* [General test format](specs/test-format.md) * [General test format](specs/test-format.md)
* [Honest validator implementation doc](specs/validator/0_beacon-chain-validator.md) * [Honest validator implementation doc](specs/validator/0_beacon-chain-validator.md)
* [Merkle proof formats](specs/light_client/merkle_proofs.md)
* [Light client syncing protocol](specs/light_client/sync_protocol.md)
## Design goals ## Design goals
The following are the broad design goals for Ethereum 2.0: The following are the broad design goals for Ethereum 2.0:

View File

@ -1,3 +1,11 @@
**NOTICE**: This document is a work-in-progress for researchers and implementers.
### Constants
| Name | Value |
| - | - |
| `LENGTH_FLAG` | `2**64 - 1` |
### Generalized Merkle tree index ### Generalized Merkle tree index
In a binary Merkle tree, we define a "generalized index" of a node as `2**depth + index`. Visually, this looks as follows: In a binary Merkle tree, we define a "generalized index" of a node as `2**depth + index`. Visually, this looks as follows:
@ -36,17 +44,34 @@ y_data_root len(y)
....... .......
``` ```
We can now define a concept of a "path", a way of describing a function that takes as input an SSZ object and outputs some specific (possibly deeply nested) member. For example, `foo -> foo.x` is a path, as are `foo -> len(foo.y)` and `foo -> foo[5]`. We'll describe paths as lists: in these three cases they are `["x"]`, `["y", "len"]` and `["y", 5]` respectively. We can now define a function `get_generalized_indices(object: Any, path: List[str OR int], root=1: int) -> int` that converts an object and a path to a set of generalized indices (note that for constant-sized objects, there is only one generalized index and it only depends on the path, but for dynamically sized objects the indices may depend on the object itself too). For dynamically-sized objects, the set of indices will have more than one member because of the need to access an array's length to determine the correct generalized index for some array access. We can now define a concept of a "path", a way of describing a function that takes as input an SSZ object and outputs some specific (possibly deeply nested) member. For example, `foo -> foo.x` is a path, as are `foo -> len(foo.y)` and `foo -> foo.y[5].w`. We'll describe paths as lists, which can have two representations. In "human-readable form", they are `["x"]`, `["y", "__len__"]` and `["y", 5, "w"]` respectively. In "encoded form", they are lists of `uint64` values, in these cases (assuming the fields of `foo` in order are `x` then `y`, and `w` is the first field of `y[i]`) `[0]`, `[1, 2**64-1]`, `[1, 5, 0]`.
```python ```python
def get_generalized_indices(obj: Any, path: List[str or int], root=1) -> List[int]: def path_to_encoded_form(obj: Any, path: List[str or int]) -> List[int]:
if len(path) == 0:
return []
if isinstance(path[0], "__len__"):
assert len(path) == 1
return [LENGTH_FLAG]
elif isinstance(path[0], str) and hasattr(obj, "fields"):
return [list(obj.fields.keys()).index(path[0])] + path_to_encoded_form(getattr(obj, path[0]), path[1:])
elif isinstance(obj, (StaticList, DynamicList)):
return [path[0]] + path_to_encoded_form(obj[path[0]], path[1:])
else:
raise Exception("Unknown type / path")
```
We can now define a function `get_generalized_indices(object: Any, path: List[int], root=1: int) -> int` that converts an object and a path to a set of generalized indices (note that for constant-sized objects, there is only one generalized index and it only depends on the path, but for dynamically sized objects the indices may depend on the object itself too). For dynamically-sized objects, the set of indices will have more than one member because of the need to access an array's length to determine the correct generalized index for some array access.
```python
def get_generalized_indices(obj: Any, path: List[int], root=1) -> List[int]:
if len(path) == 0: if len(path) == 0:
return [root] return [root]
elif isinstance(obj, StaticList): elif isinstance(obj, StaticList):
items_per_chunk = (32 // len(serialize(x))) if isinstance(x, int) else 1 items_per_chunk = (32 // len(serialize(x))) if isinstance(x, int) else 1
new_root = root * next_power_of_2(len(obj) // items_per_chunk) + path[0] // items_per_chunk new_root = root * next_power_of_2(len(obj) // items_per_chunk) + path[0] // items_per_chunk
return get_generalized_indices(obj[path[0]], path[1:], new_root) return get_generalized_indices(obj[path[0]], path[1:], new_root)
elif isinstance(obj, DynamicList) and path[0] == "len": elif isinstance(obj, DynamicList) and path[0] == LENGTH_FLAG:
return [root * 2 + 1] return [root * 2 + 1]
elif isinstance(obj, DynamicList) and isinstance(path[0], int): elif isinstance(obj, DynamicList) and isinstance(path[0], int):
assert path[0] < len(obj) assert path[0] < len(obj)
@ -54,9 +79,9 @@ def get_generalized_indices(obj: Any, path: List[str or int], root=1) -> List[in
new_root = root * 2 * next_power_of_2(len(obj) // items_per_chunk) + path[0] // items_per_chunk new_root = root * 2 * next_power_of_2(len(obj) // items_per_chunk) + path[0] // items_per_chunk
return [root *2 + 1] + get_generalized_indices(obj[path[0]], path[1:], new_root) return [root *2 + 1] + get_generalized_indices(obj[path[0]], path[1:], new_root)
elif hasattr(obj, "fields"): elif hasattr(obj, "fields"):
index = list(fields.keys()).index(path[0]) field = list(fields.keys())[path[0]]
new_root = root * next_power_of_2(len(fields)) + index new_root = root * next_power_of_2(len(fields)) + path[0]
return get_generalized_indices(getattr(obj, path[0]), path[1:], new_root) return get_generalized_indices(getattr(obj, field), path[1:], new_root)
else: else:
raise Exception("Unknown type / path") raise Exception("Unknown type / path")
``` ```
@ -109,6 +134,8 @@ def get_proof_indices(tree_indices: List[int]) -> List[int]:
Generating a proof is simply a matter of taking the node of the SSZ hash tree with the union of the given generalized indices for each index given by `get_proof_indices`, and outputting the list of nodes in the same order. Generating a proof is simply a matter of taking the node of the SSZ hash tree with the union of the given generalized indices for each index given by `get_proof_indices`, and outputting the list of nodes in the same order.
Here is the verification function:
```python ```python
def verify_multi_proof(root, indices, leaves, proof): def verify_multi_proof(root, indices, leaves, proof):
tree = {} tree = {}
@ -127,8 +154,24 @@ def verify_multi_proof(root, indices, leaves, proof):
return (indices == []) or (1 in tree and tree[1] == root) return (indices == []) or (1 in tree and tree[1] == root)
``` ```
### MerklePartial
We define:
#### `MerklePartial`
```python
{
"root": "bytes32",
"indices": ["uint64"],
"values": ["bytes32"],
"proof": ["bytes32"]
}
```
#### Proofs for execution #### Proofs for execution
We define `MerklePartial(f, arg1, arg2...)` as being a list of Merkle multiproofs of the sets of nodes in the hash trees of the SSZ objects that are needed to authenticate the values needed to compute some function `f(arg1, arg2...)`. An individual Merkle multiproof is given as a dynamic sized list of `bytes32` values, a `MerklePartial` is a fixed-size list of objects `{proof: ["bytes32"], value: "bytes32"}`, one for each `arg` to `f` (if some `arg` is a base type, then the multiproof is empty). We define `MerklePartial(f, arg1, arg2..., focus=0)` as being a `MerklePartial` object wrapping a Merkle multiproof of the set of nodes in the hash tree of the SSZ object `arg[focus]` that is needed to authenticate the parts of the object needed to compute `f(arg1, arg2...)`.
Ideally, any function which accepts an SSZ object should also be able to accept a `MerklePartial` object as a substitute. Ideally, any function which accepts an SSZ object should also be able to accept a `MerklePartial` object as a substitute.

View File

@ -51,22 +51,11 @@ def get_earlier_start_epoch(slot: Slot) -> int:
def get_later_start_epoch(slot: Slot) -> int: def get_later_start_epoch(slot: Slot) -> int:
return slot - slot % PERSISTENT_COMMITTEE_PERIOD - PERSISTENT_COMMITTEE_PERIOD return slot - slot % PERSISTENT_COMMITTEE_PERIOD - PERSISTENT_COMMITTEE_PERIOD
def get_earlier_period_data(block: ExtendedBeaconBlock, shard_id: Shard) -> PeriodData: def get_period_data(block: ExtendedBeaconBlock, shard_id: Shard, later: bool) -> PeriodData:
period_start = get_earlier_start_epoch(block.slot) period_start = get_later_start_epoch(header.slot) if later else get_earlier_start_epoch(header.slot)
validator_count = len(get_active_validator_indices(block.state, period_start)) validator_count = len(get_active_validator_indices(state, period_start))
committee_count = validator_count // (SHARD_COUNT * TARGET_COMMITTEE_SIZE) + 1 committee_count = validator_count // (SHARD_COUNT * TARGET_COMMITTEE_SIZE) + 1
indices = get_shuffled_committee(block.state, shard_id, period_start, 0, committee_count) indices = get_period_committee(block.state, shard_id, period_start, 0, committee_count)
return PeriodData(
validator_count,
generate_seed(block.state, period_start),
[block.state.validator_registry[i] for i in indices]
)
def get_later_period_data(block: ExtendedBeaconBlock, shard_id: Shard) -> PeriodData:
period_start = get_later_start_epoch(block.slot)
validator_count = len(get_active_validator_indices(block.state, period_start))
committee_count = validator_count // (SHARD_COUNT * TARGET_COMMITTEE_SIZE) + 1
indices = get_shuffled_committee(block.state, shard_id, period_start, 0, committee_count)
return PeriodData( return PeriodData(
validator_count, validator_count,
generate_seed(block.state, period_start), generate_seed(block.state, period_start),
@ -80,18 +69,18 @@ A light client will keep track of:
* A random `shard_id` in `[0...SHARD_COUNT-1]` (selected once and retained forever) * A random `shard_id` in `[0...SHARD_COUNT-1]` (selected once and retained forever)
* A block header that they consider to be finalized (`finalized_header`) and do not expect to revert. * A block header that they consider to be finalized (`finalized_header`) and do not expect to revert.
* `later_period_data = get_maximal_later_committee(finalized_header, shard_id)` * `later_period_data = get_period_data(finalized_header, shard_id, later=True)`
* `earlier_period_data = get_maximal_earlier_committee(finalized_header, shard_id)` * `earlier_period_data = get_period_data(finalized_header, shard_id, later=False)`
We use the struct `validator_memory` to keep track of these variables. We use the struct `validator_memory` to keep track of these variables.
### Updating the shuffled committee ### Updating the shuffled committee
If a client's `validator_memory.finalized_header` changes so that `header.slot // PERSISTENT_COMMITTEE_PERIOD` increases, then the client can ask the network for a `new_committee_proof = MerklePartial(get_maximal_later_committee, validator_memory.finalized_header, shard_id)`. It can then compute: If a client's `validator_memory.finalized_header` changes so that `header.slot // PERSISTENT_COMMITTEE_PERIOD` increases, then the client can ask the network for a `new_committee_proof = MerklePartial(get_period_data, validator_memory.finalized_header, shard_id, later=True)`. It can then compute:
```python ```python
earlier_period_data = later_period_data earlier_period_data = later_period_data
later_period_data = get_later_period_data(new_committee_proof, finalized_header, shard_id) later_period_data = get_period_data(new_committee_proof, finalized_header, shard_id, later=True)
``` ```
The maximum size of a proof is `128 * ((22-7) * 32 + 110) = 75520` bytes for validator records and `(22-7) * 32 + 128 * 8 = 1504` for the active index proof (much smaller because the relevant active indices are all beside each other in the Merkle tree). This needs to be done once per `PERSISTENT_COMMITTEE_PERIOD` epochs (2048 epochs / 9 days), or ~38 bytes per epoch. The maximum size of a proof is `128 * ((22-7) * 32 + 110) = 75520` bytes for validator records and `(22-7) * 32 + 128 * 8 = 1504` for the active index proof (much smaller because the relevant active indices are all beside each other in the Merkle tree). This needs to be done once per `PERSISTENT_COMMITTEE_PERIOD` epochs (2048 epochs / 9 days), or ~38 bytes per epoch.
@ -106,13 +95,13 @@ def compute_committee(header: BeaconBlockHeader,
earlier_validator_count = validator_memory.earlier_period_data.validator_count earlier_validator_count = validator_memory.earlier_period_data.validator_count
later_validator_count = validator_memory.later_period_data.validator_count later_validator_count = validator_memory.later_period_data.validator_count
earlier_committee = validator_memory.earlier_period_data.committee maximal_earlier_committee = validator_memory.earlier_period_data.committee
later_committee = validator_memory.later_period_data.committee maximal_later_committee = validator_memory.later_period_data.committee
earlier_start_epoch = get_earlier_start_epoch(header.slot) earlier_start_epoch = get_earlier_start_epoch(header.slot)
later_start_epoch = get_later_start_epoch(header.slot) later_start_epoch = get_later_start_epoch(header.slot)
epoch = slot_to_epoch(header.slot) epoch = slot_to_epoch(header.slot)
actual_committee_count = max( committee_count = max(
earlier_validator_count // (SHARD_COUNT * TARGET_COMMITTEE_SIZE), earlier_validator_count // (SHARD_COUNT * TARGET_COMMITTEE_SIZE),
later_validator_count // (SHARD_COUNT * TARGET_COMMITTEE_SIZE), later_validator_count // (SHARD_COUNT * TARGET_COMMITTEE_SIZE),
) + 1 ) + 1