clean up merkleization text in SSZ spec

This commit is contained in:
protolambda 2019-07-12 21:15:28 +02:00
parent 65b0311582
commit a8dc9157b8
No known key found for this signature in database
GPG Key ID: EC89FDBB2B4C7623
1 changed files with 18 additions and 22 deletions

View File

@ -25,8 +25,6 @@
- [Vectors, containers, lists, unions](#vectors-containers-lists-unions)
- [Deserialization](#deserialization)
- [Merkleization](#merkleization)
- [`Bitvector[N]`](#bitvectorn-1)
- [`Bitlist[N]`](#bitlistn-1)
- [Self-signed containers](#self-signed-containers)
- [Implementations](#implementations)
@ -177,38 +175,36 @@ Note that deserialization requires hardening against invalid inputs. A non-exhau
We first define helper functions:
* `chunk_count(type)`: calculate the amount of leafs for merkleization of the type.
* all basic types: `1`
* bitlists and bitvectors: `(N + 255) // 256` (dividing by chunk size, rounding up)
* lists and vectors of basic types: `N * item_length(elem_type) + 31) // 32` (dividing by chunk size, rounding up)
* lists and vectors of composite types: `N`
* containers: `len(fields)`
* `bitfield_bytes(bits)`: return the bits of the bitlist or bitvector, packed in bytes, aligned to the start. Exclusive length-delimiting bit for bitlists.
* `pack`: Given ordered objects of the same basic type, serialize them, pack them into `BYTES_PER_CHUNK`-byte chunks, right-pad the last chunk with zero bytes, and return the chunks.
* `next_pow_of_two(i)`: get the next power of 2 of `i`, if not already a power of 2, with 0 mapping to 1. Examples: `0->1, 1->1, 2->2, 3->4, 4->4, 6->8, 9->16`
* `merkleize(data, pad_for=1)`: Given ordered `BYTES_PER_CHUNK`-byte chunks, if necessary append zero chunks so that the number of chunks is a power of two, Merkleize the chunks, and return the root.
* The merkleization depends on the effective input, which can be padded: if `pad_for=L`, then pad the `data` with zeroed chunks to `next_pow_of_two(L)` (virtually for memory efficiency).
* `merkleize(chunks, limit=None)`: Given ordered `BYTES_PER_CHUNK`-byte chunks, merkleize the chunks, and return the root:
* The merkleization depends on the effective input, which can be padded/limited:
- if no limit: pad the `chunks` with zeroed chunks to `next_pow_of_two(len(chunks))` (virtually for memory efficiency).
- if `limit > len(chunks)`, pad the `chunks` with zeroed chunks to `next_pow_of_two(limit)` (virtually for memory efficiency).
- if `limit < len(chunks)`: do not merkleize, input exceeds limit. Raise an error instead.
* Then, merkleize the chunks (empty input is padded to 1 zero chunk):
- If `1` chunk: A single chunk is simply that chunk, i.e. the identity when the number of chunks is one.
- If `> 1` chunks: pad to `next_pow_of_two(len(chunks))`, merkleize as binary tree.
- If `1` chunk: the root is the chunk itself.
- If `> 1` chunks: merkleize as binary tree.
* `mix_in_length`: Given a Merkle root `root` and a length `length` (`"uint256"` little-endian serialization) return `hash(root + length)`.
* `mix_in_type`: Given a Merkle root `root` and a type_index `type_index` (`"uint256"` little-endian serialization) return `hash(root + type_index)`.
We now define Merkleization `hash_tree_root(value)` of an object `value` recursively:
* `merkleize(pack(value))` if `value` is a basic object or a vector of basic objects.
* `mix_in_length(merkleize(pack(value), pad_for=(N * elem_size / BYTES_PER_CHUNK)), len(value))` if `value` is a list of basic objects.
* `merkleize(bitfield_bytes(value), limit=chunk_count(type))` if `value` is a bitvector.
* `mix_in_length(merkleize(pack(value), limit=chunk_count(type)), len(value))` if `value` is a list of basic objects.
* `mix_in_length(merkleize(bitfield_bytes(value), limit=chunk_count(type)), len(value))` if `value` is a bitlist.
* `merkleize([hash_tree_root(element) for element in value])` if `value` is a vector of composite objects or a container.
* `mix_in_length(merkleize([hash_tree_root(element) for element in value], pad_for=N), len(value))` if `value` is a list of composite objects.
* `mix_in_length(merkleize([hash_tree_root(element) for element in value], limit=chunk_count(type)), len(value))` if `value` is a list of composite objects.
* `mix_in_type(merkleize(value.value), value.type_index)` if `value` is of union type.
### `Bitvector[N]`
```python
as_integer = sum([value[i] << i for i in range(len(value))])
return merkleize(pack(as_integer.to_bytes((N + 7) // 8, "little")))
```
### `Bitlist[N]`
```python
as_integer = sum([value[i] << i for i in range(len(value))])
return mix_in_length(merkleize(pack(as_integer.to_bytes((N + 7) // 8, "little"))), len(value))
```
## Self-signed containers
Let `value` be a self-signed container object. The convention is that the signature (e.g. a `"bytes96"` BLS12-381 signature) be the last field of `value`. Further, the signed message for `value` is `signing_root(value) = hash_tree_root(truncate_last(value))` where `truncate_last` truncates the last element of `value`.