eth2.0-specs/specs/simple-serialize.md

109 lines
4.9 KiB
Markdown
Raw Normal View History

2019-02-27 11:53:24 +00:00
# SimpleSerialiZe (SSZ)
2019-02-27 11:06:06 +00:00
This is a **work in progress** describing typing, serialization and Merkleization of Ethereum 2.0 objects.
## Table of contents
2019-02-27 11:53:24 +00:00
- [Typing](#typing)
2019-02-27 12:04:36 +00:00
- [Basic types](#basic-types)
- [Composite types](#composite-types)
- [Aliases](#aliases)
- [Serialization](#serialization)
- [`uintN`](#uintn)
- [`bool`](#bool)
2019-02-27 16:35:26 +00:00
- [Containers, tuples, lists](#containers-tuples-lists)
- [Deserialization](#deserialization)
- [Merkleization](#merkleization)
2019-02-27 11:40:08 +00:00
- [Self-signed containers](#self-signed-containers)
- [Implementations](#implementations)
2018-09-23 23:52:38 +00:00
2019-02-27 11:53:24 +00:00
## Typing
2018-09-23 23:52:38 +00:00
2019-02-27 16:54:19 +00:00
### Basic types
2018-09-23 23:52:38 +00:00
* `uintN`: `N`-bit unsigned integer (where `N in [8, 16, 32, 64, 128, 256]`)
* `bool`: 1-bit unsigned integer
2018-11-27 07:45:04 +00:00
2019-02-27 16:54:19 +00:00
### Composite types
2018-09-23 23:52:38 +00:00
2019-02-27 16:59:49 +00:00
* **container**: ordered heterogenous collection of values
* key-pair curly braket notation `{}`, e.g. `{'foo': uint64, 'bar': bool}`
* **tuple**: ordered fixed-length homogeneous collection of values
* angle braket notation `[N]`, e.g. `uint64[N]`
* **list**: ordered variable-length homogenous collection of values
* angle braket notation `[]`, e.g. `uint64[]`
2018-09-23 23:52:38 +00:00
2019-02-27 17:00:49 +00:00
### Aliases
For convenience we alias:
* `byte` to `uint8`
* `bytes` to `byte[]`
* `bytesN` to `byte[N]`
## Serialization
We recursively define the `serialize` function which consumes an object `value` (of the type specified) and returns a byte string of type `bytes`.
2019-02-11 14:49:11 +00:00
### `uintN`
2018-09-23 23:52:38 +00:00
```python
assert N in [8, 16, 32, 64, 128, 256]
2019-02-27 16:35:26 +00:00
return value.to_bytes(N // 8, 'little')
2018-09-23 23:52:38 +00:00
```
### `bool`
```python
2019-02-27 16:35:26 +00:00
assert value in (True, False)
return b'\x01' if value is True else b'\x00'
2018-09-23 23:52:38 +00:00
```
### Containers, tuples, lists
```python
2019-02-27 16:35:26 +00:00
serialized_bytes = ''.join([serialize(element) for element in value])
2019-02-27 16:56:51 +00:00
LENGTH_BYTES = 4
2019-02-27 16:54:19 +00:00
assert len(serialized_bytes) < 2**(8 * LENGTH_BYTES)
serialized_length = len(serialized_bytes).to_bytes(LENGTH_BYTES, 'little')
return serialized_length + serialized_bytes
2018-10-03 05:08:20 +00:00
```
## Deserialization
2019-02-27 11:06:06 +00:00
Given a type, serialization is an injective function from objects of that type to byte strings. That is, deserialization—the inverse function—is well-defined.
## Merkleization
We first define helper functions:
2019-02-27 11:06:06 +00:00
* `pack`: Given ordered objects of the same basic type, serialize them, pack them into 32-byte chunks, right-pad the last chunk with zero bytes, and return the chunks.
* `merkleize`: Given ordered 32-byte chunks, right-pad them with zero chunks to the next power of two, Merkleize the chunks, and return the root.
* `mix_in_length`: Given a Merkle root `root` and a length `length` (`uint256` little-endian serialization) return `hash(root + length)`.
2019-02-27 16:56:51 +00:00
We now define Merkleization `hash_tree_root(value)` of an object `value` recursively:
2019-02-27 16:35:26 +00:00
* `merkleize(pack(value))` if `value` is a basic object or a tuple of basic objects
* `mix_in_length(merkleize(pack(value)), len(value))` if `value` is a list of basic objects
* `merkleize([hash_tree_root(element) for element in value])` if `value` is a tuple of composite objects or a container
* `mix_in_length(merkleize([hash_tree_root(element) for element in value]), len(value))` if `value` is a list of composite objects
2019-02-27 11:40:08 +00:00
## Self-signed containers
Let `container` be a self-signed container object. The convention is that the signature (e.g. a `bytes96` BLS12-381 signature) be the last field of `container`. Further, the signed message for `container` is `signed_root(container) = hash_tree_root(truncate_last(container))` where `truncate_last` truncates the last element of `container`.
2018-09-23 23:52:38 +00:00
## Implementations
2019-02-27 11:40:08 +00:00
| Language | Project | Maintainer | Implementation |
2019-02-27 11:54:56 +00:00
|-|-|-|-|
2019-02-27 11:40:08 +00:00
| Python | Ethereum 2.0 | Ethereum Foundation | [https://github.com/ethereum/py-ssz](https://github.com/ethereum/py-ssz) |
| Rust | Lighthouse | Sigma Prime | [https://github.com/sigp/lighthouse/tree/master/beacon_chain/utils/ssz](https://github.com/sigp/lighthouse/tree/master/beacon_chain/utils/ssz) |
| Nim | Nimbus | Status | [https://github.com/status-im/nim-beacon-chain/blob/master/beacon_chain/ssz.nim](https://github.com/status-im/nim-beacon-chain/blob/master/beacon_chain/ssz.nim) |
| Rust | Shasper | ParityTech | [https://github.com/paritytech/shasper/tree/master/util/ssz](https://github.com/paritytech/shasper/tree/master/util/ssz) |
| Javascript | Lodestart | Chain Safe Systems | [https://github.com/ChainSafeSystems/ssz-js/blob/master/src/index.js](https://github.com/ChainSafeSystems/ssz-js/blob/master/src/index.js) |
| Java | Cava | ConsenSys | [https://www.github.com/ConsenSys/cava/tree/master/ssz](https://www.github.com/ConsenSys/cava/tree/master/ssz) |
| Go | Prysm | Prysmatic Labs | [https://github.com/prysmaticlabs/prysm/tree/master/shared/ssz](https://github.com/prysmaticlabs/prysm/tree/master/shared/ssz) |
| Swift | Yeeth | Dean Eigenmann | [https://github.com/yeeth/SimpleSerialize.swift](https://github.com/yeeth/SimpleSerialize.swift) |
| C# | | Jordan Andrews | [https://github.com/codingupastorm/csharp-ssz](https://github.com/codingupastorm/csharp-ssz) |
| C++ | | | [https://github.com/NAKsir-melody/cpp_ssz](https://github.com/NAKsir-melody/cpp_ssz) |