2019-02-27 10:54:23 +00:00
# [WIP] SimpleSerialiZe (SSZ)
2019-02-27 11:06:06 +00:00
This is a **work in progress** describing typing, serialization and Merkleization of Ethereum 2.0 objects.
2019-02-27 10:54:23 +00:00
## Table of contents
- [Constants ](#constants )
- [Types ](#types )
- [Primitive types ](#primitive-types )
- [Composite types ](#composite-types )
- [Notation ](#notation )
- [Aliases ](#aliases )
- [Serialization ](#serialization )
- [`uintN` ](#uintn )
- [`bool` ](#bool )
- [Containers ](#containers )
- [Tuples ](#tuples )
- [Lists ](#lists )
- [Deserialization ](#deserialization )
- [Merkleization ](#merkleization )
2019-02-27 11:40:08 +00:00
- [Self-signed containers ](#self-signed-containers )
2019-02-27 10:54:23 +00:00
- [Implementations ](#implementations )
2018-09-23 23:52:38 +00:00
## Constants
2019-02-27 10:54:23 +00:00
| Name | Value | Definition |
|-|:-:|-|
2019-02-27 11:06:06 +00:00
| `LENGTH_BYTES` | `4` | Number of bytes for the length of variable-length serialized objects. |
| `MAX_LENGTH` | `2**(8 * LENGTH_BYTES)` | Maximum serialization length. |
2018-09-23 23:52:38 +00:00
2019-02-27 10:54:23 +00:00
## Types
2018-09-23 23:52:38 +00:00
2019-02-27 10:54:23 +00:00
### Primitive types
2018-09-23 23:52:38 +00:00
2019-02-27 10:54:23 +00:00
* `uintN` : `N` -bit unsigned integer (where `N in [8, 16, 32, 64, 128, 256]` )
* `bool` : 1-bit unsigned integer
2018-11-27 07:45:04 +00:00
2019-02-27 10:54:23 +00:00
### Composite types
2018-11-27 07:45:04 +00:00
2019-02-27 11:03:27 +00:00
* **Container**: ordered heterogenous collection of values
* **Tuple**: ordered fixed-length homogeneous collection of values
* **List**: ordered variable-length homogenous collection of values
2018-09-23 23:52:38 +00:00
2019-02-27 10:54:23 +00:00
### Notation
2018-09-23 23:52:38 +00:00
2019-02-27 11:03:27 +00:00
* **Container**: key-pair notation `{}` , e.g. `{'key1': uint64, 'key2': bool}`
* **Tuple**: angle-braket notation `[N]` , e.g. `uint64[N]`
* **List**: angle-braket notation `[]` , e.g. `uint64[]`
2018-09-23 23:52:38 +00:00
2019-02-27 10:54:23 +00:00
### Aliases
2018-10-02 13:33:11 +00:00
2019-02-27 10:54:23 +00:00
For convenience we alias:
2018-10-02 13:33:11 +00:00
2019-02-27 10:54:23 +00:00
* `byte` to `uint8`
* `bytes` to `byte[]`
* `bytesN` to `byte[N]`
* `bit` to `bool`
2018-10-02 22:17:29 +00:00
2019-02-27 10:54:23 +00:00
## Serialization
2018-10-02 13:42:25 +00:00
2019-02-27 11:03:27 +00:00
We reccursively define the `serialize` function which consumes an object `o` (of the type specified) and returns a byte string `[]byte` .
2019-02-11 14:49:11 +00:00
2019-02-27 10:54:23 +00:00
### `uintN`
2018-09-23 23:52:38 +00:00
```python
2019-02-27 10:54:23 +00:00
assert N in [8, 16, 32, 64, 128, 256]
2019-02-27 11:03:27 +00:00
return o.to_bytes(N / 8, 'little')
2018-09-23 23:52:38 +00:00
```
2019-02-27 10:54:23 +00:00
### `bool`
2018-10-26 13:22:28 +00:00
```python
2019-02-27 11:03:27 +00:00
assert o in (True, False)
return b'\x01' if o is True else b'\x00'
2018-09-23 23:52:38 +00:00
```
2019-02-27 10:54:23 +00:00
### Containers
2018-10-26 13:22:28 +00:00
```python
2019-02-27 11:03:27 +00:00
serialized_elements = [serialize(element) for element in o]
2019-02-27 10:54:23 +00:00
serialized_bytes = reduce(lambda x, y: x + y, serialized_elements)
2019-02-27 11:03:27 +00:00
assert len(serialized_bytes) < MAX_LENGTH
2019-02-27 10:54:23 +00:00
serialized_length = len(serialized_bytes).to_bytes(LENGTH_BYTES, 'little')
return serialized_length + serialized_bytes
2018-10-03 05:08:20 +00:00
```
2019-02-27 10:54:23 +00:00
### Tuples
2019-02-01 00:03:23 +00:00
2019-02-01 14:31:00 +00:00
```python
2019-02-27 11:03:27 +00:00
serialized_elements = [serialize(element) for element in o]
2019-02-27 10:54:23 +00:00
serialized_bytes = reduce(lambda x, y: x + y, serialized_elements)
return serialized_bytes
2019-02-01 00:03:23 +00:00
```
2018-11-15 13:12:34 +00:00
2019-02-27 10:54:23 +00:00
### Lists
2018-11-15 13:12:34 +00:00
```python
2019-02-27 11:03:27 +00:00
serialized_elements = [serialize(element) for element in o]
2019-02-27 10:54:23 +00:00
serialized_bytes = reduce(lambda x, y: x + y, serialized_elements)
2019-02-27 11:03:27 +00:00
assert len(serialized_elements) < MAX_LENGTH
2019-02-27 10:54:23 +00:00
serialized_length = len(serialized_elements).to_bytes(LENGTH_BYTES, 'little')
return serialized_length + serialized_bytes
2018-11-15 13:12:34 +00:00
```
2019-02-27 10:54:23 +00:00
## Deserialization
2018-11-15 13:12:34 +00:00
2019-02-27 11:06:06 +00:00
Given a type, serialization is an injective function from objects of that type to byte strings. That is, deserialization—the inverse function—is well-defined.
2018-11-15 13:12:34 +00:00
2019-02-27 10:54:23 +00:00
## Merkleization
2018-11-15 13:12:34 +00:00
2019-02-27 10:54:23 +00:00
We first define helper functions:
2018-11-15 13:12:34 +00:00
2019-02-27 11:06:06 +00:00
* `pack` : Given ordered objects of the same basic type, serialize them, pack them into 32-byte chunks, right-pad the last chunk with zero bytes, and return the chunks.
* `merkleize` : Given ordered 32-byte chunks, right-pad them with zero chunks to the closest power of two, Merkleize the chunks, and return the root.
* `mix_in_length` : Given a Merkle root `root` and a length `length` (32-byte little-endian serialization) return `hash(root + length)` .
2018-11-15 13:12:34 +00:00
2019-02-27 10:54:23 +00:00
Let `o` be an object. We now define object Merkleization `hash_tree_root(o)` recursively:
2019-02-16 21:44:27 +00:00
2019-02-27 10:54:23 +00:00
* `merkleize(pack(o))` if `o` is a basic object or a tuple of basic objects
* `mix_in_length(merkleize(pack(o)), len(o))` if `o` is a list of basic objects
* `merkleize([hash_tree_root(element) for element in o])` if `o` is a tuple of composite objects or a container
* `mix_in_length(merkleize([hash_tree_root(element) for element in o]), len(o))` if `o` is a list of composite objects
2019-02-16 21:44:27 +00:00
2019-02-27 11:40:08 +00:00
## Self-signed containers
2019-02-16 21:44:27 +00:00
2019-02-27 10:54:23 +00:00
Let `container` be a self-signed container object. The convention is that the signature (e.g. a `bytes96` BLS12-381 signature) be the last field of `container` . Further, the signed message for `container` is `signed_root(container) = hash_tree_root(truncate_last(container))` where `truncate_last` truncates the last element of `container` .
2019-02-16 21:44:27 +00:00
2018-09-23 23:52:38 +00:00
## Implementations
2019-02-27 11:40:08 +00:00
| Language | Project | Maintainer | Implementation |
|-|-|-|-|
| Python | Ethereum 2.0 | Ethereum Foundation | [https://github.com/ethereum/py-ssz ](https://github.com/ethereum/py-ssz ) |
| Rust | Lighthouse | Sigma Prime | [https://github.com/sigp/lighthouse/tree/master/beacon_chain/utils/ssz ](https://github.com/sigp/lighthouse/tree/master/beacon_chain/utils/ssz ) |
| Nim | Nimbus | Status | [https://github.com/status-im/nim-beacon-chain/blob/master/beacon_chain/ssz.nim ](https://github.com/status-im/nim-beacon-chain/blob/master/beacon_chain/ssz.nim ) |
| Rust | Shasper | ParityTech | [https://github.com/paritytech/shasper/tree/master/util/ssz ](https://github.com/paritytech/shasper/tree/master/util/ssz ) |
| Javascript | Lodestart | Chain Safe Systems | [https://github.com/ChainSafeSystems/ssz-js/blob/master/src/index.js ](https://github.com/ChainSafeSystems/ssz-js/blob/master/src/index.js ) |
| Java | Cava | ConsenSys | [https://www.github.com/ConsenSys/cava/tree/master/ssz ](https://www.github.com/ConsenSys/cava/tree/master/ssz ) |
| Go | Prysm | Prysmatic Labs | [https://github.com/prysmaticlabs/prysm/tree/master/shared/ssz ](https://github.com/prysmaticlabs/prysm/tree/master/shared/ssz ) |
| Swift | Yeeth | Dean Eigenmann | [https://github.com/yeeth/SimpleSerialize.swift ](https://github.com/yeeth/SimpleSerialize.swift ) |
| C# | | Jordan Andrews | [https://github.com/codingupastorm/csharp-ssz ](https://github.com/codingupastorm/csharp-ssz ) |
| C++ | | | [https://github.com/NAKsir-melody/cpp_ssz ](https://github.com/NAKsir-melody/cpp_ssz ) |