nimbus-eth2/ncli/e2store.py
Jacek Sieka aabdd34704
e2store: add era format (#2382)
Era files contain 8192 blocks and a state corresponding to the length of
the array holding block roots in the state, meaning that each block is
verifiable using the pubkeys and block roots from the state. Of course,
one would need to know the root of the state as well, which is available
in the first block of the _next_ file - or known from outside.

This PR also adds an implementation to write e2s, e2i and era files, as
well as a python script to inspect them.

All in all, the format is very similar to what goes on in the network
requests meaning it can trivially serve as a backing format for serving
said requests.

Mainnet, up to the first 671k slots, take up 3.5gb - in each era file,
the BeaconState contributes about 9mb at current validator set sizes, up
from ~3mb in the early blocks, for a grand total of ~558mb for the 82 eras
tested - this overhead could potentially be calculated but one would lose
the ability to verify individual blocks (eras could still be verified using
historical roots).

```
-rw-rw-r--. 1 arnetheduck arnetheduck   16  5 mar 11.47 ethereum2-mainnet-00000000-00000001.e2i
-rw-rw-r--. 1 arnetheduck arnetheduck 1,8M  5 mar 11.47 ethereum2-mainnet-00000000-00000001.e2s
-rw-rw-r--. 1 arnetheduck arnetheduck  65K  5 mar 11.47 ethereum2-mainnet-00000001-00000001.e2i
-rw-rw-r--. 1 arnetheduck arnetheduck  18M  5 mar 11.47 ethereum2-mainnet-00000001-00000001.e2s
...
-rw-rw-r--. 1 arnetheduck arnetheduck  65K  5 mar 11.52 ethereum2-mainnet-00000051-00000001.e2i
-rw-rw-r--. 1 arnetheduck arnetheduck  68M  5 mar 11.52 ethereum2-mainnet-00000051-00000001.e2s
-rw-rw-r--. 1 arnetheduck arnetheduck  61K  5 mar 11.11 ethereum2-mainnet-00000052-00000001.e2i
-rw-rw-r--. 1 arnetheduck arnetheduck  62M  5 mar 11.11 ethereum2-mainnet-00000052-00000001.e2s
```
2021-03-15 11:31:39 +01:00

50 lines
1.3 KiB
Python

import sys, struct
def read_e2store(name):
with open(name, "rb") as f:
header = f.read(8)
typ = header[0:2] # First 2 bytes for type
if typ != b"e2":
raise RuntimeError("this is not an e2store file")
while True:
header = f.read(8) # Header is 8 bytes
if not header: break
typ = header[0:2] # First 2 bytes for type
dlen = struct.unpack("<q", header[2:8] + b"\0\0")[0] # 6 bytes of little-endian length
data = f.read(dlen)
if len(data) != dlen: # Don't trust the given length, specially when pre-allocating
raise RuntimeError("File is missing data")
if typ == b"i2":
raise RuntimeError("Cannot switch to index mode")
elif typ == b"e2":
pass # Ignore extra headers
yield (typ, data)
def find_offset(name, slot):
# Find the offset of a given slot
with open(name, "rb") as f:
header = f.read(8)
typ = header[0:2] # First 2 bytes for type
if typ != b"i2":
raise RuntimeError("this is not an e2store file")
start_slot = struct.unpack("<q", f.read(8))[0]
f.seek(8 * (slot - start_slot) + 16)
return struct.unpack("<q", f.read(8))[0]
name = sys.argv[1]
if name.endswith(".e2i"):
print(find_offset(name, int(sys.argv[2])))
else:
for typ, data in read_e2store(name):
print("typ", typ, "data", len(data))