* EIP 2124: Zero RTT netsplit on chain mismatch * Assign number to EIP 2124 * Address review comments on EIP 2124. * Rename EIP 2124 to better reflect content * Fix cardinality * Fix email address in 2124 * Update fork identifier EIP based on review discussions. * Fixup EIP name for the fork identifier. * Remove stale mention of ENR from forkid EIP. * Add RLP encoding test cases for forkid EIP. * Add Felix too to the forkid EIP author list. * Get rid of mentioning the MTU stuffin the forkid EIP.
18 KiB
eip | title | author | discussions-to | status | type | category | created |
---|---|---|---|---|---|---|---|
2124 | Fork identifier for chain compatibility checks | Péter Szilágyi <peterke@gmail.com>, Felix Lange <fjl@ethereum.org> | https://github.com/ethereum/EIPs/issues/2125 | Draft | Standards Track | Networking | 2019-05-03 |
Simple Summary
Currently nodes in the Ethereum network try to find each other by establishing random connections to remote machines "looking" like an Ethereum node (public networks, private networks, test networks, etc), hoping that they found a useful peer (same genesis, same forks). This wastes time and resources, especially for smaller networks.
To avoid this overhead, Ethereum needs a mechanism that can precisely identify whether a node will be useful, as early as possible. Such a mechanism requires a way to summarize chain configurations, as well as a way to disseminate said summaries in the network.
This proposal focuses only on the definition of said summary - a generally useful fork identifier - and it's validation rules, allowing it to be embedded into arbitrary network protocols (e.g. discovery ENRs or eth/6x
handshakes).
Abstract
There are many public and private Ethereum networks, but the discovery protocol doesn't differentiate between them. The only way to check if a peer is good or bad (same chain or not), is to establish a TCP/IP connection, wrap it with RLPx cryptography, then execute an eth
handshake. This is an extreme cost to bear if it turns out that the remote peer is on a different network and it's not even precise enough to differentiate Ethereum and Ethereum Classic. This cost is magnified for small networks, where a lot more trial and errors are needed to find good nodes.
Even if the peer is on the same chain, during non-controversial consensus upgrades, not everybody updates their nodes in time (developer nodes, leftovers, etc). These stale nodes put a meaningless burden on the peer-to-peer network, since they just latch on to good nodes, but don't accept upgraded blocks. This causes valuable peer slots and bandwidth to be lost until the stale nodes finally update. This is a serious issue for test networks, where leftovers can linger for months.
This EIP proposes a new identity scheme to both precisely and concisely summarize the chain's current status (genesis and all applied forks). The conciseness is particularly important to make the identity useful across datagram protocols too. The EIP solves a number of issues:
- If two nodes are on different networks, they should never even consider connecting.
- If a hard fork passes, upgraded nodes should reject non-upgraded ones, but NOT before.
- If two chains share the same genesis, but not forks (ETH / ETC), they should reject each other.
This EIP does not attempt to solve the clean separation of 3-way-forks! If at the same future block number, the network splits into three (non-fork, fork-A and fork-B), separating the forkers from each another will need case-by-case special handling. Not handling this keeps the proposal pragmatic, simple and also avoids making it too easy to fork off mainnet.
To keep the scope limited, this EIP only defines the identity scheme and validation rules. The same scheme and algorithm can be embedded into various networking protocols, allowing both the eth/6x
handshake to be more precise (Ethereum vs. Ethereum Classic); as well as the discovery to be more useful (reject surely peers without ever connecting).
Motivation
Peer-to-peer networking is messy and hard due to firewalls and network address translation (NAT). Generally only a small fraction of nodes have publicly routed addresses and P2P networks rely mainly on these for forwarding data for everyone else. The best way to maximize the utility of the public nodes is to ensure their resources aren't wasted on tasks that are worthless to the network.
By aggressively cutting off incompatible nodes from each other we can extract a lot more value from the public nodes, making the entire P2P network much more robust and reliable. Supporting this network partitioning at a discovery layer can further enhance performance as we avoid the costly crypto and latency/bandwidth hit associated with establishing a stream connection in the first place.
Specification
Each node maintains the following values:
FORK_HASH
: IEEE CRC32 checksum ([4]byte
) of the genesis hash and fork blocks numbers that already passed.- The fork block numbers are fed into the CRC32 checksum in ascending order.
- If multiple forks are applied at the same block, the block number is checksummed only once.
- Block numbers are regarded as
uint64
integers, encoded in big endian format when checksumming. - If a chain is configured to start with a non-Frontier ruleset already in its genesis, that is NOT considered a fork.
FORK_NEXT
: Block number (uint64
) of the next upcoming fork, or0
if no next fork is known.
E.g. FORK_HASH
for mainnet would be:
- forkhash₀ =
0xfc64ec04
(Genesis) =CRC32(<genesis-hash>)
- forkhash₁ =
0x97c2c34c
(Homestead) =CRC32(<genesis-hash> || uint64(1150000))
- forkhash₂ =
0x91d1f948
(DAO fork) =CRC32(<genesis-hash> || uint64(1150000) || uint64(1920000))
The fork identifier is defined as RLP([FORK_HASH, FORK_NEXT])
. This forkid
is cross validated (NOT naively compared) to assess a remote chain's compatibility. Irrespective of fork state, both parties must come to the same conclusion to avoid indefinite reconnect attempts from one side.
Validation rules
- If local and remote
FORK_HASH
matches, connect.- The two nodes are in the same fork state currently. They might know of differing future forks, but that's not relevant until the fork triggers (might be postponed, nodes might be updated to match).
- If the remote
FORK_HASH
is a subset of the local past forks and the remoteFORK_NEXT
matches with the locally following fork block number, connect.- Remote node is currently syncing. It might eventually diverge from us, but at this current point in time we don't have enough information.
- If the remote
FORK_HASH
is a superset of the local past forks and can be completed with locally known future forks, connect.- Local node is currently syncing. It might eventually diverge from the remote, but at this current point in time we don't have enough information.
- Reject in all other cases.
Stale software examples
The examples below try to exhaust the fork combination possibilities that arise when nodes do not run matching software versions, but otherwise follow the same chain (mainnet nodes, testnet nodes, etc).
Past forks | Future forks | Remote FORK_HASH |
Remote FORK_NEXT |
Connect | Reason |
---|---|---|---|---|---|
A | A | Yes (1) | Same forks, same sync state. | ||
A | A | B | Yes (1) | Remote is advertising a future fork, but that is uncertain. | |
A | B | A | Yes (1) | Local knows about a future fork, but that is uncertain. | |
A | B | A | B | Yes (1) | Both know about a future fork, but that is uncertain. |
A | B1 | A | B2 | Yes (1) | Both know about differing future forks, but those are uncertain. |
[A,B] | A | B | Yes (2) | Remote out of sync. | |
[A,B,C] | A | B | Yes¹ (2) | Remote out of sync. Remote will need a software update, but we don't know it yet. | |
A | B | A ⊕ B | Yes (3) | Local out of sync. | |
A | B,C | A ⊕ B | Yes (3) | Local out of sync. Local also knows about a future fork, but that is uncertain yet. | |
A | A ⊕ B | No (4) | Local needs software update. | ||
A | B | A ⊕ B ⊕ C | No² (4) | Local needs software update. | |
[A,B] | A | No (4) | Remote needs software update. |
Note, there's one asymmetry in the table, marked with ¹ and ². Since we don't have access to a remote node's future fork list (just the next one), we can't detect that it's software is stale until it syncs up. This is acceptable as 1) the remote node will disconnect from us anyway, and 2) this is a temporary fluke during sync, not permanent with a leftover node.
Mismatching chain examples (local perspective)
TODO: Give some examples as to what happens if the nodes follow different forks.
Rationale
Why flatten FORK_HASH
into 4 bytes? Why not share the entire genesis and fork list?
Whilst the eth
devp2p protocol permits arbitrarily much data to be transmitted, the discovery protocol's total space allowance for all ENR entries is 300 bytes.
Reducing the FORK_HASH
into a 4 bytes checksum ensures that we leave ample room in the ENR for future extensions; and 4 bytes is more than enough for arbitrarily many Ethereum networks from a (practical) collision perspective.
Why use IEEE CRC32 as the checksum instead of Keccak256?
We need a mechanism that can flatten arbitrary data into 4 bytes, without ignoring any of the input. Any other checksum or hashing algorithm would work, but since nodes can lie at any time, there's no value in cryptographic hash functions.
Instead of just taking the first 4 bytes of a Keccak256 hash (seems odd) or XOR-ing all the 4-byte groups (messy), CRC32 is a better alternative, as this is exactly what it was designed for. IEEE CRC32 is also used by ethernet, gzip, zip, png, etc, so every programming language support should not be a problem.
We're not using FORK_NEXT
for much, can't we get rid of it somehow?
We need to be able to differentiate whether a remote node is out of sync or whether its software is stale. Sharing only the past forks cannot tell us if the node is legitimately behind or stuck.
Why advertise only one next fork, instead of "hashing" all known future ones like the FORK_HASH
?
Opposed to past forks that have already passed (for us locally) and can be considered immutable, we don't know anything about future ones. Maybe we're out of sync or maybe the fork didn't pass yet. If it didn't pass yet, it might be postponed, so enforcing it would split the network apart. It could also happen that we're not yet aware of all future forks (haven't updated our software in a while).
Backwards Compatibility
This EIP only defines an identity scheme, it does not define functional changes.
Test Cases
Here's a full suite of tests for all possible fork IDs that Mainnet, Ropsten, Rinkeby and Görli can advertise given the Petersburg fork cap (time of writing).
type testcase struct {
head uint64
want ID
}
tests := []struct {
config *params.ChainConfig
genesis common.Hash
cases []testcase
}{
// Mainnet test cases
{
params.MainnetChainConfig,
params.MainnetGenesisHash,
[]testcase{
{0, ID{Hash: 0xfc64ec04, Next: 1150000}}, // Unsynced
{1149999, ID{Hash: 0xfc64ec04, Next: 1150000}}, // Last Frontier block
{1150000, ID{Hash: 0x97c2c34c, Next: 1920000}}, // First Homestead block
{1919999, ID{Hash: 0x97c2c34c, Next: 1920000}}, // Last Homestead block
{1920000, ID{Hash: 0x91d1f948, Next: 2463000}}, // First DAO block
{2462999, ID{Hash: 0x91d1f948, Next: 2463000}}, // Last DAO block
{2463000, ID{Hash: 0x7a64da13, Next: 2675000}}, // First Tangerine block
{2674999, ID{Hash: 0x7a64da13, Next: 2675000}}, // Last Tangerine block
{2675000, ID{Hash: 0x3edd5b10, Next: 4370000}}, // First Spurious block
{4369999, ID{Hash: 0x3edd5b10, Next: 4370000}}, // Last Spurious block
{4370000, ID{Hash: 0xa00bc324, Next: 7280000}}, // First Byzantium block
{7279999, ID{Hash: 0xa00bc324, Next: 7280000}}, // Last Byzantium block
{7280000, ID{Hash: 0x668db0af, Next: 0}}, // First and last Constantinople, first Petersburg block
{7987396, ID{Hash: 0x668db0af, Next: 0}}, // Today Petersburg block
},
},
// Ropsten test cases
{
params.TestnetChainConfig,
params.TestnetGenesisHash,
[]testcase{
{0, ID{Hash: 0x30c7ddbc, Next: 10}}, // Unsynced, last Frontier, Homestead and first Tangerine block
{9, ID{Hash: 0x30c7ddbc, Next: 10}}, // Last Tangerine block
{10, ID{Hash: 0x63760190, Next: 1700000}}, // First Spurious block
{1699999, ID{Hash: 0x63760190, Next: 1700000}}, // Last Spurious block
{1700000, ID{Hash: 0x3ea159c7, Next: 4230000}}, // First Byzantium block
{4229999, ID{Hash: 0x3ea159c7, Next: 4230000}}, // Last Byzantium block
{4230000, ID{Hash: 0x97b544f3, Next: 4939394}}, // First Constantinople block
{4939393, ID{Hash: 0x97b544f3, Next: 4939394}}, // Last Constantinople block
{4939394, ID{Hash: 0xd6e2149b, Next: 0}}, // First Petersburg block
{5822692, ID{Hash: 0xd6e2149b, Next: 0}}, // Today Petersburg block
},
},
// Rinkeby test cases
{
params.RinkebyChainConfig,
params.RinkebyGenesisHash,
[]testcase{
{0, ID{Hash: 0x3b8e0691, Next: 1}}, // Unsynced, last Frontier block
{1, ID{Hash: 0x60949295, Next: 2}}, // First and last Homestead block
{2, ID{Hash: 0x8bde40dd, Next: 3}}, // First and last Tangerine block
{3, ID{Hash: 0xcb3a64bb, Next: 1035301}}, // First Spurious block
{1035300, ID{Hash: 0xcb3a64bb, Next: 1035301}}, // Last Spurious block
{1035301, ID{Hash: 0x8d748b57, Next: 3660663}}, // First Byzantium block
{3660662, ID{Hash: 0x8d748b57, Next: 3660663}}, // Last Byzantium block
{3660663, ID{Hash: 0xe49cab14, Next: 4321234}}, // First Constantinople block
{4321233, ID{Hash: 0xe49cab14, Next: 4321234}}, // Last Constantinople block
{4321234, ID{Hash: 0xafec6b27, Next: 0}}, // First Petersburg block
{4586649, ID{Hash: 0xafec6b27, Next: 0}}, // Today Petersburg block
},
},
// Goerli test cases
{
params.GoerliChainConfig,
params.GoerliGenesisHash,
[]testcase{
{0, ID{Hash: 0xa3f5ab08, Next: 0}}, // Unsynced, last Frontier, Homestead, Tangerine, Spurious, Byzantium, Constantinople and first Petersburg block
{795329, ID{Hash: 0xa3f5ab08, Next: 0}}, // Today Petersburg block
},
},
}
Here's a suite of tests of the different states a Mainnet node might be in and the different remote fork identifiers it might be required to validate and decide to accept or reject:
tests := []struct {
head uint64
id ID
err error
}{
// Local is mainnet Petersburg, remote announces the same. No future fork is announced.
{7987396, ID{Hash: 0x668db0af, Next: 0}, nil},
// Local is mainnet Petersburg, remote announces the same. Remote also announces a next fork
// at block 0xffffffff, but that is uncertain.
{7987396, ID{Hash: 0x668db0af, Next: math.MaxUint64}, nil},
// Local is mainnet currently in Byzantium only (so it's aware of Petersburg), remote announces
// also Byzantium, but it's not yet aware of Petersburg (e.g. non updated node before the fork).
// In this case we don't know if Petersburg passed yet or not.
{7279999, ID{Hash: 0xa00bc324, Next: 0}, nil},
// Local is mainnet currently in Byzantium only (so it's aware of Petersburg), remote announces
// also Byzantium, and it's also aware of Petersburg (e.g. updated node before the fork). We
// don't know if Petersburg passed yet (will pass) or not.
{7279999, ID{Hash: 0xa00bc324, Next: 7280000}, nil},
// Local is mainnet currently in Byzantium only (so it's aware of Petersburg), remote announces
// also Byzantium, and it's also aware of some random fork (e.g. misconfigured Petersburg). As
// neither forks passed at neither nodes, they may mismatch, but we still connect for now.
{7279999, ID{Hash: 0xa00bc324, Next: math.MaxUint64}, nil},
// Local is mainnet Petersburg, remote announces Byzantium + knowledge about Petersburg. Remote
// is simply out of sync, accept.
{7987396, ID{Hash: 0x668db0af, Next: 7280000}, nil},
// Local is mainnet Petersburg, remote announces Spurious + knowledge about Byzantium. Remote
// is definitely out of sync. It may or may not need the Petersburg update, we don't know yet.
{7987396, ID{Hash: 0x3edd5b10, Next: 4370000}, nil},
// Local is mainnet Byzantium, remote announces Petersburg. Local is out of sync, accept.
{7279999, ID{Hash: 0x668db0af, Next: 0}, nil},
// Local is mainnet Spurious, remote announces Byzantium, but is not aware of Petersburg. Local
// out of sync. Local also knows about a future fork, but that is uncertain yet.
{4369999, ID{Hash: 0xa00bc324, Next: 0}, nil},
// Local is mainnet Petersburg. remote announces Byzantium but is not aware of further forks.
// Remote needs software update.
{7987396, ID{Hash: 0xa00bc324, Next: 0}, ErrRemoteStale},
// Local is mainnet Petersburg, and isn't aware of more forks. Remote announces Petersburg +
// 0xffffffff. Local needs software update, reject.
{7987396, ID{Hash: 0x5cddc0e1, Next: 0}, ErrLocalIncompatibleOrStale},
// Local is mainnet Byzantium, and is aware of Petersburg. Remote announces Petersburg +
// 0xffffffff. Local needs software update, reject.
{7279999, ID{Hash: 0x5cddc0e1, Next: 0}, ErrLocalIncompatibleOrStale},
// Local is mainnet Petersburg, remote is Rinkeby Petersburg.
{7987396, ID{Hash: 0xafec6b27, Next: 0}, ErrLocalIncompatibleOrStale},
}
Here's a couple of tests to verify the proper RLP encoding (since FORK_HASH
is a 4 byte binary but FORK_NEXT
is an 8 byte quantity):
tests := []struct {
id ID
want []byte
}{
{
ID{Hash: checksumToBytes(0), Next: 0},
common.Hex2Bytes("c6840000000080"),
},
{
ID{Hash: checksumToBytes(0xdeadbeef), Next: 0xBADDCAFE},
common.Hex2Bytes("ca84deadbeef84baddcafe"),
},
{
ID{Hash: checksumToBytes(math.MaxUint32), Next: math.MaxUint64},
common.Hex2Bytes("ce84ffffffff88ffffffffffffffff"),
},
}
Implementation
https://github.com/ethereum/go-ethereum/pull/19738
Copyright
Copyright and related rights waived via CC0.