The base networking protocol (DEVp2p) used by Ethereum currently does not employ any form of compression. This results in a massive amount of bandwidth wasted in the entire network, making both initial sync as well as normal operation slower and laggier.
This EIP proposes a tiny extension to the DEVp2p protocol to enable [Snappy compression](https://en.wikipedia.org/wiki/Snappy_(compression)) on all message payloads after the initial handshake. After extensive benchmarks, results show that data traffic is decreased by 60-80% for initial sync. You can find exact numbers below.
Synchronizing the Ethereum main network (block 4,248,000) in Geth using fast sync currently consumes 1.01GB upload and 33.59GB download bandwidth. On the Rinkeby test network (block 852,000) it's 55.89MB upload and 2.51GB download.
However, most of this data (blocks, transactions) are heavily compressable. By enabling compression at the message payload level, we can reduce the previous numbers to 1.01GB upload / 13.46GB download on the main network, and 46.21MB upload / 463.65MB download on the test network.
The motivation behind doing this at the DEVp2p level (opposed to eth for example) is that it would enable compression for all sub-protocols (eth, les, bzz) seamlessly, reducing any complexity those protocols might incur in trying to individually optimize for data traffic.
Bump the advertised DEVp2p version number from `4` to `5`. If during handshake, the remote side advertises support only for version `4`, run the exact same protocol as until now.
Similarly to message sending, when receiving a DEVp2p v5 message from a remote node, insert a Snappy decompression step right after the decrypting the DEVp2p message:
*Note: Snappy supports uncompressed binary literals (up to 4GB) too, leaving room for fine-tuned future optimisations for already compressed or encrypted data that would have no gain of compression (Snappy usually detects this case automatically).*
Currently a DEVp2p message length is limited to 24 bits, amounting to a maximum size of 16MB. With the introduction of Snappy compression, care must be taken not to blindy decompress messages, since they may get significantly larger than 16MB.
However, Snappy is capable of calculating the decompressed size of an input message without inflating it in memory (*[the stream starts with the uncompressed length up to a maximum of `2^32 - 1` stored as a little-endian varint](https://github.com/google/snappy/blob/master/format_description.txt#L20)*). This can be used to discard any messages which decompress above some threshold. **The proposal is to use the same limit (16MB) as the threshold for decompressed messages.** This retains the same guarantees that the current DEVp2p protocol does, so there won't be surprises in application level protocols.
This proposal is fully backward compatible. Clients upgrading to the proposed DEVp2p protocol version `5` should still support skipping the compression step for connections that only advertise version `4` of the DEVp2p protocol.
There is more than one valid encoding of any given input, and there is more than one good internal compression algorithm within Snappy when trading off throughput for output size. As such, different implementations might produce slight variations in the compressed form, but all should be cross compatible between each other.
As an example, take hex encoded RLP of block #272621 from the Rinkeby test network: [block.rlp (~3MB)](https://gist.githubusercontent.com/karalabe/72a1a6c4c1dbe6d4996879e415697f06/raw/195bf0c0050ee9805fcd5db4b5b650c58879a55f/block.rlp).
* Encoding the raw RLP via [Go's Snappy library](https://github.com/golang/snappy) yields: [block.go.snappy (~70KB)](https://gist.githubusercontent.com/karalabe/72a1a6c4c1dbe6d4996879e415697f06/raw/195bf0c0050ee9805fcd5db4b5b650c58879a55f/block.go.snappy).
* Encoding the raw RLP via [Python's Snappy library](https://github.com/andrix/python-snappy) yields: [block.py.snappy (~70KB)](https://gist.githubusercontent.com/karalabe/72a1a6c4c1dbe6d4996879e415697f06/raw/195bf0c0050ee9805fcd5db4b5b650c58879a55f/block.py.snappy).
You can verify that an encoded binary can be decoded into the proper plaintext using the following snippets:
### Go
```
$ go get https://github.com/golang/snappy
```
```go
package main
import (
"bytes"
"encoding/hex"
"fmt"
"io/ioutil"
"log"
"os"
"github.com/golang/snappy"
)
func main() {
// Read and decode the decompressed file
plainhex, err := ioutil.ReadFile(os.Args[1])
if err != nil {
log.Fatalf("Failed to read decompressed file %s: %v", os.Args[1], err)
}
plain, err := hex.DecodeString(string(plainhex))
if err != nil {
log.Fatalf("Failed to decode decompressed file: %v", err)
}
// Read and decode the compressed file
comphex, err := ioutil.ReadFile(os.Args[2])
if err != nil {
log.Fatalf("Failed to read compressed file %s: %v", os.Args[2], err)
}
comp, err := hex.DecodeString(string(comphex))
if err != nil {
log.Fatalf("Failed to decode compressed file: %v", err)
}
// Make sure they match
decomp, err := snappy.Decode(nil, comp)
if err != nil {
log.Fatalf("Failed to decompress compressed file: %v", err)
}
if !bytes.Equal(plain, decomp) {
fmt.Println("Booo, decompressed file does not match provided plain text!")
return
}
fmt.Println("Yay, decompressed data matched provided plain text!")
}
```
```
$ go run main.go block.rlp block.go.snappy
Yay, decompressed data matched provided plain text!
$ go run main.go block.rlp block.py.snappy
Yay, decompressed data matched provided plain text!
```
### Python
```bash
$ pip install python-snappy
```
```py
import snappy
import sys
# Read and decode the decompressed file
with open(sys.argv[1], 'rb') as file:
plainhex = file.read()
plain = plainhex.decode("hex")
# Read and decode the compressed file
with open(sys.argv[2], 'rb') as file:
comphex = file.read()
comp = comphex.decode("hex")
# Make sure they match
decomp = snappy.uncompress(comp)
if plain != decomp:
print "Booo, decompressed file does not match provided plain text!"
else:
print "Yay, decompressed data matched provided plain text!"
```
```
$ python main.py block.rlp block.go.snappy
Yay, decompressed data matched provided plain text!
$ python main.py block.rlp block.py.snappy
Yay, decompressed data matched provided plain text!