vac.dev/_posts/2020-02-13-where-are-we-at-...

144 lines
6.9 KiB
Markdown
Raw Normal View History

2020-02-10 05:51:16 +00:00
---
layout: post
2020-02-13 05:49:09 +00:00
name: "Where Are We at with Waku?"
title: "Where Are We at with Waku?"
date: 2020-02-14 12:00:00 +0800
2020-02-10 05:51:16 +00:00
author: oskarth
2020-02-13 05:49:09 +00:00
published: true
permalink: /where-are-we-at-with-waku
2020-02-10 05:51:16 +00:00
categories: research
2020-02-13 05:49:09 +00:00
summary: A research log. Where are we at with Waku?
2020-02-10 05:51:16 +00:00
image: /assets/img/TODOFIXME.png
discuss: https://forum.vac.dev/t/TODOFIXME
---
2020-02-12 05:00:18 +00:00
TODO: Picture for bottleneck or Waku
2020-02-13 05:49:09 +00:00
TODO: Fix discuss link
2020-02-13 05:41:35 +00:00
TODO: Fix date
2020-02-12 05:00:18 +00:00
2020-02-10 05:51:16 +00:00
# Where we are at with Waku?
Waku is our fork of Whisper where we address the shortcomings of Whisper in an iterative manner. We've seen a in [previous post](https://vac.dev/fixing-whisper-with-waku) that Whisper doesn't scale, and why. Here's an update on where we are at with Waku since then.
2020-02-10 05:51:16 +00:00
## Current state
**Specs:**
2020-02-12 04:45:50 +00:00
We've cut the spec up into three components. These are:
2020-02-10 05:51:16 +00:00
2020-02-13 05:41:35 +00:00
- Waku (main spec), currently in [version 0.3.0](https://specs.vac.dev/waku/waku.html)
2020-02-12 04:45:50 +00:00
- Waku envelope data field, currently in [version 0.1.0](https://specs.vac.dev/waku/envelope-data-format.html)
- Waku mailserver, currently in [version 0.2.0](https://specs.vac.dev/waku/mailserver.html)
2020-02-10 05:51:16 +00:00
**Clients:**
2020-02-12 04:45:50 +00:00
There are currently two clients that implement Waku, these are [Nimbus](https://github.com/status-im/nimbus/tree/master/waku) in Nim and [status-go](https://github.com/status-im/status-go) in Go.
2020-02-10 05:51:16 +00:00
2020-02-13 05:41:35 +00:00
At the time of writing the Nimbus client implements the spec fully, but lacks mail server/client, rate limiting and confirmations capability. The status-go client implements everything except bridging mode, which is currently a work in progress.
2020-02-10 05:51:16 +00:00
2020-02-12 04:45:50 +00:00
For more details, see [implementation matrix](https://specs.vac.dev/waku/waku.html#appendix-b-implementation-notes).
In terms of end user applications, work is currently in progress to integrate it into the [Status core app](https://github.com/status-im/status-react/pull/9949) and is expected to be released in their upcoming 1.1 release.
2020-02-10 05:51:16 +00:00
2020-02-12 04:45:50 +00:00
TODO: Fact check this with Adam and Kim
**Simulation:**
2020-02-10 05:51:16 +00:00
We've got a [simulation](https://github.com/status-im/nimbus/tree/master/waku#testing-waku-protocol) in the Nimbus client that verifies - or rather, fails to falsify - the scalability model described in an [earlier post](https://vac.dev/fixing-whisper-with-waku). More on this below.
2020-02-10 05:51:16 +00:00
## How many users does Waku support?
2020-02-12 05:00:18 +00:00
This is our current understanding of how many users a network running the Waku protocol can support. Specifically in the context of the Status chat app, since that's the most immediate consumer of Waku. It should generalize fairly well to most deployments, but YMMW.
**tl;dr (for Status app):**
2020-02-12 05:00:18 +00:00
- beta: 100 DAU
- v1: 1k DAU
- v1.1 (waku only): 10k DAU (up to x10 with deployment hotfixes)
- v1.2 (waku+dns): 100k DAU (can optionally be folded into v1.1)
*Assuming 10 concurrent users = 100 DAU. Estimate uncertainty increases for each order of magnitude until real-world data is observed.*
As far as we know right now, these are the bottlenecks we have:
2020-02-12 05:00:18 +00:00
- Immediate bottleneck - Receive bandwidth for end user clients (aka Fixing Whisper with Waku)
- Very likely bottleneck - Nodes and cluster capacity (aka DNS based node discovery)
- Conjecture but not unlikely to appear- Full node traffic (aka the routing / partition problem)
2020-02-12 05:00:18 +00:00
We've already seen the first bottleneck being discussed in the initial post. Dean wrote a post on [DNS based discovery](https://vac.dev/dns-based-discovery) which explains how we will address the likely second bottleneck. More on the third one in future posts.
For more details on these bottlenecks, uncertainty and mitigations, see [Scalability estimate: How many users can Waku and the Status app support?](https://discuss.status.im/t/scalability-estimate-how-many-users-can-waku-and-the-status-app-support/1514).
2020-02-10 05:51:16 +00:00
2020-02-12 05:00:18 +00:00
TODO: Elaborate on bottleneck 3, kad etc
2020-02-10 05:51:16 +00:00
2020-02-12 05:00:18 +00:00
## Simulation
2020-02-10 05:51:16 +00:00
2020-02-13 05:41:35 +00:00
The ultimate test is real-world usage. Until then, we have a simulation thanks to Kim De Mey from the Nimbus team!
2020-02-10 05:51:16 +00:00
2020-02-12 05:17:40 +00:00
![](assets/img/waku_simulation.jpeg)
2020-02-10 05:51:16 +00:00
2020-02-13 05:41:35 +00:00
We have two network topologies, Star and full mesh, with 6 randomly connected nodes, one traditional light node with bloom filter (Whisper style) and one Waku light node.
One of the full nodes sends 1 envelope over 1 of the 100 topics that the two light nodes subscribe to. After that, it sends 10000 envelopes over random topics.
2020-02-12 05:15:19 +00:00
2020-02-13 05:41:35 +00:00
For light node, bloom filter is set to almost 10% false positive (bloom filter: n=100, k=3, m=512). It shows the number of valid and invalid envelopes received for the different nodes.
2020-02-12 05:15:19 +00:00
**Star network:**
| Description | Peers | Valid | Invalid |
|-----------------|-------|-------|---------|
| Master node | 7 | 10001 | 0 |
| Full node 1 | 3 | 10001 | 0 |
| Full node 2 | 1 | 10001 | 0 |
| Full node 3 | 1 | 10001 | 0 |
| Full node 4 | 1 | 10001 | 0 |
| Full node 5 | 1 | 10001 | 0 |
| Light node | 2 | 815 | 0 |
| Waku light node | 2 | 1 | 0 |
**Full mesh:**
| Description | Peers | Valid | Invalid |
|-----------------|-------|-------|---------|
| Full node 0 | 7 | 10001 | 20676 |
| Full node 1 | 7 | 10001 | 9554 |
| Full node 2 | 5 | 10001 | 23304 |
| Full node 3 | 5 | 10001 | 11983 |
| Full node 4 | 5 | 10001 | 24425 |
| Full node 5 | 5 | 10001 | 23472 |
| Light node | 2 | 803 | 803 |
| Waku light node | 2 | 1 | 1 |
Things to note:
2020-02-13 05:41:35 +00:00
- Whisper light node with ~10% false positive gets ~10% of total traffic
2020-02-12 05:15:19 +00:00
- Waku light node gets ~1000x less envelopes than Whisper light node
- Full mesh results in a lot more duplicate messages, expect for Waku light node
Run the simulation yourself [here](https://github.com/status-im/nimbus/tree/master/waku#testing-waku-protocol). The parameters are configurable, and it is integrated with Prometheus and Grafana.
2020-02-10 05:51:16 +00:00
## Difference between Waku and Whisper
Summary of main differences between Waku v0 spec and Whisper v6, as described in [EIP-627](https://eips.ethereum.org/EIPS/eip-627):
2020-02-12 05:20:08 +00:00
- Handshake/Status message not compatible with shh/6 nodes; specifying options as association list
- Include topic-interest in Status handshake
- Upgradability policy
- `topic-interest` packet code
- RLPx subprotocol is changed from shh/6 to waku/0.
- Light node capability is added.
- Optional rate limiting is added.
- Status packet has following additional parameters: light-node, confirmations-enabled and rate-limits
- Mail Server and Mail Client functionality is now part of the specification.
- P2P Message packet contains a list of envelopes instead of a single envelope.
2020-02-10 05:51:16 +00:00
## Next steps and future plans
2020-02-12 05:23:22 +00:00
There are a lot of remaining challenges to make Waku a robust and suitable base
communication protocol. Here we outline a few challenges that we aim to address:
2020-02-10 05:51:16 +00:00
2020-02-12 05:23:22 +00:00
- scalability of the network
- incentived infrastructure and spam-resistance
- build with resource restricted devices in mind, including nodes being mostly offline
2020-02-10 05:51:16 +00:00
2020-02-12 05:23:22 +00:00
When it comes to the third bottleneck, a likely candidate for addressing this
is using Kademlia routing. Stay tuned.