From db9e54a499f9316ee2904aecf2fb141e28769e30 Mon Sep 17 00:00:00 2001
From: Oskar Thoren <ot@oskarthoren.com>
Date: Thu, 28 Nov 2019 13:19:56 +0800
Subject: [PATCH] fresh attempt

---
 _drafts/fixing-whisper-with-waku.md | 201 ++++++++++++++++++++++++++++
 _drafts/fixing-whisper.md           |  37 +----
 2 files changed, 207 insertions(+), 31 deletions(-)
 create mode 100644 _drafts/fixing-whisper-with-waku.md

diff --git a/_drafts/fixing-whisper-with-waku.md b/_drafts/fixing-whisper-with-waku.md
new file mode 100644
index 0000000..e192ae9
--- /dev/null
+++ b/_drafts/fixing-whisper-with-waku.md
@@ -0,0 +1,201 @@
+---
+layout: post
+name:  "Waku - Fixing Whisper"
+title:  "Waku - Fixing Whisper"
+date:   2019-11-28 12:00:00 +0800
+author: oskarth
+published: true
+permalink: /fixing-whisper-with-waku
+categories: research
+summary: A research log. Why Whisper can't scale and how to fix it.
+image: /assets/img/whisper_scalability.png
+---
+
+This post will introduce Waku. Waku is a fork of Whisper that addresses some of
+its shortcomings in an iterative way. It will also show a theoretical scaling
+model for Status.
+
+- Description of Whisper and recap of its issues (gossip, 'darkness', pow, incentive, spec etc)
+- Introduce model
+- Motivation for a new protocol
+- Progress so far
+
+## Whisper theoretical model
+
+Whisper theoretical model. Attempts to encode characteristics of it. Specifically for use case such as one by Status (see [Status Whisper usage spec](https://github.com/status-im/specs/blob/master/status-whisper-usage-spec.md)).
+
+
+### Goals
+1. Ensure network scales by being user or usage bound, as opposed to bandwidth growing in proportion to network size.
+2. Staying with in a reasonable bandwidth limit for limited data plans.
+3. Do the above without materially impacting existing nodes.
+
+
+```
+Case 1. Only receiving messages meant for you [naive case]
+
+Assumptions:
+- A1. Envelope size (static): 1024kb
+- A2. Envelopes / message (static): 10
+- A3. Received messages / day (static): 100
+- A4. Only receiving messages meant for you.
+
+For 100 users, receiving bandwidth is 1000.0KB/day
+For 10k users, receiving bandwidth is 1000.0KB/day
+For  1m users, receiving bandwidth is 1000.0KB/day
+
+------------------------------------------------------------
+Case 2. Receiving messages for everyone [naive case]
+
+Assumptions:
+- A1. Envelope size (static): 1024kb
+- A2. Envelopes / message (static): 10
+- A3. Received messages / day (static): 100
+- A5. Received messages for everyone.
+
+For 100 users, receiving bandwidth is   97.7MB/day
+For 10k users, receiving bandwidth is    9.5GB/day
+For  1m users, receiving bandwidth is  953.7GB/day
+
+------------------------------------------------------------
+Case 3. All private messages go over one discovery topic
+
+Assumptions:
+- A1. Envelope size (static): 1024kb
+- A2. Envelopes / message (static): 10
+- A3. Received messages / day (static): 100
+- A6. Proportion of private messages (static): 0.5
+- A7. Public messages only received by relevant recipients (static).
+- A8. All private messages are received by everyone (same topic) (static).
+
+For 100 users, receiving bandwidth is   49.3MB/day
+For 10k users, receiving bandwidth is    4.8GB/day
+For  1m users, receiving bandwidth is  476.8GB/day
+
+------------------------------------------------------------
+Case 4. All private messages are partitioned into shards [naive case]
+
+Assumptions:
+- A1. Envelope size (static): 1024kb
+- A2. Envelopes / message (static): 10
+- A3. Received messages / day (static): 100
+- A6. Proportion of private messages (static): 0.5
+- A7. Public messages only received by relevant recipients (static).
+- A9. Private messages partitioned across partition shards (static), n=5000
+
+For 100 users, receiving bandwidth is 1000.0KB/day
+For 10k users, receiving bandwidth is    1.5MB/day
+For  1m users, receiving bandwidth is   98.1MB/day
+
+------------------------------------------------------------
+
+Case 5. 4 + Bloom filter with false positive rate
+
+Assumptions:
+- A1. Envelope size (static): 1024kb
+- A2. Envelopes / message (static): 10
+- A3. Received messages / day (static): 100
+- A6. Proportion of private messages (static): 0.5
+- A7. Public messages only received by relevant recipients (static).
+- A9. Private messages partitioned across partition shards (static), n=5000
+- A10. Bloom filter size (m) (static): 512
+- A11. Bloom filter hash functions (k) (static): 3
+- A12. Bloom filter elements, i.e. topics, (n) (static): 100
+- A13. Bloom filter assuming optimal k choice (sensitive to m, n).
+- A14. Bloom filter false positive proportion of full traffic, p=0.1
+
+For 100 users, receiving bandwidth is   10.7MB/day
+For 10k users, receiving bandwidth is  978.0MB/day
+For  1m users, receiving bandwidth is   95.5GB/day
+
+NOTE: Traffic extremely sensitive to bloom false positives
+This completely dominates network traffic at scale.
+With p=1% we get 10k users ~100MB/day and 1m users ~10gb/day)
+------------------------------------------------------------
+Case 6. Case 5 + Benign duplicate receives
+
+Assumptions:
+- A1. Envelope size (static): 1024kb
+- A2. Envelopes / message (static): 10
+- A3. Received messages / day (static): 100
+- A6. Proportion of private messages (static): 0.5
+- A7. Public messages only received by relevant recipients (static).
+- A9. Private messages partitioned across partition shards (static), n=5000
+- A10. Bloom filter size (m) (static): 512
+- A11. Bloom filter hash functions (k) (static): 3
+- A12. Bloom filter elements, i.e. topics, (n) (static): 100
+- A13. Bloom filter assuming optimal k choice (sensitive to m, n).
+- A14. Bloom filter false positive proportion of full traffic, p=0.1
+- A15. Benign duplicate receives factor (static): 2
+- A16. No bad envelopes, bad PoW, expired, etc (static).
+
+For 100 users, receiving bandwidth is   21.5MB/day
+For 10k users, receiving bandwidth is    1.9GB/day
+For  1m users, receiving bandwidth is  190.9GB/day
+
+------------------------------------------------------------
+
+Case 7. 6 + Mailserver under good conditions; small bloom fp; mostly offline
+
+Assumptions:
+- A1. Envelope size (static): 1024kb
+- A2. Envelopes / message (static): 10
+- A3. Received messages / day (static): 100
+- A6. Proportion of private messages (static): 0.5
+- A7. Public messages only received by relevant recipients (static).
+- A9. Private messages partitioned across partition shards (static), n=5000
+- A10. Bloom filter size (m) (static): 512
+- A11. Bloom filter hash functions (k) (static): 3
+- A12. Bloom filter elements, i.e. topics, (n) (static): 100
+- A13. Bloom filter assuming optimal k choice (sensitive to m, n).
+- A14. Bloom filter false positive proportion of full traffic, p=0.1
+- A15. Benign duplicate receives factor (static): 2
+- A16. No bad envelopes, bad PoW, expired, etc (static).
+- A17. User is offline p% of the time (static) p=0.9
+- A18. No bad request, dup messages for mailservers; overlap perfect (static).
+- A19. Mailserver requests can change false positive rate to be p=0.01
+
+For 100 users, receiving bandwidth is    3.9MB/day
+For 10k users, receiving bandwidth is  284.8MB/day
+For  1m users, receiving bandwidth is   27.8GB/day
+
+------------------------------------------------------------
+
+Case 8. No metadata protection w bloom filter; 1 node connected; static shard
+
+Next step up is to either only use contact code, or shard more aggressively.
+Note that this requires change of other nodes behavior, not just local node.
+
+Assumptions:
+- A1. Envelope size (static): 1024kb
+- A2. Envelopes / message (static): 10
+- A3. Received messages / day (static): 100
+- A6. Proportion of private messages (static): 0.5
+- A7. Public messages only received by relevant recipients (static).
+- A9. Private messages partitioned across partition shards (static), n=5000
+
+For 100 users, receiving bandwidth is 1000.0KB/day
+For 10k users, receiving bandwidth is    1.5MB/day
+For  1m users, receiving bandwidth is   98.1MB/day
+
+------------------------------------------------------------
+```
+
+See [source](https://github.com/vacp2p/research/tree/master/whisper_scalability)
+for more detail on the model and its assumptions.
+
+### Takeaways
+
+The results are summed up in the following graph. Notice the log-log scale. The
+colored backgrounds correspond to the following bandwidth usage:
+
+- Blue: <10mb/d (<~300mb/month)
+- Green: <30mb/d (<~1gb/month)
+- Yellow: <100mb/d (<~3gb/month)
+- Red: >100mb/d(>3gb/month)
+
+![](assets/img/whisper_scalability.png)
+
+## Progress so far
+
+,,,
diff --git a/_drafts/fixing-whisper.md b/_drafts/fixing-whisper.md
index 89c8299..4896f5d 100644
--- a/_drafts/fixing-whisper.md
+++ b/_drafts/fixing-whisper.md
@@ -7,44 +7,19 @@ author: oskarth
 published: true
 permalink: /fixing-whisper
 categories: research
-summary: A research log.
+summary: A research log. Why Whisper can't scale and how to fix it.
 image: /assets/img/whisper_scalability.png
 ---
 
 **tldr: Whisper currently can’t scale. This post shows why, and how to fix it.**
 
-## Background
+<!-- What is whisper? -->
 
-TODO: Too Statu specific
+Very few people use Whisper. One of the major consumers of it, Status, has major isues with bandwidth.
 
-We have very few users for the Status app. Despite this, we have issues with bandwidth usage. One of the most common complaints I hear about Status, and the reason core contributors often don’t use it at events for group coordination, is that it consumes too much bandwidth. People often have a limited data plan, and especially at big events we’ve seen community members have their whole data plan drained just by using Status.
-
-For more precise user reports and some rough numbers, see e.g.:
-
-    https://github.com/status-im/status-react/issues/9081 2
-    https://github.com/status-im/status-react/issues/9185 2
-
-We have made some improvements in this regard, both in the past and for the v1 release. Most recently by moving to a partitioned topic as opposed to a single discovery topic. There have also been improvements to mailserver performance 1.
-
-Still, this isn’t enough. At a fundamental level, the confidence that Whisper will scale to any reasonable level is very low, and for good reasons. However, this is more of a rough intuition, and we haven’t done any real studies on this or how to fix it. Right now it’s more like a pebble in our shoe that we keep walking around with, hoping it’ll go away.
-
-There are a few reasons we haven’t made progress on making Whisper more scalable:
-
-1) **Lack of adoption.** Few users means the problem haven’t hit us in any serious way, and the “scalability” issues we’ve solved have mostly been relevant for ~100-1k users. The issues we have seen have not been taken seriously enough, because people don’t depend on Status to function.
-
-2) **Church of Darkness.** One of our core principles is privacy, and this, coupled with lack of rigorous understanding of the protocols we use and their properties, have lead us to put an irrationally high premium on the metadata protection capabilities that Whisper provides.
-
-3) **Timeline expectations.** There are more longer-term plans for replacing Whisper. This is the work that is happening with Vac 1 and together with entities like Block.Science, Swarm and Nym. This means we’ve historically not seen fixing Whisper ourselves as a big priority in the short to medium term.
-
-## Going foward
-
-With v1 of the app soon being out of the door (amazing job everyone!), we are going to start pushing for more adoption. For people to use Status, we need reasonable performance, on par with alternative solutions.
-
-### On metadata protection and a reality check
-
-Considering the financial constraints, we need to push for traction and make Status a joy to use sooner rather than later. This means we can’t have people burn up their data plan and uninstall the app. Later on, we can enhance it with more rigorous guarantees around things like metadata protection, for example through mixnets such as the one Nym is working on.
-
-As an end user, most people care more about being able to use the thing at all than theoretical (and somewhat unrigorous) metadata protection guarantees. Additionally, the proposed solutions will still enable hardcore users to get stronger receiver-anonymity guarantees if they so wish.
+While the general confidence that Whisper will scale is low, the reasons aren't quite clear.
+code
+s an end user, most people care more about being able to use the thing at all than theoretical (and somewhat unrigorous) metadata protection guarantees. Additionally, the proposed solutions will still enable hardcore users to get stronger receiver-anonymity guarantees if they so wish.
 
 It is also worth pointing out that, unlike apps like Signal, we don’t tie users to their identity by a phone number or email address. This is already huge when it comes to privacy. Other apps like Briar also outsource the metadata protection to running on Tor. Now, this comes with issues regarding spam resistance, but that’s a topic for another time.