Merge pull request #47 from waku-org/feat--waku-sync-2

Waku store sync 2.0 specification
2026-01-05 23:53:12 +00:00 · 2025-01-22 17:03:28 -05:00 · 2025-01-22 17:03:28 -05:00 · f2df957ace
commit f2df957ace
parent 2513311b69 27aa8bc9c2
1 changed files with 167 additions and 88 deletions
--- a/standards/core/sync.md
+++ b/standards/core/sync.md
@ -7,82 +7,190 @@ contributors:
 - Hanno Cornelius <hanno@status.im>
 ---
-## Abstract
+# Abstract
-This specification explains the `WAKU-SYNC` protocol
+This specification explains `WAKU-SYNC`
-which enables the reconciliation of two sets of message hashes
+which enables the synchronization of messages between nodes storing sets of [`14/WAKU2-MESSAGE`](https://rfc.vac.dev/waku/standards/core/14/message)
 in the context of keeping multiple Store nodes synchronized.
 Waku Sync is a wrapper around
 [Negentropy](https://github.com/hoytech/negentropy) a [range-based set reconciliation protocol](https://logperiodic.com/rbsr.html).
-## Specification
+# Specification
-**Protocol identifier**: `/vac/waku/sync/1.0.0`
+Waku Sync consists of two libp2p protocols: `reconciliation` and `transfer`.
 The Reconciliation protocol finds differences in sets of messages.
 The Transfer protocol is used to exchange the differences found with other peers.
 The end goal being that peers have the same set of messages.
-### Terminology
+#### Terminology
 The key words “MUST”, “MUST NOT”, “REQUIRED”, “SHALL”, “SHALL NOT”, “SHOULD”, “SHOULD NOT”, 
 “RECOMMENDED”, “MAY”, and “OPTIONAL” in this document are to be interpreted as described in [RFC2119](https://www.ietf.org/rfc/rfc2119.txt).
-The term Negentropy refers to the protocol of the same name.
+## Reconciliation
 Negentropy payload refers to
 the messages created by the Negentropy protocol.
 Client always refers to the initiator
 and the server the receiver of the first payload.
-### Design Requirements
+**Libp2p Protocol identifier**: `/vac/waku/reconciliation/1.0.0`
 Nodes enabling Waku Sync SHOULD
 manage and keep message hashes in a local cache
 for the range of time
 during which synchronization is required.
 Nodes SHOULD use the same time range,
 for Waku we chose one hour as the global default.
 Waku Relay or Light Push protocol MAY be enabled
 and used in conjunction with Sync
 as a source of new message hashes
 for the cache.
-Nodes MAY use the Store protocol
+The protocol finds differences between two peers by
-to request missing messages once reconciliation is complete
+comparing _fingerprints_ of _ranges_ of message _IDs_.
-or to provide messages to requesting clients.
+_Ranges_ are encoded into payloads, exchanged between the peers and when the range _fingerprints_ are different, split into smaller (sub)ranges.
 This process repeats until _ranges_ include a small number of messages.
 At this point lists of message _IDs_ are sent for comparison instead of _fingerprints_ over entire ranges of messages.
-### Payload
+#### Overview
 The `reconciliation` protocol follows the following heuristic:
 1. The requestor chooses a time range to sync.
 2. The range is encoded into a payload and sent.
 3. The requestee receives the payload and decodes it.
 4. The range is processed and, if a difference with the local range is detected, a set of subranges are produced.
 5. The new ranges are encoded and sent.
 6. This process repeats while differences found are sent to the `transfer` protocol.
 7. The synchronization ends when all ranges have been processed and no differences are left.
 #### Message IDs
 Message _IDs_ MUST be composed of the timestamp and the hash of the [`14/WAKU2-MESSAGE`](https://rfc.vac.dev/waku/standards/core/14/message).
 The timestamp MUST be the time of creation and
 the hash MUST follow the
 [deterministic message hashing specification](https://rfc.vac.dev/waku/standards/core/14/message#deterministic-message-hashing)
 > This way the message IDs can always be totally ordered,
 first chronologically according to the timestamp and then
 disambiguated based on the hash lexical order
 in cases where the timestamp is the same.
 #### Range Bounds
 A _range_ MUST consist of two _IDs_, the first bound is
 inclusive the second bound exclusive.
 The first bound MUST be strictly smaller than the second one.
 #### Range Fingerprinting
 The _fingerprint_ of a range MUST be the XOR operation applied to
 the hash of all message _IDs_ included in that _range_.
 #### Range Type
 Every _range_ MUST have one of the following types; _fingerprint_, _skip_ or _item set_.
 - _Fingerprint_ type contains a _fingerprint_.
 - _Skip_ type contains nothing and is used to signal already processed _ranges_.
 - _Item set_ type contains message _IDs_ and a _resolved_ boolean.
 > _Item sets_ are an optimization, sending multiple _IDs_ instead of
 recursing further reduce the number of round-trips.
 #### Range Processing
 _Ranges_ have to be processed differently according to their types.
 - _Fingerprint_ ranges MUST be compared.
  - **Equal** ranges MUST become _skip ranges_.
  - **Unequal** ranges MUST be split into smaller _fingerprint_ or _item set_ ranges based on a implementation specific threshold.
 - **Unresolved** _item set_ ranges MUST be compared, differences sent to the `transfer` protocol and marked resolved.
 - **Resolved** _item set_ ranges MUST be compared, differences sent to the `transfer` protocol and become skip ranges.
 - _Skip_ ranges MUST be merged with other consecutive _skip ranges_.
 In the case where only skip ranges remains, the synchronization is done.
 ### Delta Encoding
 _Ranges_ and timestamps MUST be delta encoded as follows for efficient transmission.
 All _ranges_ to be transmitted MUST be ordered and only upper bounds used.
 > Inclusive lower bounds can be omitted because they are always
 the same as the exclusive upper bounds of the previous range or zero.
 To achieve this, it MAY be needed to add _skip ranges_.
 > For example, a _skip range_ can be added with
 an exclusive upper bound equal to the first range lower bound.
 This way the receiving peer knows to ignore the range from zero to the start of the sync time window.
 Every _ID_'s timestamps after the first MUST be noted as the difference from the previous one.
 If the timestamp is the same, zero MUST be used and the hash MUST be added.
 The added hash MUST be truncated up to and including the first differentiating byte.
 | Timestamp | Hash | Timestamp (encoded) | Hash (encoded) 
 | - | - | - | -
 | 1000 | 0x4a8a769a... | 1000 | -
 | 1002 | 0x351c5e86... | 2 | -
 | 1002 | 0x3560d9c4... | 0 | 0x3560
 | 1003 | 0xbeabef25... | 1 | -
 #### Varints
 All _varints_ MUST be little-endian base 128 variable length integers (LEB128) and minimally encoded.
 #### Payload encoding
 The wire level payload MUST be encoded as follow.
 > The & denote concatenation.
 1. _varint_ bytes of the delta encoded timestamp &
 2. if timestamp is zero, 1 byte for the hash bytes length & the hash bytes &
 3. 1 byte, the _range_ type &
 4. either
    - 32 bytes _fingerprint_ or
    - _varint_ bytes of the item set length & bytes of every items or
    - if _skip range_, do nothing
 5. repeat steps 1 to 4 for all ranges.
 ## Transfer Protocol
 **Libp2p Protocol identifier**: `/vac/waku/transfer/1.0.0`
 The transfer protocol SHOULD send messages as soon as
 a difference is found via reconciliation.
 It MUST only accept messages from peers the node is reconciliating with.
 New message IDs MUST be added to the reconciliation protocol.
 The payload sent MUST follow the wire specification below.
 ### Wire specification
 ```protobuf
 syntax = "proto3";
-package waku.sync.v1;
+package waku.sync.transfer.v1;
-message SyncPayload {
+import "waku/message/v1/message.proto";
  optional bytes negentropy = 1;
-  repeated bytes hashes = 20;
+message WakuMessageAndTopic {
  // Full message content and associated pubsub_topic as value
  optional waku.message.v1.WakuMessage message = 1;
  optional string pubsub_topic = 2;
 }
 ```
-### Session Flow
+# Implementation
-A client initiates a session with a server
+The flexibility of the protocol implies that much is left to the implementers.
-by sending a `SyncPayload` with
+What will follow is NOT part of the specification.
-only the `negentropy` field set.
+This section was created to inform implementations.
 This field MUST contain
 the first negentropy payload
 created by the client
 for this session.
-The server receives a `SyncPayload`.
+#### Parameters 
-A new negentropy payload is computed from the received one.
+Two useful parameters to add to your implementation are partitioning count and the item set threshold.
 The server sends back a `SyncPayload` to the client.
-The client receives a `SyncPayload`.
+The partitioning count is the number of time a range is split.
-A new negentropy payload OR an empty one is computed.
+A higher value reduces round trips at the cost of computing more fingerprints.
-If a new payload is computed then
+
-the exchanges between client and server continues until
+The item set threshold determines when item sets are sent instead of fingerprints.
-the client computes an empty payload.
+A higher value sends more items which means higher chance of duplicates but
-This client computation also outputs any hash differences found,
+reduces the amount of round trips overall.
-those MUST be stored.
+
-In the case of an empty payload,
+#### Storage
-the reconciliation is done,
+The storage implementation should reflect the context.
-the client MUST send back a `SyncPayload`
+Most messages that will be added will be recent and
-with all the missing server hashes in the `hashes` field and
+removed messages will be older ones.
-an empty `nengentropy` field.
+When differences are found some messages will have to be inserted randomly.
 It is expected to be a less likely case than time based insertion and removal.
 Last but not least it must be optimized for fingerprinting
 as it is the most often used operation.
 #### Sync Interval
 Ad-hoc syncing can be useful in some cases but continuous periodic sync
 minimize the differences in messages stored across the network.
 Syncing early and often is the best strategy.
 The default used in Nwaku is 5 minutes interval between sync with a range of 1 hour.
 #### Sync Window
 By default we offset the sync window by 20 seconds in the past.
 The actual start of the sync range is T-01:00:20 and the end T-00:00:20 in most cases.
 This is to handle the inherent jitters of GossipSub.
 In other words, it is the amount of time needed to confirm if a message is missing or not.
 #### Peer Choice
 Wrong peering strategies can lead to inadvertently segregating peers and
 reduce sampling diversity.
 Nwaku randomly select peers to sync with for simplicity and robustness.
 More sophisticated strategies may be implemented in future.
 ## Attack Vectors
 Nodes using `WAKU-SYNC` are fully trusted.
@ -92,41 +200,12 @@ Further refinements to the protocol are planned
 to reduce the trust level required to operate.
 Notably by verifying messages RLN proof at reception.
 ## Implementation
 The following is not part of the specifications but good to know implementation details.
 ### Peer Choice
 Peering strategies can lead to inadvertently segregating peers and reduce sampling diversity.
 We randomly select peers to sync with for simplicity and robustness.
 A good strategy can be devised but we chose not to.
 ### Interval
 Ad-hoc syncing can be useful in some cases but continuous periodic sync
 minimize the differences in messages stored across the network.
 Syncing early and often is the best strategy.
 The default used in nwaku is 5 minutes interval between sync with a range of 1 hour.
 ### Range
 We also offset the sync range by 20 seconds in the past.
 The actual start of the sync range is T-01:00:20 and the end T-00:00:20
 This is to handle the inherent jitters of GossipSub.
 In other words, it is the amount of time needed to confirm if a message is missing or not.
 ### Storage
 The storage implementation should reflect the Waku context.
 Most messages that will be added will be recent and
 all removed messages will be older ones.
 When differences are found some messages will have to be inserted randomly.
 It is expected to be a less likely case than time based insertion and removal.
 Last but not least it must be optimized for sequential read
 as it is the most often used operation.
 ## Copyright
 Copyright and related rights waived via
 [CC0](https://creativecommons.org/publicdomain/zero/1.0/).
 ## References
- - https://logperiodic.com/rbsr.html
+ - [RBSR](https://github.com/AljoschaMeyer/rbsr_short/blob/main/main.pdf)
- - https://github.com/hoytech/negentropy
+ - [Negentropy Explainer](https://logperiodic.com/rbsr.html)
 - [Master Thesis](https://github.com/AljoschaMeyer/master_thesis/blob/main/main.pdf)