From 564225920c8073b78bf85e29578413da40d2e715 Mon Sep 17 00:00:00 2001 From: Jazz Turner-Baggs <473256+jazzz@users.noreply.github.com> Date: Tue, 21 Oct 2025 21:40:28 -0700 Subject: [PATCH 01/20] WIP --- standards/application/privatev1.md | 268 +++++++++++++++++++++++++++++ 1 file changed, 268 insertions(+) create mode 100644 standards/application/privatev1.md diff --git a/standards/application/privatev1.md b/standards/application/privatev1.md new file mode 100644 index 0000000..06c50c5 --- /dev/null +++ b/standards/application/privatev1.md @@ -0,0 +1,268 @@ +--- +title: PRIVATE1 +name: Private conversation +category: Standards Track +tags: an optional list of tags, not standard +editor: Jazz Alyxzander (@Jazzz) +contributors: +--- + +# Abstract + + +# Background + +Pairwise encrypted messaging channels are a foundational component in building chat systems. They allow for confidential, authenticated payloads to be delivered between two clients. Groupchats an more conversation based communication often rely on pairwise channels (at least partially) to deliver state updates and coordination messages. + +Having robust pairwise communication channels allow for 1:1 communication while also providing the infrastructure for more complicated communication. + +# Private V1 + +PrivateV1 is conversation type which establishes a full-duplex secure channel between two participants. + +Private Conversations have the following properties: + - Payload Confidentiality: Only the participants can read the contents of any message sent. + - Content Integrity: Recipients can detect if the contents were modified by a third party. + - Sender Privacy: Only the recipient can determine who the sender was. + - Forward Secrecy: A compromise in the future does not allow previous messages to be decrypted by a third party. + - Post Compromise Security: Conversations eventually recover from a compromise which occurs today. + - Message Reliablity: Messages sent with this protocol are + - Partial Message Order: !TODO: + +## Definitions + +This document makes use of the shared terminology defined in the [CHAT-DEFINITIONS](https://github.com/waku-org/specs/blob/jazzz/chatdefs/informational/chatdefs.md) specification. + +The terms include: +- Recipient +- Sender +- Payload +- Content +- Participant +- Application + + +## Architecture + +This conversation type assumes there is some service or application which wishes to generate and receive encrypted content. It also assumes that some other component will be responsible for delivering the generated payloads. How messages are sent are to be determined by implementors. + +```mermaid +flowchart LR + Content:::plain--> Privatev1 --> Payload:::plain + classDef plain fill:none,stroke:transparent; +``` + +### Application +Responsible for the creation and generation of content. + +### Delivery Service +This protocols assumes there is a abstract delivery service which is responsible for routing payloads to their destination. See: [TODO](link.to.spec). + +## Initialization + +Prior to operation participants MUST agree on the following parameters fopr each conversation. +- `rda` - delivery address (recipient) +- `sda` - delivery address (sender) +- `sk` - initial secret key [32 bytes] + +To maintain the security properties `sk`: +- MUST be known only by the participants. +- MUST be mutually authenticated. + +Additionally implementations MUST determine the following constants: +- `max_seg_size` - maximum segmentation size. +- `max_skip` - number of keys which can be skipped per session. + +## Operation + +There are 3 phases to operation. + + +```mermaid +flowchart TD + C("Content"):::plain + S(Segmentation) + R(Reliability) + E(Encryption) + D(Delivery):::plain + C --> S --> R --> E --> D + + classDef plain fill:none,stroke:transparent; +``` + +- **Segmentation**: Divides contents into smaller fragments for transportation. +- **Reliability**: Adds tracking information to detect dropped messages. +- **Encryption**: + +The output of each phase of the operational pipeline is the input of the next. + +### Segmentation +This protocol places no restriction on the size of the content to be delivered. In order to support restrictions of any delivery service messages are segmented to a predefined size. + +The segmentation strategy used is defined by [!TODO: Flatten link once completed](https://github.com/waku-org/specs/pull/91) + +!TODO: ^Spec currently has a limit of + +### Message Reliability +Scalable Data Sync is used to detect missing messages and provide delivery receipts to the sender after successful reception of a payload. +SDS is implementated according to the [specification](https://github.com/vacp2p/rfc-index/blob/3505da6bd66d2830e5711deb0b5c2b4de9212a4d/vac/raw/sds.md). + + +!TODO: define: sender_id mapping +!TODO: define: message_id mapping +!TODO: update to latest version and inlcude SDS-R + +!NOTE: The defaultConfig in nim-SDS creates a bloom filter with the parameters n=10000, p=0.001 which has a size of ~18KiB. The bloom filter is included in every message which results in a bestcase overhead rate of 13.3% (assuming waku's MSS of 150KB). Given a target content size of 4KB, that puts the utilization factor at 80+% (Without considering other layers). This needs to be looked at, lowering to n=2000 would lower overhead to ~3.5 KiB. + +### Encryption + +Payloads are encrypted using [doubleratchet](https://signal.org/docs/specifications/doubleratchet/). + +With the following choices for external fucntions: +- `DH`: X25519 +- `KDF_RF`: HKDF with SHA256, info = "logoschat_privatev1" +- `KDF_CK`: HKDF with SHA256, input = "0x01 for message key, and "0x02" for chainkey +- `KDF_MK`: HKDF with SHA256, hdkf.info = "PrivateV1MessageKey" +- `ENCRYPT`: Implemented with AEAD_CHACHA20_POLY1305 + +!TODO: Define AssociatedData + +AEAD_CHACHA20_POLY1305 is implemented using randomly generated nonces. The nonce and tag are combined with the ciphertext for transport where `ciphertext = nonce || encrypted_bytes || tag`. + + +# Wire Format Specification / Syntax + +## Payload Parse Tree + +A deterministic parse tree is used to avoid ambiguity when recieving payloads. + +```mermaid +flowchart TD + + D[DoubleRatchet] + S[SDS Message] + Seg1[ Segment] + Seg2[ Segment] + Seg3[ Segment] + P[PrivateV1Frame] + + start@{ shape: start } + start --> D + D -->|Payload| S + S -->|Payload| Seg1 + + Seg1 --> P + Seg2:::plain --> P + Seg3:::plain --> P + + P --> T{frame_type} + T --> ContentFrame + T --> Placeholder + + classDef plain fill:none,stroke:transparent; +``` +!TODO: Replace placeholder + + +## Payloads +!TODO: Don't duplicate payload definitions from other specs. Though its helpful for now. + +### Encrypted Payload +```protobuf +message Doubleratchet { + bytes dh = 1; // 32 byte publickey + uint32 msgNum = 2; + uint32 prevChainLen = 3; + bytes ciphertext = 4; // arbitrary length bytes +} +``` +**dh**: the x component of the dh_pair.publickey encoded as raw bytes. +**ciphertext**: A protobuf encoded SDS Message + +### SDS Message + +This payload is used without modification from the SDS Spec. + +```protobuf +message HistoryEntry { + string message_id = 1; + bytes retrieval_hint = 2; + } + +message ReliablePayload { + string message_id = 2; + string channel_id = 3; + int32 lamport_timestamp = 10; + repeated HistoryEntry causal_history = 11; + bytes bloom_filter = 12; + bytes content = 20; + } +``` + +**content:** This field is an protobuf encoded `Segment` + +!TODO: Why is SDS using signed int for timestamps? + +### Segmentation + +This payload is used without modification form the Segmentation [specification](https://github.com/waku-org/specs/blob/fa2993b427f12796356a232c54be75814fac5d98/standards/application/segmentation.md) + +```proto + +message SegmentMessageProto { + bytes entire_message_hash = 1; // 32 Bytes + uint32 index = 2; + uint32 segments_count = 3; + bytes payload = 4; + uint32 parity_segment_index = 5; + uint32 parity_segments_count = 6; +} + +``` + +**payload**: This field is an protobuf encoded `PrivateV1Frame` + +### Frame + +```protobuf +message PrivateV1Frame { + uint64 timestamp = 3; // Sender reported timestamp + oneof frame_type { + common_frames.ContentFrame content = 10; + Placeholder placeholder = 11; + // .... + } +} +``` + + + +## Implementation Suggestions +An *implementation suggestions* section may provide suggestions on how to approach implementation details, +as well as more context an implementer may need to be aware off when proceeding with the implementation. + +if available, point to existing implementations for reference. + + +## (Further Optional Sections) + + +## Security/Privacy Considerations + +### Segmentation Session Binding + + + + +### Privacy - ContentSize + + + + +## Copyright + +Copyright and related rights waived via [CC0](https://creativecommons.org/publicdomain/zero/1.0/). + +## References + +A list of references. \ No newline at end of file From 56f7272a6ce672a99bd4d222e4bb91ec8783a9e7 Mon Sep 17 00:00:00 2001 From: Jazz Turner-Baggs <473256+jazzz@users.noreply.github.com> Date: Wed, 22 Oct 2025 10:35:23 -0700 Subject: [PATCH 02/20] Cleanup of Arch, Initialization and Segmentation --- standards/application/privatev1.md | 29 ++++++++++++++++------------- 1 file changed, 16 insertions(+), 13 deletions(-) diff --git a/standards/application/privatev1.md b/standards/application/privatev1.md index 06c50c5..f4feb3d 100644 --- a/standards/application/privatev1.md +++ b/standards/application/privatev1.md @@ -34,17 +34,18 @@ Private Conversations have the following properties: This document makes use of the shared terminology defined in the [CHAT-DEFINITIONS](https://github.com/waku-org/specs/blob/jazzz/chatdefs/informational/chatdefs.md) specification. The terms include: -- Recipient -- Sender -- Payload +- Application - Content - Participant -- Application +- Payload +- Recipient +- Sender ## Architecture -This conversation type assumes there is some service or application which wishes to generate and receive encrypted content. It also assumes that some other component will be responsible for delivering the generated payloads. How messages are sent are to be determined by implementors. +This conversation type assumes there is some service or application which wishes to generate and receive encrypted content. +It also assumes that some other component will be responsible for delivering the generated payloads. ```mermaid flowchart LR @@ -52,16 +53,16 @@ flowchart LR classDef plain fill:none,stroke:transparent; ``` -### Application -Responsible for the creation and generation of content. +### Payload Delivery +How payloads are sent and received by clients is not described in this protocol. +The choice of delivery method has no impact on the security of this conversation type, though the choice may affect sender privacy and censorship resistance. +In practice, any best-effort method of transmitting payloads will suffice, as no assumptions are made. -### Delivery Service -This protocols assumes there is a abstract delivery service which is responsible for routing payloads to their destination. See: [TODO](link.to.spec). ## Initialization -Prior to operation participants MUST agree on the following parameters fopr each conversation. -- `rda` - delivery address (recipient) +Prior to operation participants MUST agree on the following parameters for each conversation. +- `rda` - delivery address (recipient) !TODO: Can delivery addresses be removed from this spec? - `sda` - delivery address (sender) - `sk` - initial secret key [32 bytes] @@ -92,12 +93,14 @@ flowchart TD - **Segmentation**: Divides contents into smaller fragments for transportation. - **Reliability**: Adds tracking information to detect dropped messages. -- **Encryption**: +- **Encryption**: Provides confidentiality and tamper resistence. The output of each phase of the operational pipeline is the input of the next. ### Segmentation -This protocol places no restriction on the size of the content to be delivered. In order to support restrictions of any delivery service messages are segmented to a predefined size. +Thought the protocol has no limitation, it is assumed that a delivery mechanism MAY have restrictions on the max message size. +While this is a transport level issue, it's included here because defering segementation has negative impacts on bandwidth efficiency and privacy. Forcing the transport layer to handle segmentation would require either reassembling unauthenticated segments which are open to malicious interference or implementing encryption at the transport layer. +In the event of a dropped payload, segmentation after reliability would require clients to re-broadcast entire frames, rather than only the missing segments. Increasing load on the network, and increasing a DOS attack surface. To optimize the entire pipeline, segmentation is handled first, so that segments can benefit from the reliability and robust encryption already in place. The segmentation strategy used is defined by [!TODO: Flatten link once completed](https://github.com/waku-org/specs/pull/91) From 1578ba29b2016ab85d9c733940b8b79cf5a1db2f Mon Sep 17 00:00:00 2001 From: Jazz Turner-Baggs <473256+jazzz@users.noreply.github.com> Date: Fri, 24 Oct 2025 13:19:17 -0700 Subject: [PATCH 03/20] frame handling + initialization --- standards/application/privatev1.md | 109 +++++++++++++++++++---------- 1 file changed, 72 insertions(+), 37 deletions(-) diff --git a/standards/application/privatev1.md b/standards/application/privatev1.md index f4feb3d..ad48c27 100644 --- a/standards/application/privatev1.md +++ b/standards/application/privatev1.md @@ -12,7 +12,7 @@ contributors: # Background -Pairwise encrypted messaging channels are a foundational component in building chat systems. They allow for confidential, authenticated payloads to be delivered between two clients. Groupchats an more conversation based communication often rely on pairwise channels (at least partially) to deliver state updates and coordination messages. +Pairwise encrypted messaging channels are a foundational component in building chat systems. They allow for confidential, authenticated payloads to be delivered between two clients. Groupchats and channel based communication often rely on pairwise channels (at least partially) to deliver state updates and coordination messages. Having robust pairwise communication channels allow for 1:1 communication while also providing the infrastructure for more complicated communication. @@ -26,7 +26,7 @@ Private Conversations have the following properties: - Sender Privacy: Only the recipient can determine who the sender was. - Forward Secrecy: A compromise in the future does not allow previous messages to be decrypted by a third party. - Post Compromise Security: Conversations eventually recover from a compromise which occurs today. - - Message Reliablity: Messages sent with this protocol are + - Message Reliability: Messages sent with this protocol are - Partial Message Order: !TODO: ## Definitions @@ -49,7 +49,7 @@ It also assumes that some other component will be responsible for delivering the ```mermaid flowchart LR - Content:::plain--> Privatev1 --> Payload:::plain + ContentFrame:::plain--> Privatev1 --> Payload:::plain classDef plain fill:none,stroke:transparent; ``` @@ -58,23 +58,32 @@ How payloads are sent and received by clients is not described in this protocol. The choice of delivery method has no impact on the security of this conversation type, though the choice may affect sender privacy and censorship resistance. In practice, any best-effort method of transmitting payloads will suffice, as no assumptions are made. +### Content +This protocol expects that all content be wrapped in a ContentFrame as per [CONTENTFRAME](https://github.com/waku-org/specs/blob/jazzz/content_frame/standards/application/contentframe.md) specification. + +This increases observability when issues arise due to client versions mismatches. By enforcing that only ContentFrames will be passed to applications, this creates a clear boundary between Content and protocol owned meta messages. ## Initialization -Prior to operation participants MUST agree on the following parameters for each conversation. +The channel is initialized by both sender and recipient agreeing on the following values for each conversation: +- `sk` - initial secret key [32 bytes] +- `ssk` - sender DH seed key +- `rsk` - recipient DH seed key - `rda` - delivery address (recipient) !TODO: Can delivery addresses be removed from this spec? - `sda` - delivery address (sender) -- `sk` - initial secret key [32 bytes] -To maintain the security properties `sk`: -- MUST be known only by the participants. -- MUST be mutually authenticated. + +To maintain the security properties: +- `sk` MUST be known only by the participants. +- `sk` MUST be derived in a way that ensures mutual authentication of the participants +- `sk` SHOULD have forward secrecy by incorporating ephemeral key material +- `rsk` and `ssk` SHOULD incorporate ephemeral key material Additionally implementations MUST determine the following constants: -- `max_seg_size` - maximum segmentation size. -- `max_skip` - number of keys which can be skipped per session. +- `max_seg_size` - maximum segmentation size to be used. +- `max_skip` - number of keys which can be skipped per session. Values are determined by -## Operation +## Frame Encoding There are 3 phases to operation. @@ -93,38 +102,43 @@ flowchart TD - **Segmentation**: Divides contents into smaller fragments for transportation. - **Reliability**: Adds tracking information to detect dropped messages. -- **Encryption**: Provides confidentiality and tamper resistence. +- **Encryption**: Provides confidentiality and tamper resistance. The output of each phase of the operational pipeline is the input of the next. ### Segmentation Thought the protocol has no limitation, it is assumed that a delivery mechanism MAY have restrictions on the max message size. -While this is a transport level issue, it's included here because defering segementation has negative impacts on bandwidth efficiency and privacy. Forcing the transport layer to handle segmentation would require either reassembling unauthenticated segments which are open to malicious interference or implementing encryption at the transport layer. -In the event of a dropped payload, segmentation after reliability would require clients to re-broadcast entire frames, rather than only the missing segments. Increasing load on the network, and increasing a DOS attack surface. To optimize the entire pipeline, segmentation is handled first, so that segments can benefit from the reliability and robust encryption already in place. +While this is a transport level issue, it's included here because deferring segmentation has negative impacts on bandwidth efficiency and privacy. +Forcing the transport layer to handle segmentation would require either reassembling unauthenticated segments (which are open to malicious interference) or implementing encryption at the transport layer. +In the event of a dropped payload, segmentation after reliability would require clients to re-broadcast entire frames, rather than only the missing segments. +This unnecessarily increases load on the network/clients and increases a DOS attack surface. +To optimize the entire pipeline, segmentation is handled first, so that segments can benefit from the reliability and robust encryption already in place. The segmentation strategy used is defined by [!TODO: Flatten link once completed](https://github.com/waku-org/specs/pull/91) +Implementation specifics: +- Error correction is not used, as reliable delivery is already provided by lower layers. +- `segmentSize` = `max_seg_size` !TODO: ^Spec currently has a limit of ### Message Reliability Scalable Data Sync is used to detect missing messages and provide delivery receipts to the sender after successful reception of a payload. -SDS is implementated according to the [specification](https://github.com/vacp2p/rfc-index/blob/3505da6bd66d2830e5711deb0b5c2b4de9212a4d/vac/raw/sds.md). - +SDS is implemented according to the [specification](https://github.com/vacp2p/rfc-index/blob/3505da6bd66d2830e5711deb0b5c2b4de9212a4d/vac/raw/sds.md). !TODO: define: sender_id mapping !TODO: define: message_id mapping -!TODO: update to latest version and inlcude SDS-R +!TODO: update to latest version and include SDS-R -!NOTE: The defaultConfig in nim-SDS creates a bloom filter with the parameters n=10000, p=0.001 which has a size of ~18KiB. The bloom filter is included in every message which results in a bestcase overhead rate of 13.3% (assuming waku's MSS of 150KB). Given a target content size of 4KB, that puts the utilization factor at 80+% (Without considering other layers). This needs to be looked at, lowering to n=2000 would lower overhead to ~3.5 KiB. +!NOTE: The defaultConfig in nim-SDS creates a bloom filter with the parameters n=10000, p=0.001 which has a size of ~18KiB. The bloom filter is included in every message which results in a best-case overhead rate of 13.3% (assuming waku's MSS of 150KB). Given a target content size of 4KB, that puts the utilization factor at 80+% (Without considering other layers). This needs to be looked at, lowering to n=2000 would lower overhead to ~3.5 KiB. ### Encryption -Payloads are encrypted using [doubleratchet](https://signal.org/docs/specifications/doubleratchet/). +Payloads are encrypted using [`doubleratchet`](https://signal.org/docs/specifications/doubleratchet/). -With the following choices for external fucntions: +With the following choices for external functions: - `DH`: X25519 -- `KDF_RF`: HKDF with SHA256, info = "logoschat_privatev1" -- `KDF_CK`: HKDF with SHA256, input = "0x01 for message key, and "0x02" for chainkey +- `KDF_RF`: HKDF with SHA256, info = `logoschat_privatev1` +- `KDF_CK`: HKDF with SHA256, input = "0x01 for message_key, and "0x02" for chain_key - `KDF_MK`: HKDF with SHA256, hdkf.info = "PrivateV1MessageKey" - `ENCRYPT`: Implemented with AEAD_CHACHA20_POLY1305 @@ -133,30 +147,39 @@ With the following choices for external fucntions: AEAD_CHACHA20_POLY1305 is implemented using randomly generated nonces. The nonce and tag are combined with the ciphertext for transport where `ciphertext = nonce || encrypted_bytes || tag`. +## Frame Handling + +This protocol uses explicit tagging of content, to remove ambiguity when parsing/handling frames. +This allows for clear distinction between content and frames providing protocol functionality. Even if new frames are added in the future, Clients can be certain whether the payload is intended for itself or applications. This is achieved through an invariant - All non-content frames are intended to be consumed by the client. When a new unknown frame arrives it can be certain that a version compatibility issue has occurred. + +- Clients SHALL only pass content frames to Applications +- Clients MAY drop unrecognized frames + + # Wire Format Specification / Syntax ## Payload Parse Tree -A deterministic parse tree is used to avoid ambiguity when recieving payloads. +A deterministic parse tree is used to avoid ambiguity when receiving payloads. ```mermaid flowchart TD D[DoubleRatchet] S[SDS Message] - Seg1[ Segment] - Seg2[ Segment] - Seg3[ Segment] + Segment1[ Segment] + Segment2[ Segment] + Segment3[ Segment] P[PrivateV1Frame] start@{ shape: start } start --> D D -->|Payload| S - S -->|Payload| Seg1 + S -->|Payload| Segment1 - Seg1 --> P - Seg2:::plain --> P - Seg3:::plain --> P + Segment1 --> P + Segment2:::plain --> P + Segment3:::plain --> P P --> T{frame_type} T --> ContentFrame @@ -208,9 +231,9 @@ message ReliablePayload { ### Segmentation -This payload is used without modification form the Segmentation [specification](https://github.com/waku-org/specs/blob/fa2993b427f12796356a232c54be75814fac5d98/standards/application/segmentation.md) +This payload is used without modification from the Segmentation [specification](https://github.com/waku-org/specs/blob/fa2993b427f12796356a232c54be75814fac5d98/standards/application/segmentation.md) -```proto +```protobuf message SegmentMessageProto { bytes entire_message_hash = 1; // 32 Bytes @@ -225,6 +248,8 @@ message SegmentMessageProto { **payload**: This field is an protobuf encoded `PrivateV1Frame` +!TODO: This should be encoded as a FrameType so it can be optional. + ### Frame ```protobuf @@ -238,20 +263,30 @@ message PrivateV1Frame { } ``` +!TODO: This is the only place where this protocol is explicitly dependent on ContentFrame. +A concept of a content tagging is required, but the exact structure could be a implementation detail. How best to abstract the exact type away? + ## Implementation Suggestions -An *implementation suggestions* section may provide suggestions on how to approach implementation details, -as well as more context an implementer may need to be aware off when proceeding with the implementation. -if available, point to existing implementations for reference. +### Initialization +Mutual authentication is provided by the `sk`, so there is no requirement of using authenticated keys for `ssk` and `rsk`. Implementations SHOULD use the most ephemeral key available in order incorporate as much key material as possible. This means that senders SHOULD generate a new ephemeral key for `ssk` for every conversation assuming channels are asynchronously initialized. -## (Further Optional Sections) +### Excessive Skipped Message +Handling of skipped message keys is not strictly defined in double ratchet. Implementations need to choose an strategy which works best for their environment, and delivery mechanism. Halting operation of the channel is the safest, as it bounds resource utilization in the event of a DOS attack but is not always possible. + +If eventual delivery of messages is not guaranteed, implementors should regularly delete keys that are older than a given time window. Unreliable delivery mechanisms will result in increased key storage over time, as more messages are lost with no hope of delivery. + +!TODO: Worth making deletion of stale keys part of the spec? ## Security/Privacy Considerations +### Sender Derivation + + ### Segmentation Session Binding @@ -268,4 +303,4 @@ Copyright and related rights waived via [CC0](https://creativecommons.org/public ## References -A list of references. \ No newline at end of file +A list of references. use SHA256 or SAH256. \ No newline at end of file From 0d9a2b2b575b599e86e641f1e949c6eb0f99661f Mon Sep 17 00:00:00 2001 From: Jazz Turner-Baggs <473256+jazzz@users.noreply.github.com> Date: Mon, 27 Oct 2025 09:17:44 -0700 Subject: [PATCH 04/20] towards semBr --- standards/application/privatev1.md | 34 +++++++++++++++++++++--------- 1 file changed, 24 insertions(+), 10 deletions(-) diff --git a/standards/application/privatev1.md b/standards/application/privatev1.md index ad48c27..655cdd5 100644 --- a/standards/application/privatev1.md +++ b/standards/application/privatev1.md @@ -2,7 +2,7 @@ title: PRIVATE1 name: Private conversation category: Standards Track -tags: an optional list of tags, not standard +tags: editor: Jazz Alyxzander (@Jazzz) contributors: --- @@ -12,7 +12,9 @@ contributors: # Background -Pairwise encrypted messaging channels are a foundational component in building chat systems. They allow for confidential, authenticated payloads to be delivered between two clients. Groupchats and channel based communication often rely on pairwise channels (at least partially) to deliver state updates and coordination messages. +Pairwise encrypted messaging channels are a foundational component in building chat systems. +They allow for confidential, authenticated payloads to be delivered between two clients. +Groupchats and channel based communication often rely on pairwise channels (at least partially) to deliver state updates and coordination messages. Having robust pairwise communication channels allow for 1:1 communication while also providing the infrastructure for more complicated communication. @@ -61,7 +63,8 @@ In practice, any best-effort method of transmitting payloads will suffice, as no ### Content This protocol expects that all content be wrapped in a ContentFrame as per [CONTENTFRAME](https://github.com/waku-org/specs/blob/jazzz/content_frame/standards/application/contentframe.md) specification. -This increases observability when issues arise due to client versions mismatches. By enforcing that only ContentFrames will be passed to applications, this creates a clear boundary between Content and protocol owned meta messages. +This increases observability when issues arise due to client versions mismatches. +By enforcing that only ContentFrames will be passed to applications, this creates a clear boundary between Content and protocol owned meta messages. ## Initialization @@ -112,7 +115,7 @@ While this is a transport level issue, it's included here because deferring segm Forcing the transport layer to handle segmentation would require either reassembling unauthenticated segments (which are open to malicious interference) or implementing encryption at the transport layer. In the event of a dropped payload, segmentation after reliability would require clients to re-broadcast entire frames, rather than only the missing segments. This unnecessarily increases load on the network/clients and increases a DOS attack surface. -To optimize the entire pipeline, segmentation is handled first, so that segments can benefit from the reliability and robust encryption already in place. +To optimize the entire pipeline, segmentation is handled first so that segments can benefit from the reliability and robust encryption already in place. The segmentation strategy used is defined by [!TODO: Flatten link once completed](https://github.com/waku-org/specs/pull/91) @@ -129,7 +132,9 @@ SDS is implemented according to the [specification](https://github.com/vacp2p/rf !TODO: define: message_id mapping !TODO: update to latest version and include SDS-R -!NOTE: The defaultConfig in nim-SDS creates a bloom filter with the parameters n=10000, p=0.001 which has a size of ~18KiB. The bloom filter is included in every message which results in a best-case overhead rate of 13.3% (assuming waku's MSS of 150KB). Given a target content size of 4KB, that puts the utilization factor at 80+% (Without considering other layers). This needs to be looked at, lowering to n=2000 would lower overhead to ~3.5 KiB. +!NOTE: The defaultConfig in nim-SDS creates a bloom filter with the parameters n=10000, p=0.001 which has a size of ~18KiB. The bloom filter is included in every message which results in a best-case overhead rate of 13.3% (assuming waku's MSS of 150KB). +Given a target content size of 4KB, that puts the utilization factor at 80+% (Without considering other layers). +This needs to be looked at, lowering to n=2000 would lower overhead to ~3.5 KiB. ### Encryption @@ -144,13 +149,17 @@ With the following choices for external functions: !TODO: Define AssociatedData -AEAD_CHACHA20_POLY1305 is implemented using randomly generated nonces. The nonce and tag are combined with the ciphertext for transport where `ciphertext = nonce || encrypted_bytes || tag`. +AEAD_CHACHA20_POLY1305 is implemented using randomly generated nonces. +The nonce and tag are combined with the ciphertext for transport where `ciphertext = nonce || encrypted_bytes || tag`. ## Frame Handling This protocol uses explicit tagging of content, to remove ambiguity when parsing/handling frames. -This allows for clear distinction between content and frames providing protocol functionality. Even if new frames are added in the future, Clients can be certain whether the payload is intended for itself or applications. This is achieved through an invariant - All non-content frames are intended to be consumed by the client. When a new unknown frame arrives it can be certain that a version compatibility issue has occurred. +This allows for clear distinction between content and frames providing protocol functionality. +Even if new frames are added in the future, Clients can be certain whether the payload is intended for itself or applications. +This is achieved through an invariant - All non-content frames are intended to be consumed by the client. +When a new unknown frame arrives it can be certain that a version compatibility issue has occurred. - Clients SHALL only pass content frames to Applications - Clients MAY drop unrecognized frames @@ -272,13 +281,18 @@ A concept of a content tagging is required, but the exact structure could be a i ### Initialization -Mutual authentication is provided by the `sk`, so there is no requirement of using authenticated keys for `ssk` and `rsk`. Implementations SHOULD use the most ephemeral key available in order incorporate as much key material as possible. This means that senders SHOULD generate a new ephemeral key for `ssk` for every conversation assuming channels are asynchronously initialized. +Mutual authentication is provided by the `sk`, so there is no requirement of using authenticated keys for `ssk` and `rsk`. +Implementations SHOULD use the most ephemeral key available in order incorporate as much key material as possible. +This means that senders SHOULD generate a new ephemeral key for `ssk` for every conversation assuming channels are asynchronously initialized. ### Excessive Skipped Message -Handling of skipped message keys is not strictly defined in double ratchet. Implementations need to choose an strategy which works best for their environment, and delivery mechanism. Halting operation of the channel is the safest, as it bounds resource utilization in the event of a DOS attack but is not always possible. +Handling of skipped message keys is not strictly defined in double ratchet. +Implementations need to choose an strategy which works best for their environment, and delivery mechanism. +Halting operation of the channel is the safest, as it bounds resource utilization in the event of a DOS attack but is not always possible. -If eventual delivery of messages is not guaranteed, implementors should regularly delete keys that are older than a given time window. Unreliable delivery mechanisms will result in increased key storage over time, as more messages are lost with no hope of delivery. +If eventual delivery of messages is not guaranteed, implementors should regularly delete keys that are older than a given time window. +Unreliable delivery mechanisms will result in increased key storage over time, as more messages are lost with no hope of delivery. !TODO: Worth making deletion of stale keys part of the spec? From cb429897e0e5022475e6820b8accf87c152db9a6 Mon Sep 17 00:00:00 2001 From: Jazz Turner-Baggs <473256+jazzz@users.noreply.github.com> Date: Mon, 27 Oct 2025 11:45:53 -0700 Subject: [PATCH 05/20] Abstract away contentFrame --- standards/application/privatev1.md | 42 ++++++++++++++++++------------ 1 file changed, 26 insertions(+), 16 deletions(-) diff --git a/standards/application/privatev1.md b/standards/application/privatev1.md index 655cdd5..918b511 100644 --- a/standards/application/privatev1.md +++ b/standards/application/privatev1.md @@ -46,25 +46,31 @@ The terms include: ## Architecture -This conversation type assumes there is some service or application which wishes to generate and receive encrypted content. -It also assumes that some other component will be responsible for delivering the generated payloads. +This conversation type assumes there is some service or application which wishes to generate and receive end-to-end encrypted content. +It also assumes that some other component will be responsible for delivering the generated payloads. At its core this protocol takes the content provided and creates a series of payloads to be sent to the recipient. ```mermaid flowchart LR - ContentFrame:::plain--> Privatev1 --> Payload:::plain + Content:::plain--> Privatev1 --> Payload:::plain classDef plain fill:none,stroke:transparent; ``` +### Content + +Content is provided to the protocol as encoded bytes. +Due to segmentation limitations there is a restriction on the maximum size of content. +This value is variable and is dependent upon which delivery service is used. +In practice content size MUST be less that `255` * `max_seg_size` see: [initialization](#initialization) + +Other than its size, the protocol is agnostic of content. + + ### Payload Delivery + How payloads are sent and received by clients is not described in this protocol. The choice of delivery method has no impact on the security of this conversation type, though the choice may affect sender privacy and censorship resistance. In practice, any best-effort method of transmitting payloads will suffice, as no assumptions are made. -### Content -This protocol expects that all content be wrapped in a ContentFrame as per [CONTENTFRAME](https://github.com/waku-org/specs/blob/jazzz/content_frame/standards/application/contentframe.md) specification. - -This increases observability when issues arise due to client versions mismatches. -By enforcing that only ContentFrames will be passed to applications, this creates a clear boundary between Content and protocol owned meta messages. ## Initialization @@ -103,7 +109,7 @@ flowchart TD classDef plain fill:none,stroke:transparent; ``` -- **Segmentation**: Divides contents into smaller fragments for transportation. +- **Segmentation**: Divides content into smaller fragments for transportation. - **Reliability**: Adds tracking information to detect dropped messages. - **Encryption**: Provides confidentiality and tamper resistance. @@ -191,7 +197,7 @@ flowchart TD Segment3:::plain --> P P --> T{frame_type} - T --> ContentFrame + T --content--> Bytes T --> Placeholder classDef plain fill:none,stroke:transparent; @@ -263,22 +269,25 @@ message SegmentMessageProto { ```protobuf message PrivateV1Frame { - uint64 timestamp = 3; // Sender reported timestamp + uint64 timestamp = 1; // Sender reported timestamp oneof frame_type { - common_frames.ContentFrame content = 10; + bytes content = 10; Placeholder placeholder = 11; // .... } } ``` -!TODO: This is the only place where this protocol is explicitly dependent on ContentFrame. -A concept of a content tagging is required, but the exact structure could be a implementation detail. How best to abstract the exact type away? - +**content:** is encoded as bytes in order to allow implementations to define the type at runtime. ## Implementation Suggestions +### Content Types + +Implementors need to be mindful of maintaining interoperability between clients, when deciding how content is encoded prior to transmission. +In a decentralized context, clients cannot be assumed to be using the same version let alone application. It is recommended that implementors use a self-describing content payload such as [CONTENTFRAME](https://github.com/waku-org/specs/blob/jazzz/content_frame/standards/application/contentframe.md) specification. This provides the ability for clients to determine support for incoming frames, regardless of the software used to receive them. + ### Initialization Mutual authentication is provided by the `sk`, so there is no requirement of using authenticated keys for `ssk` and `rsk`. @@ -317,4 +326,5 @@ Copyright and related rights waived via [CC0](https://creativecommons.org/public ## References -A list of references. use SHA256 or SAH256. \ No newline at end of file +A list of references. use SHA256 or SAH256. + From 403bb7bc5fa4ce33ed5b77ea694e088844e95f92 Mon Sep 17 00:00:00 2001 From: Jazz Turner-Baggs <473256+jazzz@users.noreply.github.com> Date: Mon, 27 Oct 2025 12:11:07 -0700 Subject: [PATCH 06/20] update frame handing --- standards/application/privatev1.md | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/standards/application/privatev1.md b/standards/application/privatev1.md index 918b511..8f749a6 100644 --- a/standards/application/privatev1.md +++ b/standards/application/privatev1.md @@ -64,7 +64,6 @@ In practice content size MUST be less that `255` * `max_seg_size` see: [initiali Other than its size, the protocol is agnostic of content. - ### Payload Delivery How payloads are sent and received by clients is not described in this protocol. @@ -162,12 +161,13 @@ The nonce and tag are combined with the ciphertext for transport where `cipherte ## Frame Handling This protocol uses explicit tagging of content, to remove ambiguity when parsing/handling frames. -This allows for clear distinction between content and frames providing protocol functionality. +This creates a clear distinction between frames generated by the protocol, and content which was passed in. Even if new frames are added in the future, Clients can be certain whether the payload is intended for itself or applications. This is achieved through an invariant - All non-content frames are intended to be consumed by the client. When a new unknown frame arrives it can be certain that a version compatibility issue has occurred. -- Clients SHALL only pass content frames to Applications +- All application level content MUST use the `content` frameType. +- Clients SHALL only pass `content` tagged frames to Applications - Clients MAY drop unrecognized frames From 3c52c13ab99b92f8e31cf03bcf9e78fd3d36990d Mon Sep 17 00:00:00 2001 From: Jazz Turner-Baggs <473256+jazzz@users.noreply.github.com> Date: Mon, 27 Oct 2025 12:14:15 -0700 Subject: [PATCH 07/20] Remove delivery address --- standards/application/privatev1.md | 3 --- 1 file changed, 3 deletions(-) diff --git a/standards/application/privatev1.md b/standards/application/privatev1.md index 8f749a6..5b143fa 100644 --- a/standards/application/privatev1.md +++ b/standards/application/privatev1.md @@ -77,9 +77,6 @@ The channel is initialized by both sender and recipient agreeing on the followin - `sk` - initial secret key [32 bytes] - `ssk` - sender DH seed key - `rsk` - recipient DH seed key -- `rda` - delivery address (recipient) !TODO: Can delivery addresses be removed from this spec? -- `sda` - delivery address (sender) - To maintain the security properties: - `sk` MUST be known only by the participants. From 8ff0caae760073d8213853c585c6a32adb92be6a Mon Sep 17 00:00:00 2001 From: Jazz Turner-Baggs <473256+jazzz@users.noreply.github.com> Date: Mon, 27 Oct 2025 12:32:13 -0700 Subject: [PATCH 08/20] remove partial order property --- standards/application/privatev1.md | 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-) diff --git a/standards/application/privatev1.md b/standards/application/privatev1.md index 5b143fa..3c6b4cd 100644 --- a/standards/application/privatev1.md +++ b/standards/application/privatev1.md @@ -28,8 +28,7 @@ Private Conversations have the following properties: - Sender Privacy: Only the recipient can determine who the sender was. - Forward Secrecy: A compromise in the future does not allow previous messages to be decrypted by a third party. - Post Compromise Security: Conversations eventually recover from a compromise which occurs today. - - Message Reliability: Messages sent with this protocol are - - Partial Message Order: !TODO: + - Dropped Message Observability: Messages which were lost in transit are eventually visible to both sender and recipient. ## Definitions From 9d11d4add53af69a82d2a167f96e13ca25509463 Mon Sep 17 00:00:00 2001 From: Jazz Turner-Baggs <473256+jazzz@users.noreply.github.com> Date: Mon, 27 Oct 2025 14:12:08 -0700 Subject: [PATCH 09/20] Update background --- standards/application/privatev1.md | 15 ++++++++++----- 1 file changed, 10 insertions(+), 5 deletions(-) diff --git a/standards/application/privatev1.md b/standards/application/privatev1.md index 3c6b4cd..c84f233 100644 --- a/standards/application/privatev1.md +++ b/standards/application/privatev1.md @@ -9,14 +9,19 @@ contributors: # Abstract +This specification defines PRIVATE1, a conversation protocol for establishing secure, full-duplex encrypted communication channels between two participants. PRIVATE1 provides end-to-end encryption with forward secrecy and post-compromise security using the DoubleRatchet algorithm, combined with reliable message delivery via Scalable Data Sync (SDS) and efficient segmentation for transport-constrained environments. + +The protocol is transport-agnostic and designed to support both direct messaging and as a foundation for group communication systems. PRIVATE1 ensures payload confidentiality, content integrity, sender privacy, and message reliability while remaining resilient to network disruptions and message reordering. # Background -Pairwise encrypted messaging channels are a foundational component in building chat systems. -They allow for confidential, authenticated payloads to be delivered between two clients. -Groupchats and channel based communication often rely on pairwise channels (at least partially) to deliver state updates and coordination messages. +Pairwise encrypted messaging channels represent the foundational building block upon which modern secure communication systems are constructed. While end-to-end encrypted group chats and public channels dominate user-facing features and capture the majority of user attention, the underlying infrastructure enabling these complex communication patterns relies fundamentally on secure one-to-one communication primitives. Just as higher-level network protocols are built upon reliable transport primitives like TCP, sophisticated group communication systems depend on robust pairwise channels to function correctly and securely. + +These channels serve purposes beyond simple content delivery. They transmit not only user-visible messages but also critical metadata, coordination signals, and state synchronization information between clients. This signaling capability makes pairwise channels essential infrastructure for distributed systems: key material distribution, membership updates, administrative actions, and protocol coordination all flow through these channels. While more sophisticated group communication strategies can achieve better efficiency at scale—particularly for broadcast-style communication patterns with many participants—they struggle to match the privacy and security properties that pairwise channels provide inherently. The fundamental asymmetry of two-party communication enables stronger guarantees: minimal metadata exposure, simpler key management, clearer authentication boundaries, and more straightforward security analysis. + +However, being encrypted is merely the starting point, not the complete solution. Production-quality one-to-one channels must function reliably in the messy reality of modern networks. Real-world deployment demands resilience to unreliable networks where messages may be lost, delayed, duplicated, or arrive out of order. Channels must efficiently handle arbitrarily large payloads—from short text messages to multi-megabyte file transfers—while respecting the maximum transmission unit constraints imposed by various transport layers. Perhaps most critically, the protocol must remain fully operational even when one or more participants are offline or intermittently connected, a common scenario in mobile environments where users move between network conditions, battery limitations force background restrictions, or time zone differences mean participants are rarely simultaneously active. These practical requirements shape the protocol design as significantly as cryptographic considerations, demanding careful attention to segmentation strategies, reliability mechanisms, state management, and resource constraints alongside the core security properties. + -Having robust pairwise communication channels allow for 1:1 communication while also providing the infrastructure for more complicated communication. # Private V1 @@ -139,7 +144,7 @@ This needs to be looked at, lowering to n=2000 would lower overhead to ~3.5 KiB. ### Encryption -Payloads are encrypted using [`doubleratchet`](https://signal.org/docs/specifications/doubleratchet/). +Payloads are encrypted using the [doubleratchet](https://signal.org/docs/specifications/doubleratchet/) protocol. With the following choices for external functions: - `DH`: X25519 From 49355afdb08aa8e45ab0cc1ced4b8d44ae4a8c4c Mon Sep 17 00:00:00 2001 From: Jazz Turner-Baggs <473256+jazzz@users.noreply.github.com> Date: Mon, 27 Oct 2025 15:04:46 -0700 Subject: [PATCH 10/20] readability updates --- standards/application/privatev1.md | 47 +++++++++++++++++------------- 1 file changed, 26 insertions(+), 21 deletions(-) diff --git a/standards/application/privatev1.md b/standards/application/privatev1.md index c84f233..f1c1309 100644 --- a/standards/application/privatev1.md +++ b/standards/application/privatev1.md @@ -15,7 +15,7 @@ The protocol is transport-agnostic and designed to support both direct messaging # Background -Pairwise encrypted messaging channels represent the foundational building block upon which modern secure communication systems are constructed. While end-to-end encrypted group chats and public channels dominate user-facing features and capture the majority of user attention, the underlying infrastructure enabling these complex communication patterns relies fundamentally on secure one-to-one communication primitives. Just as higher-level network protocols are built upon reliable transport primitives like TCP, sophisticated group communication systems depend on robust pairwise channels to function correctly and securely. +Pairwise encrypted messaging channels represent the foundational building block of modern secure communication systems. While end-to-end encrypted group chats capture user attention, the underlying infrastructure that makes these systems possible relies (at least somewhat) on secure one-to-one communication primitives. Just as higher-level network protocols are built upon reliable transport primitives like TCP, sophisticated communication systems depend on robust pairwise channels to function correctly and securely. These channels serve purposes beyond simple content delivery. They transmit not only user-visible messages but also critical metadata, coordination signals, and state synchronization information between clients. This signaling capability makes pairwise channels essential infrastructure for distributed systems: key material distribution, membership updates, administrative actions, and protocol coordination all flow through these channels. While more sophisticated group communication strategies can achieve better efficiency at scale—particularly for broadcast-style communication patterns with many participants—they struggle to match the privacy and security properties that pairwise channels provide inherently. The fundamental asymmetry of two-party communication enables stronger guarantees: minimal metadata exposure, simpler key management, clearer authentication boundaries, and more straightforward security analysis. @@ -25,15 +25,14 @@ However, being encrypted is merely the starting point, not the complete solution # Private V1 -PrivateV1 is conversation type which establishes a full-duplex secure channel between two participants. +PrivateV1 is a conversation type specification that establishes a full-duplex secure communication channel between two participants. It combines the Double Ratchet algorithm for encryption with Scalable Data Sync (SDS) for reliable delivery and an efficient segmentation strategy to handle transport constraints. -Private Conversations have the following properties: - - Payload Confidentiality: Only the participants can read the contents of any message sent. - - Content Integrity: Recipients can detect if the contents were modified by a third party. - - Sender Privacy: Only the recipient can determine who the sender was. - - Forward Secrecy: A compromise in the future does not allow previous messages to be decrypted by a third party. - - Post Compromise Security: Conversations eventually recover from a compromise which occurs today. - - Dropped Message Observability: Messages which were lost in transit are eventually visible to both sender and recipient. +- **Payload Confidentiality**: Only the two participants can read the contents of any message sent. Observers, transport providers, and other third parties cannot decrypt message contents. +- **Content Integrity**: Recipients can detect if message contents were modified by a third party. Any tampering with encrypted payloads will cause decryption to fail, preventing corrupted messages from being accepted as authentic. +- **Sender Privacy**: Only the recipient can determine who the sender was. Observers cannot identify the sender from encrypted payloads, though both participants can authenticate each other's messages. +- **Forward Secrecy**: A compromise in the future does not allow previous messages to be decrypted by a third party. Message keys are deleted immediately after use and cannot be reconstructed from current state, even if long-term keys are later compromised. +- **Post-Compromise Security**: Conversations eventually recover from a key compromise. After an attacker loses access to a device, the security properties are eventually restored. +- **Dropped Message Observability**: Messages lost in transit are eventually observable to both sender and recipient. ## Definitions @@ -48,6 +47,7 @@ The terms include: - Sender + ## Architecture This conversation type assumes there is some service or application which wishes to generate and receive end-to-end encrypted content. @@ -58,22 +58,28 @@ flowchart LR Content:::plain--> Privatev1 --> Payload:::plain classDef plain fill:none,stroke:transparent; ``` +### Content -### Content +Applications provide content as encoded bytes, which is then packaged into payloads for transmission -Content is provided to the protocol as encoded bytes. -Due to segmentation limitations there is a restriction on the maximum size of content. -This value is variable and is dependent upon which delivery service is used. -In practice content size MUST be less that `255` * `max_seg_size` see: [initialization](#initialization) +**Size Limit** -Other than its size, the protocol is agnostic of content. +Content MUST be smaller than `255 * max_seg_size` +due to segmentation protocol limitations. ### Payload Delivery +How payloads are sent and received by clients is deliberately not specified by this protocol. +Transport choice is an implementation decision that should be made based on deployment requirements. -How payloads are sent and received by clients is not described in this protocol. -The choice of delivery method has no impact on the security of this conversation type, though the choice may affect sender privacy and censorship resistance. -In practice, any best-effort method of transmitting payloads will suffice, as no assumptions are made. +The choice of transport mechanism has no impact on PRIVATE1's security properties. +Confidentiality, integrity, and forward secrecy are provided regardless of how payloads are delivered. +However, transport choice may affect other properties. +**Sender Privacy:** +Implementations may leak sensitive metadata. + +**Reliability** +While PRIVATE1 handles message losses, more reliable transports reduce retransmission overhead. ## Initialization @@ -92,7 +98,7 @@ Additionally implementations MUST determine the following constants: - `max_seg_size` - maximum segmentation size to be used. - `max_skip` - number of keys which can be skipped per session. Values are determined by -## Frame Encoding +## Protocol Operation There are 3 phases to operation. @@ -128,7 +134,6 @@ The segmentation strategy used is defined by [!TODO: Flatten link once completed Implementation specifics: - Error correction is not used, as reliable delivery is already provided by lower layers. - `segmentSize` = `max_seg_size` -!TODO: ^Spec currently has a limit of ### Message Reliability Scalable Data Sync is used to detect missing messages and provide delivery receipts to the sender after successful reception of a payload. @@ -150,7 +155,7 @@ With the following choices for external functions: - `DH`: X25519 - `KDF_RF`: HKDF with SHA256, info = `logoschat_privatev1` - `KDF_CK`: HKDF with SHA256, input = "0x01 for message_key, and "0x02" for chain_key -- `KDF_MK`: HKDF with SHA256, hdkf.info = "PrivateV1MessageKey" +- `KDF_MK`: HKDF with SHA256, hkdf.info = "PrivateV1MessageKey" - `ENCRYPT`: Implemented with AEAD_CHACHA20_POLY1305 !TODO: Define AssociatedData From b44c3c7dbe16187619ae778fe4a12329d9c8a284 Mon Sep 17 00:00:00 2001 From: Jazz Turner-Baggs <473256+jazzz@users.noreply.github.com> Date: Mon, 27 Oct 2025 16:04:55 -0700 Subject: [PATCH 11/20] Update encryption and segmentation --- standards/application/privatev1.md | 92 ++++++++++++++++++++++-------- 1 file changed, 69 insertions(+), 23 deletions(-) diff --git a/standards/application/privatev1.md b/standards/application/privatev1.md index f1c1309..9815fe3 100644 --- a/standards/application/privatev1.md +++ b/standards/application/privatev1.md @@ -60,13 +60,17 @@ flowchart LR ``` ### Content -Applications provide content as encoded bytes, which is then packaged into payloads for transmission +Applications provide content as encoded bytes, which is then packaged into payloads for transmission. **Size Limit** Content MUST be smaller than `255 * max_seg_size` due to segmentation protocol limitations. +**Agnostic** + +The protocol treats the contents as a arbitrary sequence of bytes and is agnostic to its contents. + ### Payload Delivery How payloads are sent and received by clients is deliberately not specified by this protocol. Transport choice is an implementation decision that should be made based on deployment requirements. @@ -100,8 +104,8 @@ Additionally implementations MUST determine the following constants: ## Protocol Operation -There are 3 phases to operation. - +PRIVATE1 processes messages through a three-stage pipeline where each stage's output becomes the next stage's input. +The specific ordering of these stages is critical for maintaining security properties while enabling efficient operation. ```mermaid flowchart TD @@ -114,20 +118,31 @@ flowchart TD classDef plain fill:none,stroke:transparent; ``` +**Pipeline Stages:** +- **Segmentation**: Divides content into transport-appropriate fragments +- **Reliability (SDS)**: Adds tracking metadata for delivery detection and ordering +- **Encryption (Double Ratchet)**: Provides confidentiality, authentication, and forward secrecy -- **Segmentation**: Divides content into smaller fragments for transportation. -- **Reliability**: Adds tracking information to detect dropped messages. -- **Encryption**: Provides confidentiality and tamper resistance. -The output of each phase of the operational pipeline is the input of the next. ### Segmentation -Thought the protocol has no limitation, it is assumed that a delivery mechanism MAY have restrictions on the max message size. -While this is a transport level issue, it's included here because deferring segmentation has negative impacts on bandwidth efficiency and privacy. -Forcing the transport layer to handle segmentation would require either reassembling unauthenticated segments (which are open to malicious interference) or implementing encryption at the transport layer. -In the event of a dropped payload, segmentation after reliability would require clients to re-broadcast entire frames, rather than only the missing segments. -This unnecessarily increases load on the network/clients and increases a DOS attack surface. -To optimize the entire pipeline, segmentation is handled first so that segments can benefit from the reliability and robust encryption already in place. + +While PRIVATE1 itself has no inherent message size limitation, practical transport mechanisms typically impose maximum payload sizes. +Segmentation is intentionally placed as the first pipeline stage rather than deferring it to the transport layer + +**Why Segment Before Encryption** + +Segmenting after encryption would force the transport layer to handle fragmentation of ciphertext blobs, creating several problems. +- Transport-layer segmentation would require buffering all segments before any can be authenticated, increasing the DOS attack surface. +- Unauthenticated segment reassembly opens the door to malicious segment injection and substitution attacks. +- Unencrypted segmentation metadata reveals size and other metadata about the content in transit. + +**Why Segment Before Reliability** + +Placing segmentation after reliability tracking would mean retransmission of a dropped payload requires re-broadcasting the entire frame. +By segmenting first, the reliability layer can track individual segments and request retransmission of only the missing fragments. + +**Implementation** The segmentation strategy used is defined by [!TODO: Flatten link once completed](https://github.com/waku-org/specs/pull/91) @@ -149,20 +164,51 @@ This needs to be looked at, lowering to n=2000 would lower overhead to ~3.5 KiB. ### Encryption -Payloads are encrypted using the [doubleratchet](https://signal.org/docs/specifications/doubleratchet/) protocol. +Payloads are encrypted using the [Double Ratchet](https://signal.org/docs/specifications/doubleratchet/) algorithm with the following cryptographic primitive choices: -With the following choices for external functions: -- `DH`: X25519 -- `KDF_RF`: HKDF with SHA256, info = `logoschat_privatev1` -- `KDF_CK`: HKDF with SHA256, input = "0x01 for message_key, and "0x02" for chain_key -- `KDF_MK`: HKDF with SHA256, hkdf.info = "PrivateV1MessageKey" -- `ENCRYPT`: Implemented with AEAD_CHACHA20_POLY1305 +**Double Ratchet Configuration** -!TODO: Define AssociatedData +- `DH`: X25519 for Diffie-Hellman operations +- `KDF_RK`: HKDF with SHA256, `info = "PrivateV1RootKey"` +- `KDF_CK`: HKDF with SHA256, using `input`=`0x01` for message keys and `input`=`0x02` for chain keys +- `KDF_MK`: HKDF with SHA256, `info = "PrivateV1MessageKey"` +- `ENCRYPT`: AEAD_CHACHA20_POLY1305 -AEAD_CHACHA20_POLY1305 is implemented using randomly generated nonces. -The nonce and tag are combined with the ciphertext for transport where `ciphertext = nonce || encrypted_bytes || tag`. +**AEAD Implementation** +ChaCha20-Poly1305 is used with randomly generated 96-bit (12-byte) nonces. +The nonce MUST be generated using a cryptographically secure random number generator for each message. +The complete ciphertext format for transport is: +``` +encrypted_payload = nonce || ciphertext || tag +``` + +Where `nonce` is 12 bytes, `ciphertext` is variable length, and `tag` is 16 bytes. + +## Frame Handling + +This protocol uses explicit frame type tagging to remove ambiguity when parsing and handling frames. +This creates a clear distinction between protocol-generated frames and application content. + +**Type Discrimination** + +All frames carry an explicit type field that identifies their purpose. +The `content` frame type is reserved exclusively for application-level data. +All other frame types are protocol-owned and intended for client processing, not application consumption. + +This establishes a critical invariant: any frame that is not `content` is meant for the protocol layer. +When a client encounters an unknown frame type, it can definitively conclude this represents a version compatibility issue rather than corrupted application data. + +**Processing Rules** + +- All application-level content MUST use the `content` frame type +- Clients SHALL only pass `content` frames to applications +- Clients MAY drop unrecognized frame types + +**Future Extensibility** + +This explicit tagging mechanism allows the protocol to evolve without breaking existing implementations. +Future versions may define additional frame types for protocol-level functionality while legacy clients continue processing `content` frames normally. ## Frame Handling From 9c2e71d9ebd36047b88a020072891a14bb4a9c12 Mon Sep 17 00:00:00 2001 From: Jazz Turner-Baggs <473256+jazzz@users.noreply.github.com> Date: Tue, 28 Oct 2025 08:27:49 -0700 Subject: [PATCH 12/20] Remove stale section --- standards/application/privatev1.md | 13 ------------- 1 file changed, 13 deletions(-) diff --git a/standards/application/privatev1.md b/standards/application/privatev1.md index 9815fe3..51ce787 100644 --- a/standards/application/privatev1.md +++ b/standards/application/privatev1.md @@ -210,19 +210,6 @@ When a client encounters an unknown frame type, it can definitively conclude thi This explicit tagging mechanism allows the protocol to evolve without breaking existing implementations. Future versions may define additional frame types for protocol-level functionality while legacy clients continue processing `content` frames normally. -## Frame Handling - -This protocol uses explicit tagging of content, to remove ambiguity when parsing/handling frames. -This creates a clear distinction between frames generated by the protocol, and content which was passed in. -Even if new frames are added in the future, Clients can be certain whether the payload is intended for itself or applications. -This is achieved through an invariant - All non-content frames are intended to be consumed by the client. -When a new unknown frame arrives it can be certain that a version compatibility issue has occurred. - -- All application level content MUST use the `content` frameType. -- Clients SHALL only pass `content` tagged frames to Applications -- Clients MAY drop unrecognized frames - - # Wire Format Specification / Syntax ## Payload Parse Tree From 1e8ba4a786f7e1e82b5b4c2029f765d0c469962c Mon Sep 17 00:00:00 2001 From: Jazz Turner-Baggs <473256+jazzz@users.noreply.github.com> Date: Tue, 28 Oct 2025 08:28:18 -0700 Subject: [PATCH 13/20] update payload delivery warnings --- standards/application/privatev1.md | 8 ++++---- 1 file changed, 4 insertions(+), 4 deletions(-) diff --git a/standards/application/privatev1.md b/standards/application/privatev1.md index 51ce787..97d64e6 100644 --- a/standards/application/privatev1.md +++ b/standards/application/privatev1.md @@ -77,12 +77,12 @@ Transport choice is an implementation decision that should be made based on depl The choice of transport mechanism has no impact on PRIVATE1's security properties. Confidentiality, integrity, and forward secrecy are provided regardless of how payloads are delivered. -However, transport choice may affect other properties. +However, transport choice may affect other properties and characteristics. -**Sender Privacy:** -Implementations may leak sensitive metadata. +**Recipient Privacy:** +The routing/addressing layer may leak sensitive metadata including the recipients identity. The payloads generated by this protocol do not reveal the participants of a conversation, however the overall privacy properties are determined by the delivery mechanism used to transport payloads. -**Reliability** +**Reliability Performance** While PRIVATE1 handles message losses, more reliable transports reduce retransmission overhead. ## Initialization From eaff6caff0a2537be6459fccfe5a1fdf97ac498f Mon Sep 17 00:00:00 2001 From: Jazz Turner-Baggs <473256+jazzz@users.noreply.github.com> Date: Tue, 28 Oct 2025 14:14:51 -0700 Subject: [PATCH 14/20] Update SDS initialization --- standards/application/privatev1.md | 60 +++++++++++++++++++++++++++++- 1 file changed, 59 insertions(+), 1 deletion(-) diff --git a/standards/application/privatev1.md b/standards/application/privatev1.md index 97d64e6..be33c99 100644 --- a/standards/application/privatev1.md +++ b/standards/application/privatev1.md @@ -91,6 +91,7 @@ The channel is initialized by both sender and recipient agreeing on the followin - `sk` - initial secret key [32 bytes] - `ssk` - sender DH seed key - `rsk` - recipient DH seed key +- `conversation_id` - globally unique identifier To maintain the security properties: - `sk` MUST be known only by the participants. @@ -98,10 +99,40 @@ To maintain the security properties: - `sk` SHOULD have forward secrecy by incorporating ephemeral key material - `rsk` and `ssk` SHOULD incorporate ephemeral key material +As PRIVATE1 is agnostic to identity defining a unique identifier is difficult at this layer. The exact derivation is left to implementations to determine. +- `conversation_id` MUST be unique across all instances of chat conversations +- `conversation_id` SHOULD be consistent across applications to maintain interoperability + Additionally implementations MUST determine the following constants: - `max_seg_size` - maximum segmentation size to be used. - `max_skip` - number of keys which can be skipped per session. Values are determined by +## Value Derivations + +These values are derived during protocol operation and are deterministically computed from protocol data. + +### Frame Identifier + +For reliability tracking, every payload MUST have a unique deterministic identifier. + +The frame identifier is computed as: +``` +frame_id = rhex(blake2b(encoded_frame_bytes)) +``` + +Where: +- `rhex` is lowercase hexadecimal encoding without the `0x` prefix +- `blake2b` is BLAKE2b hash function with 128-bit output +- `encoded_frame_bytes` is the protobuf-encoded `PrivateV1Frame` +- `frame_id` is a 32 character string + +**Protobuf Encoding Considerations** + +Protobuf does not guarantee byte-identical outputs for multiple serializations of the same logical message. +Because of this, the `frame_id` represents the hash of specific encoded bytes rather than an abstract frame structure. +Implementations MUST compute `frame_id` from the actual bytes being transmitted to ensure sender and receiver derive identical identifiers. + + ## Protocol Operation PRIVATE1 processes messages through a three-stage pipeline where each stage's output becomes the next stage's input. @@ -148,7 +179,32 @@ The segmentation strategy used is defined by [!TODO: Flatten link once completed Implementation specifics: - Error correction is not used, as reliable delivery is already provided by lower layers. -- `segmentSize` = `max_seg_size` +- `segmentSize` = `max_seg_size + +### Message Reliability + +Scalable Data Sync (SDS) is used to detect missing messages, provide delivery confirmation, and handle retransmission of payloads. +SDS is implemented according to the [specification](https://github.com/vacp2p/rfc-index/blob/main/vac/raw/sds.md). + +**SDS Field Mappings** + +The following mappings connect PRIVATE1 concepts to SDS fields: + +- `sender_id`: !TODO: This requires PRIVATE1 to be identity aware +- `message_id`: uses the `frame_id` definition. +- `channel_id`: uses the `conversation_id` parameter. + +**Sender Validation** +SDS uses a `sender_id` payload field to determine whether a message was sent by the remote party. This value is sender reported and not validated which can have unknown implications if trusted in other contexts. For security hygiene Clients SHOULD drop SDS messages if `sender_id` != the sender derived from the encryption layer. + + +**Bloom Filter Configuration** + +PRIVATE1 uses bloom filter parameters of `n=2000` (expected elements) and `p=0.001` (false positive probability). +This configuration produces bloom filters of approximately 3.5 KiB per message. + +!TODO: Can the bloom filter be dropped in 1:1 communication? + ### Message Reliability Scalable Data Sync is used to detect missing messages and provide delivery receipts to the sender after successful reception of a payload. @@ -344,6 +400,8 @@ Unreliable delivery mechanisms will result in increased key storage over time, a !TODO: Worth making deletion of stale keys part of the spec? + + ## Security/Privacy Considerations ### Sender Derivation From f6d6e6cf52dcf76b6404fce745a37ff1665fe4b6 Mon Sep 17 00:00:00 2001 From: Jazz Turner-Baggs <473256+jazzz@users.noreply.github.com> Date: Tue, 28 Oct 2025 14:31:33 -0700 Subject: [PATCH 15/20] Update security considerations --- standards/application/privatev1.md | 32 ++++++------------------------ 1 file changed, 6 insertions(+), 26 deletions(-) diff --git a/standards/application/privatev1.md b/standards/application/privatev1.md index be33c99..4042ba8 100644 --- a/standards/application/privatev1.md +++ b/standards/application/privatev1.md @@ -205,19 +205,6 @@ This configuration produces bloom filters of approximately 3.5 KiB per message. !TODO: Can the bloom filter be dropped in 1:1 communication? - -### Message Reliability -Scalable Data Sync is used to detect missing messages and provide delivery receipts to the sender after successful reception of a payload. -SDS is implemented according to the [specification](https://github.com/vacp2p/rfc-index/blob/3505da6bd66d2830e5711deb0b5c2b4de9212a4d/vac/raw/sds.md). - -!TODO: define: sender_id mapping -!TODO: define: message_id mapping -!TODO: update to latest version and include SDS-R - -!NOTE: The defaultConfig in nim-SDS creates a bloom filter with the parameters n=10000, p=0.001 which has a size of ~18KiB. The bloom filter is included in every message which results in a best-case overhead rate of 13.3% (assuming waku's MSS of 150KB). -Given a target content size of 4KB, that puts the utilization factor at 80+% (Without considering other layers). -This needs to be looked at, lowering to n=2000 would lower overhead to ~3.5 KiB. - ### Encryption Payloads are encrypted using the [Double Ratchet](https://signal.org/docs/specifications/doubleratchet/) algorithm with the following cryptographic primitive choices: @@ -398,23 +385,16 @@ Halting operation of the channel is the safest, as it bounds resource utilizatio If eventual delivery of messages is not guaranteed, implementors should regularly delete keys that are older than a given time window. Unreliable delivery mechanisms will result in increased key storage over time, as more messages are lost with no hope of delivery. -!TODO: Worth making deletion of stale keys part of the spec? - - - ## Security/Privacy Considerations -### Sender Derivation - - -### Segmentation Session Binding - - - - -### Privacy - ContentSize +### Sender Deniability and Authentication** +Encrypted messages do not have a cryptographically provable sender to third parties due to the deniability property of the Double Ratchet algorithm. +However, participants in a conversation can authenticate each other through the shared cryptographic state. +When receiving a message, the recipient knows it must have come from the other participant because only they possess the necessary key material to produce valid ciphertexts. +Because sender identity is implicitly authenticated through shared secrets rather than explicit signatures, it is critical that the initial shared secret `sk` be derived from an authenticated key exchange process. +Without proper authentication during initialization, an adversary could perform a man-in-the-middle attack and establish separate sessions with each participant, allowing them to read and modify all messages. ## Copyright From 8e1e2536a101c458a2fc4fed26d64892d6d62572 Mon Sep 17 00:00:00 2001 From: Jazz Turner-Baggs <473256+jazzz@users.noreply.github.com> Date: Tue, 28 Oct 2025 14:54:53 -0700 Subject: [PATCH 16/20] Cleanup --- standards/application/privatev1.md | 57 ++++++++++++++---------------- 1 file changed, 26 insertions(+), 31 deletions(-) diff --git a/standards/application/privatev1.md b/standards/application/privatev1.md index 4042ba8..7a6e397 100644 --- a/standards/application/privatev1.md +++ b/standards/application/privatev1.md @@ -15,17 +15,18 @@ The protocol is transport-agnostic and designed to support both direct messaging # Background -Pairwise encrypted messaging channels represent the foundational building block of modern secure communication systems. While end-to-end encrypted group chats capture user attention, the underlying infrastructure that makes these systems possible relies (at least somewhat) on secure one-to-one communication primitives. Just as higher-level network protocols are built upon reliable transport primitives like TCP, sophisticated communication systems depend on robust pairwise channels to function correctly and securely. +Pairwise encrypted messaging channels represent a foundational building block of modern secure communication systems. While end-to-end encrypted group chats capture user attention, the underlying infrastructure that makes these systems possible relies (at least somewhat) on secure one-to-one communication primitives. Just as higher-level network protocols are built upon reliable transport primitives like TCP, sophisticated communication systems depend on robust pairwise channels to function correctly and securely. -These channels serve purposes beyond simple content delivery. They transmit not only user-visible messages but also critical metadata, coordination signals, and state synchronization information between clients. This signaling capability makes pairwise channels essential infrastructure for distributed systems: key material distribution, membership updates, administrative actions, and protocol coordination all flow through these channels. While more sophisticated group communication strategies can achieve better efficiency at scale—particularly for broadcast-style communication patterns with many participants—they struggle to match the privacy and security properties that pairwise channels provide inherently. The fundamental asymmetry of two-party communication enables stronger guarantees: minimal metadata exposure, simpler key management, clearer authentication boundaries, and more straightforward security analysis. - -However, being encrypted is merely the starting point, not the complete solution. Production-quality one-to-one channels must function reliably in the messy reality of modern networks. Real-world deployment demands resilience to unreliable networks where messages may be lost, delayed, duplicated, or arrive out of order. Channels must efficiently handle arbitrarily large payloads—from short text messages to multi-megabyte file transfers—while respecting the maximum transmission unit constraints imposed by various transport layers. Perhaps most critically, the protocol must remain fully operational even when one or more participants are offline or intermittently connected, a common scenario in mobile environments where users move between network conditions, battery limitations force background restrictions, or time zone differences mean participants are rarely simultaneously active. These practical requirements shape the protocol design as significantly as cryptographic considerations, demanding careful attention to segmentation strategies, reliability mechanisms, state management, and resource constraints alongside the core security properties. +These channels serve purposes beyond simple content delivery. They transmit not only user-visible messages but also critical metadata, coordination signals, and state synchronization information between clients. This signaling capability makes pairwise channels essential infrastructure for distributed systems: key material distribution, membership updates, administrative actions, and protocol coordination all flow through these channels. While more sophisticated group communication strategies can achieve better efficiency at scale—particularly for broadcast-style communication patterns — they struggle to match the privacy and security properties that pairwise channels provide inherently. The fundamental asymmetry of two-party communication enables stronger guarantees: minimal metadata exposure, simpler key management, clearer authentication boundaries, and more straightforward security analysis. +However, being encrypted is merely the starting point, not the complete solution. Production-quality one-to-one channels must function reliably in the messy reality of modern networks. Real-world deployment demands resilience to unreliable networks where messages may be lost, delayed, duplicated, or arrive out of order. Channels must efficiently handle arbitrarily large payloads—from short text messages to multi-megabyte file transfers—while respecting the maximum transmission unit constraints imposed by various transport layers. Perhaps most critically, the protocol must remain fully operational even when one or more participants are offline or intermittently connected. # Private V1 -PrivateV1 is a conversation type specification that establishes a full-duplex secure communication channel between two participants. It combines the Double Ratchet algorithm for encryption with Scalable Data Sync (SDS) for reliable delivery and an efficient segmentation strategy to handle transport constraints. +PrivateV1 is a conversation type specification that establishes a full-duplex secure communication channel between two participants. It combines the Double Ratchet algorithm for encryption with Scalable Data Sync (SDS) for reliable delivery and an efficient segmentation strategy to handle transport constraints. + +PRIVATE1 provides the following properties: - **Payload Confidentiality**: Only the two participants can read the contents of any message sent. Observers, transport providers, and other third parties cannot decrypt message contents. - **Content Integrity**: Recipients can detect if message contents were modified by a third party. Any tampering with encrypted payloads will cause decryption to fail, preventing corrupted messages from being accepted as authentic. @@ -51,7 +52,7 @@ The terms include: ## Architecture This conversation type assumes there is some service or application which wishes to generate and receive end-to-end encrypted content. -It also assumes that some other component will be responsible for delivering the generated payloads. At its core this protocol takes the content provided and creates a series of payloads to be sent to the recipient. +It also assumes that some other component is responsible for delivering the generated payloads. At its core this protocol takes the content provided and creates a series of payloads to be sent to the recipient. ```mermaid flowchart LR @@ -99,7 +100,7 @@ To maintain the security properties: - `sk` SHOULD have forward secrecy by incorporating ephemeral key material - `rsk` and `ssk` SHOULD incorporate ephemeral key material -As PRIVATE1 is agnostic to identity defining a unique identifier is difficult at this layer. The exact derivation is left to implementations to determine. +PRIVATE1 requires a unique identifier, however the exact derivation is left to implementations to determine. - `conversation_id` MUST be unique across all instances of chat conversations - `conversation_id` SHOULD be consistent across applications to maintain interoperability @@ -159,7 +160,6 @@ flowchart TD ### Segmentation While PRIVATE1 itself has no inherent message size limitation, practical transport mechanisms typically impose maximum payload sizes. -Segmentation is intentionally placed as the first pipeline stage rather than deferring it to the transport layer **Why Segment Before Encryption** @@ -170,7 +170,7 @@ Segmenting after encryption would force the transport layer to handle fragmentat **Why Segment Before Reliability** -Placing segmentation after reliability tracking would mean retransmission of a dropped payload requires re-broadcasting the entire frame. +Placing segmentation after reliability tracking would mean retransmission of a dropped segment would require re-broadcasting the entire frame. By segmenting first, the reliability layer can track individual segments and request retransmission of only the missing fragments. **Implementation** @@ -180,6 +180,7 @@ The segmentation strategy used is defined by [!TODO: Flatten link once completed Implementation specifics: - Error correction is not used, as reliable delivery is already provided by lower layers. - `segmentSize` = `max_seg_size +- All payloads regardless of size are wrapped in a segmentation message. ### Message Reliability @@ -195,8 +196,7 @@ The following mappings connect PRIVATE1 concepts to SDS fields: - `channel_id`: uses the `conversation_id` parameter. **Sender Validation** -SDS uses a `sender_id` payload field to determine whether a message was sent by the remote party. This value is sender reported and not validated which can have unknown implications if trusted in other contexts. For security hygiene Clients SHOULD drop SDS messages if `sender_id` != the sender derived from the encryption layer. - +SDS uses a `sender_id` payload field to determine whether a message was sent by the remote party. This value is sender reported and not validated which can have unknown implications if trusted in other contexts. For security hygiene Clients SHOULD drop SDS messages if `sender_id` != the sender derived from the encryption layer. !TODO: PrivateV1 is not sender aware currently **Bloom Filter Configuration** @@ -240,7 +240,7 @@ The `content` frame type is reserved exclusively for application-level data. All other frame types are protocol-owned and intended for client processing, not application consumption. This establishes a critical invariant: any frame that is not `content` is meant for the protocol layer. -When a client encounters an unknown frame type, it can definitively conclude this represents a version compatibility issue rather than corrupted application data. +When a client encounters an unknown frame type, it can definitively conclude this represents a version compatibility issue. **Processing Rules** @@ -280,11 +280,10 @@ flowchart TD P --> T{frame_type} T --content--> Bytes - T --> Placeholder + classDef plain fill:none,stroke:transparent; ``` -!TODO: Replace placeholder ## Payloads @@ -308,24 +307,23 @@ This payload is used without modification from the SDS Spec. ```protobuf message HistoryEntry { - string message_id = 1; - bytes retrieval_hint = 2; - } - -message ReliablePayload { - string message_id = 2; - string channel_id = 3; - int32 lamport_timestamp = 10; - repeated HistoryEntry causal_history = 11; - bytes bloom_filter = 12; - bytes content = 20; - } + string message_id = 1; // Unique identifier of the SDS message, as defined in `Message` + optional bytes retrieval_hint = 2; // Optional information to help remote parties retrieve this SDS message; For example, A Waku deterministic message hash or routing payload hash +} + +message Message { + string sender_id = 1; // Participant ID of the message sender + string message_id = 2; // Unique identifier of the message + string channel_id = 3; // Identifier of the channel to which the message belongs + optional int32 lamport_timestamp = 10; // Logical timestamp for causal ordering in channel + repeated HistoryEntry causal_history = 11; // List of preceding message IDs that this message causally depends on. Generally 2 or 3 message IDs are included. + optional bytes bloom_filter = 12; // Bloom filter representing received message IDs in channel + optional bytes content = 20; // Actual content of the message +} ``` **content:** This field is an protobuf encoded `Segment` -!TODO: Why is SDS using signed int for timestamps? - ### Segmentation This payload is used without modification from the Segmentation [specification](https://github.com/waku-org/specs/blob/fa2993b427f12796356a232c54be75814fac5d98/standards/application/segmentation.md) @@ -401,7 +399,4 @@ Without proper authentication during initialization, an adversary could perform Copyright and related rights waived via [CC0](https://creativecommons.org/publicdomain/zero/1.0/). -## References - -A list of references. use SHA256 or SAH256. From bfe3f14e12b73c83da08d5c5a095621f30fd7ec1 Mon Sep 17 00:00:00 2001 From: Jazz Turner-Baggs <473256+jazzz@users.noreply.github.com> Date: Tue, 28 Oct 2025 15:12:17 -0700 Subject: [PATCH 17/20] review comments --- standards/application/privatev1.md | 41 ++++++++++++++++++++---------- 1 file changed, 28 insertions(+), 13 deletions(-) diff --git a/standards/application/privatev1.md b/standards/application/privatev1.md index 7a6e397..4477d23 100644 --- a/standards/application/privatev1.md +++ b/standards/application/privatev1.md @@ -9,7 +9,7 @@ contributors: # Abstract -This specification defines PRIVATE1, a conversation protocol for establishing secure, full-duplex encrypted communication channels between two participants. PRIVATE1 provides end-to-end encryption with forward secrecy and post-compromise security using the DoubleRatchet algorithm, combined with reliable message delivery via Scalable Data Sync (SDS) and efficient segmentation for transport-constrained environments. +This specification defines PRIVATE1, a conversation protocol for establishing secure, full-duplex encrypted communication channels between two participants. PRIVATE1 provides end-to-end encryption with forward secrecy and post-compromise security using the Double Ratchet algorithm, combined with reliable message delivery via Scalable Data Sync (SDS) and efficient segmentation for transport-constrained environments. The protocol is transport-agnostic and designed to support both direct messaging and as a foundation for group communication systems. PRIVATE1 ensures payload confidentiality, content integrity, sender privacy, and message reliability while remaining resilient to network disruptions and message reordering. @@ -24,7 +24,7 @@ However, being encrypted is merely the starting point, not the complete solution # Private V1 -PrivateV1 is a conversation type specification that establishes a full-duplex secure communication channel between two participants. It combines the Double Ratchet algorithm for encryption with Scalable Data Sync (SDS) for reliable delivery and an efficient segmentation strategy to handle transport constraints. +PRIVATE1 is a conversation type specification that establishes a full-duplex secure communication channel between two participants. It combines the Double Ratchet algorithm for encryption with Scalable Data Sync (SDS) for reliable delivery and an efficient segmentation strategy to handle transport constraints. PRIVATE1 provides the following properties: @@ -56,7 +56,7 @@ It also assumes that some other component is responsible for delivering the gene ```mermaid flowchart LR - Content:::plain--> Privatev1 --> Payload:::plain + Content:::plain--> PrivateV1 --> Payload:::plain classDef plain fill:none,stroke:transparent; ``` ### Content @@ -68,9 +68,9 @@ Applications provide content as encoded bytes, which is then packaged into paylo Content MUST be smaller than `255 * max_seg_size` due to segmentation protocol limitations. -**Agnostic** +**Structure** -The protocol treats the contents as a arbitrary sequence of bytes and is agnostic to its contents. +The protocol treats the contents as an arbitrary sequence of bytes and is agnostic to its contents. ### Payload Delivery How payloads are sent and received by clients is deliberately not specified by this protocol. @@ -81,7 +81,7 @@ Confidentiality, integrity, and forward secrecy are provided regardless of how p However, transport choice may affect other properties and characteristics. **Recipient Privacy:** -The routing/addressing layer may leak sensitive metadata including the recipients identity. The payloads generated by this protocol do not reveal the participants of a conversation, however the overall privacy properties are determined by the delivery mechanism used to transport payloads. +The routing/addressing layer may leak sensitive metadata including the recipient's identity. The payloads generated by this protocol do not reveal the participants of a conversation, however the overall privacy properties are determined by the delivery mechanism used to transport payloads. **Reliability Performance** While PRIVATE1 handles message losses, more reliable transports reduce retransmission overhead. @@ -106,7 +106,7 @@ PRIVATE1 requires a unique identifier, however the exact derivation is left to i Additionally implementations MUST determine the following constants: - `max_seg_size` - maximum segmentation size to be used. -- `max_skip` - number of keys which can be skipped per session. Values are determined by +- `max_skip` - number of keys which can be skipped per session. ## Value Derivations @@ -196,7 +196,7 @@ The following mappings connect PRIVATE1 concepts to SDS fields: - `channel_id`: uses the `conversation_id` parameter. **Sender Validation** -SDS uses a `sender_id` payload field to determine whether a message was sent by the remote party. This value is sender reported and not validated which can have unknown implications if trusted in other contexts. For security hygiene Clients SHOULD drop SDS messages if `sender_id` != the sender derived from the encryption layer. !TODO: PrivateV1 is not sender aware currently +SDS uses a `sender_id` payload field to determine whether a message was sent by the remote party. This value is sender reported and not validated which can have unknown implications if trusted in other contexts. For security hygiene Clients SHOULD drop SDS messages if `sender_id` != the sender derived from the encryption layer. !TODO: PRIVATE1 is not sender aware currently **Bloom Filter Configuration** @@ -291,7 +291,7 @@ flowchart TD ### Encrypted Payload ```protobuf -message Doubleratchet { +message DoubleRatchet { bytes dh = 1; // 32 byte publickey uint32 msgNum = 2; uint32 prevChainLen = 3; @@ -322,7 +322,7 @@ message Message { } ``` -**content:** This field is an protobuf encoded `Segment` +**content:** This field is a protobuf encoded `Segment` ### Segmentation @@ -345,7 +345,7 @@ message SegmentMessageProto { !TODO: This should be encoded as a FrameType so it can be optional. -### Frame +### PrivateV1Frame ```protobuf message PrivateV1Frame { @@ -377,7 +377,7 @@ This means that senders SHOULD generate a new ephemeral key for `ssk` for every ### Excessive Skipped Message Handling of skipped message keys is not strictly defined in double ratchet. -Implementations need to choose an strategy which works best for their environment, and delivery mechanism. +Implementations need to choose a strategy which works best for their environment, and delivery mechanism. Halting operation of the channel is the safest, as it bounds resource utilization in the event of a DOS attack but is not always possible. If eventual delivery of messages is not guaranteed, implementors should regularly delete keys that are older than a given time window. @@ -385,7 +385,7 @@ Unreliable delivery mechanisms will result in increased key storage over time, a ## Security/Privacy Considerations -### Sender Deniability and Authentication** +### Sender Deniability and Authentication Encrypted messages do not have a cryptographically provable sender to third parties due to the deniability property of the Double Ratchet algorithm. However, participants in a conversation can authenticate each other through the shared cryptographic state. @@ -400,3 +400,18 @@ Without proper authentication during initialization, an adversary could perform Copyright and related rights waived via [CC0](https://creativecommons.org/publicdomain/zero/1.0/). +## References +- **[DOUBLERATCHET]** "The Double Ratchet Algorithm", Signal, 2016. + https://signal.org/docs/specifications/doubleratchet/ + +- **[SDS]** "Scalable Data Sync Specification", vac, 2024. + https://github.com/vacp2p/rfc-index/blob/main/vac/raw/sds.md + +- **[SEGMENTATION]** "Message Segmentation Specification", Waku, 2024. + https://github.com/waku-org/specs/blob/main/standards/application/segmentation.md + +- **[CONTENTFRAME]** "ContentFrame Specification", Waku, 2024. + https://github.com/waku-org/specs/blob/main/standards/application/contentframe.md + +- **[CHAT-DEFINITIONS]** "Chat Definitions Specification", Waku, 2024. + https://github.com/waku-org/specs/blob/main/informational/chatdefs.md From cfa59f37fa112fa8b7a8023c8daf88b5ae730a33 Mon Sep 17 00:00:00 2001 From: Jazz Turner-Baggs <473256+jazzz@users.noreply.github.com> Date: Tue, 28 Oct 2025 16:25:51 -0700 Subject: [PATCH 18/20] spelling fixes --- .wordlist.txt | 28 ++++++++++++++++++++++++++++ standards/application/privatev1.md | 12 ++++++------ 2 files changed, 34 insertions(+), 6 deletions(-) diff --git a/.wordlist.txt b/.wordlist.txt index 8e754de..41e2034 100644 --- a/.wordlist.txt +++ b/.wordlist.txt @@ -1,22 +1,50 @@ +AEAD ALLOC +ciphertext creativecommons +cryptographic +cryptographically danielkaiser +decrypt +decrypted +dh DHT +Diffie DoS github GITHUB gossipsub GossipSub +Groupchats +Hellman +HKDF https iana IANA +implementers +KDF +KiB +lamport libp2p md +nonces +observability +protobuf +publickey pubsub +retransmission +Retransmission rfc RFC +Scalable +SDS +SHA SHARDING subnets +TCP +uint +uint32 +Unencrypted Waku WAKU www diff --git a/standards/application/privatev1.md b/standards/application/privatev1.md index 4477d23..de77611 100644 --- a/standards/application/privatev1.md +++ b/standards/application/privatev1.md @@ -179,7 +179,7 @@ The segmentation strategy used is defined by [!TODO: Flatten link once completed Implementation specifics: - Error correction is not used, as reliable delivery is already provided by lower layers. -- `segmentSize` = `max_seg_size +- `segmentSize` = `max_seg_size` - All payloads regardless of size are wrapped in a segmentation message. ### Message Reliability @@ -215,11 +215,11 @@ Payloads are encrypted using the [Double Ratchet](https://signal.org/docs/specif - `KDF_RK`: HKDF with SHA256, `info = "PrivateV1RootKey"` - `KDF_CK`: HKDF with SHA256, using `input`=`0x01` for message keys and `input`=`0x02` for chain keys - `KDF_MK`: HKDF with SHA256, `info = "PrivateV1MessageKey"` -- `ENCRYPT`: AEAD_CHACHA20_POLY1305 +- `ENCRYPT`: `AEAD_CHACHA20_POLY1305` **AEAD Implementation** -ChaCha20-Poly1305 is used with randomly generated 96-bit (12-byte) nonces. +`ChaCha20-Poly1305` is used with randomly generated 96-bit (12-byte) nonces. The nonce MUST be generated using a cryptographically secure random number generator for each message. The complete ciphertext format for transport is: ``` @@ -365,8 +365,8 @@ message PrivateV1Frame { ### Content Types -Implementors need to be mindful of maintaining interoperability between clients, when deciding how content is encoded prior to transmission. -In a decentralized context, clients cannot be assumed to be using the same version let alone application. It is recommended that implementors use a self-describing content payload such as [CONTENTFRAME](https://github.com/waku-org/specs/blob/jazzz/content_frame/standards/application/contentframe.md) specification. This provides the ability for clients to determine support for incoming frames, regardless of the software used to receive them. +Implementers need to be mindful of maintaining interoperability between clients, when deciding how content is encoded prior to transmission. +In a decentralized context, clients cannot be assumed to be using the same version let alone application. It is recommended that implementers use a self-describing content payload such as [CONTENTFRAME](https://github.com/waku-org/specs/blob/jazzz/content_frame/standards/application/contentframe.md) specification. This provides the ability for clients to determine support for incoming frames, regardless of the software used to receive them. ### Initialization @@ -380,7 +380,7 @@ Handling of skipped message keys is not strictly defined in double ratchet. Implementations need to choose a strategy which works best for their environment, and delivery mechanism. Halting operation of the channel is the safest, as it bounds resource utilization in the event of a DOS attack but is not always possible. -If eventual delivery of messages is not guaranteed, implementors should regularly delete keys that are older than a given time window. +If eventual delivery of messages is not guaranteed, implementers should regularly delete keys that are older than a given time window. Unreliable delivery mechanisms will result in increased key storage over time, as more messages are lost with no hope of delivery. ## Security/Privacy Considerations From 872b715b17738110e05f3a45dfb4af925c768734 Mon Sep 17 00:00:00 2001 From: Jazz Turner-Baggs <473256+jazzz@users.noreply.github.com> Date: Tue, 28 Oct 2025 16:26:06 -0700 Subject: [PATCH 19/20] fix: pyspelling config --- .spellcheck.yml | 16 ++++++++++++---- 1 file changed, 12 insertions(+), 4 deletions(-) diff --git a/.spellcheck.yml b/.spellcheck.yml index 0e94ad8..b7743a1 100644 --- a/.spellcheck.yml +++ b/.spellcheck.yml @@ -8,13 +8,21 @@ matrix: mode: none add-filter: - url + camel-case: true # accept CamelCase words + run-together: true dictionary: wordlists: - .wordlist.txt pipeline: - - pyspelling.filters.markdown - - pyspelling.filters.html: - comments: false - - pyspelling.filters.text + - pyspelling.filters.markdown: + markdown_extensions: + - markdown.extensions.extra + - pyspelling.filters.html: + comments: false + attributes: + - title + - alt + ignores: + - code default_encoding: utf-8 suggest: true From f00cff3800eea40b58a104ec44a05c3085f6c1ea Mon Sep 17 00:00:00 2001 From: Jazz Turner-Baggs <473256+jazzz@users.noreply.github.com> Date: Tue, 18 Nov 2025 22:04:27 -0800 Subject: [PATCH 20/20] Update hash function for consistency --- standards/application/privatev1.md | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/standards/application/privatev1.md b/standards/application/privatev1.md index de77611..e67e0e9 100644 --- a/standards/application/privatev1.md +++ b/standards/application/privatev1.md @@ -212,9 +212,9 @@ Payloads are encrypted using the [Double Ratchet](https://signal.org/docs/specif **Double Ratchet Configuration** - `DH`: X25519 for Diffie-Hellman operations -- `KDF_RK`: HKDF with SHA256, `info = "PrivateV1RootKey"` -- `KDF_CK`: HKDF with SHA256, using `input`=`0x01` for message keys and `input`=`0x02` for chain keys -- `KDF_MK`: HKDF with SHA256, `info = "PrivateV1MessageKey"` +- `KDF_RK`: HKDF with BLAKE2b, `info = "PrivateV1RootKey"` +- `KDF_CK`: HKDF with BLAKE2b, using `input`=`0x01` for message keys and `input`=`0x02` for chain keys +- `KDF_MK`: HKDF with BLAKE2b, `info = "PrivateV1MessageKey"` - `ENCRYPT`: `AEAD_CHACHA20_POLY1305` **AEAD Implementation**