mirror of https://github.com/logos-messaging/specs.git synced 2026-01-02 14:13:06 +00:00

Jazz Turner-Baggs eaff6caff0

2025-10-28 14:14:51 -07:00

20 KiB

Raw Blame History

title	name	category	tags	editor	contributors
PRIVATE1	Private conversation	Standards Track		Jazz Alyxzander (@Jazzz)

Abstract

This specification defines PRIVATE1, a conversation protocol for establishing secure, full-duplex encrypted communication channels between two participants. PRIVATE1 provides end-to-end encryption with forward secrecy and post-compromise security using the DoubleRatchet algorithm, combined with reliable message delivery via Scalable Data Sync (SDS) and efficient segmentation for transport-constrained environments.

The protocol is transport-agnostic and designed to support both direct messaging and as a foundation for group communication systems. PRIVATE1 ensures payload confidentiality, content integrity, sender privacy, and message reliability while remaining resilient to network disruptions and message reordering.

Background

Pairwise encrypted messaging channels represent the foundational building block of modern secure communication systems. While end-to-end encrypted group chats capture user attention, the underlying infrastructure that makes these systems possible relies (at least somewhat) on secure one-to-one communication primitives. Just as higher-level network protocols are built upon reliable transport primitives like TCP, sophisticated communication systems depend on robust pairwise channels to function correctly and securely.

These channels serve purposes beyond simple content delivery. They transmit not only user-visible messages but also critical metadata, coordination signals, and state synchronization information between clients. This signaling capability makes pairwise channels essential infrastructure for distributed systems: key material distribution, membership updates, administrative actions, and protocol coordination all flow through these channels. While more sophisticated group communication strategies can achieve better efficiency at scale—particularly for broadcast-style communication patterns with many participants—they struggle to match the privacy and security properties that pairwise channels provide inherently. The fundamental asymmetry of two-party communication enables stronger guarantees: minimal metadata exposure, simpler key management, clearer authentication boundaries, and more straightforward security analysis.

However, being encrypted is merely the starting point, not the complete solution. Production-quality one-to-one channels must function reliably in the messy reality of modern networks. Real-world deployment demands resilience to unreliable networks where messages may be lost, delayed, duplicated, or arrive out of order. Channels must efficiently handle arbitrarily large payloads—from short text messages to multi-megabyte file transfers—while respecting the maximum transmission unit constraints imposed by various transport layers. Perhaps most critically, the protocol must remain fully operational even when one or more participants are offline or intermittently connected, a common scenario in mobile environments where users move between network conditions, battery limitations force background restrictions, or time zone differences mean participants are rarely simultaneously active. These practical requirements shape the protocol design as significantly as cryptographic considerations, demanding careful attention to segmentation strategies, reliability mechanisms, state management, and resource constraints alongside the core security properties.

Private V1

PrivateV1 is a conversation type specification that establishes a full-duplex secure communication channel between two participants. It combines the Double Ratchet algorithm for encryption with Scalable Data Sync (SDS) for reliable delivery and an efficient segmentation strategy to handle transport constraints.

Payload Confidentiality: Only the two participants can read the contents of any message sent. Observers, transport providers, and other third parties cannot decrypt message contents.
Content Integrity: Recipients can detect if message contents were modified by a third party. Any tampering with encrypted payloads will cause decryption to fail, preventing corrupted messages from being accepted as authentic.
Sender Privacy: Only the recipient can determine who the sender was. Observers cannot identify the sender from encrypted payloads, though both participants can authenticate each other's messages.
Forward Secrecy: A compromise in the future does not allow previous messages to be decrypted by a third party. Message keys are deleted immediately after use and cannot be reconstructed from current state, even if long-term keys are later compromised.
Post-Compromise Security: Conversations eventually recover from a key compromise. After an attacker loses access to a device, the security properties are eventually restored.
Dropped Message Observability: Messages lost in transit are eventually observable to both sender and recipient.

Definitions

This document makes use of the shared terminology defined in the CHAT-DEFINITIONS specification.

The terms include:

Application
Content
Participant
Payload
Recipient
Sender

Architecture

This conversation type assumes there is some service or application which wishes to generate and receive end-to-end encrypted content. It also assumes that some other component will be responsible for delivering the generated payloads. At its core this protocol takes the content provided and creates a series of payloads to be sent to the recipient.

flowchart LR
    Content:::plain--> Privatev1 --> Payload:::plain
    classDef plain fill:none,stroke:transparent;

Content

Applications provide content as encoded bytes, which is then packaged into payloads for transmission.

Size Limit

Content MUST be smaller than 255 * max_seg_size due to segmentation protocol limitations.

Agnostic

The protocol treats the contents as a arbitrary sequence of bytes and is agnostic to its contents.

Payload Delivery

How payloads are sent and received by clients is deliberately not specified by this protocol. Transport choice is an implementation decision that should be made based on deployment requirements.

The choice of transport mechanism has no impact on PRIVATE1's security properties. Confidentiality, integrity, and forward secrecy are provided regardless of how payloads are delivered. However, transport choice may affect other properties and characteristics.

Recipient Privacy: The routing/addressing layer may leak sensitive metadata including the recipients identity. The payloads generated by this protocol do not reveal the participants of a conversation, however the overall privacy properties are determined by the delivery mechanism used to transport payloads.

Reliability Performance While PRIVATE1 handles message losses, more reliable transports reduce retransmission overhead.

Initialization

The channel is initialized by both sender and recipient agreeing on the following values for each conversation:

sk - initial secret key [32 bytes]
ssk - sender DH seed key
rsk - recipient DH seed key
conversation_id - globally unique identifier

To maintain the security properties:

sk MUST be known only by the participants.
sk MUST be derived in a way that ensures mutual authentication of the participants
sk SHOULD have forward secrecy by incorporating ephemeral key material
rsk and ssk SHOULD incorporate ephemeral key material

As PRIVATE1 is agnostic to identity defining a unique identifier is difficult at this layer. The exact derivation is left to implementations to determine.

conversation_id MUST be unique across all instances of chat conversations
conversation_id SHOULD be consistent across applications to maintain interoperability

Additionally implementations MUST determine the following constants:

max_seg_size - maximum segmentation size to be used.
max_skip - number of keys which can be skipped per session. Values are determined by

Value Derivations

These values are derived during protocol operation and are deterministically computed from protocol data.

Frame Identifier

For reliability tracking, every payload MUST have a unique deterministic identifier.

The frame identifier is computed as:

frame_id = rhex(blake2b(encoded_frame_bytes))

Where:

rhex is lowercase hexadecimal encoding without the 0x prefix
blake2b is BLAKE2b hash function with 128-bit output
encoded_frame_bytes is the protobuf-encoded PrivateV1Frame
frame_id is a 32 character string

Protobuf Encoding Considerations

Protobuf does not guarantee byte-identical outputs for multiple serializations of the same logical message. Because of this, the frame_id represents the hash of specific encoded bytes rather than an abstract frame structure. Implementations MUST compute frame_id from the actual bytes being transmitted to ensure sender and receiver derive identical identifiers.

Protocol Operation

PRIVATE1 processes messages through a three-stage pipeline where each stage's output becomes the next stage's input. The specific ordering of these stages is critical for maintaining security properties while enabling efficient operation.

flowchart TD
    C("Content"):::plain
    S(Segmentation)
    R(Reliability)
    E(Encryption) 
     D(Delivery):::plain
    C --> S --> R --> E --> D

    classDef plain fill:none,stroke:transparent;

Pipeline Stages:

Segmentation: Divides content into transport-appropriate fragments
Reliability (SDS): Adds tracking metadata for delivery detection and ordering
Encryption (Double Ratchet): Provides confidentiality, authentication, and forward secrecy

Segmentation

While PRIVATE1 itself has no inherent message size limitation, practical transport mechanisms typically impose maximum payload sizes. Segmentation is intentionally placed as the first pipeline stage rather than deferring it to the transport layer

Why Segment Before Encryption

Segmenting after encryption would force the transport layer to handle fragmentation of ciphertext blobs, creating several problems.

Transport-layer segmentation would require buffering all segments before any can be authenticated, increasing the DOS attack surface.
Unauthenticated segment reassembly opens the door to malicious segment injection and substitution attacks.
Unencrypted segmentation metadata reveals size and other metadata about the content in transit.

Why Segment Before Reliability

Placing segmentation after reliability tracking would mean retransmission of a dropped payload requires re-broadcasting the entire frame. By segmenting first, the reliability layer can track individual segments and request retransmission of only the missing fragments.

Implementation

The segmentation strategy used is defined by !TODO: Flatten link once completed

Implementation specifics:

Error correction is not used, as reliable delivery is already provided by lower layers.
segmentSize = `max_seg_size

Message Reliability

Scalable Data Sync (SDS) is used to detect missing messages, provide delivery confirmation, and handle retransmission of payloads. SDS is implemented according to the specification.

SDS Field Mappings

The following mappings connect PRIVATE1 concepts to SDS fields:

sender_id: !TODO: This requires PRIVATE1 to be identity aware
message_id: uses the frame_id definition.
channel_id: uses the conversation_id parameter.

Sender Validation SDS uses a sender_id payload field to determine whether a message was sent by the remote party. This value is sender reported and not validated which can have unknown implications if trusted in other contexts. For security hygiene Clients SHOULD drop SDS messages if sender_id != the sender derived from the encryption layer.

Bloom Filter Configuration

PRIVATE1 uses bloom filter parameters of n=2000 (expected elements) and p=0.001 (false positive probability). This configuration produces bloom filters of approximately 3.5 KiB per message.

!TODO: Can the bloom filter be dropped in 1:1 communication?

Message Reliability

Scalable Data Sync is used to detect missing messages and provide delivery receipts to the sender after successful reception of a payload. SDS is implemented according to the specification.

!TODO: define: sender_id mapping !TODO: define: message_id mapping !TODO: update to latest version and include SDS-R

!NOTE: The defaultConfig in nim-SDS creates a bloom filter with the parameters n=10000, p=0.001 which has a size of ~18KiB. The bloom filter is included in every message which results in a best-case overhead rate of 13.3% (assuming waku's MSS of 150KB). Given a target content size of 4KB, that puts the utilization factor at 80+% (Without considering other layers). This needs to be looked at, lowering to n=2000 would lower overhead to ~3.5 KiB.

Encryption

Payloads are encrypted using the Double Ratchet algorithm with the following cryptographic primitive choices:

Double Ratchet Configuration

DH: X25519 for Diffie-Hellman operations
KDF_RK: HKDF with SHA256, info = "PrivateV1RootKey"
KDF_CK: HKDF with SHA256, using input=0x01 for message keys and input=0x02 for chain keys
KDF_MK: HKDF with SHA256, info = "PrivateV1MessageKey"
ENCRYPT: AEAD_CHACHA20_POLY1305

AEAD Implementation

ChaCha20-Poly1305 is used with randomly generated 96-bit (12-byte) nonces. The nonce MUST be generated using a cryptographically secure random number generator for each message. The complete ciphertext format for transport is:

encrypted_payload = nonce || ciphertext || tag

Where nonce is 12 bytes, ciphertext is variable length, and tag is 16 bytes.

Frame Handling

This protocol uses explicit frame type tagging to remove ambiguity when parsing and handling frames. This creates a clear distinction between protocol-generated frames and application content.

Type Discrimination

All frames carry an explicit type field that identifies their purpose. The content frame type is reserved exclusively for application-level data. All other frame types are protocol-owned and intended for client processing, not application consumption.

This establishes a critical invariant: any frame that is not content is meant for the protocol layer. When a client encounters an unknown frame type, it can definitively conclude this represents a version compatibility issue rather than corrupted application data.

Processing Rules

All application-level content MUST use the content frame type
Clients SHALL only pass content frames to applications
Clients MAY drop unrecognized frame types

Future Extensibility

This explicit tagging mechanism allows the protocol to evolve without breaking existing implementations. Future versions may define additional frame types for protocol-level functionality while legacy clients continue processing content frames normally.

Wire Format Specification / Syntax

Payload Parse Tree

A deterministic parse tree is used to avoid ambiguity when receiving payloads.

flowchart TD

    D[DoubleRatchet]
    S[SDS Message]
    Segment1[ Segment]
    Segment2[ Segment]
    Segment3[ Segment]
    P[PrivateV1Frame]

    start@{ shape: start }
    start --> D
    D -->|Payload| S
    S -->|Payload| Segment1

    Segment1 --> P
    Segment2:::plain --> P
    Segment3:::plain --> P
    
    P --> T{frame_type}
    T --content--> Bytes
    T --> Placeholder

    classDef plain fill:none,stroke:transparent;

!TODO: Replace placeholder

Payloads

!TODO: Don't duplicate payload definitions from other specs. Though its helpful for now.

Encrypted Payload

message Doubleratchet {
    bytes dh = 1;               // 32 byte publickey
    uint32 msgNum = 2;          
    uint32 prevChainLen = 3;     
    bytes ciphertext = 4;       // arbitrary length bytes
}

dh: the x component of the dh_pair.publickey encoded as raw bytes. ciphertext: A protobuf encoded SDS Message

SDS Message

This payload is used without modification from the SDS Spec.

message HistoryEntry {
    string message_id = 1;        
    bytes retrieval_hint = 2;                      
  }
  
message ReliablePayload {
    string message_id = 2;      
    string channel_id = 3;  
    int32 lamport_timestamp = 10;    
    repeated HistoryEntry causal_history = 11;   
    bytes bloom_filter = 12; 
    bytes content = 20;                           
  }

content: This field is an protobuf encoded Segment

!TODO: Why is SDS using signed int for timestamps?

Segmentation

This payload is used without modification from the Segmentation specification


message SegmentMessageProto {
  bytes  entire_message_hash    = 1; // 32 Bytes
  uint32 index                  = 2; 
  uint32 segments_count         = 3;
  bytes  payload                = 4; 
  uint32 parity_segment_index   = 5;
  uint32 parity_segments_count  = 6; 
}

payload: This field is an protobuf encoded PrivateV1Frame

!TODO: This should be encoded as a FrameType so it can be optional.

Frame

message PrivateV1Frame {                 
    uint64 timestamp = 1;             // Sender reported timestamp
	oneof frame_type {
		bytes content = 10;
        Placeholder placeholder = 11;
        // ....
	}
}

content: is encoded as bytes in order to allow implementations to define the type at runtime.

Implementation Suggestions

Content Types

Implementors need to be mindful of maintaining interoperability between clients, when deciding how content is encoded prior to transmission. In a decentralized context, clients cannot be assumed to be using the same version let alone application. It is recommended that implementors use a self-describing content payload such as CONTENTFRAME specification. This provides the ability for clients to determine support for incoming frames, regardless of the software used to receive them.

Initialization

Mutual authentication is provided by the sk, so there is no requirement of using authenticated keys for ssk and rsk. Implementations SHOULD use the most ephemeral key available in order incorporate as much key material as possible. This means that senders SHOULD generate a new ephemeral key for ssk for every conversation assuming channels are asynchronously initialized.

Excessive Skipped Message

Handling of skipped message keys is not strictly defined in double ratchet. Implementations need to choose an strategy which works best for their environment, and delivery mechanism. Halting operation of the channel is the safest, as it bounds resource utilization in the event of a DOS attack but is not always possible.

If eventual delivery of messages is not guaranteed, implementors should regularly delete keys that are older than a given time window. Unreliable delivery mechanisms will result in increased key storage over time, as more messages are lost with no hope of delivery.

20 KiB

Raw Blame History

Abstract

Background

Private V1

Definitions

Architecture

Content

Payload Delivery

Initialization

Value Derivations

Frame Identifier

Protocol Operation

Segmentation

Message Reliability

Message Reliability

Encryption

Frame Handling

Wire Format Specification / Syntax

Payload Parse Tree

Payloads

Encrypted Payload

SDS Message

Segmentation

Frame

Implementation Suggestions

Content Types

Initialization

Excessive Skipped Message

Security/Privacy Considerations

Sender Derivation

Segmentation Session Binding

Privacy - ContentSize

Copyright

References

20 KiB Raw Blame History

Abstract

Background

Private V1

Definitions

Architecture

Content

Payload Delivery

Initialization

Value Derivations

Frame Identifier

Protocol Operation

Segmentation

Message Reliability

Message Reliability

Encryption

Frame Handling

Wire Format Specification / Syntax

Payload Parse Tree

Payloads

Encrypted Payload

SDS Message

Segmentation

Frame

Implementation Suggestions

Content Types

Initialization

Excessive Skipped Message

Security/Privacy Considerations

Sender Derivation

Segmentation Session Binding

Privacy - ContentSize

Copyright

References

20 KiB

Raw Blame History