rfc/content/docs/rfcs/32/README.md

27 KiB

slug title name status editor contributors
32 32/RLN Rate Limit Nullifier raw Blagoj Dimovski <blagoj.dimovski@yandex.com>
Barry Whitehat <barrywhitehat@protonmail.com>
Sanaz Taheri <sanaz@status.im>
Oskar Thorén <oskar@status.im>
Onur Kilic <onurkilic1004@gmail.com>

Abstract

The following specification covers the RLN construct as well as some auxiliary libraries useful for interacting with it. Rate limiting nullifier (RLN) is a construct based on zero-knowledge proofs that provides an anonymous rate-limited signaling/messaging framework suitable for decentralized (and centralized) environments. Anonymity refers to the unlinkability of messages to their owner.

Motivation

RLN guarantees a messaging rate is enforced cryptographically while preserving the anonymity of the message owners. A wide range of applications can benefit from RLN and provide desirable security features. For example, an e-voting system can integrate RLN to contain the voting rate while protecting the voters-vote unlinkability. Another use case is to protect an anonymous messaging system against DDoS and spam attacks by containing messaging rate of users. This latter use case is explained in 17/WAKU2-RLN-RELAY RFC.

Flow

The users participate in the protocol by first registering to an application-defined group referred by the membership group. Registration to the group is mandatory for signaling in the application. After registration, group members can generate Zero-knowledge Proof of membership for their signals and can participate in the application. Usually, the membership requires a financial or social stake which is beneficial for the prevention of Sybil attacks and double-signaling. Group members are allowed to send one signal per external nullifier (an identifier that groups signals and can be thought of as a voting booth). If a user generates more signals than allowed, the user risks being slashed - by revealing his membership secret credentials. If the financial stake is put in place, the user also risks his stake being taken.

Generally the flow can be described by the following steps:

  1. Registration
  2. Signaling
  3. Verification and slashing

Registration

Depending on the application requirements, the registration can be implemented in different ways, for example:

  • centralized registrations, by using a central server
  • decentralized registrations, by using a smart contract

What is important is that the users' identity commitments (explained in section User Indetity) are stored in a Merkle tree, and the users can obtain a Merkle proof proving that they are part of the group.

Also depending on the application requirements, usually a financial or social stake is introduced.

An example for financial stake is: For each registration a certain amount of ETH is required. An example for social stake is using InterRep as a registry - users need to prove that they have a highly reputable social media account.

Implementation notes

User identity

The user's identity is composed of:

{
    identity_secret: [identity_nullifier, identity_trapdoor],
    identity_secret_hash: poseidonHash(identity_secret),
    identity_commitment: poseidonHash([identity_secret_hash])
}

For registration, the user needs to submit their identity_commitment (along with any additional registration requirements) to the registry. Upon registration, they should receive leaf_index value which represents their position in the Merkle tree. Receiving a leaf_index is not a hard requirement and is application specific. The other way around is the users calculating the leaf_index themselves upon successful registration.

Signaling

After registration, the users can participate in the application by sending signals to the other participants in a decentralised manner or to a centralised server. Along with their signal, they need to generate a ZK-Proof by using the circuit with the specification described above.

For generating a proof, the users need to obtain the required parameters or compute them themselves, depending on the application implementation and client libraries supported by the application. For example the users can store the membership Merkle tree on their end and generate a Merkle proof whenever they want to generate a signal.

Implementation notes

Signal hash

The signal hash can be generated by hashing the raw signal (or content) using the keccak256 hash function.

External nullifier

The external nullifier MUST be computed as the Poseidon hash of the current epoch (e.g. a value equal to or derived from the current UNIX timestamp divided by the epoch length) and the RLN identifier.

external_nullifier = poseidonHash([epoch, rln_identifier])

Obtaining Merkle proof

The Merkle proof should be obtained locally or from a trusted third party. By using the incremental Merkle tree algorithm, the Merkle can be obtained by providing the leaf_index of the identity_commitment. The proof (Merkle_proof) is composed of the following fields:

{
  root: bigint
  indices: number[]
  path_elements: bigint[][]
}
  1. root - The root of membership group Merkle tree at the time of publishing the message
  2. indices - The index fields of the leafs in the Merkle tree - used by the Merkle tree algorithm for verification
  3. path_elements - Auxiliary data structure used for storing the path to the leaf - used by the Merkle proof algorithm for verificaton

Generating proof

For proof generation, the user need to submit the following fields to the circuit:

    {
      identity_secret: identity_secret_hash,
      path_elements: Merkle_proof.path_elements,
      identity_path_index: Merkle_proof.indices,
      x: signal_hash,
      epoch: epoch,
      rln_identifier: rln_identifier
    }

Calculating output

The proof output is calculated locally, in order for the required fields for proof verification to be sent along with the proof. The proof output is composed of the y share of the secret equation and the internal_nullifier. The internal_nullifier represents a unique fingerprint of a user for a given epoch and app. The following fields are needed for proof output calculation:

{
  identity_secret_hash: bigint, 
  epoch: bigint, 
  rln_identifier: bigint,
  x: bigint, 
}

The output [y, internal_nullifier] is calculated in the following way:

external_nullifier = poseidonHash([epoch, rln_identifier])

a_0 = identity_secret_hash
a_1 = poseidonHash([a0, external_nullifier])

y = a_0 + x * a_1

internal_nullifier = poseidonHash([a_1])

It relies on the properties of the Shamir's Secret sharing scheme.

Sending the output message

The user's output message (output_message), containing the signal should contain the following fields at minimum:

{
    signal: signal, # non-hashed signal
    proof: zk_proof,
    internal_nullifier: internal_nullifier,
    x: x, # signal_hash
    y: y,
    rln_identifier: rln_identifier
}

Additionally depending on the application, the following fields might be required:

  {
      root: Merkle_proof.root,
      epoch: epoch
  }

Verification and slashing

The slashing implementation is dependent on the type of application. If the application is implemented in a centralised manner, and everything is stored on a single server, the slashing will be implemented only on the server. Otherwise if the application is distributed, the slashing will be implemented on each user's client.

Implementation notes

Each user of the protocol (server or otherwise) will need to store metadata for each message received by each user, for the given epoch. The data can be deleted when the epoch passes. Storing metadata is required, so that if a user sends more than one unique signal per epoch, they can be slashed and removed from the protocol. The metadata stored contains the x, y shares and the internal_nullifier for the user for each message. If enough such shares are present, the user's secret can be retreived.

One way of storing received metadata (messaging_metadata) is the following format:

{
    [external_nullifier]: {
        [internal_nullifier]: {
            x_shares: [],
            y_shares: []
        }
    }
}

Verification

The output message verification consists of the following steps:

  • external_nullifier correctness
  • non-duplicate message check
  • zk_proof verification
  • spam verification

1. external_nullifier correctness Upon received output_message, first the epoch and rln_identifier fields are checked, to ensure that the message matches the current external_nullifier. If the external_nullifier is correct the verification continues, otherwise, the message is discarded.

2. non-duplicate message check The received message is checked to ensure it is not duplicate. The duplicate message check is performed by verifying that the x and y fields do not exist in the messaging_metadata object. If the x and y fields exist in the x_shares and y_shares array for the external_nullifier and the internal_nullifier the message can be considered as a duplicate. Duplicate messages are discarded.

3. zk_proof verification

The zk_proof should be verified by providing the zk_proof field to the circuit verifier along with the public_signal:

[
    y,
    Merkle_proof.root,
    internal_nullifier,
    x, # signal_hash
    epoch,
    rln_identifier
]

If the proof verification is correct, the verification continues, otherwise the message is discarded.

4. Double signaling verification

After the proof is verified the x, and y fields are added to the x_shares and y_shares arrays of the messaging_metadata external_nullifier and internal_nullifier object. If the length of the arrays is equal to the signaling threshold (limit), the user can be slashed.

Slashing

After the verification, the user can be slashed if two different shares are present to reconstruct their identity_secret_hash from x_shares and y_shares fields, for their internal_nullifier. The secret can be retreived by the properties of the Shamir's secret sharing scheme. In particular the secret (a_0) can be retrieved by computing Lagrange polynomials.

After the secret is retreived, the user's identity_commitment can be generated from the secret and it can be used for removing the user from the membership Merkle tree (zeroing out the leaf that contains the user's identity_commitment). Additionally, depending on the application the identity_secret_hash can be used for taking the user's provided stake.

Technical overview

The main RLN construct is implemented using a ZK-SNARK circuit. However, it is helpful to describe the other necessary outside components for interaction with the circuit, which together with the ZK-SNARK circuit enable the above mentioned features.

Terminology

Term Description
ZK-SNARK https://z.cash/technology/zksnarks/
Stake Financial or social stake required for registering in the RLN applications. Common stake examples are: locking cryptocurrency (financial), linking reputable social identity.
Identity secret An array of two unique random components (identity nullifier and identity trapdoor), which must be kept private by the user. Secret hash and identity commitment are derived from this array.
Identity nullifier Random 32 byte value used as component for identity secret generation.
Identity trapdoor Random 32 byte value used as component for identity secret generation.
Identity secret hash The hash of the identity secret, obtained using the Poseidon hash function. It is used for deriving the identity commitment of the user, and as a private input for zk proof generation. The secret hash should be kept private by the user.
Identity commitment Hash obtained from the Identity secret hash by using the poseidon hash function. It is used by the users for registering in the protocol.
Signal The message generated by a user. It is an arbitrary bit string that may represent a chat message, a URL request, protobuf message, etc.
Signal hash Keccak256 hash of the signal modulo circuit's field characteristic, used as an input in the RLN circuit.
RLN Identifier Random finite field value unique per RLN app. It is used for additional cross-application security. The role of the RLN identifier is protection of the user secrets from being compromised when signals are being generated with the same credentials in different apps.
RLN membership tree Merkle tree data structure, filled with identity commitments of the users. Serves as a data structure that ensures user registrations.
Merkle proof Proof that a user is member of the RLN membership tree.

RLN ZK-Circuit specific terms

Term Description
x Keccak hash of the signal, same as signal hash (Defined above).
A0 The identity secret hash.
A1 Poseidon hash of [A0, External nullifier] (see about External nullifier below).
y The result of the polynomial equation (y = a0 + a1*x). The public output of the circuit.
External nullifier Poseidon hash of [Epoch, RLN Identifier]. An identifier that groups signals and can be thought of as a voting booth.
Internal nullifier Poseidon hash of [A1]. This field ensures that a user can send only one valid signal per external nullifier without risking being slashed. Public input of the circuit.

ZK Circuits specification

Anonymous signaling with a controlled rate limit is enabled by proving that the user is part of a group which has high barriers to entry (form of stake) and enabling secret reveal if more than 1 unique signal is produced per external nullifier. The membership part is implemented using membership Merkle trees and Merkle proofs, while the secret reveal part is enabled by using the Shamir's Secret Sharing scheme. Essentially the protocol requires the users to generate zero-knowledge proof to be able to send signals and participate in the application. The zero knowledge proof proves that the user is member of a group, but also enforces the user to share part of their secret for each signal in an external nullifier. The external nullifier is usually represented by timestamp or a time interval. It can also be thought of as a voting booth in voting applications.

The ZK Circuit is implemented using a Groth-16 ZK-SNARK, using the circomlib library.

System parameters

  • n_levels - Merkle tree depth

Circuit parameters

Public Inputs

  • x
  • epoch
  • rln_identifier

Private Inputs

  • identity_secret_hash
  • path_elements - rln membership proof component
  • identity_path_index - rln membership proof component

Outputs

  • y
  • root - the rln membership tree root
  • internal_nullifier

Hash function

Canonical Poseidon hash implementation is used, as implemented in the circomlib library, according to the Poseidon paper. This Poseidon hash version (canonical implementation) uses the following parameters:

Hash inputs t RF RP
1 2 8 56
2 3 8 57
3 4 8 56
4 5 8 60
5 6 8 60
6 7 8 63
7 8 8 64
8 9 8 63

Membership implementation

For a valid signal, a user's identity_commitment (more on identity commitments below) must exist in identity membership tree. Membership is proven by providing a membership proof (witness). The fields from the membership proof required for the verification are: path_elements and identity_path_index.

IncrementalQuinTree algorithm is used for constructing the Membership Merkle tree. The circuits are reused from this repository. You can find out more details about the IncrementalQuinTree algorithm here.

Slashing and Shamir's Secret Sharing

Slashing is enabled by using polynomials and Shamir's Secret sharing. In order to produce a valid proof, identity_secret_hash as a private input to the circuit. Then a secret equation is created in the form of:

y = a_0 + x * a_1,

where a_0 is the identity_secret_hash and a_1 = hash(a_0, external nullifier). Along with the generated proof, the users need to provide a (x, y) share which satisfies the line equation, in order for their proof to be verified. x is the hashed signal, while the y is the circuit output. With more than one pair of unique shares, anyone can derive a_0, i.e. the identity_secret_hash . The hash of a signal will be the evaluation point x. In this way, a member who sends more than one unique signal per external_nullifier risks their identity secret being revealed.

Note that shares used in different epochs and different RLN apps cannot be used to derive the identity secret hash.

Thanks to the external_nullifier definition, also shares computed from same secret within same epoch but in different RLN apps cannot be used to derive the identity secret hash.

The rln_identifier is a random value from a finite field, unique per RLN app, and is used for additional cross-application security - to protect the user secrets being compromised if they use the same credentials accross different RLN apps. If rln_identifier is not present, the user uses the same credentials and sends a different message for two different RLN apps using the same external_nullifier, then their user signals can be grouped by the internal_nullifier which could lead the user's secret revealed. This is because two separate signals under the same internal_nullifier can be treated as rate limiting violation. With adding the rln_identifier field we obscure the internal_nullifier, so this kind of attack can be hardened because we don't have the same internal_nullifier anymore.

Identity credentials generation

In order to be able to generate valid proofs, the users need to be part of the identity membership Merkle tree. They are part of the identity membership Merkle tree if their identity_commitment is placed in a leaf in the tree.

The identity credentials of a user are composed of:

  • identity_secret
  • identity_secret_hash
  • identity_commitment

identity_secret

The identity_secret is generated in the following way:

    identity_nullifier = random_32_byte_buffer
    identity_trapdoor = random_32_byte_buffer
    identity_secret = [identity_nullifier, identity_trapdoor]

The same secret should not be used accross different protocols, because revealing the secret at one protocol could break privacy for the user in the other protocols.

identity_secret_hash

The identity_secret_hash is generated by obtaining a Poseidon hash of the identity_secret array:

    identity_secret_hash = poseidonHash(identity_secret)

identity_commitment

The identity_commitment is generated by obtaining a Poseidon hash of the identity_secret_hash:

identity_commitment = poseidonHash([identity_secret_hash])

Appendix A: Security considerations

RLN is an experimental and still un-audited technology. This means that the circuits have not been yet audited. Another consideration is the security of the underlying primitives. zk-SNARKS require a trusted setup for generating a prover and verifier keys. The standard for this is to use trusted Multi-Party Computation (MPC) ceremony, which requires two phases. Trusted MPC ceremony has not yet been performed for the RLN circuits.

Appendix B: Identity scheme choice

The hashing scheme used is based on the design decisions which also include the Semaphore circuits. Our goal was to ensure compatibility of the secrets for apps that use Semaphore and RLN circuits while also not compromising on security because of using the same secrets.

For example let's say there is a voting app that uses Semaphore, and also a chat app that uses RLN. The UX would be better if the users would not need to care about complicated identity management (secrets and commitments) t hey use for each app, and it would be much better if they could use a single id commitment for this. Also in some cases these kind of dependency is required - RLN chat app using Interep as a registry (instead of using financial stake). One potential concern about this interoperability is a slashed user on the RLN app side having their security compromised on the semaphore side apps as well. I.e obtaining the user's secret, anyone would be able to generate valid semaphore proofs as the slashed user. We don't want that, and we should keep user's app specific security threats in the domain of that app alone.

To achieve the above interoperability UX while preventing the shared app security model (i.e slashing user on an RLN app having impact on Semaphore apps), we had to do the follow in regard the identity secret and identity commitment:

    identity_secret = [identity_nullifier, identity_trapdoor]
    identity_secret_hash = poseidonHash(identity_secret)
    identity_commitment = poseidonHash([identity_secret_hash])

Secret components for generating Semaphore proof:

identity_nullifier
identity_trapdoor

Secret components for generting RLN proof:

identity_secret_hash

When a user is slashed on the RLN app side, their identity secret hash is revealed. However a semaphore proof can't be generated because we do not know the user's nullifier and trapdoor.

With this design we achieve:

identity commitment (Semaphore) == identity commitment (RLN) secret (semaphore) != secret (RLN).

This is the only option we had for the scheme in order to satisfy the properties described above.

Also for RLN we do a single secret component input for the circuit. Thus we need to hash the secret array (two components) to a secret hash, and we use that as a secret component input.

Appendix C: Auxiliary tooling

There are few additional tools implemented for easier integrations and usage of the RLN protocol.

zerokit is a set of Zero Knowledge modules, written in Rust and designed to be used in many different environments. Among different modules, it supports Semaphore and RLN.

zk-kit is a typescript library which exposes APIs for identity credentials generation, as well as proof generation. It supports various protocols (Semaphore, RLN).

zk-keeper is a browser plugin which allows for safe credential storing and proof generation. You can think of MetaMask for ZK-Proofs. It uses zk-kit under the hood.

Appendix D: Example usage

The following examples are code snippets using the zerokit RLN module. The examples are written in rust.

Creating a RLN object

    use rln::protocol::*;
    use rln::public::*;
    use std::io::Cursor;

    // We set the RLN parameters: 
    // - the tree height;
    // - the circuit resource folder (requires a trailing "/").
    let tree_height = 20;
    let resources = Cursor::new("../zerokit/rln/resources/tree_height_20/");

    // We create a new RLN instance
    let mut rln = RLN::new(tree_height, resources);

Generating identity credentials

    // We generate an identity tuple
    let mut buffer = Cursor::new(Vec::<u8>::new());
    rln.extended_key_gen(&mut buffer).unwrap();

    // We deserialize the keygen output to obtain
    // the identiy_secret and id_commitment
    let (identity_trapdoor, identity_nullifier, identity_secret_hash, id_commitment) = deserialize_identity_tuple(buffer.into_inner());

Adding ID commitment to the RLN Merkle tree

    // We define the tree index where id_commitment will be added
    let id_index = 10;

    // We serialize id_commitment and pass it to set_leaf
    let mut buffer = Cursor::new(serialize_field_element(id_commitment));
    rln.set_leaf(id_index, &mut buffer).unwrap();

Setting epoch and signal

    // We generate epoch from a date seed and we ensure is
    // mapped to a field element by hashing-to-field its content
    let epoch = hash_to_field(b"Today at noon, this year");

    // We set our signal 
    let signal = b"RLN is awesome";

Generating proof

    // We prepare input to the proof generation routine
    let proof_input = prepare_prove_input(identity_secret, id_index, epoch, signal);

    // We generate a RLN proof for proof_input
    let mut in_buffer = Cursor::new(proof_input);
    let mut out_buffer = Cursor::new(Vec::<u8>::new());
    rln.generate_rln_proof(&mut in_buffer, &mut out_buffer)
        .unwrap();

    // We get the public outputs returned by the circuit evaluation
    let proof_data = out_buffer.into_inner();

Verifiying proof

    // We prepare input to the proof verification routine
    let verify_data = prepare_verify_input(proof_data, signal);

    // We verify the zk-proof against the provided proof values
    let mut in_buffer = Cursor::new(verify_data);
    let verified = rln.verify(&mut in_buffer).unwrap();

    // We ensure the proof is valid
    assert!(verified);

For more details please visit the zerokit library.

Copyright

Copyright and related rights waived via CC0

References