> **Note:** This version of WAKU2-STORE is earmarked to replace RFC [13/WAKU2-STORE](https://github.com/vacp2p/rfc-index/blob/7b443c1aab627894e3f22f5adfbb93f4c4eac4f6/waku/standards/core/13/store.md) once it reaches draft status
The concept of `ephemeral` messages introduced in [`WAKU2-MESSAGE`](https://github.com/vacp2p/rfc-index/blob/7b443c1aab627894e3f22f5adfbb93f4c4eac4f6/waku/standards/core/14/message.md) affects `WAKU2-STORE` as well.
Nodes running `WAKU2-STORE` SHOULD support `ephemeral` messages as specified in [14/WAKU2-MESSAGE](https://github.com/vacp2p/rfc-index/blob/7b443c1aab627894e3f22f5adfbb93f4c4eac4f6/waku/standards/core/14/message.md).
Nodes running `WAKU2-STORE` SHOULD NOT store messages with the `ephemeral` flag set to `true`.
The store query protocol operates as a query protocol for a key-value store of historical Waku messages,
with each entry having a [14/WAKU2-MESSAGE](https://github.com/vacp2p/rfc-index/blob/7b443c1aab627894e3f22f5adfbb93f4c4eac4f6/waku/standards/core/14/message.md) and associated pubsub topic as value,
and [deterministic message hash](https://github.com/vacp2p/rfc-index/blob/7b443c1aab627894e3f22f5adfbb93f4c4eac4f6/waku/standards/core/14/message.md#deterministic-message-hashing) as key.
The store can be queried to return either a set of keys or a set of key-value pairs.
Within the store query protocol, Waku message keys and values MUST be represented in a `WakuMessageKeyValue` message.
This message MUST contain the deterministic `message_hash` as key.
It MAY contain the full `WakuMessage` and associated pubsub topic as value in the `message` and `pubsub_topic` fields,
depending on the use case as set out below.
If the message contains a value entry in addition to the key,
both the `message` and `pubsub_topic` fields MUST be populated.
The message MUST NOT have either `message` or `pubsub_topic` populated with the other unset.
In order for a Waku message to be eligible for storage:
- it MUST be a _valid_ [14/WAKU2-MESSAGE](https://github.com/vacp2p/rfc-index/blob/7b443c1aab627894e3f22f5adfbb93f4c4eac4f6/waku/standards/core/14/message.md).
- the `timestamp` field MUST be populated with the Unix epoch time at which the message was generated in nanoseconds.
If at the time of storage the `timestamp` deviates by more than 20 seconds
either into the past or the future when compared to the store node’s internal clock,
the store node MAY reject the message.
- the `ephemeral` field MUST be set to `false`.
### Waku message sorting
The key-value entries in the store MUST be time-sorted by the `WakuMessage``timestamp` attribute.
Where two or more key-value entries have identical `timestamps`,
the entries MUST be further sorted by the natural order of their message hash keys.
Within the context of traversing over key-value entries in the store,
_"forward"_ indicates traversing the entries in ascending order,
whereas _"backward"_ indicates traversing the entries in descending order.
### Pagination
If a large number of entries in the store service node match the query criteria provided in a `StoreQueryRequest`,
the client MAY make use of pagination
in a chain of store query request and response transactions
to retrieve the full response in smaller batches termed _"pages"_.
Pagination can be performed either in [a _forward_ or _backward_ direction](#waku-message-sorting).
A store query client MAY indicate the maximum number of matching entries it wants in the `StoreQueryResponse`,
by setting the page size limit in the `pagination_limit` field.
Note that a store service node MAY enforce its own limit
if the `pagination_limit` is unset
or larger than the service node's internal page size limit.
A `StoreQueryResponse` with a populated `pagination_cursor` indicates that more stored entries match the query than included in the response.
A `StoreQueryResponse` without a populated `pagination_cursor` indicates that
there are no more matching entries in the store.
The client MAY request the next page of entries from the store service node
by populating a subsequent `StoreQueryRequest` with the `pagination_cursor` received in the `StoreQueryResponse`.
All other fields and query criteria MUST be the same as in the preceding `StoreQueryRequest`.
A `StoreQueryRequest` without a populated `pagination_cursor` indicates that
the client wants to retrieve the "first page" of the stored entries matching the query.
## Store Query Request
A client node MUST send all historical message queries within a `StoreQueryRequest` message.
This request MUST contain a `request_id`.
The `request_id` MUST be a uniquely generated string.
If the store query client requires the store service node to include Waku message values in the query response,
it MUST set `include_data` to `true`.
If the store query client requires the store service node to return only message hash keys in the query response,
it SHOULD set `include_data` to `false`.
By default, therefore, the store service node assumes `include_data` to be `false`.
A store query client MAY include query filter criteria in the `StoreQueryRequest`.
There are two types of filter use cases:
1. Content filtered queries and
2. Message hash lookup queries
### Content filtered queries
A store query client MAY request the store service node to filter historical entries by a content filter.
Such a client MAY create a filter on content topic, on time range or on both.
To filter on content topic, the client MUST populate _both_ the `pubsub_topic`_and_`content_topics` field.
The client MUST NOT populate either `pubsub_topic` or `content_topics` and leave the other unset.
Both fields MUST either be set or unset.
A mixed content topic filter with just one of either `pubsub_topic` or `content_topics` set, SHOULD be regarded as an invalid request.
To filter on time range, the client MUST set `time_start`, `time_end` or both.
Each `time_` field should contain a Unix epoch timestamp in nanoseconds.
An unset `time_start` SHOULD be interpreted as "from the oldest stored entry".
An unset `time_end` SHOULD be interpreted as "up to the youngest stored entry".
If any of the content filter fields are set,
namely `pubsub_topic`, `content_topic`, `time_start`, or `time_end`,
the client MUST NOT set the `message_hashes` field.
### Message hash lookup queries
A store query client MAY request the store service node to filter historical entries by one or more matching message hash keys.
This type of query acts as a "lookup" against a message hash key or set of keys already known to the client.
In order to perform a lookup query, the store query client MUST populate the `message_hashes` field with the list of message hash keys it wants to lookup in the store service node.
If the `message_hashes` field is set,
the client MUST NOT set any of the content filter fields,
namely `pubsub_topic`, `content_topic`, `time_start`, or `time_end`.
### Presence queries
A presence query is a special type of lookup query that allows a client to check for the presence of one or more messages in the store service node,
without retrieving the full contents (values) of the messages.
This can, for example, be used as part of a reliability mechanism,
whereby store query clients verify that previously published messages have been successfully stored.
In order to perform a presence query,
the store query client MUST populate the `message_hashes` field in the `StoreQueryRequest` with the list of message hashes
for which it wants to verify presence in the store service node.
The `include_data` property MUST be set to `false`.
The client SHOULD interpret every `message_hash` returned in the `messages` field of the `StoreQueryResponse` as present in the store.
The client SHOULD assume that all other message hashes included in the original `StoreQueryRequest` but not in the `StoreQueryResponse` is not present in the store.
### Pagination info
The store query client MAY include a message hash as `pagination_cursor`,
to indicate at which key-value entry a store service node SHOULD start the query.
The `pagination_cursor` is treated as exclusive
and the corresponding entry will not be included in subsequent store query responses.
For forward queries, only messages following (see [sorting](#waku-message-sorting)) the one indexed at `pagination_cursor` will be returned.
For backward queries, only messages preceding (see [sorting](#waku-message-sorting)) the one indexed at `pagination_cursor` will be returned.
If the store query client requires the store service node to perform a forward query,
it MUST set `pagination_forward` to `true`.
If the store query client requires the store service node to perform a backward query,
it SHOULD set `pagination_forward` to `false`.
By default, therefore, the store service node assumes pagination to be backward.
A store query client MAY indicate the maximum number of matching entries it wants in the `StoreQueryResponse`,
by setting the page size limit in the `pagination_limit` field.
Note that a store service node MAY enforce its own limit
if the `pagination_limit` is unset
or larger than the service node's internal page size limit.
See [pagination](#pagination) for more on how the pagination info is used in store transactions.
## Store Query Response
In response to any `StoreQueryRequest`,
a store service node SHOULD respond with a `StoreQueryResponse` with a `requestId` matching that of the request.
This response MUST contain a `status_code` indicating if the request was successful or not.
Successful status codes are in the `2xx` range.
Client nodes SHOULD consider all other status codes as error codes and assume that the requested operation had failed.
In addition, the store service node MAY choose to provide a more detailed status description in the `status_desc` field.
### Filter matching
For [content filtered queries](#content-filtered-queries), an entry in the store service node matches the filter criteria in a `StoreQueryRequest` if each of the following conditions are met:
- its `content_topic` is in the request `content_topics` set
and it was published on a matching `pubsub_topic` OR the request `content_topics` and `pubsub_topic` fields are unset
- its `timestamp` is _larger or equal_ than the request `start_time` OR the request `start_time` is unset
- its `timestamp` is _smaller_ than the request `end_time` OR the request `end_time` is unset
Note that for content filtered queries, `start_time` is treated as _inclusive_ and `end_time` is treated as _exclusive_.
For [message hash lookup queries](#message-hash-lookup-queries), an entry in the store service node matches the filter criteria if its `message_hash` is in the request `message_hashes` set.
The store service node SHOULD respond with an error code and discard the request
if the store query request contains both content filter criteria and message hashes.
### Populating response messages
The store service node SHOULD populate the `messages` field in the response
only with entries matching the filter criteria provided in the corresponding request.
Regardless of whether the response is to a _forward_ or _backward_ query,
the `messages`field in the response MUST be ordered in a forward direction
according to the [message sorting rules](#waku-message-sorting).
If the corresponding `StoreQueryRequest` has `include_data` set to true,
the service node SHOULD populate both the `message_hash` and `message` for each entry in the response.
In all other cases, the store service node SHOULD populate only the `message_hash` field for each entry in the response.
### Paginating the response
The response SHOULD NOT contain more `messages` than the `pagination_limit` provided in the corresponding `StoreQueryRequest`.
It is RECOMMENDED that the store node defines its own maximum page size internally.
If the `pagination_limit` in the request is unset,
or exceeds this internal maximum page size,
the store service node SHOULD ignore the `pagination_limit` field and apply its own internal maximum page size.
In response to a _forward_`StoreQueryRequest`:
- if the `pagination_cursor` is set,
the store service node SHOULD populate the `messages` field
with matching entries following the `pagination_cursor` (exclusive).
- if the `pagination_cursor` is unset,
the store service node SHOULD populate the `messages` field
with matching entries from the first entry in the store.
- if there are still more matching entries in the store
after the maximum page size is reached while populating the response,
the store service node SHOULD populate the `pagination_cursor` in the `StoreQueryResponse`
with the message hash key of the _last_ entry _included_ in the response.
In response to a _backward_`StoreQueryRequest`:
- if the `pagination_cursor` is set,
the store service node SHOULD populate the `messages` field
with matching entries preceding the `pagination_cursor` (exclusive).
- if the `pagination_cursor` is unset,
the store service node SHOULD populate the `messages` field
with matching entries from the last entry in the store.
- if there are still more matching entries in the store
after the maximum page size is reached while populating the response,
the store service node SHOULD populate the `pagination_cursor` in the `StoreQueryResponse`
with the message hash key of the _first_ entry _included_ in the response.
The main security consideration to take into account while using this protocol is that a querying node have to reveal their content filters of interest to the queried node, hence potentially compromising their privacy.
# Future Work
- **Anonymous query**: This feature guarantees that nodes can anonymously query historical messages from other nodes i.e.,
without disclosing the exact topics of [14/WAKU2-MESSAGE](https://github.com/vacp2p/rfc-index/blob/7b443c1aab627894e3f22f5adfbb93f4c4eac4f6/waku/standards/core/14/message.md) they are interested in.
As such, no adversary in the `WAKU2-STORE` protocol would be able to learn which peer is interested in which content filters i.e.,
content topics of [14/WAKU2-MESSAGE](/spec/14).
The current version of the `WAKU2-STORE` protocol does not provide anonymity for historical queries,
as the querying node needs to directly connect to another node in the `WAKU2-STORE` protocol and
explicitly disclose the content filters of its interest to retrieve the corresponding messages.
However, one can consider preserving anonymity through one of the following ways:
such data fields must be treated carefully to achieve query anonymity.
<!-- TODO: if nodes have to disclose their PeerIDs (e.g., for authentication purposes) when connecting to other nodes in the store protocol, then Tor does not preserve anonymity since it only helps in hiding the IP. So, the PeerId usage in switches must be investigated further. Depending on how PeerId is used, one may be able to link between a querying node and its queried topics despite hiding the IP address-->
Examples of such 2PC protocols are secure one-way Private Set Intersections (PSI).
<!-- TODO: add a reference for PSIs? --><!-- TODO: more techniques to be included -->
<!-- TODO: Censorship resistant: this is about a node that hides the historical messages from other nodes. This attack is not included in the specs since it does not fit the passive adversarial model (the attacker needs to deviate from the store protocol).-->