Merge pull request #19 from logos-storage/spec/codex-status-go-prototype

codex-status-go prototype docs
This commit is contained in:
Marcin Czenko 2025-12-15 03:55:10 +01:00 committed by GitHub
commit 2cf61aecd8
No known key found for this signature in database
GPG Key ID: B5690EEEBB952194
34 changed files with 5207 additions and 8 deletions

1
.gitignore vendored
View File

@ -1,5 +1,6 @@
.DS_Store
.obsidian/publish.json
.obsidian/workspace*.json
.obsidian/themes/Omarchy
00 Planner
10 Notes/Inbox.md

File diff suppressed because one or more lines are too long

View File

@ -1 +1 @@
{"id":"query-control","name":"Query Control","version":"0.7.13","minAppVersion":"1.7.2","description":"An experimental Obsidian plugin that adds additional control to queries","author":"NothingIsLost & reply2za","authorUrl":"https://github.com/reply2za","isDesktopOnly":false}
{"id":"query-control","name":"Query Control","version":"0.8.0","minAppVersion":"1.7.2","description":"An experimental Obsidian plugin that adds additional control to queries","author":"NothingIsLost & reply2za","authorUrl":"https://github.com/reply2za","isDesktopOnly":false}

View File

@ -7,6 +7,40 @@
justify-content: center; /* works around issues with minimal theme */
}
.workspace-leaf-content[data-type="markdown"] .internal-query .internal-query-header {
display: flex;
flex-direction: column;
justify-content: center;
align-items: center;
gap: 0.25em;
padding: 0.6em 0.9em;
margin: 0.4em auto 0.8em;
max-width: 90%;
text-align: center;
border-radius: var(--radius-m, 10px);
background-color: var(--background-secondary, rgba(0, 0, 0, 0.05));
border: 1px solid var(--background-modifier-border, rgba(0, 0, 0, 0.08));
box-shadow: 0 5px 16px -12px var(--shadow-s, rgba(0, 0, 0, 0.4));
font-size: var(--font-ui-medium, 1.02em);
font-weight: 600;
letter-spacing: 0.013em;
line-height: 1.32;
}
.workspace-leaf-content[data-type="markdown"] .internal-query .internal-query-header::after {
content: "";
display: block;
width: 68px;
height: 2.3px;
border-radius: 999px;
background: linear-gradient(
90deg,
var(--interactive-accent, #7f6df2) 0%,
var(--text-highlight-bg, rgba(255, 215, 0, 0.9)) 100%
);
opacity: 0.88;
}
.workspace-leaf-content[data-type="markdown"] .internal-query .is-hidden {
display: none;
}

View File

@ -0,0 +1,363 @@
## Abstract
Messages are stored permanently by store nodes ([13/WAKU2-STORE](https://github.com/vacp2p/rfc-index/blob/main/waku/standards/core/13/store.md)) for up to a certain configurable period of time, limited by the overall storage provided by a store node. Messages older than that period are no longer provided by store nodes, making it impossible for other nodes to request historical messages that go beyond that time range. This raises issues in the case of Status communities, where recently joined members of a community are not able to request complete message histories of the community channels.
This specification describes how **Control Nodes** (which are specific nodes in Status communities) archive historical message data of their communities, beyond the time range limit provided by Store Nodes using the [BitTorrent](https://bittorrent.org) protocol. It also describes how the archives are distributed to community members via the Status network, so they can fetch them and gain access to a complete message history.
## Terminology
The following terminology is used throughout this specification. Notice that some actors listed here are nodes that operate in Waku networks only, while others operate in the Status communities layer):
| Name | References |
| --------------------- | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------ |
| Waku node | An Waku node ([10/WAKU2](https://github.com/vacp2p/rfc-index/blob/main/waku/standards/core/10/waku2.md)) that implements [11/WAKU2-RELAY](https://github.com/vacp2p/rfc-index/blob/main/waku/standards/core/11/relay.md) |
| Store node | A Waku node that implements [13/WAKU2-STORE](https://github.com/vacp2p/rfc-index/blob/main/waku/standards/core/13/store.md) |
| Waku network | A group of Waku nodes forming a graph, connected via [11/WAKU2-RELAY](https://github.com/vacp2p/rfc-index/blob/main/waku/standards/core/11/relay.md) |
| Status user | An Status account that is used in a Status consumer product, such as Status Mobile or Status Desktop |
| Status node | A Status client run by a Status application |
| Control node | A Status node that owns the private key for a Status community |
| Community member | A Status user that is part of a Status community, not owning the private key of the community |
| Community member node | A Status node with message archive capabilities enabled, run by a community member |
| Live messages | Waku messages received through the Waku network |
| BitTorrent client | A program implementing the [BitTorrent](https://bittorrent.org) protocol |
| Torrent/Torrent file | A file containing metadata about data to be downloaded by BitTorrent clients |
| Magnet link | A link encoding the metadata provided by a torrent file ([Magnet URI scheme](https://en.wikipedia.org/wiki/Magnet_URI_scheme)) |
## Requirements / Assumptions
This specification has the following assumptions:
- Store nodes, ([13/WAKU2-STORE](https://github.com/vacp2p/rfc-index/blob/main/waku/standards/core/13/store.md)), are available 24/7 ensuring constant live message availability.
- The storage time range limit is 30 days.
- Store nodes have enough storage to persist historical messages for up to 30 days.
- No store nodes have storage to persist historical messages older than 30 days.
- All nodes are honest.
- The network is reliable.
Furthermore, it assumes that:
- Control nodes have enough storage to persist historical messages older than 30 days.
- Control nodes provide archives with historical messages **at least** every 30 days.
- Control nodes receive all community messages.
- Control nodes are honest.
- Control nodes know at least one store node from which it can query historical messages.
These assumptions are less than ideal and will be enhanced in future work.
This [forum discussion](https://forum.vac.dev/t/status-communities-protocol-and-product-point-of-view/114) provides more details.
## Overview
The following is a high-level overview of the user flow and features this specification describes. For more detailed descriptions, read the dedicated sections in this specification.
### Serving community history archives
Control nodes go through the following (high level) process to provide community members with message histories:
1. Community owner creates a Status community (previously known as [org channels](https://github.com/status-im/specs/pull/151)) which makes its node a Control node.
2. Community owner enables message archive capabilities (on by default but can be turned off as well - see [UI feature spec](https://github.com/status-im/feature-specs/pull/36)).
3. A special type of channel to exchange metadata about the archival data is created, this channel is not visible in the user interface.
4. Community owner invites community members.
5. Control node receives messages published in channels and stores them into a local database.
6. After 7 days, the control node exports and compresses last 7 days worth of messages from database and bundles it together with a [message archive index](#wakumessagearchiveindex) into a torrent, from which it then creates a magnet link ([Magnet URI scheme](https://en.wikipedia.org/wiki/Magnet_URI_scheme), [Extensions for Peers to Send Metadata Files](https://www.bittorrent.org/beps/bep_0009.html)).
7. Control node sends the magnet link created in step 6 to community members via special channel created in step 3 through the Waku network.
8. Every subsequent 7 days, steps 6 and 7 are repeated and the new message archive data is appended to the previously created message archive data.
### Serving archives for missed messages
If the control node goes offline (where "offline" means, the control node's main process is no longer running), it MUST go through the following process:
1. Control node restarts
2. Control node requests messages from store nodes for the missed time range for all channels in their community
3. All missed messages are stored into control node's local message database
4. If 7 or more days have elapsed since the last message history torrent was created, the control node will perform step 6 and 7 of [Serving community history archives](#serving-community-history-archives) for every 7 days worth of messages in the missed time range (e.g. if the node was offline for 30 days, it will create 4 message history archives)
### Receiving community history archives
Community member nodes go through the following (high level) process to fetch and restore community message histories:
1. User joins community and becomes community member (see [org channels spec](https://github.com/vacp2p/rfc-index/blob/main/status/56/communities.md))
2. By joining a community, member nodes automatically subscribe to special channel for message archive metadata exchange provided by the community
3. Member node requests live message history (last 30 days) of all the community channels, including the special channel from store nodes
4. Member node receives Waku message ([14/WAKU2-MESSAGE](https://github.com/vacp2p/rfc-index/blob/main/waku/standards/core/14/message.md)) that contains the metadata magnet link from the special channel
5. Member node extracts the magnet link from the Waku message and passes it to torrent client
6. Member node downloads [message archive index](#message-history-archive-index) file and determines which message archives are not downloaded yet (all or some)
7. Member node fetches missing message archive data via torrent
8. Member node unpacks and decompresses message archive data to then hydrate its local database, deleting any messages, for that community that the database previously stored in the same time range, as covered by the message history archive
## Storing live messages
For archival data serving, the control node MUST store live messages as [14/WAKU2-MESSAGE](https://github.com/vacp2p/rfc-index/blob/main/waku/standards/core/14/message.md). This is in addition to their database of application messages. This is required to provide confidentiality, authenticity, and integrity of message data distributed via the BitTorrent layer, and later validated by community members when they unpack message history archives.
Control nodes SHOULD remove those messages from their local databases once they are older than 30 days and after they have been turned into message archives and distributed to the BitTorrent network.
### Exporting messages for bundling
Control nodes export Waku messages from their local database for creating and bundling history archives using the following criteria:
- Waku messages to be exported MUST have a `contentTopic` that match any of the topics of the community channels
- Waku messages to be exported MUST have a `timestamp` that lies within a time range of 7 days
The `timestamp` is determined by the context in which the control node attempts to create a message history archives as described below:
1. The control node attempts to create an archive periodically for the past seven days (including the current day). In this case, the `timestamp` has to lie within those 7 days.
2. The control node has been offline (control node's main process has stopped and needs restart) and attempts to create archives for all the live messages it has missed since it went offline. In this case, the `timestamp` has to lie within the day the latest message was received and the current day.
Exported messages MUST be restored as [14/WAKU2-MESSAGE](https://github.com/vacp2p/rfc-index/blob/main/waku/standards/core/14/message.md) for bundling. Waku messages that are older than 30 days and have been exported for bundling can be removed from the control node's database (control nodes still maintain a database of application messages).
## Message history archives
Message history archives are represented as `WakuMessageArchive` and created from Waku messages exported from the local database. Message history archives are implemented using the following protocol buffer.
### WakuMessageHistoryArchive
The `from` field SHOULD contain a timestamp of the time range's lower bound. The type parallels the `timestamp` of [WakuMessage](https://github.com/vacp2p/rfc-index/blob/main/waku/standards/core/14/message.md).
The `to` field SHOULD contain a timestamp of the time range's the higher bound.
The `contentTopic` field MUST contain a list of all communiity channel topics.
The `messages` field MUST contain all messages that belong into the archive given its `from`, `to` and `contentTopic` fields.
The `padding` field MUST contain the amount of zero bytes needed so that the overall byte size of the protobuf encoded `WakuMessageArchive` is a multiple of the `pieceLength` used to divide the message archive data into pieces. This is needed for seamless encoding and decoding of archival data in interation with BitTorrent, as explained in [creating message archive torrents](#creating-message-archive-torrents).
```protobuf
syntax = "proto3"
message WakuMessageArchiveMetadata {
uint8 version = 1
uint64 from = 2
uint64 to = 3
repeated string contentTopic = 4
}
message WakuMessageArchive {
uint8 version = 1
WakuMessageArchiveMetadata metadata = 2
repeated WakuMessage messages = 3 // `WakuMessage` is provided by 14/WAKU2-MESSAGE
bytes padding = 4
}
```
## Message History Archive Index
Control nodes MUST provide message archives for the entire community history. The entirey history consists of a set of `WakuMessageArchive`'s where each archive contains a subset of historical `WakuMessage`s for a time range of seven days. All the `WakuMessageArchive`s are concatenated into a single file as a byte string (see [Ensuring reproducible data pieces](#ensuring-reproducible-data-pieces)).
Control nodes MUST create a message history archive index (`WakuMessageArchiveIndex`) with metadata that allows receiving nodes to only fetch the message history archives they are interested in.
### WakuMessageArchiveIndex
A `WakuMessageArchiveIndex` is a map where the key is the KECCAK-256 hash of the `WakuMessageArchiveIndexMetadata` derived from a 7-day archive and the value is an instance of that `WakuMessageArchiveIndexMetadata` corresponding to that archive.
The `offset` field MUST contain the position at which the message history archive starts in the byte string of the total message archive data. This MUST be the sum of the length of all previously created message archives in bytes (see [Creating message archive torrents](#creating-message-archive-torrents)).
```protobuf
syntax = "proto3"
message WakuMessageArchiveIndexMetadata {
uint8 version = 1
WakuMessageArchiveMetadata metadata = 2
uint64 offset = 3
uint64 num_pieces = 4
}
message WakuMessageArchiveIndex {
map<string, WakuMessageArchiveIndexMetadata> archives = 1
}
```
The control node MUST update the `WakuMessageArchiveIndex` every time it creates one or more `WakuMessageArchive`s and bundle it into a new torrent. For every created `WakuMessageArchive`, there MUST be a `WakuMessageArchiveIndexMetadata` entry in the `archives` field `WakuMessageArchiveIndex`.
## Creating message archive torrents
Control nodes MUST create a torrent file ("torrent") containing metadata to all message history archives. To create a torrent file, and later serve the message archive data in the BitTorrent network, control nodes MUST store the necessary data in dedicated files on the file system.
A torrent's source folder MUST contain the following two files:
- `data` - Contains all protobuf encoded `WakuMessageArchive`'s (as bit strings) concatenated in ascending order based on their time
- `index` - Contains the protobuf encoded `WakuMessageArchiveIndex`
Control nodes SHOULD store these files in a dedicated folder that is identifiable, via the community id.
### Ensuring reproducible data pieces
The control node MUST ensure that the byte string resulting from the protobuf encoded `data` is equal to the byte string `data` from the previously generated message archive torrent, plus the data of the latest 7 days worth of messages encoded as `WakuMessageArchive`. Therefore, the size of `data` grows every seven days as it's append only.
The control nodes also MUST ensure that the byte size of every individual `WakuMessageArchive` encoded protobuf is a multiple of `pieceLength: ???` (**TODO**) using the `padding` field. If the protobuf encoded `WakuMessageArchive` is not a multiple of `pieceLength`, its `padding` field MUST be filled with zero bytes and the `WakuMessageArchive` MUST be re-encoded until its size becomes multiple of `pieceLength`.
This is necessary because the content of the `data` file will be split into pieces of `pieceLength` when the torrent file is created, and the SHA1 hash of every piece is then stored in the torrent file and later used by other nodes to request the data for each individual data piece.
By fitting message archives into a multiple of `pieceLength` and ensuring they fill possible remaining space with zero bytes, control nodes prevent the **next** message archive to occupy that remaining space of the last piece, which will result in a different SHA1 hash for that piece.
#### **Example: Without padding**
Let `WakuMessageArchive` "A1" be of size 20 bytes:
```json
0 11 22 33 44 55 66 77 88 99
10 11 12 13 14 15 16 17 18 19
```
With a `pieceLength` of 10 bytes, A1 will fit into `20 / 10 = 2` pieces:
```json
0 11 22 33 44 55 66 77 88 99 // piece[0] SHA1: 0x123
10 11 12 13 14 15 16 17 18 19 // piece[1] SHA1: 0x456
```
#### **Example: With padding**
Let `WakuMessageArchive` "A2" be of size 21 bytes:
```json
0 11 22 33 44 55 66 77 88 99
10 11 12 13 14 15 16 17 18 19
20
```
With a `pieceLength` of 10 bytes, A2 will fit into `21 / 10 = 2` pieces.
The remainder will introduce a third piece:
```json
0 11 22 33 44 55 66 77 88 99 // piece[0] SHA1: 0x123
10 11 12 13 14 15 16 17 18 19 // piece[1] SHA1: 0x456
20 // piece[2] SHA1: 0x789
```
The next `WakuMessageArchive` "A3" will be appended ("#3") to the existing data and occupy the remaining space of the third data piece.
The piece at index 2 will now produce a different SHA1 hash:
```json
0 11 22 33 44 55 66 77 88 99 // piece[0] SHA1: 0x123
10 11 12 13 14 15 16 17 18 19 // piece[1] SHA1: 0x456
20 #3 #3 #3 #3 #3 #3 #3 #3 #3 // piece[2] SHA1: 0xeef
#3 #3 #3 #3 #3 #3 #3 #3 #3 #3 // piece[3]
```
By filling up the remaining space of the third piece with A2 using its `padding` field, it is guaranteed that its SHA1 will stay the same:
```json
0 11 22 33 44 55 66 77 88 99 // piece[0] SHA1: 0x123
10 11 12 13 14 15 16 17 18 19 // piece[1] SHA1: 0x456
20 0 0 0 0 0 0 0 0 0 // piece[2] SHA1: 0x999
#3 #3 #3 #3 #3 #3 #3 #3 #3 #3 // piece[3]
#3 #3 #3 #3 #3 #3 #3 #3 #3 #3 // piece[4]
```
### Seeding message history archives
The control node MUST seed the [generated torrent](#creating-message-archive-torrents) until a new `WakuMessageArchive` is created.
The control node SHOULD NOT seed torrents for older message history archives. Only one torrent at a time should be seeded.
### Creating magnet links
Once a torrent file for all message archives is created, the control node MUST derive a magnet link following the [Magnet URI scheme](https://en.wikipedia.org/wiki/Magnet_URI_scheme) using the underlying BitTorrent protocol client.
### Message archive distribution
Message archives are available via the BitTorrent network as they are being [seeded by the control node](#seeding-message-history-archives). Other community member nodes will download the message archives from the BitTorrent network once they receive a magnet link that contains a message archive index.
The control node MUST send magnet links containing message archives and the message archive index to a special community channel. The topic of that special channel follows the following format:
```text
/{application-name}/{version-of-the-application}/{content-topic-name}/{encoding}
```
All messages sent with this topic MUST be instances of `ApplicationMetadataMessage` ([62/STATUS-PAYLOADS](https://github.com/vacp2p/rfc-index/blob/main/status/62/payloads.md)) with a `payload` of `CommunityMessageArchiveIndex`.
Only the control node MAY post to the special channel. Other messages on this specified channel MUST be ignored by clients. Community members MUST NOT have permission to send messages to the special channel. However, community member nodes MUST subscribe to special channel to receive Waku messages containing magnet links for message archives.
### Canonical message histories
Only control nodes are allowed to distribute messages with magnet links via the special channel for magnet link exchange. Community members MUST NOT be allowed to post any messages to the special channel.
Status nodes MUST ensure that any message that isn't signed by the control node in the special channel is ignored.
Since the magnet links are created from the control node's database (and previously distributed archives), the message history provided by the control node becomes the canonical message history and single source of truth for the community.
Community member nodes MUST replace messages in their local databases with the messages extracted from archives within the same time range. Messages that the control node didn't receive MUST be removed and are no longer part of the message history of interest, even if it already existed in a community member node's database.
## Fetching message history archives
Generally, fetching message history archives is a three step process:
1. Receive [message archive index](#message-history-archive-index magnet link as described in [Message archive distribution], download `index` file from torrent, then determine which message archives to download
2. Download individual archives
Community member nodes subscribe to the special channel that control nodes publish magnet links for message history archives to. There are two scenarios in which member nodes can receive such a magnet link message from the special channel:
1. The member node receives it via live messages, by listening to the special channel
2. The member node requests messages for a time range of up to 30 days from store nodes (this is the case when a new community member joins a community)
### Downloading message archives
When member nodes receive a message with a `CommunityMessageHistoryArchive` ([62/STATUS-PAYLOADS](https://github.com/vacp2p/rfc-index/blob/main/status/62/payloads.md)) from the aforementioned channnel, they MUST extract the `magnet_uri` and pass it to their underlying BitTorrent client so they can fetch the latest message history archive index, which is the `index` file of the torrent (see [Creating message archive torrents](#creating-message-archive-torrents)).
Due to the nature of distributed systems, there's no guarantee that a received message is the "last" message. This is especially true when member nodes request historical messages from store nodes.
Therefore, member nodes MUST wait for 20 seconds after receiving the last `CommunityMessageArchive` before they start extracting the magnet link to fetch the latest archive index.
Once a message history archive index is downloaded and parsed back into `WakuMessageArchiveIndex`, community member nodes use a local lookup table to determine which of the listed archives are missing using the KECCAK-256 hashes stored in the index.
For this lookup to work, member nodes MUST store the KECCAK-256 hashes of the `WakuMessageArchiveIndexMetadata` provided by the `index` file for all of the message history archives that have been downlaoded in their local database.
Given a `WakuMessageArchiveIndex`, member nodes can access individual `WakuMessageArchiveIndexMetadata` to download individual archives.
Community member nodes MUST choose one of the following options:
1. **Download all archives** - Request and download all data pieces for `data` provided by the torrent (this is the case for new community member nodes that haven't downloaded any archives yet)
2. **Download only the latest archive** - Request and download all pieces starting at the `offset` of the latest `WakuMessageArchiveIndexMetadata` (this is the case for any member node that already has downloaded all previous history and is now interested in only the latst archive)
3. **Download specific archives** - Look into `from` and `to` fields of every `WakuMessageArchiveIndexMetadata` and determine the pieces for archives of a specific time range (can be the case for member nodes that have recently joined the network and are only interested in a subset of the complete history)
### Storing historical messages
When message archives are fetched, community member nodes MUST unwrap the resulting `WakuMessage` instances into `ApplicationMetadataMessage` instances and store them in their local database. Community member nodes SHOULD NOT store the wrapped `WakuMessage` messages.
All message within the same time range MUST be replaced with the messages provided by the message history archive.
Community members nodes MUST ignore the expiration state of each archive message.
## Considerations
The following are things to consider when implementing this specification.
## Control node honesty
This spec assumes that all control nodes are honest and behave according to the spec. Meaning they don't inject their own messages into, or remove any messages from historic archives.
## Bandwidth consumption
Community member nodes will download the latest archive they've received from the archive index, which includes messages from the last seven days. Assuming that community members nodes were online for that time range, they have already downloaded that message data and will now download an archive that contains the same.
This means there's a possibility member nodes will download the same data at least twice.
## Multiple community owners
It is possible for control nodes to export the private key of their owned community and pass it to other users so they become control nodes as well. This means, it's possible for multiple control nodes to exist.
This might conflict with the assumption that the control node serves as a single source of truth. Multiple control nodes can have different message histories.
Not only will multiple control nodes multiply the amount of archive index messages being distributed to the network, they might also contain different sets of magnet links and their corresponding hashes.
Even if just a single message is missing in one of the histories, the hashes presented in archive indices will look completely different, resulting in the community member node to download the corresponding archive (which might be identical to an archive that was already downloaded, except for that one message).
## Copyright
Copyright and related rights waived via [CC0](https://creativecommons.org/publicdomain/zero/1.0/).
## References
- [13/WAKU2-STORE](https://github.com/vacp2p/rfc-index/blob/main/waku/standards/core/13/store.md)
- [BitTorrent](https://bittorrent.org)
- [10/WAKU2](https://github.com/vacp2p/rfc-index/blob/main/waku/standards/core/10/waku2.md)
- [11/WAKU2-RELAY](https://github.com/vacp2p/rfc-index/blob/main/waku/standards/core/11/relay.md)
- [Magnet URI scheme](https://en.wikipedia.org/wiki/Magnet_URI_scheme)
- [forum discussion](https://forum.vac.dev/t/status-communities-protocol-and-product-point-of-view/114)
- [org channels](https://github.com/status-im/specs/pull/151)
- [UI feature spec](https://github.com/status-im/feature-specs/pull/36)
- [Extensions for Peers to Send Metadata Files](https://www.bittorrent.org/beps/bep_0009.html)
- [org channels spec](https://github.com/vacp2p/rfc-index/blob/main/status/56/communities.md)
- [14/WAKU2-MESSAGE](https://github.com/vacp2p/rfc-index/blob/main/waku/standards/core/14/message.md)
- [62/STATUS-PAYLOADS](https://github.com/vacp2p/rfc-index/blob/main/status/62/payloads.md)

View File

@ -0,0 +1,17 @@
Common conventional commit prefixes are:
- **feat**: New feature
- **fix**: Bug fix
- **docs**: Documentation changes
- **style**: Code style changes (formatting, missing semicolons, etc.)
- **refactor**: Code refactoring (neither fixes a bug nor adds a feature)
- **perf**: Performance improvements
- **test**: Adding or updating tests
- **build**: Changes to build system or dependencies
- **ci**: Changes to CI configuration files and scripts
- **chore**: Other changes that don't modify src or test files
- **revert**: Reverts a previous commit
Optional scope can be added in parentheses: `feat(api):`, `fix(wallet):`, etc.
This is especially relevant when working with status-go repo. See also [[testing codex-status-go integration]].

View File

@ -0,0 +1,197 @@
---
related-to:
- "[[Team-NLBR Solution Proposal]]"
- "[[status-go publishing magnet links]]"
- "[[status-go processing magnet links]]"
- "[[status-go-codex integration - design notes]]"
- "[[Creating History Archives - InitHistoryArchiveTasks]]"
---
## **Connection Flow: `InitHistoryArchiveTasks``StartHistoryArchiveTasksInterval`**
### **1. Where `InitHistoryArchiveTasks` is Called**
Read [](file:///home/mc2/code/status-im/status-go/protocol/messenger.go#640-640), lines 640 to 655
`InitHistoryArchiveTasks` is called from **two main places**:
**A. During `Messenger.Start()` (startup):**
```go
// In messenger.go Start() method
controlledCommunities, err := m.communitiesManager.Controlled()
if err != nil {
return nil, err
}
if m.archiveManager.IsReady() {
go func() {
// Wait for store node to be available
select {
case <-m.ctx.Done():
return
case <-m.messaging.OnStorenodeAvailable():
}
m.InitHistoryArchiveTasks(controlledCommunities) // CALLED HERE
}()
}
```
**B. When enabling archive protocol:**
```go
// In EnableCommunityHistoryArchiveProtocol()
controlledCommunities, err := m.communitiesManager.Controlled()
if err != nil {
return err
}
if len(controlledCommunities) > 0 {
go m.InitHistoryArchiveTasks(controlledCommunities) // CALLED HERE
}
```
### **2. What `InitHistoryArchiveTasks` Does**
Read [](file:///home/mc2/code/status-im/status-go/protocol/messenger_communities.go#3770-3770), lines 3770 to 3810
Read [](file:///home/mc2/code/status-im/status-go/protocol/messenger_communities.go#3810-3810), lines 3810 to 3840
Read [](file:///home/mc2/code/status-im/status-go/protocol/messenger_communities.go#3840-3840), lines 3840 to 3880
Read [](file:///home/mc2/code/status-im/status-go/protocol/messenger_communities.go#3880-3880), lines 3880 to 3895
### **3. How `StartHistoryArchiveTasksInterval` Works**
Read [](file:///home/mc2/code/status-im/status-go/protocol/communities/manager_archive.go#320-320), lines 320 to 345
## **Complete Connection Flow**
Here's the detailed connection between `InitHistoryArchiveTasks` and `StartHistoryArchiveTasksInterval`:
### **1. Initialization Chain**
```
Messenger.Start() [startup]
Wait for store node availability
InitHistoryArchiveTasks(controlledCommunities)
For each controlled community:
├─ Check if archive support enabled
├─ Seed existing torrents if available
├─ Get community chat filters/topics
├─ Request missed messages from store nodes
├─ Check last archive timestamp
└─ Decide archive strategy:
├─ Case 1: No archives yet → StartHistoryArchiveTasksInterval()
├─ Case 2: Recent archive → Seed + delayed start → StartHistoryArchiveTasksInterval()
└─ Case 3: Old archive → Create new + StartHistoryArchiveTasksInterval()
```
### **2. The Three Archive Strategy Cases**
**Case 1: No Previous Archives (`lastArchiveEndDateTimestamp == 0`)**
```go
// No prior messages to be archived, so we just kick off the archive creation loop
// for future archives
go m.archiveManager.StartHistoryArchiveTasksInterval(c, messageArchiveInterval)
```
**Case 2: Recent Archive (`durationSinceLastArchive < messageArchiveInterval`)**
```go
// Last archive is less than `interval` old, wait until `interval` is complete,
// then create archive and kick off archive creation loop for future archives
// Seed current archive in the meantime
err := m.archiveManager.SeedHistoryArchiveTorrent(c.ID())
timeToNextInterval := messageArchiveInterval - durationSinceLastArchive
time.AfterFunc(timeToNextInterval, func() {
err := m.archiveManager.CreateAndSeedHistoryArchive(c.ID(), topics, lastArchiveEndDate, to.Add(timeToNextInterval), messageArchiveInterval, c.Encrypted())
go m.archiveManager.StartHistoryArchiveTasksInterval(c, messageArchiveInterval)
})
```
**Case 3: Old Archive (`durationSinceLastArchive >= messageArchiveInterval`)**
```go
// Looks like the last archive was generated more than `interval`
// ago, so lets create a new archive now and then schedule the archive
// creation loop
err := m.archiveManager.CreateAndSeedHistoryArchive(c.ID(), topics, lastArchiveEndDate, to, messageArchiveInterval, c.Encrypted())
go m.archiveManager.StartHistoryArchiveTasksInterval(c, messageArchiveInterval)
```
### **3. What `StartHistoryArchiveTasksInterval` Does**
```go
func (m *ArchiveManager) StartHistoryArchiveTasksInterval(community *Community, interval time.Duration) {
// 1. Check if already running for this community
if _, exists := m.historyArchiveTasks.Load(id); exists {
return // Already running
}
// 2. Set up cancellation and tracking
cancel := make(chan struct{})
m.historyArchiveTasks.Store(id, cancel)
m.historyArchiveTasksWaitGroup.Add(1)
// 3. Start periodic ticker
ticker := time.NewTicker(interval) // Default: 1 hour intervals
// 4. Archive creation loop
for {
select {
case <-ticker.C:
// Every interval, create and seed new archive
err = m.CreateAndSeedHistoryArchive(community.ID(), topics, lastArchiveEndDate, to, interval, community.Encrypted())
case <-cancel:
// Cleanup and exit
m.UnseedHistoryArchiveTorrent(community.ID())
m.historyArchiveTasks.Delete(id)
m.historyArchiveTasksWaitGroup.Done()
return
}
}
}
```
### **4. The Interval Constants**
Searched text for `messageArchiveInterval` (`**/messenger_communities.go`), 20 results
The default interval is **7 days** (`7 * 24 * time.Hour`).
### **5. Complete Flow Summary**
```
System Startup
Messenger.Start()
Wait for Store Node Availability
InitHistoryArchiveTasks(controlledCommunities)
├─ For each community owner controls:
│ ├─ Check if archive support enabled
│ ├─ Seed existing torrents (if available)
| ├─ CreateAndSeedHistoryArchive
│ ├─ Get community topics and sync missed messages
│ ├─ Check when last archive was created
│ └─ Based on last archive timing:
│ ├─ No archives → StartHistoryArchiveTasksInterval() immediately
│ ├─ Recent archive → Seed + delayed CreateAndSeedHistoryArchive followed by StartHistoryArchiveTasksInterval()
│ └─ Old archive → Create new archive + CreateAndSeedHistoryArchive + StartHistoryArchiveTasksInterval()
└─ Each StartHistoryArchiveTasksInterval():
├─ Runs as background goroutine
├─ Creates ticker with 7-day interval
├─ Every 7 days: CreateAndSeedHistoryArchive()
├─ After seeding: publishes HistoryArchivesSeedingSignal
├─ Signal triggers: dispatchMagnetlinkMessage()
└─ Magnetlink sent to all community members via Waku
```
So `InitHistoryArchiveTasks` is the **initialization and setup function** that analyzes each community's archive state and decides how to start the periodic archive creation, while `StartHistoryArchiveTasksInterval` is the **long-running background worker** that actually performs the periodic archive creation every 7 days.
The connection is that `InitHistoryArchiveTasks` **always ends up calling** `StartHistoryArchiveTasksInterval` for each community (either immediately, delayed, or after creating a catch-up archive), making it the entry point that bootstraps the entire periodic archive system.

View File

@ -0,0 +1,410 @@
**Q: Where does `protobuf.ApplicationMetadataMessage_COMMUNITY_MESSAGE_ARCHIVE_MAGNETLINK` come from?**
Summarized conversation historyLooking at the code, `protobuf.ApplicationMetadataMessage_COMMUNITY_MESSAGE_ARCHIVE_MAGNETLINK` appears to be a constant from the protobuf package. Let me search for where this is defined.
Searched text for `COMMUNITY_MESSAGE_ARCHIVE_MAGNETLINK`, 2 results
Now let me look at the protobuf file to see the full context:
Searched text for `enum Type` (`**/protocol/protobuf/application_metadata_message.proto`), 1 result
Searched text for `COMMUNITY_MESSAGE_ARCHIVE` (`**/protocol/protobuf/application_metadata_message.proto`), 1 result
Let me see more context around this enum definition:
Searched text for `(COMMUNITY_MESSAGE_ARCHIVE|= 4[0-9];|= 5[0-9];)` (`**/protocol/protobuf/application_metadata_message.proto`), 20 results
Now let me check if there's a similar constant for INDEX_CID:
Searched text for `COMMUNITY_MESSAGE_ARCHIVE_INDEX_CID`, 1 result
Let me check if this constant exists in the protobuf file:
Searched text for `INDEX_CID` (`**/protocol/protobuf/application_metadata_message.proto`), no results
It seems like `COMMUNITY_MESSAGE_ARCHIVE_INDEX_CID` is being used but not defined in the protobuf file. Let me check the broader context around line 60 in the protobuf file:
Symbol `COMMUNITY_MESSAGE_ARCHIVE_INDEX_CID` not found
Let me look at the entire protobuf file to see the complete enum:
Searched codebase for "ApplicationMetadataMessage Type enum protobuf", 22 results
Based on my search, I can now answer your question about where `protobuf.ApplicationMetadataMessage_COMMUNITY_MESSAGE_ARCHIVE_MAGNETLINK` comes from:
## Answer
`protobuf.ApplicationMetadataMessage_COMMUNITY_MESSAGE_ARCHIVE_MAGNETLINK` comes from the protobuf enum definition in application_metadata_message.proto.
Specifically, it's defined at line 60 of the proto file as:
```protobuf
COMMUNITY_MESSAGE_ARCHIVE_MAGNETLINK = 43;
```
This is part of the `Type` enum within the `ApplicationMetadataMessage` protobuf definition. When the protobuf file is compiled to Go code (using the `//go:generate protoc` directive found in `protocol/protobuf/service.go`), it generates the Go constant `protobuf.ApplicationMetadataMessage_COMMUNITY_MESSAGE_ARCHIVE_MAGNETLINK` with the value `43`.
The protobuf compilation process converts the proto enum value `COMMUNITY_MESSAGE_ARCHIVE_MAGNETLINK = 43` into the Go constant that you see being used throughout the codebase for message type identification.
**Note:** I also notice that in the code there's a reference to `COMMUNITY_MESSAGE_ARCHIVE_INDEX_CID` which doesn't appear to be defined in the protobuf file yet. This suggests it may need to be added to the enum if the Codex functionality requires it.
**Q: in persistance - current file - there are a couple of attributes operating on the communities_archive_info (table?). Is it hard to add additional attributes? For my index CIDs I need the following:**
- **last_cid which corresponds to `last_magnetlink_uri`**
- **indexcid_clock => `magnetlink_clock`**
**What is this clock used for?**
Looking at the persistence.go file, I can see that the `communities_archive_info` table is used to store archive-related information for communities. Let me analyze the existing structure and answer your questions.
## Current `communities_archive_info` table structure
Based on the methods in the file, the table appears to have these columns:
- `community_id` (string)
- `magnetlink_clock` (uint64)
- `last_magnetlink_uri` (string)
- `last_message_archive_end_date` (uint64)
## Adding new attributes
It should **not be hard** to add your new attributes. You would need to:
1. **Add the new columns to the table schema** (likely in a migration file)
2. **Add corresponding methods** similar to the existing ones
For your requirements:
- `last_cid` (corresponds to `last_magnetlink_uri`)
- `indexcid_clock` (corresponds to `magnetlink_clock`)
You could add methods like:
```go
// Get methods
func (p *Persistence) GetLastSeenIndexCid(communityID types.HexBytes) (string, error) {
var indexCid string
err := p.db.QueryRow(`SELECT last_cid FROM communities_archive_info WHERE community_id = ?`, communityID.String()).Scan(&indexCid)
if err == sql.ErrNoRows {
return "", nil
}
return indexCid, err
}
func (p *Persistence) GetIndexCidMessageClock(communityID types.HexBytes) (uint64, error) {
var indexCidClock uint64
err := p.db.QueryRow(`SELECT indexcid_clock FROM communities_archive_info WHERE community_id = ?`, communityID.String()).Scan(&indexCidClock)
if err == sql.ErrNoRows {
return 0, nil
}
return indexCidClock, err
}
// Update methods
func (p *Persistence) UpdateLastSeenIndexCid(communityID types.HexBytes, indexCid string) error {
_, err := p.db.Exec(`UPDATE communities_archive_info SET last_cid = ? WHERE community_id = ?`,
indexCid, communityID.String())
return err
}
func (p *Persistence) UpdateIndexCidMessageClock(communityID types.HexBytes, clock uint64) error {
_, err := p.db.Exec(`UPDATE communities_archive_info SET indexcid_clock = ? WHERE community_id = ?`,
clock, communityID.String())
return err
}
```
## What is the clock used for?
The **clock** is a **logical timestamp** used for:
1. **Message ordering** - Ensures messages are processed in the correct order
2. **Conflict resolution** - When multiple nodes update the same data, the higher clock value wins
3. **Synchronization** - Helps determine which updates are newer when syncing between devices/nodes
4. **Deduplication** - Prevents processing the same update multiple times
Looking at the existing code:
```go
func (p *Persistence) UpdateMagnetlinkMessageClock(communityID types.HexBytes, clock uint64) error {
```
This suggests that when a new magnetlink message is received/processed, the clock value is updated to track the latest state. This is common in distributed systems to maintain consistency without requiring synchronized system clocks.
## Next steps
You'll need to:
1. **Create a database migration** to add the new columns
2. **Update the `SaveCommunityArchiveInfo` method** to include the new fields
3. **Add the getter/setter methods** for your new attributes
4. **Update any initialization code** that creates archive info records
The pattern is already established in the codebase, so adding your index CID attributes should be straightforward following the same structure as the magnetlink ones.
Absolutely! I can see you've already added the four new methods for handling Index CID operations. Let me help you complete the remaining steps to fully integrate your Index CID functionality.
## Step 1: Database Migration
First, you'll need to create a database migration to add the new columns to the `communities_archive_info` table. Let me search for existing migration files to follow the pattern:
Searched for files matching `**/migrations/**/*.sql`, 20 matches
Let me look for community-related migrations specifically:
Searched for files matching `**/migrations/**/*communit*.sql`, 20 matches
Let me look for the communities_archive_info table creation:
Searched text for `communities_archive_info` (`**/migrations/**/*.sql`), 2 results
Great! Let me check the structure of the existing table and the most recent migration:
Searched codebase for "communities_archive_info table CREATE TABLE migration", 21 results
Now let me look at the actual migration files:
Read [](file:///home/mc2/code/status-im/status-go/appdatabase/migrations/sql)
Let me check the latest migration number and the specific migration that adds the archive_info table:
Searched codebase for "1650616788_add_communities_archives_info_table CREATE TABLE communities_archive_info", 18 results
Perfect! Now I can see the migration pattern. Let me create the migration file. Based on the existing files, I need to create a new migration with a timestamp.
## Step 1: Create Database Migration
````sql
CREATE TABLE communities_archive_info_add_index_cid_columns.sql
```sql
ALTER TABLE communities_archive_info ADD COLUMN last_cid TEXT DEFAULT "";
ALTER TABLE communities_archive_info ADD COLUMN indexcid_clock INTEGER DEFAULT 0;
```
````
Let me check the current latest migration to get the right timestamp:
Searched text for `1757511667`, no results
Based on the file listing, the latest migration is `1757511667_add_messages-backup_enabled.up.sql`. So your new migration should have a timestamp after that. Here's what you need:
**Create the migration file:**
`/home/mc2/code/status-im/status-go/appdatabase/migrations/sql/1757511668_add_index_cid_columns_to_communities_archive_info.up.sql`
````sql
ALTER TABLE communities_archive_info ADD COLUMN last_cid TEXT DEFAULT "";
ALTER TABLE communities_archive_info ADD COLUMN indexcid_clock INTEGER DEFAULT 0;
````
## Step 2: Update `SaveCommunityArchiveInfo` Method
You'll need to modify the existing method to include the new fields. Looking at your existing code, I can see you've already added the getter/setter methods, but you need to update the initial save method:
````go
func (p *Persistence) SaveCommunityArchiveInfo(communityID types.HexBytes, clock uint64, lastArchiveEndDate uint64) error {
_, err := p.db.Exec(`INSERT INTO communities_archive_info (magnetlink_clock, last_message_archive_end_date, community_id, last_cid, indexcid_clock) VALUES (?, ?, ?, ?, ?)`,
clock,
lastArchiveEndDate,
communityID.String(),
"", // default empty last_cid
0, // default indexcid_clock
)
return err
}
````
## Step 3: Update the Method Calls in Manager
You'll need to find where these new methods are called. Let me search for usage patterns:
Searched text for `UpdateMagnetlinkMessageClock`, 6 results
Searched text for `UpdateLastSeenMagnetlink`, 4 results
Great! Now I can see the pattern. You'll need to add corresponding manager methods and update the handlers.
## Step 4: Add Manager Methods
Add these methods to manager.go:
````go
// Add these methods after the existing magnetlink methods around line 3730
func (m *Manager) UpdateIndexCidMessageClock(communityID types.HexBytes, clock uint64) error {
return m.persistence.UpdateIndexCidMessageClock(communityID, clock)
}
func (m *Manager) GetIndexCidMessageClock(communityID types.HexBytes) (uint64, error) {
return m.persistence.GetIndexCidMessageClock(communityID)
}
func (m *Manager) UpdateLastSeenIndexCid(communityID types.HexBytes, indexCid string) error {
return m.persistence.UpdateLastSeenIndexCid(communityID, indexCid)
}
func (m *Manager) GetLastSeenIndexCid(communityID types.HexBytes) (string, error) {
return m.persistence.GetLastSeenIndexCid(communityID)
}
````
## Step 5: Update Community Description Structure (if needed)
Looking at the protobuf structure, you might want to add an `archive_indexcid_clock` field similar to `archive_magnetlink_clock`. Let me search for that:
Searched text for `archive_magnetlink_clock`, 1 result
You might want to add a similar field to the protobuf for consistency:
````proto
// In /home/mc2/code/status-im/status-go/protocol/protobuf/communities.proto
// Add after line 116:
uint64 archive_indexcid_clock = 20; // or the next available number
````
## Step 6: Add Message Handlers (similar to magnetlink)
Looking at the existing pattern in messenger_handler.go, you'll need to add handlers for Index CID messages. Let me check where the magnetlink handler is:
Searched text for `COMMUNITY_MESSAGE_ARCHIVE_MAGNETLINK` (`**/protocol/messenger_handler.go`), no results
Let me search for the magnetlink message handling:
Searched text for `CommunityMessageArchiveMagnetlink`, 4 results
Perfect! Now I can see the complete pattern. Here's what you need to do:
## Step 7: Create Index CID Protobuf Message
Add to communities.proto:
````proto
// Add after CommunityMessageArchiveMagnetlink message
message CommunityMessageArchiveIndexCid {
uint64 clock = 1;
string cid = 2;
}
````
## Step 8: Add Index CID Message Type
Add to application_metadata_message.proto:
````proto
// Add after COMMUNITY_MESSAGE_ARCHIVE_MAGNETLINK = 43;
COMMUNITY_MESSAGE_ARCHIVE_INDEX_CID = 76; // or next available number
````
## Step 9: Add Index CID Message Handler
Add to messenger_handler.go (similar to the magnetlink handler):
````go
func (m *Messenger) HandleCommunityMessageArchiveIndexCid(state *ReceivedMessageState, message *protobuf.CommunityMessageArchiveIndexCid, statusMessage *messagingtypes.Message) error {
logger := m.logger.Named("HandleCommunityMessageArchiveIndexCid")
haveArchiveInfo, err := m.communitiesManager.HasCommunityArchiveInfo(statusMessage.CommunityID)
if err != nil {
logger.Debug("failed to check if we have community archive info", zap.Error(err))
return err
}
if !haveArchiveInfo {
logger.Debug("ignoring index cid message for community with no archive info", zap.String("communityID", types.EncodeHex(statusMessage.CommunityID)))
return nil
}
currentIndexCidClock, err := m.communitiesManager.GetIndexCidMessageClock(statusMessage.CommunityID)
if err != nil {
logger.Debug("failed to get index cid message clock", zap.Error(err))
return err
}
if currentIndexCidClock >= message.Clock {
// ignore older index cid messages
return nil
}
id := statusMessage.CommunityID
clock := message.Clock
err = m.communitiesManager.UpdateIndexCidMessageClock(id, clock)
if err != nil {
logger.Debug("failed to update index cid message clock", zap.Error(err))
return err
}
if len(message.Cid) > 0 {
err = m.communitiesManager.UpdateLastSeenIndexCid(id, message.Cid)
if err != nil {
logger.Debug("failed to update last seen index cid", zap.Error(err))
return err
}
}
return nil
}
````
## Step 10: Update `dispatchIndexCidMessage` Method
Update the method in messenger_communities.go to use the correct message type:
````go
func (m *Messenger) dispatchIndexCidMessage(communityID string) error {
community, err := m.communitiesManager.GetByIDString(communityID)
if err != nil {
return err
}
indexCid, err := m.archiveManager.GetHistoryArchiveIndexCid(community.ID())
if err != nil {
return err
}
indexCidMessage := &protobuf.CommunityMessageArchiveIndexCid{
Clock: m.getTimesource().GetCurrentTime(),
Cid: indexCid,
}
encodedMessage, err := proto.Marshal(indexCidMessage)
if err != nil {
return err
}
chatID := community.MagnetlinkMessageChannelID()
rawMessage := messagingtypes.RawMessage{
LocalChatID: chatID,
Sender: community.PrivateKey(),
Payload: encodedMessage,
MessageType: protobuf.ApplicationMetadataMessage_COMMUNITY_MESSAGE_ARCHIVE_INDEX_CID,
SkipGroupMessageWrap: true,
PubsubTopic: community.PubsubTopic(),
Priority: &messagingtypes.LowPriority,
}
_, err = m.messaging.SendPublic(context.Background(), chatID, rawMessage)
if err != nil {
return err
}
err = m.communitiesManager.UpdateCommunityDescriptionIndexCidMessageClock(community.ID(), indexCidMessage.Clock)
if err != nil {
return err
}
return m.communitiesManager.UpdateIndexCidMessageClock(community.ID(), indexCidMessage.Clock)
}
````
## Summary
Here are all the files you need to modify:
1. **Migration**: Create `1757511668_add_index_cid_columns_to_communities_archive_info.up.sql`
2. **Persistence**: Update `SaveCommunityArchiveInfo` method (already done the getters/setters)
3. **Manager**: Add the 4 new methods for Index CID operations
4. **Protobuf**: Add `CommunityMessageArchiveIndexCid` message and `COMMUNITY_MESSAGE_ARCHIVE_INDEX_CID` enum
5. **Handler**: Add `HandleCommunityMessageArchiveIndexCid` method
6. **Dispatch**: Update `dispatchIndexCidMessage` to use correct message type
The pattern is exactly the same as the magnetlink implementation, just with "IndexCid" instead of "Magnetlink". This approach ensures consistency with the existing codebase and makes it easy to maintain.
Would you like me to help you with any specific part of this implementation?

View File

@ -0,0 +1,144 @@
_Last updated: 2025-11-10_
Status-Go juggles several identifiers while shuttling messages between the protocol layer and Waku. This note ties the names together so future readers can reason about filters, channels, and archives without spelunking through `messenger.go`.
## Quick Glossary
| Term | Defined in | Purpose |
| --- | --- | --- |
| `ChatID` | `messaging/types/filters.go:23-48` | Status-level label for a logical chat/channel (e.g., `communityID-memberUpdate`, `communityID-general`, `0x…` contact IDs). |
| `LocalChatID` | `messagingtypes.RawMessage` (various call sites) | Field embedded in outgoing raw messages so higher layers know which chat to update; does **not** change network routing. |
| Content Topic | `messaging/layers/transport/topic.go:18-21` | Waku topic (4 bytes) derived from `ChatID` via `Keccak256(chatID)[:4]`. Real network “channel.” |
| Pubsub Topic | `messaging/types/pubsub_topics.go` (see helpers in `messagingtypes`) | Waku v2 gossipsub domain (e.g., `/waku/2/rs/16/32`, `/waku/2/default-waku/proto`). Same content topic on different pubsub topics → distinct subscriptions. |
| `transport.Filter` | `messaging/layers/transport/filter.go` | Stores `ChatID`, `FilterID`, content topic, pubsub topic, symmetric key, and flags. Returned by transport code to upper layers. |
| `messagingtypes.ChatFilter` | `messaging/types/filters.go` | Thin wrapper exposed to the protocol (`messenger.go`); created from `transport.Filter`. |
> **Rule of thumb:** The `chatID` that created a filter is the only input to `ToTopic`, so _picking a chatID_ at send time uniquely determines the Waku content topic.
## Where Chat IDs Come From
Community helpers mint deterministic chat IDs (`protocol/communities/community.go:1544-1590`):
- `Community.ChatIDs()` returns legacy per-channel identifiers (one per Status channel).
- `Community.MemberUpdateChannelID()` produces `communityID-memberUpdate`.
- `Community.UniversalChatID()` aliases the member-update channel so one topic can carry **all** community messages during the universal-channel rollout.
- Contact/discovery/chat code helpers live in `messaging/layers/transport/topic.go:27-45`.
When a community loads, `Messenger.DefaultFilters` asks to subscribe to:
1. `communityID` on the community shards pubsub topic.
2. `communityID-memberUpdate` (universal channel) on the same pubsub topic.
3. The hex-encoded community pubkey on both the global content topic and the default non-protected topic.
4. Optional fallbacks when the community does not publish a shard (`protocol/messenger_communities.go:2463-2480`).
These `ChatID` + pubsub pairs become actual Waku subscriptions via the FiltersManager.
## From Chat ID to Transport Filter
`FiltersManager.LoadPublic` is the main entry point (`messaging/layers/transport/filters_manager.go:540-591`):
1. Derive a map key (`chatID` or `chatID||pubsub` when `distinctByPubsub` is true).
2. If no filter exists yet, call `addSymmetric(chatID, pubsubTopic)` which:
- Computes `ToTopic(chatID)` → content topic.
- Calls into the Waku service (`filters_service.Subscribe`) to register the subscription.
- Returns Wakus `FilterID`, symmetric key id, and topic bytes.
3. Store and return the populated `transport.Filter`.
`InitCommunities` / `InitPublicChats` simply loop over `ChatsToInitialize` and call `LoadPublic` for each entry, so a single community normally yields several transport filters (legacy per-channel, universal, control/pubkey, etc.).
### Diagram: Subscription Lifecycle
```mermaid
flowchart TD
A[Messenger.DefaultFilters<br/>community.go helpers] --> B[ChatsToInitialize]
B --> C[Transport.InitPublicChats]
C --> D["FiltersManager.LoadPublic(chatID, pubsub)"]
D -->|compute Keccak| E["ToTopic(chatID)"]
E --> F["filters_service.Subscribe<br/>(content topic, pubsub)"]
F --> G["transport.Filter stored<br/>filters → chatID key"]
G --> H["messagingtypes.NewChatFilter<br/>exposed to messenger"]
```
## Sending Flow
All public/community traffic eventually funnels through `MessageSender.SendPublic` (`messaging/common/message_sender.go:565-681`). Important details:
1. The caller supplies `chatName` (usually `community.UniversalChatID()`).
2. After wrapping/encrypting, SendPublic calls `transport.SendPublic(ctx, newMessage, chatName)` (`messaging/layers/transport/transport.go:263-280`).
3. `transport.SendPublic` loads the filter keyed by `chatName`, then copies its symmetric key, content topic, and pubsub topic into the Waku message before posting.
Therefore **every** universal-channel message (chat, pin, magnetlink, indexCID, etc.) shares a content topic derived from `communityID-memberUpdate`. Legacy per-channel messages keep using their old chat IDs until migration completes.
### Diagram: Send Path
```mermaid
sequenceDiagram
participant Proto as protocol/messenger_communities.go
participant Msg as messaging/common/message_sender.go
participant Trans as messaging/layers/transport/transport.go
participant FM as FiltersManager
participant W as Waku
Proto->>Msg: SendPublic(chatID = communityID-memberUpdate, rawMessage)
Msg->>Trans: SendPublic(ctx, newMessage, chatID)
Trans->>FM: LoadPublic(chatID, pubsub, distinct=false)
FM-->>Trans: transport.Filter{ContentTopic, SymKeyID, PubsubTopic}
Trans->>W: Post(message with ContentTopic=ToTopic(chatID))
W-->>Trans: Hash
Trans-->>Msg: Hash
Msg-->>Proto: MessageID/Hash
```
## Receiving Flow
Incoming envelopes land inside Waku filter queues. Retrieval proceeds as follows:
1. `transport.RetrieveRawAll` iterates over **every** registered filter, calls `api.GetFilterMessages(filter.FilterID)`, drops cached duplicates, and groups results by filter (`messaging/layers/transport/transport.go:213-258`).
2. `messenger.RetrieveAll` converts transport filters into `messagingtypes.ChatFilter` objects and feeds the map into `handleRetrievedMessages` (`protocol/messenger.go:2610`, `3042-3230`).
3. For each `(filter, []*ReceivedMessage)` pair:
- If `filter.ChatID()` matches an owned community (legacy ID or universal ID) and `storeWakuMessages == true`, the raw Waku message is persisted for archive building (`protocol/messenger.go:3051-3082`, `protocol/communities/manager.go:4372-4405`).
- `messaging.HandleReceivedMessages` decodes the payload(s).
- Each decoded Status message is dispatched by type (`dispatchToHandler`), eventually ending up in chat history, member updates, archive downloads, etc.
### Diagram: Receive Path
```mermaid
flowchart LR
Waku["Waku subscription queues<br/>(per content topic & pubsub)"] -->|GetFilterMessages| Transport
Transport -->|"map[Filter][]Message"| Messenger.RetrieveAll
Messenger.RetrieveAll -->|handleRetrievedMessages| Loop["for each filter batch"]
Loop --> Decision{"owned chat & storeWakuMessages?"}
Decision -->|Yes| Store[StoreWakuMessage]
Decision -->|No| Skip["(skip storage)"]
Store --> Decode["messaging.HandleReceivedMessages"]
Skip --> Decode
Decode --> Dispatch["dispatchToHandler<br/>(type-specific logic)"]
Dispatch --> DB["User DB / UI updates / archive triggers"]
```
## Persistence & Archives
- Community owners call `GetOwnedCommunitiesChatIDs()` to load every legacy per-channel ID and `GetOwnedCommunitiesUniversalChatIDs()` for the universal ID (`protocol/communities/manager.go:4372-4400`). The union is the allowlist.
- `handleRetrievedMessages` is invoked in two distinct contexts:
1. **Live retrieval loop** (`RetrieveAll`): `storeWakuMessages = true`, `fromArchive = false`. Raw envelopes that match the allowlist are stored in `waku_messages`, and handlers run with `fromArchive=false`.
2. **Archive replay** (`handleArchiveMessages`): `storeWakuMessages = false`, `fromArchive = true`. No new rows are written, but handlers see `fromArchive=true` to suppress live-only side effects.
- Therefore, the code only writes to `waku_messages` when both the allowlist check and the “live retrieval” flags match (case 1 above).
- Stored rows retain raw payloads, timestamps, hashes, and third-party IDs (`protocol/communities/persistence.go:889-934`). Later, archive import/export code queries by content topic and timestamp windows.
Because magnetlink/indexCID/chat traffic now shares the universal channel, one subscription per community is enough to capture everything needed for archive creation; legacy filters stay in place until every client migrates.
## Debugging Checklist
1. **Which filter saw the message?** Log `filter.ChatID()` in `handleRetrievedMessages` to confirm whether it was the universal channel, a legacy chat, or the admin/pubkey topic.
2. **Did the message get stored?** Ensure the `ChatID` appears in either `GetOwnedCommunitiesChatIDs()` or `GetOwnedCommunitiesUniversalChatIDs()` and that `storeWakuMessages` was true for this pass.
3. **Unexpected content topic?** Remember that changing the `chatID` you pass into `SendPublic` changes the `ToTopic` hash. Verify the string passed to `SendPublic` (search `LocalChatID:` in the caller).
4. **Multiple filters per community?** Enumerate `t.filters.Filters()` at runtime; its normal to see several entries per community because the migration keeps both universal and legacy subscriptions alive.
## appendix: Useful References
- `messaging/layers/transport/topic.go` helpers that convert strings/public keys into Waku topics.
- `messaging/layers/transport/filters_manager.go` how filters are registered, keyed, and updated.
- `messaging/common/message_sender.go` wrapping, encrypting, and handing messages to the transport.
- `messaging/layers/transport/transport.go` message retrieval, cache checks, and calls into messenger.
- `protocol/messenger_communities.go` default chat IDs per community and subscription bootstrap.
- `protocol/messenger.go` storage/dispatch logic (`handleRetrievedMessages`).

View File

@ -0,0 +1,413 @@
marcinczenko: can you remind me how `ArchiveIndexCidClock` from the community description is used? Can you summarize how community description messages are sent and received and how they relate to magent links/indexCid updates - I remembered that community description had some impact on processing magnet links. I just to remember the details...
GitHub Copilot: I'll help you understand how `ArchiveIndexCidClock` is used and how it relates to community description messages and index CID updates.
Let me search for relevant code to give you a complete picture.
Searched text for `ArchiveIndexCidClock`, 2 results
Searched text for `ArchiveMagnetlinkClock`, 2 results
Searched text for `HandleCommunityDescriptionMessage`, 12 results
Now let me look at the specific section where the clock comparison happens:
Read [](file:///home/mc2/code/status-im/status-go/protocol/communities/manager.go#2310-2310), lines 2310 to 2360
Now let me check how these clocks are updated from the persistence layer and how magnetlink/indexCid messages are handled:
Searched text for `UpdateCommunityDescriptionIndexCidMessageClock`, 2 results
Searched text for `UpdateCommunityDescriptionMagnetlinkMessageClock`, 2 results
Searched codebase for "handle archive magnetlink index CID message torrent community", 14 results
Now let me look at the index CID related code:
Searched text for `HandleIndexCidMessage`, no results
Searched text for `dispatchIndexCidMessage`, 2 results
Searched text for `CommunityMessageArchiveIndexCid`, 9 results
Now let me look at these handler functions:
Read [](file:///home/mc2/code/status-im/status-go/protocol/messenger_handler.go#3785-3785), lines 3785 to 3825
Read [](file:///home/mc2/code/status-im/status-go/protocol/messenger_communities.go#4145-4145), lines 4145 to 4205
Now let me check the HandleHistoryArchiveIndexCidMessage:
Searched text for `HandleHistoryArchiveIndexCidMessage`, 2 results
Read [](file:///home/mc2/code/status-im/status-go/protocol/messenger_handler.go#1344-1344), lines 1344 to 1414
Perfect! Now let me check how the persistence saves this and look at the protobuf definition:
Read [](file:///home/mc2/code/status-im/status-go/protocol/protobuf/communities.proto#220-220), lines 220 to 270
Searched text for `SaveCommunityArchiveInfo`, 2 results
Read [](file:///home/mc2/code/status-im/status-go/protocol/communities/persistence.go#1023-1023), lines 1023 to 1083
Perfect! Now I have a complete picture. Let me provide you with a comprehensive summary:
## Summary: How `ArchiveIndexCidClock` and Community Description Messages Work
### **Community Description Clocks for Archives**
The community description contains two clock fields for tracking archive distribution:
1. **`ArchiveMagnetlinkClock`** - tracks the clock of the last magnetlink (torrent) message
2. **`ArchiveIndexCidClock`** - tracks the clock of the last index CID (Codex) message
These clocks are stored in the **community description protobuf** itself and are propagated when the community description is updated and distributed.
---
### **How Community Description Messages Are Sent**
1. **Control node creates/updates community** → Community description is updated with current state
2. **Community description is marshaled** → Converted to protobuf format
3. **Wrapped in ApplicationMetadataMessage** → Signed and wrapped
4. **Published to community channels** → Sent to all members via waku
The community description includes:
- All community metadata (name, description, members, etc.)
- Token permissions
- **Archive clocks**: `ArchiveMagnetlinkClock` and `ArchiveIndexCidClock`
---
### **How Community Description Messages Are Received**
When a node receives a community description message:
**In `handleCommunityDescriptionMessageCommon()` (line 2295):**
```go
// 1. Update the community with the new description
changes, err := community.UpdateCommunityDescription(description, payload, newControlNode)
// 2. Check if archive info exists in local DB
hasCommunityArchiveInfo, err := m.persistence.HasCommunityArchiveInfo(community.ID())
// 3. Extract clocks from the received community description
cdMagnetlinkClock := community.config.CommunityDescription.ArchiveMagnetlinkClock
cdIndexCidClock := community.config.CommunityDescription.ArchiveIndexCidClock
// 4. If no archive info exists, save it with the clocks from CD
if !hasCommunityArchiveInfo {
err = m.persistence.SaveCommunityArchiveInfo(community.ID(), cdMagnetlinkClock, 0, cdIndexCidClock)
} else {
// 5. If archive info exists, compare clocks and update if CD has newer clock
// For magnetlink:
magnetlinkClock, err := m.persistence.GetMagnetlinkMessageClock(community.ID())
if cdMagnetlinkClock > magnetlinkClock {
err = m.persistence.UpdateMagnetlinkMessageClock(community.ID(), cdMagnetlinkClock)
}
// For index CID:
indexCidClock, err := m.persistence.GetIndexCidMessageClock(community.ID())
if cdIndexCidClock > indexCidClock {
err = m.persistence.UpdateIndexCidMessageClock(community.ID(), cdIndexCidClock)
}
}
```
---
### **The Key Insight: Community Description Clock Comparison**
The **community description acts as a checkpoint** for archive distribution state. When nodes receive a community description:
- **They compare the CD's archive clocks** with their local archive info clocks
- **If CD has newer clocks**, they update their local tracking
- This ensures that even if a node **missed** the actual magnetlink/indexCid messages, they'll know there's a newer archive available when they receive the next community description
---
### **How Magnetlink/IndexCid Messages Work**
**When control node dispatches archive messages:**
1. **Control node seeds archive** → Creates torrent/uploads to Codex
2. **Dispatches message** (magnetlink or indexCid) → Sent to community
3. **Updates CD clock** → Updates `ArchiveMagnetlinkClock` or `ArchiveIndexCidClock` in community description
4. **Updates local tracking** → Also updates local persistence tables
**When member receives archive message:**
```go
// In HandleHistoryArchiveIndexCidMessage (line 1344):
// 1. Get the last clock from LOCAL tracking table
lastIndexCidClock, err := m.communitiesManager.GetIndexCidMessageClock(id)
lastMagnetlinkClock, err := m.communitiesManager.GetMagnetlinkMessageClock(id)
// 2. Compare with incoming message clock
// 3. If incoming clock >= lastClock, process it (download archives)
// 4. Update local tracking clocks
```
---
### **Why This Design?**
This dual-tracking system (CD clocks + local tracking) provides:
1. **Resilience**: If a node misses the actual archive message, the next community description will let them know there's a newer archive
2. **Synchronization**: New members get the latest archive clock immediately when they join (via CD)
3. **Deduplication**: Prevents re-downloading the same archive multiple times
4. **Preference support**: Can compare clocks from both methods (torrent vs Codex) to support distribution preferences
---
### **Database Storage**
All tracked in `communities_archive_info` table:
- `magnetlink_clock` - last seen magnetlink message clock
- `index_cid_clock` - last seen index CID message clock
- `last_magnetlink_uri` - actual magnetlink string
- `last_index_cid` - actual CID string
The community description clocks act as a "high water mark" that gets propagated with every CD update, while the local tracking tables record what archive messages were actually processed.
marcinczenko: ok, so it looks like the clocks in community description prevent from downloading and processing outdated archives?
Can you also show me:
(1) where the control node opdates ArchiveMagnetlinkClock or ArchiveIndexCidClock in community description
(2) when and where community description are sent (show me where it is in code as well)
GitHub Copilot: Great question! Yes, exactly - the clocks in the community description act as a "high water mark" to prevent processing outdated archives. Let me show you the code for both parts:
## (1) Where Control Node Updates Archive Clocks in Community Description
Read [](file:///home/mc2/code/status-im/status-go/protocol/messenger_communities.go#4081-4081), lines 4081 to 4131
Read [](file:///home/mc2/code/status-im/status-go/protocol/messenger_communities.go#250-250), lines 250 to 290
Now let me find where the community description clock is actually updated:
Read [](file:///home/mc2/code/status-im/status-go/protocol/messenger_communities.go#4130-4130), lines 4130 to 4160
Perfect! Now let me see where `UpdateCommunityDescriptionMagnetlinkMessageClock` and `UpdateCommunityDescriptionIndexCidMessageClock` actually update the CD:
Read [](file:///home/mc2/code/status-im/status-go/protocol/communities/manager.go#3696-3696), lines 3696 to 3726
Excellent! Now let me find where `SaveCommunity` triggers the community description to be published:
## (2) When and Where Community Descriptions Are Sent
Searched text for `func (m *Manager) SaveCommunity`, 4 results
Searched text for `func (m *Manager) saveAndPublish`, 2 results
Read [](file:///home/mc2/code/status-im/status-go/protocol/communities/manager.go#5102-5102), lines 5102 to 5162
Read [](file:///home/mc2/code/status-im/status-go/protocol/communities/manager.go#4604-4604), lines 4604 to 4644
Now let me find the `publish` function to see how it actually sends the community description:
Searched text for `func (m *Manager) publish(`, 1 result
Read [](file:///home/mc2/code/status-im/status-go/protocol/communities/manager.go#747-747), lines 747 to 767
Now let me find where the subscription is consumed and the actual message is sent:
Searched text for `handleCommunitiesSubscription`, 3 results
Read [](file:///home/mc2/code/status-im/status-go/protocol/messenger_communities.go#306-306), lines 306 to 406
Perfect! Now let me find the `publishOrg` function:
Searched text for `func (m *Messenger) publishOrg`, 1 result
Read [](file:///home/mc2/code/status-im/status-go/protocol/messenger_communities.go#115-115), lines 115 to 195
Perfect! Now I have all the information. Let me provide you with a comprehensive answer:
---
## Summary: How Archive Clocks Are Updated and Community Descriptions Are Sent
### (1) Where Control Node Updates Archive Clocks in Community Description
**The Flow:**
1. **Archive is seeded** → Archive manager detects seeding is complete
2. **Signal is published**`HistoryArchivesSeedingSignal` is sent via subscription
3. **Control node dispatches archive messages** → Sends magnetlink/indexCid messages
4. **Control node updates CD clocks** → Updates `ArchiveMagnetlinkClock` or `ArchiveIndexCidClock`
**The Code Path:**
```
protocol/messenger_communities.go:250-270 (handleCommunitiesHistoryArchivesSubscription)
└─> When HistoryArchivesSeedingSignal received and IsControlNode():
├─> dispatchMagnetlinkMessage() (line 4099)
│ └─> UpdateCommunityDescriptionMagnetlinkMessageClock() (line 4138)
│ └─> manager.go:3696 - Sets community.config.CommunityDescription.ArchiveMagnetlinkClock
│ └─> SaveCommunity() - Saves to DB
└─> dispatchIndexCidMessage() (line 4145)
└─> UpdateCommunityDescriptionIndexCidMessageClock() (line 4183)
└─> manager.go:3708 - Sets community.config.CommunityDescription.ArchiveIndexCidClock
└─> SaveCommunity() - Saves to DB
```
**Key Code Locations:**
**In `messenger_communities.go:250-270`:**
```go
if c.IsControlNode() {
if sub.HistoryArchivesSeedingSignal.MagnetLink {
err := m.dispatchMagnetlinkMessage(sub.HistoryArchivesSeedingSignal.CommunityID)
}
if sub.HistoryArchivesSeedingSignal.IndexCid {
err := m.dispatchIndexCidMessage(sub.HistoryArchivesSeedingSignal.CommunityID)
}
}
```
**In `messenger_communities.go:4138-4142`:**
```go
// After sending magnetlink message:
err = m.communitiesManager.UpdateCommunityDescriptionMagnetlinkMessageClock(community.ID(), magnetLinkMessage.Clock)
```
**In `manager.go:3696-3705`:**
```go
func (m *Manager) UpdateCommunityDescriptionMagnetlinkMessageClock(communityID types.HexBytes, clock uint64) error {
community, err := m.GetByIDString(communityID.String())
community.config.CommunityDescription.ArchiveMagnetlinkClock = clock
return m.SaveCommunity(community) // ← Saves to DB
}
```
**Important:** Note that `SaveCommunity()` **only saves to the database** - it does NOT automatically publish the community description! The CD will be published later through the periodic check or when other changes trigger a publish.
---
### (2) When and Where Community Descriptions Are Sent
**There are TWO main ways community descriptions are published:**
#### **A. Event-Driven Publishing (Immediate)**
When changes happen (like updating clocks, editing community, accepting members), the code calls:
**`saveAndPublish()``manager.go:4608-4638`**
```go
func (m *Manager) saveAndPublish(community *Community) error {
err := m.SaveCommunity(community) // Save to DB
if err != nil {
return err
}
if community.IsControlNode() {
m.publish(&Subscription{Community: community}) // ← Publish event
return nil
}
// ... handle admin events
}
```
**`publish()``manager.go:747-759`**
```go
func (m *Manager) publish(subscription *Subscription) {
for _, s := range m.subscriptions {
s <- subscription // Send to subscription channel
}
}
```
#### **B. Subscription Handler (Actual Network Send)**
**`handleCommunitiesSubscription()``messenger_communities.go:307-406`**
This goroutine listens to the subscription channel:
```go
func (m *Messenger) handleCommunitiesSubscription(c chan *communities.Subscription) {
// Checks every 5 minutes
ticker := time.NewTicker(5 * time.Minute)
go func() {
for {
select {
case sub := <-c: // Receives from subscription channel
if sub.Community != nil {
publishOrgAndDistributeEncryptionKeys(community)
}
case <-ticker.C: // Periodic check every 5 minutes
// ... republish if needed
}
}
}()
}
```
The `publishOrgAndDistributeEncryptionKeys` function then calls:
**`publishOrg()``messenger_communities.go:115-159`**
```go
func (m *Messenger) publishOrg(org *communities.Community, shouldRekey bool) error {
payload, err := org.MarshaledDescription() // ← Marshal community description
rawMessage := messagingtypes.RawMessage{
Payload: payload,
Sender: org.PrivateKey(),
SkipEncryptionLayer: true,
MessageType: protobuf.ApplicationMetadataMessage_COMMUNITY_DESCRIPTION,
PubsubTopic: org.PubsubTopic(),
Priority: &messagingtypes.HighPriority,
}
messageID, err := m.messaging.SendPublic(context.Background(), org.IDString(), rawMessage)
// ← Actually sends to the network via Waku
return err
}
```
---
### **The Complete Flow:**
```
1. Control Node Seeds Archive
└─> HistoryArchivesSeedingSignal published
2. dispatchMagnetlinkMessage() or dispatchIndexCidMessage()
├─> Sends actual magnetlink/indexCid message to network
└─> UpdateCommunityDescriptionMagnetlinkMessageClock()
└─> Updates community.config.CommunityDescription.ArchiveMagnetlinkClock
└─> SaveCommunity() - saves to DB only
3. Later (when CD is published for any reason):
└─> saveAndPublish() called OR periodic check triggers
└─> publish(&Subscription{Community: community})
└─> handleCommunitiesSubscription receives event
└─> publishOrg()
└─> MarshaledDescription() - includes updated clocks
└─> SendPublic() - sends to Waku network
4. Members receive Community Description
└─> HandleCommunityDescriptionMessage()
└─> handleCommunityDescriptionMessageCommon()
└─> Compares CD clocks with local tracking
└─> Updates local tracking if CD has newer clocks
```
---
### **Key Insights:**
1. **Archive clocks are updated in CD immediately** when magnetlink/indexCid messages are dispatched
2. **CD is NOT immediately published** - it waits for the next publish event or periodic check
3. **CD includes the updated clocks** whenever it's eventually published
4. **This design ensures** that even if members miss the actual magnetlink/indexCid message, they'll get the updated clock in the next CD update
5. **The clocks prevent reprocessing** - members won't download the same archive twice because they compare incoming message clocks against both the CD clocks and their local tracking
This is why you can miss an archive message but still know there's a newer archive available - the CD acts as a "checkpoint" that gets propagated with every community update!

View File

@ -0,0 +1 @@
https://hackmd.io/@codex-storage/H1kBvIgMbg

View File

@ -0,0 +1,565 @@
---
related-to:
- "[[Running tests with gotestsum]]"
- "[[go testify assertions]]"
---
Let's start with what we find in the [status-go build instructions](https://github.com/status-im/status-go/blob/develop/_docs/how-to-build.md) as lots of things just work.
We have two options: (1) use the [nix](https://nixos.org/) develop shell or (2) just use standard shell you have in your shell. Theoretically, nix should give you better isolation and repeatability, yet, it is quite opinionated, has learning curve, and adds quite a bit complexity. For now thus, I decided to a more conservative shell environment, where I feel more comfortable.
> In what follows I am using BASH on Arch Linux in [Omarchy](https://omarchy.org/) distribution.
### Pre-requisites
You need to have `go` installed on your system. On Arch Linux I used `sudo pacman -S go` to install it.
### GO dependencies
Right after cloning, you should be able to run:
```bash
make status-go-deps
```
Plus, and what the original documentation does not say, you will need `gotestsum` to conveniently run unit tests. We have to install it manually:
```bash
go install gotest.tools/gotestsum@latest
```
or to use specific version (`v1.13.0` was the most recent while writing this doc):
```bash
go install gotest.tools/gotestsum@v1.13.0
```
You can check it is available by running:
```bash
gotestsum --version
gotestsum version dev
```
`dev` version comes from using `@latest` when installing `gotestsum`. If you installed a concrete version, you will see:
```bash
gotestsum --version
gotestsum version v1.13.0
```
You may also manually install go Protobuf compiler: `protoc`. I have followed the instructions from [Protocol Buffer Compiler Installation](https://protobuf.dev/installation/).
The following bash script (Arch Linux) can come in handy:
```bash
#!/usr/bin/env bash
set -euo pipefail
echo "installing go..."
sudo pacman -S --noconfirm --needed go
echo "installing go protoc compiler"
PB_REL="https://github.com/protocolbuffers/protobuf/releases"
VERSION="32.1"
FILE="protoc-${VERSION}-linux-x86_64.zip"
# 1. create a temp dir
TMP_DIR="$(mktemp -d)"
# ensure cleanup on exit
trap 'rm -rf "$TMP_DIR"' EXIT
echo "Created temp dir: $TMP_DIR"
# 2. download file into temp dir
curl -L -o "$TMP_DIR/$FILE" "$PB_REL/download/v$VERSION/$FILE"
# 3. unzip into ~/.local/share/go
mkdir -p "$HOME/.local/share/go"
unzip -o "$TMP_DIR/$FILE" -d "$HOME/.local/share/go"
# 4. cleanup handled automatically by trap
echo "protoc $VERSION installed into $HOME/.local/share/go"
```
After that make sure that `$HOME/.local/share/go/bin` is in your path, and you should get:
```bash
protoc --version
libprotoc 32.1
```
Then also the `protoc-gen-go` plugin is required to generate Go code from `.proto` files.
Install it with:
```bash
go install google.golang.org/protobuf/cmd/protoc-gen-go@v1.34.1
```
Make sure `$(go env GOPATH)/bin` is in your `$PATH` so protoc can find the plugin.
Verify the installation:
```bash
which protoc-gen-go
protoc-gen-go --version
# Should output: protoc-gen-go v1.34.1
```
### notes on regenerating mocks
In order to regenerate mocks you will need `mockgen`. You would expect that `make status-go-deps` did it, but it did not...
You can install it with:
```bash
go install go.uber.org/mock/mockgen
```
> Also make sure you have `$(go env GOPATH)/bin` in your PATH. Otherwise
make sure you have something like `export PATH="$PATH:$(go env GOPATH)/bin"`
in your `~/.bashrc` (adjusted to your SHELL and OS version).
This should be part of your standard GO installation.
If everything works well, you should see something like:
```bash
which mockgen && mockgen -version
/home/<your-user-name>/go/bin/mockgen
v0.6.0
```
If everything seems to be under control, we can now proceed with actual generation, e.g. for the mocks relevant to `protocol/communites`:
```bash
go generate ./protocol/communities
```
The related mocks - if any - should be (re) generated.
> Notice that `mock` folders are git ignored in status-go.
#### go-zerokit-rln-x86_64 vendoring problem
If you try to run the tests for the first time, you may face the following error:
```bash
gotestsum --packages="./protocol/communities" -f standard-verbose --rerun-fails -- -v -run "TestCodexClientTestSuite" -count 1
FAIL github.com/status-im/status-go/protocol/communities [setup failed]
=== Failed
=== FAIL: protocol/communities (0.00s)
FAIL github.com/status-im/status-go/protocol/communities [setup failed]
=== Errors
vendor/github.com/waku-org/go-zerokit-rln/rln/link/x86_64.go:8:8: cannot find module providing package github.com/waku-org/go-zerokit-rln-x86_64/rln: import lookup disabled by -mod=vendor
(Go version in go.mod is at least 1.14 and vendor directory exists.)
DONE 0 tests, 1 failure, 1 error in 0.000s
ERROR rerun aborted because previous run had errors
```
The problem can be outlined as follows:
1. **The package *IS* declared in dependencies**:
- `go.mod` has: `github.com/waku-org/go-zerokit-rln-x86_64 v0.0.0-20230916171518-2a77c3734dd1 // indirect`
- `modules.txt` lists it as *vendored*
2. **BUT it's excluded from git**:
- `.gitignore` has: `vendor/github.com/waku-org/go-zerokit-rln-x86_64/`
- as the result. the actual vendor directory is **missing** from your file system
3. **Why it's excluded**: these seems to platform-specific native libraries (RLN - Rate Limiting Nullifier) with binary/compiled components that are large and platform-specific. The project excludes them from version control.
4. **The build tag restriction**: The file x86_64.go has build tags:
`//go:build (linux || windows) && amd64 && !android`
So it only compiles on x86_64 Linux/Windows, but when it tries to compile, it needs the vendored package.
#### The Solution
We need to vendor the missing dependencies:
```bash
# This will download the missing vendored dependencies
go mod vendor
```
This will populate the `vendor/github.com/waku-org/go-zerokit-rln-x86_64/` directory with the necessary files, even though they're gitignored.
### Building backend and the libs
Just to check if everything is setup correctly, let's build `status-backend` (which is a wrapper over status-go that provides web API - handy for testing), and then status-go static and shared libraries:
```bash
make status-backend
```
It will be available as `./build/bin/status-backend`:
```bash
./build/bin/status-backend -h
Usage of ./build/bin/status-backend:
-address string
host:port to listen (default "127.0.0.1:0")
-pprof
enable pprof
-testify.m string
regular expression to select tests of the testify suite to run
```
Now, the libs. Static lib:
```bash
make statusgo-library
# ...
Building static library...
CGO_ENABLED=1 CGO_CFLAGS="-I/home/mc2/code/status-im/status-go/libs -I/home/mc2/code/status-im/nim-sds/library -I//include -I//include/darwin -I/home/mc2/code/status-im/nim-sds/library -I/home/mc2/code/status-im/status-go/libs" CGO_LDFLAGS="-Wl,-rpath,/home/mc2/code/status-im/status-go/libs -L/home/mc2/code/status-im/status-go/libs -lcodex -Wl,-rpath,/home/mc2/code/status-im/status-go/libs -L/home/mc2/code/status-im/nim-sds/build -lsds -L/home/mc2/code/status-im/nim-sds/build -lsds -L/home/mc2/code/status-im/status-go/libs -lcodex" LD_LIBRARY_PATH=/home/mc2/code/status-im/status-go/libs:/home/mc2/code/status-im/nim-sds/build:$LD_LIBRARY_PATH go build \
-tags 'gowaku_no_rln' \
-ldflags="" \
-buildmode=c-archive \
-o build/bin/libstatus.a \
"build/bin/statusgo-lib/main.go"
Static library built: build/bin/libstatus.a
```
and shared lib:
```bash
make statusgo-shared-library
# ...
Building shared library...
Tags: gowaku_no_rln
CGO_LDFLAGS="-L/home/mc2/code/status-im/status-go/libs -lcodex -Wl,-rpath,/home/mc2/code/status-im/status-go/libs -L/home/mc2/code/status-im/nim-sds/build -lsds -L/home/mc2/code/status-im/nim-sds/build -lsds -L/home/mc2/code/status-im/status-go/libs -lcodex" CGO_CFLAGS="-I/home/mc2/code/status-im/status-go/libs -I/home/mc2/code/status-im/nim-sds/library -I//include -I//include/darwin -I/home/mc2/code/status-im/nim-sds/library -I/home/mc2/code/status-im/status-go/libs" \
go build \
-tags 'gowaku_no_rln' \
-ldflags="" \
-buildmode=c-shared \
-o build/bin/libstatus.so \
./build/bin/statusgo-lib
cd build/bin && \
ls -lah . && \
mv ./libstatus.so ./libstatus.so.0 && \
ln -s ./libstatus.so.0 ./libstatus.so
total 495M
drwxr-xr-x 1 mc2 mc2 120 Dec 12 18:50 .
drwxr-xr-x 1 mc2 mc2 6 Dec 12 18:48 ..
-rw-r--r-- 1 mc2 mc2 261M Dec 12 18:49 libstatus.a
-rw-r--r-- 1 mc2 mc2 7.4K Dec 12 18:50 libstatus.h
-rw-r--r-- 1 mc2 mc2 122M Dec 12 18:50 libstatus.so
-rwxr-xr-x 1 mc2 mc2 113M Dec 12 18:48 status-backend
drwxr-xr-x 1 mc2 mc2 14 Dec 12 18:48 statusgo-lib
Shared library built:
-rw-r--r-- 1 mc2 mc2 272836554 Dec 12 18:49 build/bin/libstatus.a
-rw-r--r-- 1 mc2 mc2 7482 Dec 12 18:50 build/bin/libstatus.h
lrwxrwxrwx 1 mc2 mc2 16 Dec 12 18:50 build/bin/libstatus.so -> ./libstatus.so.0
-rw-r--r-- 1 mc2 mc2 127116728 Dec 12 18:50 build/bin/libstatus.so.0
```
### Running unit test
The obvious
```bash
make test
```
It runs all the developer tests, except for `protocol` tests currently.
`make test` uses `_assets/scripts/run_unit_tests.sh` under the hood, where we find the following fragment:
```bash
if [[ $HAS_PROTOCOL_PACKAGE == 'false' ]]; then
# This is the default single-line flow for testing all packages
# The `else` branch is temporary and will be removed once the `protocol` package runtime is optimized.
run_test_for_packages "${UNIT_TEST_PACKAGES}" "0" "${UNIT_TEST_COUNT}" "${DEFAULT_TIMEOUT_MINUTES}" "All packages"
else
# Spawn a process to test all packages except `protocol`
UNIT_TEST_PACKAGES_FILTERED=$(echo "${UNIT_TEST_PACKAGES}" | tr ' ' '\n' | grep -v '/protocol$' | tr '\n' ' ')
run_test_for_packages "${UNIT_TEST_PACKAGES_FILTERED}" "0" "${UNIT_TEST_COUNT}" "${DEFAULT_TIMEOUT_MINUTES}" "All packages except 'protocol'" &
bg_pids+=("$!")
# Spawn separate processes to run `protocol` package
for ((i=1; i<=UNIT_TEST_COUNT; i++)); do
run_test_for_packages github.com/status-im/status-go/protocol "${i}" 1 "${PROTOCOL_TIMEOUT_MINUTES}" "Only 'protocol' package" &
bg_pids+=("$!")
done
fi
```
From the comments, we see that the `else` branch is currently used when running the tests. It splits running tests into two parts:
1. All the tests except for the tests for `status-go/protocol` module, those are faster, which is reflected in the the shorter timeout: `DEFAULT_TIMEOUT_MINUTES=5`.
2. The `protocol tests` - there are many of them (over 900) and can take longer to run (`PROTOCOL_TIMEOUT_MINUTES=45`). By default `UNIT_TEST_COUNT=1`, which means it tries to run the protocol tests only once.
> The timeout variables `DEFAULT_TIMEOUT_MINUTES` and `PROTOCOL_TIMEOUT_MINUTES` are used as the value of the `-timeout` option passed down to `go test`. It limits the time all the tests are allowing to take.
We may observe more or less failures, depending on the run, but one test will consistently fail, even if we modify the script to only run non-protocol tests: `TestBasicWakuV2`:
```bash
=== RUN TestBasicWakuV2
waku_test.go:172:
Error Trace: /home/mc2/code/status-im/status-go/messaging/waku/waku_test.go:172
Error: Received unexpected error:
Get "http://localhost:8645/debug/v1/info": dial tcp [::1]:8645: connect: connection refused
Test: TestBasicWakuV2
--- FAIL: TestBasicWakuV2 (0.00s)
```
Let's try to run this test without using the `run_unit_tests.sh` script to confirm the problem:
```bash
gotestsum --packages="./messaging/waku" -f testname --rerun-fails -- \
-count 1 -timeout "5m" \
-tags "gowaku_no_rln gowaku_skip_migrations" \
-run TestBasicWakuV2
```
and we will get the same error.
If we look into source code of `messaging/waku/waku_test.go` where `TestBasicWakuV2` is defined, and scroll down a bit (!!), we will find the following comment:
```go
// In order to run these tests, you must run an nwaku node
//
// Using Docker:
//
// IP_ADDRESS=$(hostname -I | awk '{print $1}');
// docker run \
// -p 60000:60000/tcp -p 9000:9000/udp -p 8645:8645/tcp harbor.status.im/wakuorg/nwaku:v0.36.0 \
// --tcp-port=60000 --discv5-discovery=true --cluster-id=16 --pubsub-topic=/waku/2/rs/16/32 --pubsub-topic=/waku/2/rs/16/64 \
// --nat=extip:${IP_ADDRESS} --discv5-discovery --discv5-udp-port=9000 --rest-address=0.0.0.0 --store
```
First, on Arch Linux it may happen that `-I` option is not available. As an alternative you can use:
```bash
IP_ADDRESS=$(ip -o -4 addr show up primary scope global | awk '{print $4}' | cut -d/ -f1 | head -n1);
```
Unfortunately, if we try to start waku node as recommended, we will get an error:
```bash
docker run \
-p 60000:60000/tcp -p 9000:9000/udp -p 8645:8645/tcp \
harbor.status.im/wakuorg/nwaku:v0.36.0 \
--tcp-port=60000 --discv5-discovery=true --cluster-id=16 \
--pubsub-topic=/waku/2/rs/16/32 --pubsub-topic=/waku/2/rs/16/64 \
--nat=extip:${IP_ADDRESS} --discv5-discovery --discv5-udp-port=9000 \
--rest-address=0.0.0.0 --store
Unrecognized option 'pubsub-topic'
Try wakunode2 --help for more information.
```
Using quick co-pilot fix, it seems that we need to use `--shard` instead of `--pubsub-topic` and the correct command to start a waku node (with a store) is:
```bash
docker run \
-p 60000:60000/tcp -p 9000:9000/udp -p 8645:8645/tcp \
harbor.status.im/wakuorg/nwaku:v0.36.0 \
--tcp-port=60000 --discv5-discovery=true --cluster-id=16 \
--shard=32 --shard=64 \
--nat=extip:${IP_ADDRESS} --discv5-udp-port=9000 \
--rest-address=0.0.0.0 --store
```
Now when we run `TestBasicWakuV2`, it passes:
```bash
gotestsum --packages="./messaging/waku" -f testname --rerun-fails -- -count 1 -timeout "5m" -tags "gowaku_no_rln gowaku_skip_migrations" -run TestBasicWakuV2
PASS messaging/waku.TestBasicWakuV2 (7.24s)
PASS messaging/waku
```
We can also run all the test relevant to the `messaging/waku` package and they should all pass:
```bash
gotestsum --packages="./messaging/waku" -f testname --rerun-fails -- -count 1 -timeout "5m" -tags "gowaku_no_rln gowaku_skip_migrations"
PASS messaging/waku.TestMultipleTopicCopyInNewMessageFilter (0.00s)
PASS messaging/waku.TestHandlePeerAddress/valid_enrtree (0.00s)
PASS messaging/waku.TestHandlePeerAddress/invalid_multiaddr (0.00s)
PASS messaging/waku.TestHandlePeerAddress/valid_multiaddr (0.00s)
PASS messaging/waku.TestHandlePeerAddress/valid_enr (0.00s)
PASS messaging/waku.TestHandlePeerAddress/unknown_address_format (0.00s)
PASS messaging/waku.TestHandlePeerAddress (0.00s)
PASS messaging/waku.TestDiscoveryV5 (4.20s)
2025-09-17T04:02:04.059+0200 DEBUG p2p-config config/log.go:21 [Fx] HOOK OnStop github.com/libp2p/go-libp2p/p2p/transport/quicreuse.(*ConnManager).Close-fm() executing (caller: github.com/libp2p/go-libp2p/config.(*Config).addTransports.func9)
PASS messaging/waku.TestRestartDiscoveryV5 (10.86s)
2025-09-17T04:02:14.914+0200 DEBUG p2p-config config/log.go:21 [Fx] HOOK OnStop github.com/libp2p/go-libp2p/p2p/transport/quicreuse.(*ConnManager).Close-fm() called by github.com/libp2p/go-libp2p/config.(*Config).addTransports.func9 ran successfully in 12.553µs
PASS messaging/waku.TestRelayPeers (0.01s)
PASS messaging/waku.TestBasicWakuV2 (7.46s)
PASS messaging/waku.TestPeerExchange (11.06s)
SKIP messaging/waku.TestWakuV2Filter (0.00s)
PASS messaging/waku.TestOnlineChecker (0.11s)
PASS messaging/waku
=== Skipped
=== SKIP: messaging/waku TestWakuV2Filter (0.00s)
waku_test.go:392: flaky test
DONE 14 tests, 1 skipped in 33.720s
```
Now, we should also be able to successfully run all the non-protocol tests using `make test`:
```bash
make test
...
DONE 1881 tests, 25 skipped in 93.841s
Gathering test coverage results: output: coverage_merged.out, input: ./test_0.coverage.out
Generating HTML coverage report
Testing finished
```
Following the practice from running [[Running functional tests in status-go]], also here we have a handy script for starting a waku node suitable for unit testing:
```bash
_assets/scripts/run_waku.sh
Starting waku node...
0795dbb8e786cf9047c8162fcc1cbbb11cd22926a5091f4e07df0e1b259ef62d
Waku node started.
Press any button to exit...q
Removing containers...
0795dbb8e786
DONE!
```
Now, for the status-go-codex integration, most of the relevant tests are in the `TestManagerSuite`:
```bash
gotestsum --packages="./protocol/communities" -f testname --rerun-fails -- -count 1 -timeout "5m" -tags "gowaku_no_rln gowaku_skip_migrations" -run TestManagerSuite
PASS protocol/communities.TestManagerSuite/TestCheckAllChannelsPermissions (0.51s)
PASS protocol/communities.TestManagerSuite/TestCheckAllChannelsPermissions_EmptyPermissions (0.52s)
PASS protocol/communities.TestManagerSuite/TestCheckChannelPermissions_NoPermissions (0.53s)
PASS protocol/communities.TestManagerSuite/TestCheckChannelPermissions_ViewAndPostPermissions (0.51s)
PASS protocol/communities.TestManagerSuite/TestCheckChannelPermissions_ViewAndPostPermissionsCombination (0.53s)
PASS protocol/communities.TestManagerSuite/TestCheckChannelPermissions_ViewAndPostPermissionsCombination2 (0.50s)
PASS protocol/communities.TestManagerSuite/TestCheckChannelPermissions_ViewOnlyPermissions (0.52s)
PASS protocol/communities.TestManagerSuite/TestCommunityIDIsHydratedWhenMarshaling (0.25s)
PASS protocol/communities.TestManagerSuite/TestCommunityQueue (0.53s)
PASS protocol/communities.TestManagerSuite/TestCommunityQueueMultipleDifferentSigners (0.52s)
PASS protocol/communities.TestManagerSuite/TestCommunityQueueMultipleDifferentSignersIgnoreIfNotReturned (0.55s)
PASS protocol/communities.TestManagerSuite/TestCreateCommunity (0.27s)
PASS protocol/communities.TestManagerSuite/TestCreateCommunity_WithBanner (0.25s)
PASS protocol/communities.TestManagerSuite/TestCreateHistoryArchiveTorrentFromMessages (0.27s)
PASS protocol/communities.TestManagerSuite/TestCreateHistoryArchiveTorrentFromMessages_ShouldAppendArchives (0.25s)
PASS protocol/communities.TestManagerSuite/TestCreateHistoryArchiveTorrentFromMessages_ShouldCreateMultipleArchives (0.24s)
PASS protocol/communities.TestManagerSuite/TestCreateHistoryArchiveTorrent_ShouldAppendArchives (0.26s)
PASS protocol/communities.TestManagerSuite/TestCreateHistoryArchiveTorrent_ShouldCreateArchive (0.25s)
PASS protocol/communities.TestManagerSuite/TestCreateHistoryArchiveTorrent_ShouldCreateMultipleArchives (0.26s)
PASS protocol/communities.TestManagerSuite/TestCreateHistoryArchiveTorrent_WithoutMessages (0.26s)
PASS protocol/communities.TestManagerSuite/TestDetermineChannelsForHRKeysRequest (0.27s)
PASS protocol/communities.TestManagerSuite/TestEditCommunity (0.25s)
PASS protocol/communities.TestManagerSuite/TestFillMissingCommunityTokens (0.26s)
PASS protocol/communities.TestManagerSuite/TestGetControlledCommunitiesChatIDs (0.26s)
PASS protocol/communities.TestManagerSuite/TestRetrieveCollectibles (0.50s)
PASS protocol/communities.TestManagerSuite/TestRetrieveTokens (0.53s)
PASS protocol/communities.TestManagerSuite/TestSeedHistoryArchiveTorrent (0.25s)
PASS protocol/communities.TestManagerSuite/TestStartAndStopTorrentClient (0.27s)
PASS protocol/communities.TestManagerSuite/TestStartHistoryArchiveTasksInterval (10.25s)
PASS protocol/communities.TestManagerSuite/TestStartTorrentClient_DelayedUntilOnline (0.26s)
PASS protocol/communities.TestManagerSuite/TestStopHistoryArchiveTasksIntervals (2.25s)
PASS protocol/communities.TestManagerSuite/TestStopTorrentClient_ShouldStopHistoryArchiveTasks (2.24s)
PASS protocol/communities.TestManagerSuite/TestUnseedHistoryArchiveTorrent (0.26s)
PASS protocol/communities.TestManagerSuite/Test_GetPermissionedBalances (0.50s)
PASS protocol/communities.TestManagerSuite/Test_calculatePermissionedBalances (0.50s)
PASS protocol/communities.TestManagerSuite (26.65s)
PASS protocol/communities
DONE 36 tests in 26.685s
```
To run a single test, you can do something like the following:
```bash
gotestsum --packages="./protocol/communities" -f testname --rerun-fails -- -count 1 -timeout "5m" -tags "gowaku_no_rln gowaku_skip_migrations" -run TestManagerSuite/TestCreateHistoryArchiveTorrent_ShouldCreateArchive
PASS protocol/communities.TestManagerSuite/TestCreateHistoryArchiveTorrent_ShouldCreateArchive (0.27s)
PASS protocol/communities.TestManagerSuite (0.27s)
PASS protocol/communities
DONE 2 tests in 0.307s
```
> The *suite* counts as one test.
### Running "protocol" tests
When runing all the tests (including `protocol` tests) using `make test`, most of the tests pass, but not all. As already indicated above, in order to get a better feedback and to track progress, I prefer to run them using `gotestsum` command:
```bash
gotestsum --packages="./protocol" -f testname --rerun-fails -- -count 1 -timeout "45m" -tags "gowaku_no_rln gowaku_skip_migrations" | tee "test_1.log"
DONE 964 tests, 5 skipped in 1159.570s
```
Redirecting output to a file using `tee` is pretty much needed as `protocol` tests may generate a lot of log messages.
Here - just for the record - an example failing session:
```bash
gotestsum --packages="./protocol" -f testname --rerun-fails -- -count 1 -timeout "45m" -tags "gowaku_no_rln gowaku_skip_migrations" | tee "test_1.log
DONE 3 runs, 968 tests, 5 skipped, 6 failures in 1194.882s
```
So, the reason I recorded it here, is that the output can be pretty misleading if you are seeing this output for the first time. The `6 failures` is actually $1$ (one) failing test:
```bash
TestMessengerCommunitiesTokenPermissionsSuite/TestAnnouncementsChannelPermissions
```
This one failing test makes the whole suite `TestMessengerCommunitiesTokenPermissionsSuite` failing which is also counted as a failure, and then we have `3 runs` which is why we see $3 \times 2 = 6$ failures in the end.
Also it is important to mention here that `--rerun-fails` option takes $2$ as default value, which means after first pass, if there are any failing tests, these tests will be rerun, and, if there are still some failing tests after second run, those tests will be rerun as well. So we have $3$ runs in total. Also notice, that the `-count 1` option we pass to the underlying `go test` ensures that each test execution is "fresh" and not cached making sure that it does not use cached results from previous test runs.
On each rerun I am also observing the following error message:
```bash
failed to IncreaseLevel: invalid increase level, as level "fatal" is allowed by increased level, but not by existing core
```
This simply means that the logger (it seems to be [Zap logger](https://github.com/uber-go/zap)) tries to increase the log level before re-running failing tests, but apparently we are already at the max log level - thus the warning message.
About this one failing test: `TestAnnouncementsChannelPermissions`.
We see the following errors reported:
```
2025-09-17T07:01:42.088+0200 ERROR 6decc309-0976-4da3-ada2-d20566a30b02 node/node.go:386 error while mapping sync state {"mvds": {"error": "sql: database is closed"}}
ERROR 758f6b9d-fa86-44fd-851a-d727d8a57732 protocol/messenger.go:1802 can't post on chat {"chatID": "0x03aa871de09ab76e4fba610c520735125f36795fb2ca6cb03f10520da449fcd0c5ab08e1bc-61bf-4ce6-82cb-e8c40f37abd2", "chatName": "general", "messageType": "CHAT_MESSAGE"}
...
ERROR 758f6b9d-fa86-44fd-851a-d727d8a57732 protocol/messenger.go:1802 can't post on chat {"chatID": "0x03aa871de09ab76e4fba610c520735125f36795fb2ca6cb03f10520da449fcd0c5ab08e1bc-61bf-4ce6-82cb-e8c40f37abd2", "chatName": "general", "messageType": "CHAT_MESSAGE"}
...
communities_messenger_token_permissions_test.go:1169:
Error Trace: /home/mc2/code/status-im/status-go/protocol/communities_messenger_token_permissions_test.go:1169
Error: Received unexpected error:
no messages
Test: TestMessengerCommunitiesTokenPermissionsSuite/TestAnnouncementsChannelPermissions
communities_messenger_token_permissions_test.go:1169:
Error Trace: /home/mc2/code/status-im/status-go/protocol/communities_messenger_token_permissions_test.go:1169
Error: Received unexpected error:
no messages
Test: TestMessengerCommunitiesTokenPermissionsSuite/TestAnnouncementsChannelPermissions
# still some log messages
--- FAIL: TestMessengerCommunitiesTokenPermissionsSuite/TestAnnouncementsChannelPermissions (11.44s)
```
When running this single test isolated, it is a bit flaky, sometimes it passes right away, sometimes it passes on the second run.
```bash
gotestsum --packages="./protocol" -f testname --rerun-fails -- -count 1 -timeout "45m" -tags "gowaku_no_rln gowaku_skip_migrations" -run TestMessengerCommunitiesTokenPermissionsSuite/TestAnnouncementsChannelPermissions
PASS protocol.TestMessengerCommunitiesTokenPermissionsSuite/TestAnnouncementsChannelPermissions (1.42s)
PASS protocol.TestMessengerCommunitiesTokenPermissionsSuite (1.42s)
PASS protocol
DONE 2 tests in 1.453s
gotestsum --packages="./protocol" -f testname --rerun-fails -- -count 1 -timeout "45m" -tags "gowaku_no_rln gowaku_skip_migrations" -run TestMessengerCommunitiesTokenPermissionsSuite/TestAnnouncementsChannelPermissions
...
DONE 3 runs, 6 tests, 4 failures in 27.876s
```
Thus, being able to run all the tests successfully a number of times, we should be allowed to conclude that our setup is correct.

View File

@ -0,0 +1,232 @@
Here I am documenting how to run the functional tests for status go. There are not the primary focus of the status-go-codex integration, but they let us to understand the dev env and the build system better in general. I am first presenting the step-by-step instructions to run everything manually from the command line, but because it is quite a lot of details, you may skip ahead to [[#Using development scripts to run the functional tests]] and study the scripts if you want to learn the details.
> I performed all those tests on Arch Linux ([Omarchy](https://omarchy.org/) distribution).
### Step by step instructions to run functional tests manually without a script
First, some helpers regarding namings:
```bash
# identifier
$ git rev-parse --short HEAD
c827909c8
# project_name
$ echo "status-go-func-tests-$(git rev-parse --short HEAD)"
status-go-func-tests-c827909c8
# image_name
$ echo "statusgo-$(git rev-parse --short HEAD)"
statusgo-c827909c8
```
If you do not have the pyenv in place, create one with:
```bash
python3 -m venv "./tests-functional/.venv"
```
and then always make sure it is activated:
```bash
source "./tests-functional/.venv/bin/activate"
```
To check if you are in a venv, you can use:
```bash
$ echo $VIRTUAL_ENV
/home/mc2/code/status-im/status-go/tests-functional/.venv
```
Then make sure the dependencies are in place - you do not need to do this if you did it already and you did not introduce any new python dependencies - it does not harm to do though and it runs instantly:
```bash
pip install --upgrade pip && pip install -r "./tests-functional/requirements.txt"
```
Now, if you did not before, make sure you have built status-go docker image. From status-go repo root folder:
```
docker build . \
--build-arg "build_tags='gowaku_no_rln'" \
--build-arg "enable_go_cache=false" \
--tag "statusgo-$(git rev-parse --short HEAD)"
```
This, image will be used during the tests.
I normally like to make sure I have no old related containers and networks running, before starting a new session:
```bash
docker ps -a --filter "status-go-func-tests-$(git rev-parse --short HEAD)" -q | xargs -r docker rm -f && \
(docker network rm "status-go-func-tests-$(git rev-parse --short HEAD)_default" 2>/dev/null || true)
```
Above I make sure, I ignore the error when the network does not exist.
Now, we also need to start foundry, anvil, and the waku node. We do not not need to build custom image for that. Just run docker compose as follows:
```bash
docker compose -p "status-go-func-tests-$(git rev-parse --short HEAD)" \
-f "./tests-functional/docker-compose.anvil.yml" \
-f "./tests-functional/docker-compose.waku.yml" \
up -d --build --remove-orphans
```
or as a one liner:
```bash
docker compose -p "status-go-func-tests-$(git rev-parse --short HEAD)" -f tests-functional/docker-compose.anvil.yml -f tests-functional/docker-compose.waku.yml up -d --build --remove-orphans
```
You will have the following containers running:
![[Pasted image 20250911204521.png]]
Now, with everything in place, we can run the tests:
```bash
pytest --reruns 2 -m rpc -c ./tests-functional/pytest.ini -n 12 \
--dist loadgroup \
--log-cli-level=INFO \
--logs-dir=./tests-functional/logs \
--docker_project_name="status-go-func-tests-$(git rev-parse --short HEAD)" \
--docker-image="statusgo-$(git rev-parse --short HEAD)"
```
or if you prefer a one liner:
```bash
pytest --reruns 2 -m rpc -c ./tests-functional/pytest.ini -n 12 --dist=loadgroup --log-cli-level=INFO --logs-dir=./tests-functional/logs --docker_project_name="status-go-func-tests-$(git rev-parse --short HEAD)" --docker-image="statusgo-$(git rev-parse --short HEAD)"
```
After running the tests, make sure you clean up:
```bash
$ docker compose -p "status-go-func-tests-$(git rev-parse --short HEAD)" -f tests-functional/docker-compose.anvil.yml -f tests-functional/docker-compose.waku.yml stop
$ docker compose -p "status-go-func-tests-$(git rev-parse --short HEAD)" -f tests-functional/docker-compose.anvil.yml -f tests-functional/docker-compose.waku.yml down
```
and then make sure nothing is left (the above command will not clean everything if there are some other containers alive that use the resources):
```bash
docker ps -a --filter "status-go-func-tests-$(git rev-parse --short HEAD)" -q | xargs -r docker rm -f && \
(docker network rm "status-go-func-tests-$(git rev-parse --short HEAD)_default" 2>/dev/null || true)
```
To run individual test, we can use the same command with `-k <test-name>` appended.
```bash
pytest --reruns 2 -m rpc -c ./tests-functional/pytest.ini -n 12 \
--dist=loadgroup \
--log-cli-level=INFO \
--logs-dir=./tests-functional/logs \
--docker_project_name="status-go-func-tests-$(git rev-parse --short HEAD)" \
--docker-image="statusgo-$(git rev-parse --short HEAD)" \
-k test_logging
```
Notice, that if you have some simple test that does not need status go container, you can just run something like`pytest -k test_logging`.
### Using development scripts to run the functional tests
To keep things easy and clean, it is easier to use a script. For development I created simplified, more robust version of the original script in `_assets/scripts/run_functional_tests.sh`. First, to get the tests run faster I extracted building of the status-go image into `_assets/scripts/build_status_go_docker.sh`
```bash
$ _assets/scripts/build_status_go_docker.sh
```
> The scripts should be run from the top level directory so that the right docker files are used.
Then I removed the coverage data (not so important in the development), and included explicit waiting for the docker services before running the tests. This way, it is easier to see which tests actually need a "rerun" and which just failed because docker services were not ready yet (which is very much observable with the original script). The name of the new script is `_assets/scripts/run_functional_tests_dev.sh`. The new script takes one optional parameter. This can be either:
- module name
- `Test` class name
- `Test` function name
- and *parameterized* `Test`.
You can easily find those name by looking at the typical Phyton test file:
```python
# File: test_wakuext_messages_transactions.py ← Module Name
import pytest
class TestTransactionsChatMessages: # ← Test Class Name
@pytest.mark.parametrize("wakuV2LightClient", [True, False])
def test_accept_request_address_for_transaction(self, wakuV2LightClient): # ← Test Function Name
# Test logic here
if wakuV2LightClient:
# Test with wakuV2LightClient_True
else:
# Test with wakuV2LightClient_False ← Parameterized Test
```
or in a more hierarchical view:
```
Module: test_wakuext_messages_transactions.py
├── Class: TestTransactionsChatMessages
│ ├── Function: test_accept_request_address_for_transaction
| | [wakuV2LightClient_True]
│ └── Function: test_accept_request_address_for_transaction
| | [wakuV2LightClient_False]
│ └── Function: test_other_function[param1]
│ └── Function: test_other_function[param2]
```
You can target each of those tests by calling and giving as argument:
> I am skipping the prefix in the examples below. The scripts should be run from the top level directory, e.g.: `./_assets/scripts/run_functional_tests_dev.sh ...`
```bash
# Run all tests in the module
run_functional_tests.sh "test_wakuext_messages_transactions"
# Run all tests in the class
run_functional_tests.sh "TestTransactionsChatMessages"
# Run all variants of this test function
run_functional_tests.sh "test_accept_request_address_for_transaction"
# Run only the False parameter variant
run_functional_tests.sh "test_accept_request_address_for_transaction and wakuV2LightClient_False"
# Run only the True parameter variant
run_functional_tests.sh "test_accept_request_address_for_transaction and wakuV2LightClient_True"
```
When the argument is provided the `run_functional_tests.sh` will first tell you which tests will be run and let you decide to continue or not:
```bash
Discovering tests to be run...
Found 14 tests matching: test_wakuext_messages_transactions
Tests to execute:
1) test_request_transaction[wakuV2LightClient_False]
2) test_request_transaction[wakuV2LightClient_True]
3) test_decline_request_transaction[wakuV2LightClient_False]
4) test_decline_request_transaction[wakuV2LightClient_True]
5) test_accept_request_transaction[wakuV2LightClient_False]
6) test_accept_request_transaction[wakuV2LightClient_True]
7) test_request_address_for_transaction[wakuV2LightClient_False]
8) test_request_address_for_transaction[wakuV2LightClient_True]
9) test_decline_request_address_for_transaction[wakuV2LightClient_False]
10) test_decline_request_address_for_transaction[wakuV2LightClient_True]
11) test_accept_request_address_for_transaction[wakuV2LightClient_False]
12) test_accept_request_address_for_transaction[wakuV2LightClient_True]
13) test_send_transaction[wakuV2LightClient_False]
14) test_send_transaction[wakuV2LightClient_True]
Continue with execution? (y/n):
```
When running the script without any arguments, you will be warned that all tests will be run, showing you the expected number of tests and also here you will have an option to stop:
```bash
./_assets/scripts/run_functional_tests_dev.sh
Using existing virtual environment
Installing dependencies
Discovering tests to be run...
No test pattern provided. This will run all 272 tests!
Continue with execution? (y/n):
```

View File

@ -0,0 +1,158 @@
A summary of running different test scenarios with `gotestsum` based on an example
## gotestsum Command Patterns
### 1. **Only Selected Test**
```bash
# Run a specific test function
gotestsum --packages="./communities" -f testname --rerun-fails -- -run "TestCodexArchiveDownloader_BasicSingleArchive$" -count 1
# Run specific testify test
gotestsum --packages="./communities" -f testname --rerun-fails -- -run "TestCodexArchiveDownloader_BasicSingleArchive_Testify$" -count 1
# Run multiple archives testify test
gotestsum --packages="./communities" -f testname --rerun-fails -- -run "TestCodexArchiveDownloader_MultipleArchives_Testify$" -count 1
```
### 2. **Only Selected Test Suite**
```bash
# Run the entire testify suite (all methods in the suite)
gotestsum --packages="./communities" -f testname --rerun-fails -- -run "TestCodexArchiveDownloaderSuite" -count 1
# Run a specific test method within the suite
gotestsum --packages="./communities" -f testname --rerun-fails -- -run "TestCodexArchiveDownloaderSuite/TestBasicSingleArchive" -count 1
```
### 3. **All Tests for Given Package**
```bash
# Run all tests in communities package
gotestsum --packages="./communities" -f testname --rerun-fails -- -count 1
# Alternative syntax (same result)
gotestsum --packages="./communities" -f testname --rerun-fails
# Run all tests with verbose output
gotestsum --packages="./communities" -f testname --rerun-fails -- -v -count 1
```
### 4. **Integration Tests**
```bash
# Run only integration tests (using build tags)
gotestsum --packages="./communities" -f testname --rerun-fails -- -tags=integration -run "Integration" -count 1
# Run integration tests with timeout (since they may take longer)
gotestsum --packages="./communities" -f testname --rerun-fails -- -tags=integration -run "Integration" -timeout=60s -count 1
# Run integration tests with specific environment variables
CODEX_HOST=localhost CODEX_API_PORT=8080 gotestsum --packages="./communities" -f testname --rerun-fails -- -tags=integration -run "Integration" -count 1
```
## Advanced gotestsum Patterns
### **Filter by Pattern (Multiple Tests)**
```bash
# Run all archive downloader tests (both standard and testify)
gotestsum --packages="./communities" -f testname --rerun-fails -- -run "ArchiveDownloader" -count 1
# Run only testify tests
gotestsum --packages="./communities" -f testname --rerun-fails -- -run "Testify" -count 1
# Run all CodexClient tests
gotestsum --packages="./communities" -f testname --rerun-fails -- -run "CodexClient" -count 1
```
### **Output Formats**
```bash
# Different output formats
gotestsum --packages="./communities" -f dots # Dots progress
gotestsum --packages="./communities" -f pkgname # Show package names
gotestsum --packages="./communities" -f testname # Show test names (recommended)
gotestsum --packages="./communities" -f standard-quiet # Minimal output
```
### **Multiple Packages**
```bash
# Run tests across multiple packages
gotestsum --packages="./communities,./cmd/upload,./cmd/download" -f testname --rerun-fails -- -count 1
# Run all packages recursively
gotestsum --packages="./..." -f testname --rerun-fails -- -count 1
```
### **Race Detection and Coverage**
```bash
# Run with race detection
gotestsum --packages="./communities" -f testname --rerun-fails -- -race -count 1
# Run with coverage
gotestsum --packages="./communities" -f testname --rerun-fails -- -cover -count 1
# Run with both race detection and coverage
gotestsum --packages="./communities" -f testname --rerun-fails -- -race -cover -count 1
```
## Key gotestsum Advantages
1. **Better Output Formatting**: Clean, colored output with test names
2. **Automatic Retry**: `--rerun-fails` reruns failed tests automatically
3. **JUnit XML Output**: `--junitfile=results.xml` for CI/CD integration
4. **JSON Output**: `--jsonfile=results.json` for parsing
5. **Watch Mode**: `--watch` to rerun tests on file changes
6. **Parallel Execution**: Better handling of parallel test output
## Complete Examples for Your Project
```bash
# Quick test of archive downloader functionality
gotestsum --packages="./communities" -f testname --rerun-fails -- -run "ArchiveDownloader" -count 1
# Full test suite with coverage
gotestsum --packages="./communities" -f testname --rerun-fails -- -cover -count 1
# Integration tests (when you have a Codex node running)
gotestsum --packages="./communities" -f testname --rerun-fails -- -tags=integration -timeout=60s -count 1
# Development workflow with watch mode
gotestsum --packages="./communities" -f testname --watch -- -count 1
```
The key difference from `go test` is that `gotestsum` provides much better visual feedback, automatic retry capabilities, and better CI/CD integration options while using the same underlying Go test infrastructure.
## Logs
In your tests you can include custom logs.
`go test -v` prints them without an issue, but for `gotestsum` to do the same you have to use `standard-verbose` format option:
```bash
gotestsum --packages="./communities" -f standard-verbose --rerun-fails -- -run "TestCodexArchiveDownloaderSuite" -v -count 1
=== RUN TestCodexArchiveDownloaderSuite
=== RUN TestCodexArchiveDownloaderSuite/TestBasicSingleArchive
codex_archive_downloader_testify_test.go:112: ✅ Basic single archive download test passed (testify version)
codex_archive_downloader_testify_test.go:113: - All mock expectations satisfied
codex_archive_downloader_testify_test.go:114: - Callback invoked: true
=== RUN TestCodexArchiveDownloaderSuite/TestMultipleArchives
codex_archive_downloader_testify_test.go:190: ✅ Multiple archives test passed (suite version)
codex_archive_downloader_testify_test.go:191: - Completed 3 out of 3 archives
--- PASS: TestCodexArchiveDownloaderSuite (0.20s)
--- PASS: TestCodexArchiveDownloaderSuite/TestBasicSingleArchive (0.10s)
--- PASS: TestCodexArchiveDownloaderSuite/TestMultipleArchives (0.10s)
PASS
ok go-codex-client/communities 0.205s
DONE 3 tests in 0.205s
```
Compare this output with `-f testname`:
```bash
gotestsum --packages="./communities" -f testname --rerun-fails -- -run "TestCodexArchiveDownloaderSuite" -count 1
PASS communities.TestCodexArchiveDownloaderSuite/TestBasicSingleArchive (0.10s)
PASS communities.TestCodexArchiveDownloaderSuite/TestMultipleArchives (0.10s)
PASS communities.TestCodexArchiveDownloaderSuite (0.20s)
PASS communities
DONE 3 tests in 0.205s
```
> Notice that the test suite itself is also counted as a test - this is one we see `DONE 3 tests` instead of `DONE 2 tests`.

View File

@ -0,0 +1,209 @@
---
related-to:
- "[[Team-NLBR Solution Proposal]]"
- "[[status-go publishing magnet links]]"
- "[[status-go processing magnet links]]"
- "[[status-go-codex integration - design notes]]"
- "[[Creating History Archives - InitHistoryArchiveTasks]]"
- "[[testing codex-status-go integration]]"
---
The `TorrentConfig` type provides the configuration for the BitTorrent-based History Archive management functionality:
```go
type TorrentConfig struct {
// Enabled set to true enables Community History Archive protocol
Enabled bool
// Port number which the BitTorrent client will listen to for conntections
Port int
// DataDir is the file system folder Status should use for message archive torrent data.
DataDir string
// TorrentDir is the file system folder Status should use for storing torrent metadata files.
TorrentDir string
}
```
The `DataDir` is where the History Archives for the controlled communities are stored. Then, `TorrentDir` is where the corresponding community torrent files are preserved.
In the `DataDir` folder, for each community there is a folder (named after community id) in which the history archive for that community is stored:
```bash
DataDir/
├── {communityID}/
│ ├── index # Archive index file (metadata)
│ └── data # Archive data file (actual messages)
└──
```
There is one-to-one relationship between the community folder and the corresponding *torrent* file (BitTorrent metainfo):
```bash
TorrentDir/
├── {communityID}.torrent # Torrent metadata file
└──
```
## When Archives are created
The function somehow central to the Archive creation is [InitHistoryArchiveTasks](https://github.com/status-im/status-go/blob/6322f22783585474803cfc8a6f0a914757d763b5/protocol/messenger_communities.go#L3783). This function is called in a number of situations, e.g. in [Messenger.Start](https://github.com/status-im/status-go/blob/6322f22783585474803cfc8a6f0a914757d763b5/protocol/messenger.go#L562), [Messenger.EditCommunity](https://github.com/status-im/status-go/blob/6322f22783585474803cfc8a6f0a914757d763b5/protocol/messenger_communities.go#L2807), [Messenger.ImportCommunity](https://github.com/status-im/status-go/blob/6322f22783585474803cfc8a6f0a914757d763b5/protocol/messenger_communities.go#L2865), [Messenger.EnableCommunityHistoryArchiveProtocol](https://github.com/status-im/status-go/blob/6322f22783585474803cfc8a6f0a914757d763b5/protocol/messenger_communities.go#L4136).
In `InitHistoryArchiveTasks`, for each community with `HistoryArchiveSupportEnabled` option set to `true`:
- if community torrent file already exists: call [ArchiveManager.SeedHistoryArchiveTorrent](https://github.com/status-im/status-go/blob/6322f22783585474803cfc8a6f0a914757d763b5/protocol/communities/manager_archive.go#L408) - see also [[What is Seeding (AI)]] and [[When are magnetlink messages sent]].
- determine if new archives need to be created based on the last archive end date and call [CreateAndSeedHistoryArchive](https://github.com/status-im/status-go/blob/6322f22783585474803cfc8a6f0a914757d763b5/protocol/communities/manager_archive.go#L314)
- starts periodic archive creation task by calling [StartHistoryArchiveTasksInterval](https://github.com/status-im/status-go/blob/6322f22783585474803cfc8a6f0a914757d763b5/protocol/communities/manager_archive.go#L323), which will in turn call [CreateAndSeedHistoryArchive](https://github.com/status-im/status-go/blob/6322f22783585474803cfc8a6f0a914757d763b5/protocol/communities/manager_archive.go#L314).
From `CreateAndSeedHistoryArchive`, via chain of calls, we arrive at [createHistoryArchiveTorrent](https://github.com/status-im/status-go/blob/6322f22783585474803cfc8a6f0a914757d763b5/protocol/communities/manager_archive_file.go#L55), which is key to understand how archives are created and to our proposal.
## Archive Creation
Archives are about archiving messages. Thus, we first need to find the messages relevant to the given (chat) community. This happens through *filters* and connected to them `topics`. It is probably enough to say, that there is some [[Unclarity about Waku filters and topics]], but for this discussion, it should be enough to assume, that we trust to correctly retrieve the community messages. `topics` are provided to the `createHistoryArchiveTorrent`, which is where the archives are built.
Recall that for each community, status-go uses two files: `index` metadata file, and the `data`.
The `data` file stores *protobuf*-encoded *archives*. Each archive describes a period of time given by `From` and `To` attributes (both Unix timestamps casted to `uint64`), which together with `ContentTopic` form the `Metadata` part of an archive:
```go
type WakuMessageArchiveMetadata struct {
From uint64
To uint64
ContentTopic [][]byte
}
```
> For clarity, we skip `protobuf`-specific fields and annotations.
In `createHistoryArchiveTorrent`, the messages are retrieved using [GetWakuMessagesByFilterTopic](https://github.com/status-im/status-go/blob/6322f22783585474803cfc8a6f0a914757d763b5/protocol/communities/persistence.go#L967). Then, the messages are bundled into chunks, where each chunk is max `30MB` big as given by the `maxArchiveSizeInBytes` constant. Messages bigger than `maxArchiveSizeInBytes` will not be archived.
Now, for each message chunk, an instance of [WakuMessageArchive](https://github.com/status-im/status-go/blob/6322f22783585474803cfc8a6f0a914757d763b5/protocol/protobuf/communities.pb.go#L2152) (`wakuMessageArchive`) is created using [createWakuMessageArchive](https://github.com/status-im/status-go/blob/6322f22783585474803cfc8a6f0a914757d763b5/protocol/communities/manager_archive_file.go#L343). `WakuMessageArchive` has the following definition:
```go
type WakuMessageArchive struct {
Metadata *WakuMessageArchiveMetadata
Messages []*WakuMessage
}
```
> again we strip `protobuf` attributes for clarity.
We see that the `Metadata` attribute of the `WakuMessageArchive` type is set to the `WakuMessageArchiveMetadata` defined above. For reference only, `WakuMessage` has the following definition:
```go
type WakuMessage struct {
Sig []byte
Timestamp uint64
Topic []byte
Payload []byte
Padding []byte
Hash []byte
ThirdPartyId string
}
```
The `WakuMessageArchive` is then encoded and encrypted, resulting in the final `encodedArchive`. The `rawSize := len(encodedArchive)` is then padded if necessary so that the archive size is aligned to the BitTorrent piece length (which is set to `100KiB`). The `encodedArchive` together with the padding information is then added to `encodedArchives` (`[]*EncodedArchiveData`). Finally, the resulting `size` and `padding` together with the current `offset` in the existing `data` file and the `Metadata` are used to create the corresponding archive index entry that later will be serialized to the `index` file:
```go
wakuMessageArchiveIndexMetadata := &protobuf.WakuMessageArchiveIndexMetadata{
Metadata: wakuMessageArchive.Metadata,
Offset: offset,
Size: uint64(size),
Padding: uint64(padding),
}
```
The archive index entry is encoded, and its hash is used as a key in the archive index map:
```go
wakuMessageArchiveIndexMetadataBytes, err := proto.Marshal(
wakuMessageArchiveIndexMetadata
)
archiveID := crypto.Keccak256Hash(wakuMessageArchiveIndexMetadataBytes).String()
wakuMessageArchiveIndex[archiveID] = wakuMessageArchiveIndexMetadata
```
> `wakuMessageArchiveIndex` is earlier initialized to contain existing archive index entries from the current `index` file. Here we are basically appending new archive meta to the archive index data structure.
We repeat the whole process for each message chunk in the given time period, adding more periods if needed (recall, each period is 7 days long).
After that we have a list of new archives (in `encodedArchives`) and a new archive index entries. They are ready to be encoded and serialized to the corresponding `data` (by appending) and `index` files.
Finally, the corresponding torrent file is (re)created, the `HistoryArchivesCreatedSignal` is emitted, and the last message archive end date is recorded in the persistence.
The diagram below shows the relationships between the datatypes described above:
![[team-nl-br-design-1.svg]]
And then in the following diagram we show how the `index` and `data` files are populated, the corresponding torrent file and the magnet link:
![[team-nl-br-design-2.svg]]
## Archive Distribution and Download
All the nodes that want to restore the message history, first need to retrieve the `index` file. Here, the selective download of the selected files from the torrent is used. After having the index files, the nodes can find out which periods they need to retrieve. Using the `offset`, `size`, and `padding`, they use BitTorrent library to selectively fetch only the torrent pieces that they need. In our Codex integration proposal, we suggest taking advantage of Codex CIDs to formally decouple archive index from archive data.
## Proposed Integration with Codex
First we propose changing the `WakuMessageArchiveIndexMetadata` type in the following direction. Instead of the `offset`, `size`, and `padding`, we suggest to refer to an archive by a Codex CID. Thus, instead of:
```go
type WakuMessageArchiveIndexMetadata struct {
Metadata *WakuMessageArchiveMetadata
Offset uint64
Size uint64
Padding uint64
}
```
we would have something like:
```go
type WakuMessageArchiveIndexMetadata struct {
Metadata *WakuMessageArchiveMetadata
Cid CodexCid
}
```
Now instead of appending each new archive to the data file, we stream each single archive to codex via an API call. For each archive, we receive a CID back from Codex, which we then add to the corresponding archive index metadata entry as defined above.
After all archive entries are persisted, we then upload the resulting `index` to Codex under its own `index` CID. Instead of the magnet link, the community owner only publishes this `index` CID.
> If the system fails to publish the index, we assume the archive publishing was unsuccessful and we will start from scratch after restart (to be checked). In other words, we do not have test is torrent file exists. If the previous publishing was successful, then the CIDs are advertised by Codex - they will already stored in the Codex RepoStore.
In order to receive the historical messages for the given period (given by `from` and `to` in the `WakuMessageArchiveMetadata`), the receiving node first acquires the `index` using the `index` CID. For each entry in the `index` that the node has interest in, the node then downloads the corresponding archive directly using the `Cid` from this `index` entry.
The diagram below shows the relationship between the new `index` identified by a Codex CID that uses individual CIDs to refer to each individual archive:
![[team-nl-br-design-3.svg]]
### Advantages
- clean and elegant solution - easier to maintain the archive(s) and the index,
- no dependency on the internals of the low level protocol used (like `padding`, `pieceLength`) - just nice and clean CIDs,
- reusing existing Codex protocol, no need to extend,
- Codex takes care for storage: no more `index` and `data` files: thus more reliable and less error prone.
### Disadvantages
- Because each archive receives its own CID which will to be announced on DHT. If this is considered problem, we may apply bundling, or using block ranges and publish the whole `data` under its own CID. Although less elegant, it still nicely decouples `index` from the `data`, but in this case we may need to expose an API to retrieve specific block index under given `treeCid`.
## Deployment and Codex Library
In the first prototype, we suggest to use Codex API in order to validate the idea and discover potential design flows early. After successful PoC, or already in parallel, we suggest building a Codex protocol library (stripped down from EC and marketplace), which will then be used to create GO bindings for the status-go integration. The same library should than also be used in the new Codex client.
Creating the Codex library was not only long requested by IFT, but it also bring opportunity to rethink the system interfaces and work towards more modular, "plugable" design. In its first instance Codex library could be made of just the block-exchange protocol, discovery module (DHT), and *RepoStore* (block storage) - each of those could potentially be also separated into separate sub-libraries. By providing bindings for various programming languages, we can better stimulate community clients, our original Codex client being one of them. The illustration below shows a high-level overview of the composition and use of the Codex library.
![[team-nl-br-design-4.svg]]
## Changes required in the status-go client
From the numerous code fragments presented above, we can already imagine the works that have to be done in status-go code base. As discussed above the general entry is the [InitHistoryArchiveTasks](https://github.com/status-im/status-go/blob/6322f22783585474803cfc8a6f0a914757d763b5/protocol/messenger_communities.go#L3783), which will bring us to most of the other relevant changes, most importantly [CreateAndSeedHistoryArchive](https://github.com/status-im/status-go/blob/6322f22783585474803cfc8a6f0a914757d763b5/protocol/communities/manager_archive.go#L314) and [createHistoryArchiveTorrent](https://github.com/status-im/status-go/blob/6322f22783585474803cfc8a6f0a914757d763b5/protocol/communities/manager_archive_file.go#L55). We expect most of the changes to be performed around [protocol/communities/manager_archive.go](https://github.com/status-im/status-go/blob/develop/protocol/communities/manager_archive.go) and [protocol/communities/manager_archive_file.go](https://github.com/status-im/status-go/blob/develop/protocol/communities/manager_archive_file.go).
## Long Term Durability Support
The current proposal builds on Codex and it will naturally scale towards adding stronger durability requirements with Codex Marketplace (and Erasure Coding). We consider this a long term path. In a mid-term, we consider increasing the level of durability by applying some of the element already captured in Ben's [Constellations](https://github.com/benbierens/constellations)
- 1 unchanging ID per community
- Taking care of CID dissemination
- Rough health metrics
- Owner/admin controls
- Useful for more projects than Status

View File

@ -0,0 +1,74 @@
> Since we created this document we gained a bit more understanding of how topics and filters work in status-go and waku: please check [[Filters, Topics, Channels, and Chat IDs in status-go and waku]].
This is the definition of the `ChatFilter` type:
```go
type ChatFilter struct {
chatID string
filterID string
identity string
pubsubTopic string
contentTopic ContentTopic
discovery bool
negotiated bool
listen bool
ephemeral bool
priority uint64
}
```
For each community, we can have a number of filters. This is how we get them.
```go
func (m *ArchiveManager) GetCommunityChatsFilters(communityID types.HexBytes) (messagingtypes.ChatFilters, error) {
chatIDs, err := m.persistence.GetCommunityChatIDs(communityID)
if err != nil {
return nil, err
}
filters := messagingtypes.ChatFilters{}
for _, cid := range chatIDs {
filter := m.messaging.ChatFilterByChatID(cid)
if filter != nil {
filters = append(filters, filter)
}
}
return filters, nil
}
```
Perhaps simplifying too much, a *filter* is basically a reference to a waku pubsub channel where the messages can be posted/received: all messages for all chats in that community. This seems to be related to [[Universal Topic Optimization in Waku (AI)]].
What is associated with a filter is a `contentTopic` and the filters are used to make sure we later get all relevant community messages. But here we get a lot of strange inconsistencies. First, if there suppose to be a single universal topic for the community, why do we still have multiple filters:
```go
filters, err := m.archiveManager.GetCommunityChatsFilters(c.ID())
if err != nil {
m.logger.Error("failed to get community chats filters for community", zap.Error(err))
continue
}
if len(filters) == 0 {
m.logger.Debug("no filters or chats for this community starting interval", zap.String("id", c.IDString()))
go m.archiveManager.StartHistoryArchiveTasksInterval(c, messageArchiveInterval)
continue
}
```
The why the universal filter is only included later, and never mapped to the corresponding topics?
```go
topics := []messagingtypes.ContentTopic{}
for _, filter := range filters {
topics = append(topics, filter.ContentTopic())
}
filter := m.messaging.ChatFilterByChatID(c.UniversalChatID())
if filter != nil {
filters = append(filters, filter)
}
```
It is kind of crucial, because the topics are effectively used to get the messages in [GetWakuMessagesByFilterTopic](https://github.com/status-im/status-go/blob/6322f22783585474803cfc8a6f0a914757d763b5/protocol/communities/manager_archive_file.go#L125) (check the body of this function [here](https://github.com/status-im/status-go/blob/6322f22783585474803cfc8a6f0a914757d763b5/protocol/communities/persistence.go#L967-L1000)).

View File

@ -0,0 +1,88 @@
> Since we created this document we gained a bit more understanding of how topics and filters work in status-go and waku: please check [[Filters, Topics, Channels, and Chat IDs in status-go and waku]].
There is [a comment in the code base](https://github.com/status-im/status-go/blob/6322f22783585474803cfc8a6f0a914757d763b5/protocol/communities/community.go#L1584-L1589):
```go
// Using Member updates channelID as chatID to act as a universal content-topic
// for all chats in the community as explained here
// https://forum.vac.dev/t/status-communities-review-and-proposed-usage-of-waku-content-topics/335
```
See: [https://forum.vac.dev/t/status-communities-review-and-proposed-usage-of-waku-content-topics/335](https://forum.vac.dev/t/status-communities-review-and-proposed-usage-of-waku-content-topics/335)
This comment refers to a **universal topic optimization** that Status implemented to improve efficiency in the Waku messaging protocol. Let me explain how this relates to Waku:
### What is a ContentTopic in Waku?
In Waku, a **ContentTopic** is like a messaging channel identifier - it's a 4-byte value that tells Waku which messages are intended for which application features. Think of it as the "address" where specific types of messages should be delivered.
From the code I found, here's how ContentTopic works:
```go
// From messaging/waku/common/topic.go
func (t TopicType) ContentTopic() string {
enc := hexutil.Encode(t[:])
return "/waku/1/" + enc + "/rfc26"
}
```
This creates a topic like `/waku/1/0x12345678/rfc26` where `0x12345678` is the 4-byte topic.
### The Universal Topic Optimization
Originally, you might expect each community chat to have its own unique ContentTopic:
- `#general` → ContentTopic `0x12345678`
- `#random` → ContentTopic `0x87654321`
- `#dev` → ContentTopic `0xabcdef12`
But Status implemented a **universal topic optimization** where all chats in a community share the **same ContentTopic**. Here's how:
```go
// From protocol/communities/community.go lines 1584-1588
func (o *Community) UniversalChatID() string {
// Using Member updates channelID as chatID to act as a universal content-topic
// for all chats in the community as explained here
// https://forum.vac.dev/t/status-communities-review-and-proposed-usage-of-waku-content-topics/335
return o.MemberUpdateChannelID()
}
func (o *Community) MemberUpdateChannelID() string {
return o.IDString() + "-memberUpdate"
}
```
So instead of separate topics, **all community chats** use the same ContentTopic: `{communityID}-memberUpdate`.
### How This Works with Filters
When you join a community, Status creates **one Filter** (subscription) that receives messages for **all channels** in that community:
```go
// From messenger_community_chat_test.go
universalChatFilter := s.m.messaging.ChatFilterByChatID(community.UniversalChatID())
```
This single filter subscribes to the universal ContentTopic, and **all messages** from all channels in the community flow through this one subscription.
### Benefits of This Optimization
1. **Fewer Network Subscriptions**: Instead of subscribing to N different ContentTopics (one per channel), you subscribe to just 1
2. **Reduced Network Overhead**: Fewer pubsub subscriptions mean less network management
3. **Simplified Filter Management**: The FiltersManager only needs to track one filter per community instead of one per channel
4. **Better Message Delivery**: All community messages come through the same reliable channel
### How Messages Are Distinguished
Since all messages use the same ContentTopic, how does Status know which channel a message belongs to? The **message payload** contains the actual destination channel information. The universal topic is just for **transport optimization** - the application logic still correctly routes messages to the right channels.
### Relation to Waku Protocol
This optimization is specifically about how Status uses Waku's **pubsub topics and content filtering**:
- **PubsubTopic**: The "physical" network topic (like a network shard)
- **ContentTopic**: The "logical" application channel (like an app feature)
- **Filter**: The subscription that tells Waku "I want messages from this ContentTopic"
By using one ContentTopic for all community channels, Status reduces the load on Waku's **content filtering system** while maintaining the logical separation of channels at the application layer.
This is a clever optimization that maintains user experience (separate channels) while improving network efficiency (shared transport).

View File

@ -0,0 +1,191 @@
---
related-to:
- "[[Running Unit Tests for status-go]]"
- "[[Running functional tests in status-go]]"
- "[[testing codex-status-go integration]]"
---
Here are some basic steps to follow:
Build status-backend:
```bash
make status-backend
```
Start status-backend (I am using port `45453` for all examples below):
```bash
./build/bin/status-backend -address localhost:45453
```
### Step 1: Initialize the application
```bash
curl -sS http://127.0.0.1:45453/statusgo/InitializeApplication \
-H 'Content-Type: application/json' \
-d '{"dataDir":"/tmp/status-go-test"}'
```
### Step 2: Create an account (if you don't have one) OR login
```bash
curl -sS http://127.0.0.1:45453/statusgo/CreateAccountAndLogin \
-H 'Content-Type: application/json' \
-d '{
"rootDataDir": "/tmp/status-go-test",
"displayName": "TestUser",
"password": "test123456",
"customizationColor": "blue"
}'
```
or if you already have account:
```bash
curl -sS http://127.0.0.1:45453/statusgo/LoginAccount -X POST -H 'Content-Type: application/json' -d '{
"rootDataDir": "/tmp/status-go-test",
"keyUid": "0x9b755874a92bcc2d20c730fc76d451f44b39868cc8fbd31f2ff74907e299e7fd",
"password": "test123456"
}'
```
If you see `{"error":""}` as the output it means the command was successful and there is no error.
If you try to call the API before login, you will get the following error (example):
```bash
curl -sS http://127.0.0.1:45453/statusgo/CallRPC -H 'Content-Type: application/json' -d '{"jsonrpc":"2.0","id":1,"method":"wakuext_getArchiveDistributionPreference","params":[]}'
{"jsonrpc":"2.0","id":1,"error":{"code":-32601,"message":"the method wakuext_getArchiveDistributionPreference does not exist/is not available"}}
```
I did not record the output of the `CreateAccountAndLogin` but in the returned response object you should be able to find the `keyUid` mentioned above. It is also included in the response from `InitializeApplication` for each existing account:
```bash
curl -sS http://127.0.0.1:45453/statusgo/InitializeApplication \
-H 'Content-Type: application/json' \
-d '{"dataDir":"/tmp/status-go-test"}'
{"accounts":[{"name":"TestUser","timestamp":1762236462,"identicon":"","colorHash":[[3,9],[5,18],[3,24],[5,22],[4,6],[5,23],[2,3],[4,24],[1,27],[5,21],[3,11]],"colorId":3,"customizationColor":"blue","keycard-pairing":"","key-uid":"0x9b755874a92bcc2d20c730fc76d451f44b39868cc8fbd31f2ff74907e299e7fd","images":null,"kdfIterations":3200,"hasAcceptedTerms":true}],"centralizedMetricsInfo":{"enabled":false,"userConfirmed":false,"userID":""}}
```
The `keyUid` is a unique identifier for each account in status-go. You can also find it by looking at the database filenames in the data directory.
When you create an account, status-go creates database files with a specific naming pattern:
```
<address><keyUid>-v4.db
<address><keyUid>-wallet.db
```
In my case:
```bash
ls -la /tmp/status-go-test/
```
Showed files like:
```
0x9b755874a92bcc2d20c730fc76d451f44b39868cc8fbd31f2ff74907e299e7fd-v4.db
0x9b755874a92bcc2d20c730fc76d451f44b39868cc8fbd31f2ff74907e299e7fd-wallet.db
```
The filename pattern is: `<ethereum-address><key-uid>-<db-type>.db`
Breaking it down:
- **Address**: `0x9b755874a92bcc2d20c730fc76d451f44b39868c` (42 characters - standard Ethereum address)
- **KeyUID**: `c8fbd31f2ff74907e299e7fd` (the remaining characters before the dash)
So the full keyUid is actually: `0x9b755874a92bcc2d20c730fc76d451f44b39868cc8fbd31f2ff74907e299e7fd` (combining both parts).
You can also query the accounts database:
```bash
sqlite3 /tmp/status-go-test/accounts.sql "SELECT keyUid, name FROM accounts;"
0x9b755874a92bcc2d20c730fc76d451f44b39868cc8fbd31f2ff74907e299e7fd|TestUser
```
The keyUid is essentially derived from the account's key material and is used to uniquely identify the account across the system.
### Step 3: Start the messenger (optional)
I initially thought this is necessary - but it turns out that logging in is sufficient. But for reference:
```bash
curl -sS http://127.0.0.1:45453/statusgo/CallRPC \
-H 'Content-Type: application/json' \
-d '{"jsonrpc":"2.0","id":1,"method":"wakuext_startMessenger","params":[]}' | jq
```
### Step 4: Now you can call your method!
Some examples:
```bash
curl -sS http://127.0.0.1:45453/statusgo/CallRPC -H 'Content-Type: application/json' -d '{"jsonrpc":"2.0","id":1,"method":"wakuext_getArchiveDistributionPreference","params":[]}'
{"jsonrpc":"2.0","id":1,"result":"codex"}
```
```bash
curl -sS http://127.0.0.1:45453/statusgo/CallRPC -H 'Content-Type: application/json' -d '{"jsonrpc":"2.0","id":1,"method":"wakuext_setArchiveDistributionPreference","params":[{"preference":"codex"}]}'
{"jsonrpc":"2.0","id":1,"result":"codex"}
```
```bash
curl -sS http://127.0.0.1:45453/statusgo/CallRPC -H 'Content-Type: application/json' -d '{"jsonrpc":"2.0","id":1,"method":"wakuext_getMessageArchiveInterval","params":[]}'
{"jsonrpc":"2.0","id":1,"result":604800000000000}
```
```bash
curl -sS http://127.0.0.1:12345/statusgo/CallRPC -H 'Content-Type: application/json' -d '{"jsonrpc":"2.0","id":2,"method":"wakuext_updateMessageArchiveInterval","params":[60480]}'
{"jsonrpc":"2.0","id":1,"result":604800000000000}
```
Notice that value to be set is provided in seconds but the value returned in the result is in nanoseconds (to avoid potential problems with division).
### Enabling History Archives
You use `EnableCodexCommunityHistoryArchiveProtocol` method to enable history archives for Codex. The method also accepts optional overrides to the default codex node config.
#### without overrides
```bash
curl -sS http://127.0.0.1:45453/statusgo/CallRPC \
-H 'Content-Type: application/json' \
-d '{"jsonrpc":"2.0","id":1,"method":"wakuext_enableCodexCommunityHistoryArchiveProtocol","params":[{}]}'
# returns
{"jsonrpc":"2.0","id":1,"result":null}
```
#### with overrides (example DiscoveryPort + one bootstrap SPR)
```bash
curl -sS http://127.0.0.1:45453/statusgo/CallRPC \
-H 'Content-Type: application/json' \
-d '{
"jsonrpc":"2.0",
"id":1,
"method":"wakuext_enableCodexCommunityHistoryArchiveProtocol",
"params":[
{
"CodexNodeConfig.DiscoveryPort":"8091",
"CodexNodeConfig.BootstrapNodes":"[\"spr:CiUIAhIhAjOc4w87PAfj0XGMnqtYSgO8rwfPOxF7d8Y4-BXGVUJTEgIDARpJCicAJQgCEiECM5zjDzs8B-PRcYyeq1hKA7yvB887EXt3xjj4FcZVQlMQkfncyAYaCwoJBH8AAAGRAh-bGgsKCQSsEgAGkQIfmypGMEQCID4B7M6G5bEPQ_D_Z7YdPG6LHpXq3ghY2gkXtBxTExDeAiAFSOjwAem1PmbAIZlOq2hvT_LGQMwiEOEaVaoIJ1g-FQ\"]"
}
]
}'
# returns
{"jsonrpc":"2.0","id":1,"result":null}
```
To stop:
```bash
curl -sS http://127.0.0.1:45453/statusgo/CallRPC -H 'Content-Type: application/json' -d '{"jsonrpc":"2.0","id":1,"method":"wakuext_disableCommunityHistoryArchiveProtocol","params":[]}'
# returns
{"jsonrpc":"2.0","id":1,"result":null}
```
And to verify the current node configuration:
```bash
curl -sS http://127.0.0.1:45453/statusgo/GetNodeConfig \
-H 'Content-Type: application/json' \
-d '{}' | jq
```

View File

@ -0,0 +1,154 @@
Seeding is the BitTorrent process where a node that has complete archive files makes them available for other peers to download. In Status-go, this happens when:
1. **Community control nodes** create new history archives
2. **Any node** successfully downloads archives from other peers
## When Does Seeding Happen?
### 1. **After Creating New Archives (Control Nodes Only)**
In `SeedHistoryArchiveTorrent`, community control nodes seed archives after creating them:
```go
func (m *ArchiveManager) SeedHistoryArchiveTorrent(communityID types.HexBytes) error {
m.UnseedHistoryArchiveTorrent(communityID)
id := communityID.String()
torrentFile := torrentFile(m.torrentConfig.TorrentDir, id)
metaInfo, err := metainfo.LoadFromFile(torrentFile)
if err != nil {
return err
}
info, err := metaInfo.UnmarshalInfo()
if err != nil {
return err
}
hash := metaInfo.HashInfoBytes()
m.torrentTasks[id] = hash
if err != nil {
return err
}
torrent, err := m.torrentClient.AddTorrent(metaInfo)
if err != nil {
return err
}
torrent.DownloadAll()
m.publisher.publish(&Subscription{
HistoryArchivesSeedingSignal: &signal.HistoryArchivesSeedingSignal{
CommunityID: communityID.String(),
},
})
magnetLink := metaInfo.Magnet(nil, &info).String()
m.logger.Debug("seeding torrent", zap.String("id", id), zap.String("magnetLink", magnetLink))
return nil
}
```
This happens when:
- **Periodic archiving**: Control nodes regularly create archives of community message history
- **Manual archiving**: When explicitly triggered through `InitHistoryArchiveTasks`
### 2. **After Downloading Archives Successfully**
In [DownloadHistoryArchivesByMagnetlink](https://github.com/status-im/status-go/blob/6322f22783585474803cfc8a6f0a914757d763b5/protocol/communities/manager_archive.go#L486) around line 644, seeding starts after successful downloads:
```go
// After downloading all archives
m.publisher.publish(&Subscription{
HistoryArchivesSeedingSignal: &signal.HistoryArchivesSeedingSignal{
CommunityID: communityID.String(),
},
})
```
This happens when:
- **Regular members** download archives via magnetlinks
- **New community members** download historical messages
- **Nodes re-downloading** updated archives
### 3. **Seeding Signal Processing**
When seeding starts, it triggers magnetlink message distribution in `handleCommunitiesHistoryArchivesSubscription`:
```go
if sub.HistoryArchivesSeedingSignal != nil {
// Signal UI that seeding started
m.config.messengerSignalsHandler.HistoryArchivesSeeding(
sub.HistoryArchivesSeedingSignal.CommunityID
)
// Get community info
c, err := m.communitiesManager.GetByIDString(
sub.HistoryArchivesSeedingSignal.CommunityID
)
// Only control nodes send magnetlink messages
if c.IsControlNode() {
err := m.dispatchMagnetlinkMessage(
sub.HistoryArchivesSeedingSignal.CommunityID
)
// This broadcasts the magnetlink to community members
}
}
```
## Seeding Lifecycle
### **Start Seeding**
1. Archives are created or downloaded successfully
2. `HistoryArchivesSeedingSignal` is published
3. BitTorrent client starts serving the files
4. Control nodes broadcast magnetlink messages
### **Continue Seeding**
- Files remain available as long as the node is online
- Other peers can download from multiple seeders simultaneously
- BitTorrent automatically manages upload bandwidth
### **Stop Seeding**
In `UnseedHistoryArchiveTorrent`:
```go
func (m *ArchiveManager) UnseedHistoryArchiveTorrent(communityID types.HexBytes) {
// Remove torrent from client
torrent := m.torrentClient.Torrent(infoHash)
if torrent != nil {
torrent.Drop()
}
// Publish unseeding signal
m.publisher.publish(&Subscription{
HistoryArchivesUnseededSignal: &signal.HistoryArchivesUnseededSignal{
CommunityID: communityID.String(),
},
})
}
```
This happens when:
- **New archives replace old ones**: When fresher magnetlinks are received
- **Community leaves**: When a user leaves a community
- **Manual stop**: When archiving is disabled
## Key Points
- **Only control nodes send magnetlinks**, but **any node can seed** after downloading
- **Seeding is automatic** - happens immediately after successful archive creation or download
- **Multiple seeders improve availability** - BitTorrent's distributed nature means more seeders = better download speeds
- **Seeding triggers UI notifications** via `SendHistoryArchivesSeedin`.
This distributed seeding model ensures that community history archives remain available even if the original control node goes offline, making the system resilient and scalable.

View File

@ -0,0 +1,66 @@
Magnetlink messages are sent when a community's control node (the community owner) finishes seeding history archives. Here's the complete flow:
### 1. **Triggering Event: Archive Seeding Completion**
Magnetlink messages are sent when a `HistoryArchivesSeedingSignal` is triggered. This happens in two scenarios:
1. **After Creating New Archives**: In [SeedHistoryArchiveTorrent](https://github.com/status-im/status-go/blob/6322f22783585474803cfc8a6f0a914757d763b5/protocol/communities/manager_archive.go#L408) around line 438, when the control node finishes creating and starts seeding new history archives.
2. **After Downloading Archives**: In [DownloadHistoryArchivesByMagnetlink](https://github.com/status-im/status-go/blob/6322f22783585474803cfc8a6f0a914757d763b5/protocol/communities/manager_archive.go#L486) around line 645, when archives are successfully downloaded and seeding begins.
### 2. **Message Dispatch Logic**
The actual sending happens in `handleCommunitiesHistoryArchivesSubscription` at lines 246-259:
```go
if sub.HistoryArchivesSeedingSignal != nil {
m.config.messengerSignalsHandler.HistoryArchivesSeeding(
sub.HistoryArchivesSeedingSignal.CommunityID
)
c, err := m.communitiesManager.GetByIDString(
sub.HistoryArchivesSeedingSignal.CommunityID
)
if err != nil {
m.logger.Debug(
"failed to retrieve community by id string",
zap.Error(err)
)
}
if c.IsControlNode() {
err := m.dispatchMagnetlinkMessage(
sub.HistoryArchivesSeedingSignal.CommunityID
)
if err != nil {
m.logger.Debug(
"failed to dispatch magnetlink message",
zap.Error(err)
)
}
}
}
```
### 3. **Key Conditions**
- **Only Control Nodes**: Only the community owner can send magnetlink messages
- **After Seeding**: Messages are only sent after archives are successfully seeded and available for download
### 4. **Message Creation and Sending**
The `dispatchMagnetlinkMessage` function (lines 4093-4138):
1. **Gets the magnetlink**: Calls `m.archiveManager.GetHistoryArchiveMagnetlink(community.ID())` to generate the magnetlink from the torrent file
2. **Creates the message**: Builds a `CommunityMessageArchiveMagnetlink` protobuf message with current timestamp and magnetlink URI
3. **Sends publicly**: Broadcasts the message to the community's magnetlink channel using `m.messaging.SendPublic()`
4. **Updates clocks**: Updates both the community description and magnetlink message clocks
### 5. **Message Content**
The magnetlink message contains:
- **Clock**: Current timestamp
- **MagnetUri**: The BitTorrent magnetlink for downloading the archives
- **Message Type**: `COMMUNITY_MESSAGE_ARCHIVE_MAGNETLINK`
### Summary
Magnetlink messages are sent **automatically by community control nodes whenever they finish seeding new history archives**. This ensures that community members are immediately notified when new archive data becomes available for download via BitTorrent, enabling efficient peer-to-peer distribution of community message history.

View File

@ -0,0 +1,113 @@
Some summary about testify assertions.
## `assert.Equal()` Function Signature
```go
func Equal(t TestingT, expected, actual interface{}, msgAndArgs ...interface{}) bool
```
### Parameters Breakdown:
1. **`t TestingT`** - The testing interface (first parameter)
2. **`expected interface{}`** - What you expect the value to be
3. **`actual interface{}`** - The actual value you're testing
4. **`msgAndArgs ...interface{}`** - Optional custom message and formatting arguments
## What is `suite.T()`?
When you're using testify's **test suite pattern**, `suite.T()` returns the underlying `*testing.T` instance associated with the current test method.
```go
type CodexArchiveDownloaderTestifySuite struct {
suite.Suite // Embeds testify's Suite type
// ... other fields
}
func (suite *CodexArchiveDownloaderTestifySuite) TestBasicSingleArchive() {
// suite.T() returns the *testing.T for this specific test method
assert.Equal(suite.T(), 1, downloader.GetTotalArchivesCount())
}
```
## Comparison: Suite vs Function-Based
### In Test Suite (what you're seeing):
```go
assert.Equal(suite.T(), expected, actual, "optional message")
```
### In Regular Function-Based Test:
```go
func TestSomething(t *testing.T) {
assert.Equal(t, expected, actual, "optional message")
}
```
## Why `suite.T()` is Needed
The testify suite embeds a `*testing.T`, but assertions need direct access to it for:
- Reporting failures
- Marking tests as failed
- Logging output
- Integration with Go's test runner
## Other Common Assert Functions
```go
// Basic equality
assert.Equal(t, expected, actual)
assert.NotEqual(t, expected, actual)
// Boolean checks
assert.True(t, condition)
assert.False(t, condition)
// Nil checks
assert.Nil(t, value)
assert.NotNil(t, value)
// Collection checks
assert.Len(t, collection, expectedLength)
assert.Contains(t, collection, element)
assert.Empty(t, collection)
// Error checks
assert.NoError(t, err)
assert.Error(t, err)
// Type checks
assert.IsType(t, expectedType, actual)
// All with optional custom messages
assert.Equal(t, 42, result, "The calculation should return 42")
assert.True(t, isValid, "Validation should pass for input: %s", input)
```
## `require` vs `assert`
Both take the same parameters, but behave differently on failure:
```go
// assert continues test execution on failure
assert.Equal(t, 1, count)
assert.True(t, isReady) // This will still run even if above fails
// require stops test execution immediately on failure
require.Equal(t, 1, count)
require.True(t, isReady) // This won't run if above fails
```
## In Your Code Context
```go
// Line 90 you're looking at:
assert.Equal(suite.T(), 1, downloader.GetTotalArchivesCount(), "Total archives count should be 1")
```
This means:
- **`suite.T()`** - The testing context from the suite
- **`1`** - Expected value
- **`downloader.GetTotalArchivesCount()`** - Actual value being tested
- **`"Total archives count should be 1"`** - Custom error message if assertion fails
The assertion will fail if `GetTotalArchivesCount()` returns anything other than `1`, and it will display your custom message along with the expected vs actual values.

View File

@ -0,0 +1,23 @@
---
related-to:
- "[[testing codex-status-go integration]]"
- "[[Running Unit Tests for status-go]]"
- "[[Running functional tests in status-go]]"
---
The easiest way:
```bash
make lint
```
And the detailed manual commands:
```bash
golangci-lint --build-tags 'gowaku_no_rln lint' run ./...
```
and:
```bash
make lint-panics
```

View File

@ -0,0 +1,107 @@
The following bigger changes has been observed:
- added `context` to `HandleCommunityMessageArchiveMagnetlink` so that the signature now looks like this:
```go
func (m *Messenger) HandleCommunityMessageArchiveMagnetlink(ctx context.Context, state *ReceivedMessageState, message *protobuf.CommunityMessageArchiveMagnetlink, statusMessage *common.StatusMessage) error {
return m.HandleHistoryArchiveMagnetlinkMessage(state, state.CurrentMessageState.PublicKey, message.MagnetUri, message.Clock)
}
```
### Makefile
(1) extended parts related to `libwaku` and `libsds`
(2) building main target does refer to vendors:
```bash
.PHONY: $(GO_CMD_NAMES) $(GO_CMD_PATHS) $(GO_CMD_BUILDS)
$(GO_CMD_BUILDS): generate $(LIBWAKU) $(LIBSDS)
$(GO_CMD_BUILDS): ##@build Build any Go project from cmd folder
CGO_ENABLED=1 \
CGO_CFLAGS="$(CGO_CFLAGS)" \
CGO_LDFLAGS="$(CGO_LDFLAGS)" \
go build -v \
-tags '$(BUILD_TAGS)' $(BUILD_FLAGS) \
-o ./$@ ./cmd/$(notdir $@)
@echo "Compilation done."
@echo "Run \"build/bin/$(notdir $@) -h\" to view available commands."
```
You see now they use:
```bash
go build -v \
```
previously they had:
```bash
go build -mod=vendor -v \
```
This is in commit `1f18d4d326537e1da90faac6b29dee6aa3bdc6d1`:
```
chore: stop vendoring (#6951)
* chore: ignore go work
* chore: stop vendoring
ci: update github pr workflow
chore: update nix vendor hash
* feat(Makefile): vendor-hash
* fix(Makefile): colors
* chore: rm -rf vendor
* chore: update nix vendorHash
```
### nix
Nix has been slightly updated:
```nix
in mkShell {
name = "status-go-shell";
buildInputs = with pkgs;
lib.optionals (stdenv.isDarwin) [ xcodeWrapper llvmPackages.openmp ] ++ [
git jq which
go golangci-lint go-junit-report gopls codecov-cli
protobuf3_24 protoc-gen-go gotestsum openjdk openssl
rustc cargo
];
```
to:
```nix
in mkShell {
name = "status-go-shell";
buildInputs = with pkgs;
lib.optionals (stdenv.isDarwin) [ xcodeWrapper ] ++ [
git jq which
go golangci-lint go-junit-report gopls codecov-cli
protobuf3_24 protoc-gen-go gotestsum openjdk openssl
rustc cargo
nim
lib-sds-pkg
];
```
and then on top of that we have our changes:
```nix
in mkShell {
name = "status-go-shell";
buildInputs = with pkgs;
lib.optionals (stdenv.isDarwin) [ xcodeWrapper llvmPackages.openmp ] ++ [
git jq which
go golangci-lint go-junit-report gopls codecov-cli
protobuf3_24 protoc-gen-go gotestsum openjdk openssl
rustc cargo
nim
lib-sds-pkg
] ++ lib.optionals (!stdenv.isDarwin) [ gcc ];
```

View File

@ -0,0 +1,597 @@
---
related-to:
- "[[Team-NLBR Solution Proposal]]"
- "[[status-go publishing magnet links]]"
- "[[status-go processing magnet links]]"
- "[[status-go-codex integration - design notes]]"
- "[[Creating History Archives - InitHistoryArchiveTasks]]"
---
marcinczenko: What is the difference in handleImportedMessages and handleRetrievedMessages. How are they used?
GitHub Copilot: I'll examine the two methods `handleImportedMessages` and `handleRetrievedMessages` to understand their differences and usage.
Read [](file:///home/mc2/code/status-im/status-go/protocol/messenger.go#2900-2900), lines 2900 to 3100
Read [](file:///home/mc2/code/status-im/status-go/protocol/messenger.go#3100-3100), lines 3100 to 3300
Now let me search for where these methods are called to understand their usage patterns:
Searched text for `handleImportedMessages|handleRetrievedMessages` (`**//home/mc2/code/status-im/status-go/protocol/**`), 6 results
Let me examine the context where these methods are called:
Read [](file:///home/mc2/code/status-im/status-go/protocol/messenger.go#2670-2670), lines 2670 to 2700
Read [](file:///home/mc2/code/status-im/status-go/protocol/messenger_handler.go#1480-1480), lines 1480 to 1510
Now let me look at the specific message handling in both methods:
Searched text for `dispatchToHandler` (`**/messenger.go`), 2 results
Let me check the dispatchToHandler method to understand the difference:
Searched codebase for "dispatchToHandler method implementation", 15 results
Great! Now I can see the `dispatchToHandler` method is generated. Let me also check the specific handling in `handleImportedMessages`:
Read [](file:///home/mc2/code/status-im/status-go/protocol/messenger.go#3030-3030), lines 3030 to 3080
Now let me also check the context where these are used in messenger_handler.go:
Read [](file:///home/mc2/code/status-im/status-go/protocol/messenger_handler.go#1460-1460), lines 1460 to 1520
Let me check what method contains this logic by looking for the function signature:
Searched text for `func (m *Messenger)` (`**/messenger_handler.go`), 20 results
Let me look for the function that contains the separation logic:
Searched text for `ThirdPartyID` (`**/messenger_handler.go`), 6 results
Read [](file:///home/mc2/code/status-im/status-go/protocol/messenger_handler.go#1440-1440), lines 1440 to 1470
Perfect! Now I can see the complete picture. Based on my analysis of the code, I can explain the differences between `handleImportedMessages` and `handleRetrievedMessages`:
## Differences between `handleImportedMessages` and `handleRetrievedMessages`
### **1. Purpose and Context**
**`handleImportedMessages`:**
- Specifically designed for handling **imported/archived messages** (e.g., from Discord imports or community history archives)
- Used when messages have a `ThirdPartyID` (indicating they come from external sources)
- Part of the history import/archive restoration functionality
**`handleRetrievedMessages`:**
- General-purpose message handler for **regular incoming messages**
- Used for normal message flow from the messaging protocol
- Called by `RetrieveAll()` which is the main message retrieval mechanism
### **2. Message Separation Logic**
In `handleArchiveMessages()` method in messenger_handler.go, messages are separated based on `ThirdPartyID`:
```go
for filter, messages := range messagesToHandle {
for _, message := range messages {
if message.ThirdPartyID != "" {
importedMessages[filter] = append(importedMessages[filter], message) // → handleImportedMessages
} else {
otherMessages[filter] = append(otherMessages[filter], message) // → handleRetrievedMessages
}
}
}
```
### **3. Message Type Handling**
**`handleImportedMessages`:**
- **Limited scope**: Only handles specific message types:
- `CHAT_MESSAGE`
- `PIN_MESSAGE`
- Uses hardcoded switch statement for message type handling
- More restrictive and focused on basic message types
**`handleRetrievedMessages`:**
- **Full scope**: Handles all message types using `dispatchToHandler()`
- `dispatchToHandler()` is auto-generated and routes to appropriate handlers for all supported message types
- More comprehensive message processing
### **4. Feature Differences**
**`handleImportedMessages`:**
- **No mutex protection** during main processing (only for Discord-specific saves)
- **No notifications/activity center** updates
- **No community message storage** in Waku
- **No metrics tracking**
- **No blocked user filtering**
- **No datasync metadata handling**
- **No installation/shared secrets handling**
- Focuses on basic message persistence
**`handleRetrievedMessages`:**
- **Full mutex protection** (`m.handleMessagesMutex`)
- **Community message storage** for controlled communities
- **Metrics tracking** (`MetricsPushReceivedMessages`)
- **Blocked user filtering**
- **Datasync metadata handling**
- **Installation and shared secrets handling**
- **Activity center notifications**
- **Contact management**
- Returns a full `MessengerResponse`
### **5. Return Values**
**`handleImportedMessages`:**
- Returns `error` only
- No response object for UI updates
**`handleRetrievedMessages`:**
- Returns `(*MessengerResponse, error)`
- Provides full response for UI updates via `saveDataAndPrepareResponse()`
### **6. Usage Context**
**`handleImportedMessages`:**
- Used in archive restoration scenarios
- Called when processing historical/imported messages from external sources
- Part of the community history import functionality
**`handleRetrievedMessages`:**
- Used in normal message flow
- Called by `RetrieveAll()` which is invoked by `ProcessAllMessages()`
- Part of the regular message processing loop
### **Summary**
`handleImportedMessages` is a **lightweight, specialized handler** for processing historical/imported messages with minimal side effects, while `handleRetrievedMessages` is the **full-featured, comprehensive handler** for regular message processing with all the bells and whistles (notifications, community features, metrics, etc.).
The separation ensures that imported messages don't trigger unnecessary notifications or duplicate processing while still preserving the message content in the database.
marcinczenko: ok, both functions are called by `handleArchiveMessages`, which is called by `importHistoryArchive`, which in turn is called by either `resumeHistoryArchivesImport` or `downloadAndImportHistoryArchives`. The `downloadAndImportHistoryArchives` is in turn called indirectly by `handleRetrievedMessages` - which looks like a circural dependency, which I can really understand. I am trying to get a better picure of the process of archive history download...
GitHub Copilot: You're absolutely right to be confused about this circular dependency! Let me trace through the archive history download process to clarify what's happening. Let me examine the call chain you mentioned.
Searched codebase for "downloadAndImportHistoryArchives function implementation", 13 results
Now let me look at the specific methods to understand the flow better:
Read [](file:///home/mc2/code/status-im/status-go/protocol/messenger_handler.go#1410-1410), lines 1410 to 1450
Searched text for `downloadAndImportHistoryArchives` (`**/messenger_handler.go`), 6 results
Now let me search for where `handleRetrievedMessages` might indirectly call `downloadAndImportHistoryArchives`:
Searched text for `HandleHistoryArchiveMagnetlinkMessage` (`**/messenger_handler.go`), 4 results
Read [](file:///home/mc2/code/status-im/status-go/protocol/messenger_handler.go#3980-3980), lines 3980 to 4010
Let me search for what calls `HandleCommunityMessageArchiveMagnetlink`:
Searched text for `HandleCommunityMessageArchiveMagnetlink` (`**/*.go`), 8 results
Now let me check the `dispatchToHandler` generated file:
Read [](file:///home/mc2/code/status-im/status-go/protocol/messenger_handlers.go#130-130), lines 130 to 150
Perfect! Now I can see the complete picture. The apparent "circular dependency" you noticed is not actually circular, but rather involves **different message types flowing through the system**. Let me explain the complete archive history download process:
## **Archive History Download Process Flow**
### **1. Initial Trigger Points**
There are several ways archive downloads can be triggered:
**A. Via `resumeHistoryArchivesImport()`:**
- Called during startup to resume interrupted downloads
- Directly calls `importHistoryArchives()` to process already-downloaded archives
**B. Via received magnetlink messages:**
- `handleRetrievedMessages()` processes incoming `COMMUNITY_MESSAGE_ARCHIVE_MAGNETLINK` messages
- This goes through `dispatchToHandler()``handleCommunityMessageArchiveMagnetlinkProtobuf()``HandleCommunityMessageArchiveMagnetlink()``HandleHistoryArchiveMagnetlinkMessage()`
- Which then spawns a goroutine that calls `downloadAndImportHistoryArchives()`
**C. Via community join responses:**
- When joining a community, `HandleCommunityRequestToJoinResponse()` also calls `downloadAndImportHistoryArchives()`
### **2. The Download and Import Pipeline**
```
downloadAndImportHistoryArchives()
1. DownloadHistoryArchivesByMagnetlink() [downloads archives via torrent]
2. importHistoryArchives() [processes downloaded archives]
3. ExtractMessagesFromHistoryArchive() [extracts messages from archive files]
4. handleArchiveMessages() [processes the extracted messages]
5. Separates messages by ThirdPartyID:
- Messages WITH ThirdPartyID → handleImportedMessages()
- Messages WITHOUT ThirdPartyID → handleRetrievedMessages()
```
### **3. The Key Insight: Different Message Types**
The "circular dependency" you noticed is actually **different message flows**:
**Flow A: Regular Messages → Archive Download**
1. `handleRetrievedMessages()` processes regular incoming messages from the network
2. One of these messages might be a `COMMUNITY_MESSAGE_ARCHIVE_MAGNETLINK` message
3. This triggers `downloadAndImportHistoryArchives()` to download new archives
**Flow B: Archive Processing → Message Handling**
1. `downloadAndImportHistoryArchives()` downloads archive files
2. `importHistoryArchives()` extracts messages from archives
3. `handleArchiveMessages()` processes these extracted historical messages
4. These messages are then handled by `handleImportedMessages()` and `handleRetrievedMessages()` again
### **4. The Separation Logic**
In `handleArchiveMessages()`, messages are separated based on `ThirdPartyID`:
```go
for filter, messages := range messagesToHandle {
for _, message := range messages {
if message.ThirdPartyID != "" {
importedMessages[filter] = append(importedMessages[filter], message) // External imports (Discord, etc.)
} else {
otherMessages[filter] = append(otherMessages[filter], message) // Regular Status messages
}
}
}
```
### **5. Why This Design Makes Sense**
1. **Regular message processing** (`handleRetrievedMessages`) can trigger **archive downloads** when magnetlink messages arrive
2. **Archive processing** can extract both **imported messages** (Discord, etc.) and **regular Status messages**
3. **Different handlers** are used based on message origin and type:
- `handleImportedMessages()`: Lightweight processing for external imports
- `handleRetrievedMessages()`: Full-featured processing for Status messages
### **6. Complete Flow Diagram**
```
Network Messages → handleRetrievedMessages()
[If MAGNETLINK message] → downloadAndImportHistoryArchives()
Download Archive Files → importHistoryArchives()
Extract Messages → handleArchiveMessages()
Separate by ThirdPartyID:
├─ External Messages → handleImportedMessages()
└─ Status Messages → handleRetrievedMessages() [different context]
```
So it's not circular - it's a **layered system** where:
- **Network messages** can trigger **archive downloads**
- **Archive processing** extracts **historical messages**
- **Historical messages** are processed through appropriate handlers
The same function names are used, but they're processing **different types of messages** in **different contexts**.
marcinczenko: ok, so the MAGNETLINK message can only come from the network I guess, and not from the archive, right? Thus, when `handleRetrievedMessages` is called in the response to the MAGETLINK message, it will process it differntly (it will not call downloadAndImportHistoryArchives). Is this correct?
GitHub Copilot: You're absolutely correct! That's a very insightful observation. Let me confirm this by examining the logic:
Read [](file:///home/mc2/code/status-im/status-go/protocol/messenger_handler.go#1450-1450), lines 1450 to 1500
Now let me check the parameters passed to `handleRetrievedMessages` in different contexts:
Searched text for `handleRetrievedMessages` (`**/*.go`), 6 results
Perfect! Looking at the function signature and calls:
```go
func (m *Messenger) handleRetrievedMessages(chatWithMessages map[messagingtypes.ChatFilter][]*messagingtypes.ReceivedMessage, storeWakuMessages bool, fromArchive bool) (*MessengerResponse, error)
```
**Yes, you are absolutely correct!** Here's the key distinction:
### **1. Network Messages (including MAGNETLINK):**
```go
// From RetrieveAll() -> called for regular network message processing
return m.handleRetrievedMessages(chatWithMessages, true, false)
// ^^^^ ^^^^^
// storeWaku fromArchive=false
```
### **2. Archive Messages:**
```go
// From handleArchiveMessages() -> called when processing extracted archive messages
response, err := m.handleRetrievedMessages(otherMessages, false, true)
// ^^^^^ ^^^^
// storeWaku fromArchive=true
```
### **3. The Key Parameter: `fromArchive`**
When `fromArchive=true`, the `dispatchToHandler` call passes this parameter:
Read [](file:///home/mc2/code/status-im/status-go/protocol/messenger.go#3240-3240), lines 3240 to 3250
Now let me check if the generated handlers use this `fromArchive` parameter:
Read [](file:///home/mc2/code/status-im/status-go/cmd/generate_handlers/generate_handlers_template.txt#15-15), lines 15 to 45
Let me check the specific handler for MAGNETLINK messages:
Read [](file:///home/mc2/code/status-im/status-go/protocol/messenger_handlers.go#1000-1000), lines 1000 to 1020
I notice that the magnetlink handler doesn't have the `fromArchive` parameter. Let me check if there's a configuration that controls this:
Searched text for `CommunityMessageArchiveMagnetlink.*fromArchive|fromArchive.*CommunityMessageArchiveMagnetlink` (`**/*.go`), no results
Let me check what determines if a handler gets the `fromArchive` parameter by looking at the handler generation:
Searched text for `FromArchiveArg` (`**/cmd/generate_handlers/**`), 5 results
Perfect! Now I can see the complete picture:
Read [](file:///home/mc2/code/status-im/status-go/cmd/generate_handlers/generate_handlers.go#65-65), lines 65 to 75
## **You are 100% correct!** Here's why:
### **1. MAGNETLINK Messages and Archive Context**
**MAGNETLINK messages (`CommunityMessageArchiveMagnetlink`) can ONLY come from the network**, not from archives, because:
1. **Archive files don't contain MAGNETLINK messages** - they contain the actual chat messages, pin messages, etc. that were archived
2. **MAGNETLINK messages are instructions to download archives** - they wouldn't be stored inside the archives themselves (that would be recursive)
3. **Only specific message types can come from archives**: According to the handler generation code, only `ChatMessage` and `PinMessage` get the `fromArchive` parameter
### **2. Handler Behavior Differences**
```go
// Only these message types get fromArchive parameter:
info.FromArchiveArg = protobufName == "ChatMessage" || protobufName == "PinMessage"
```
This means:
- **`CommunityMessageArchiveMagnetlink` handler** does NOT get `fromArchive` parameter
- **`ChatMessage` and `PinMessage` handlers** DO get `fromArchive` parameter and can behave differently
### **3. The Safety Mechanism**
When `handleRetrievedMessages` is called in different contexts:
**A. Network Messages (fromArchive=false):**
- MAGNETLINK messages can trigger `downloadAndImportHistoryArchives()`
- This is the normal behavior for network-received magnetlinks
**B. Archive Messages (fromArchive=true):**
- MAGNETLINK messages **cannot exist** in archives (they're not archived)
- Even if they did, the handler doesn't have access to `fromArchive` to change behavior
- But `ChatMessage` and `PinMessage` handlers **do** get `fromArchive=true` and can behave accordingly
### **4. Logical Flow Protection**
```
Network Message Flow:
MAGNETLINK arrives via network
→ handleRetrievedMessages(fromArchive=false)
→ HandleCommunityMessageArchiveMagnetlink()
→ downloadAndImportHistoryArchives() ✓ ALLOWED
Archive Processing Flow:
Archive extracted messages
→ handleRetrievedMessages(fromArchive=true)
→ Only ChatMessage/PinMessage handlers get fromArchive=true
→ MAGNETLINK messages don't exist in archives anyway ✓ SAFE
```
So yes, your understanding is perfect! The system naturally prevents infinite loops because:
1. **MAGNETLINK messages only come from the network**
2. **Archive files only contain chat content messages**
3. **The `fromArchive` parameter** allows handlers to behave differently when processing historical vs. live messages
4. **Only relevant message types** get the `fromArchive` parameter
This is a well-designed system that prevents the circular dependency issue you were concerned about!
marcinczenko: Can you show me with more details the flow of receiving the magnet link before and how it is connected to `RetrieveAll()`?
GitHub Copilot: Absolutely! Let me trace the complete flow from the very beginning of how a magnetlink message arrives and gets processed through to `RetrieveAll()`. This will show you the entire message reception pipeline.
Searched codebase for "RetrieveAll message retrieval from network waku", 26 results
Perfect! Now I can show you the complete detailed flow of how a magnetlink message travels from the network to `RetrieveAll()`. Let me trace this step by step:
## **Complete Magnetlink Message Flow - From Network to `RetrieveAll()`**
### **1. Network Layer - Message Reception**
```
Waku Network
OnNewEnvelope() [waku/nwaku.go or waku/gowaku.go]
postEvent(envelope)
msgQueue <- envelope [buffered channel]
```
**Details:**
- Messages arrive from the Waku network via `OnNewEnvelope()`
- Messages are queued in `msgQueue` (buffered channel with `messageQueueLimit`)
- `processQueueLoop()` continuously processes this queue
Read [](file:///home/mc2/code/status-im/status-go/messaging/waku/nwaku.go#1180-1180), lines 1180 to 1200
### **2. Message Store Layer**
```
processMessage(envelope)
filter.Messages.Add(message) [MemoryMessageStore]
Messages stored in filter-specific stores
```
**Details:**
- `processMessage()` routes messages to appropriate filters based on topic
- Each filter has its own `MemoryMessageStore` where messages accumulate
- Messages wait here until retrieved by `RetrieveAll()`
Read [](file:///home/mc2/code/status-im/status-go/messaging/waku/nwaku.go#1200-1200), lines 1200 to 1215
Read [](file:///home/mc2/code/status-im/status-go/messaging/waku/nwaku.go#1215-1215), lines 1215 to 1235
### **3. Message Retrieval Trigger**
The `RetrieveAll()` is called either:
**A. Periodically via the Message Loop:**
```
StartRetrieveMessagesLoop() [called at startup]
ticker.C (every tick interval)
ProcessAllMessages()
RetrieveAll()
```
**B. Manually by the application:**
```
Direct call to RetrieveAll()
```
### **4. The RetrieveAll() Chain**
```go
// messenger.go
func (m *Messenger) RetrieveAll() (*MessengerResponse, error) {
chatWithMessages, err := m.messaging.RetrieveRawAll() // Step A
if err != nil {
return nil, err
}
return m.handleRetrievedMessages(chatWithMessages, true, false) // Step B
}
```
### **Step A: `m.messaging.RetrieveRawAll()`**
Read [](file:///home/mc2/code/status-im/status-go/messaging/api.go#150-150), lines 150 to 165
This calls into the transport layer:
Read [](file:///home/mc2/code/status-im/status-go/messaging/layers/transport/transport.go#210-210), lines 210 to 240
The key call here is `t.api.GetFilterMessages(filter.FilterID)`:
Read [](file:///home/mc2/code/status-im/status-go/messaging/waku/api.go#344-344), lines 344 to 359
**This is where the stored messages are retrieved!** `f.Retrieve()` calls the filter's message store:
Read [](file:///home/mc2/code/status-im/status-go/messaging/waku/common/message.go#140-140), lines 140 to 155
### **Step B: `handleRetrievedMessages()`**
Now we have the messages collected from all filters. Let's trace what happens next:
```go
// messenger.go - handleRetrievedMessages()
func (m *Messenger) handleRetrievedMessages(chatWithMessages map[messagingtypes.ChatFilter][]*messagingtypes.ReceivedMessage, storeWakuMessages bool, fromArchive bool) (*MessengerResponse, error) {
// Process each message through the handler pipeline
for iterator.HasNext() {
filter, messages := iterator.Next()
for _, shhMessage := range messages {
// Decode the raw message
handleMessagesResponse, err := m.messaging.HandleReceivedMessages(shhMessage)
for _, msg := range statusMessages {
// Route to appropriate handler
err := m.dispatchToHandler(messageState, msg.ApplicationLayer.Payload, msg, filter, fromArchive)
// If this is a MAGNETLINK message, it goes to:
// dispatchToHandler() → handleCommunityMessageArchiveMagnetlinkProtobuf()
// → HandleCommunityMessageArchiveMagnetlink() → HandleHistoryArchiveMagnetlinkMessage()
// → downloadAndImportHistoryArchives()
}
}
}
}
```
### **5. Complete Flow Diagram**
```
┌─────────────────────────────────────────────────────────────────┐
│ NETWORK LAYER │
└─────────────────────────────────────────────────────────────────┘
┌─────────────────────────────────────────────────────────────────┐
│ WAKU MESSAGE RECEPTION │
│ Network → OnNewEnvelope() → postEvent() → msgQueue │
└─────────────────────────────────────────────────────────────────┘
┌─────────────────────────────────────────────────────────────────┐
│ MESSAGE STORE LAYER │
│ processQueueLoop() → processMessage() → filter.Messages.Add() │
│ │
│ Messages stored in MemoryMessageStore per filter │
└─────────────────────────────────────────────────────────────────┘
┌─────────────────────────────────────────────────────────────────┐
│ RETRIEVAL TRIGGER │
│ │
│ Timer Loop: │ Manual: │
│ StartRetrieveMessagesLoop() │ Direct RetrieveAll() │
│ ↓ │ call │
│ ProcessAllMessages() │ │
│ ↓ │ │
│ RetrieveAll() ←─────────────────┘ │
└─────────────────────────────────────────────────────────────────┘
┌─────────────────────────────────────────────────────────────────┐
│ MESSAGE COLLECTION │
│ RetrieveAll() → messaging.RetrieveRawAll() │
│ ↓ │
│ transport.RetrieveRawAll() │
│ ↓ │
│ For each filter: api.GetFilterMessages() │
│ ↓ │
│ filter.Retrieve() → store.Pop() │
│ │
│ Returns: map[ChatFilter][]*ReceivedMessage │
└─────────────────────────────────────────────────────────────────┘
┌─────────────────────────────────────────────────────────────────┐
│ MESSAGE PROCESSING │
│ handleRetrievedMessages(messages, true, false) │
│ ↓ │
│ For each message: messaging.HandleReceivedMessages() │
│ ↓ │
│ dispatchToHandler() → route by message type │
│ │
│ IF MAGNETLINK MESSAGE: │
│ → handleCommunityMessageArchiveMagnetlinkProtobuf() │
│ → HandleCommunityMessageArchiveMagnetlink() │
│ → HandleHistoryArchiveMagnetlinkMessage() │
│ → downloadAndImportHistoryArchives() [ASYNC GOROUTINE] │
└─────────────────────────────────────────────────────────────────┘
```
And here is also a link to a diagram in excalidraw: [https://link.excalidraw.com/readonly/vSon9uiUhYJWrwXiKAsi](https://link.excalidraw.com/readonly/vSon9uiUhYJWrwXiKAsi).
### **6. Key Points About the Flow**
1. **Asynchronous Message Queue**: Messages from network are queued and processed asynchronously
2. **Per-Filter Storage**: Each chat filter has its own message store
3. **Batch Retrieval**: `RetrieveAll()` collects messages from all filters at once
4. **Cache Deduplication**: Transport layer checks cache to avoid reprocessing
5. **Message Type Routing**: `dispatchToHandler()` routes messages to type-specific handlers
6. **MAGNETLINK Processing**: MAGNETLINK messages trigger async archive downloads
7. **Regular Polling**: The retrieve loop runs continuously to process new messages
### **7. Why This Design Works Well**
- **Network messages flow continuously** into filter stores
- **RetrieveAll() periodically collects** all accumulated messages
- **MAGNETLINK messages** in this flow trigger downloads of historical archives
- **Archive processing** uses a separate flow (`handleArchiveMessages`) that eventually calls `handleRetrievedMessages` again but with `fromArchive=true`
- **No infinite loops** because MAGNETLINK messages only come from network, not archives
This shows how the live message processing naturally triggers the archive download process when MAGNETLINK messages arrive from community owners!

View File

@ -0,0 +1,186 @@
---
related-to:
- "[[Team-NLBR Solution Proposal]]"
- "[[status-go publishing magnet links]]"
- "[[status-go processing magnet links]]"
- "[[status-go-codex integration - design notes]]"
- "[[Creating History Archives - InitHistoryArchiveTasks]]"
---
See also previous conversation: [[When are magnetlink messages sent]]
Great question! Let me trace where and how magnetlink messages are published to the Waku network. This is the other side of the equation - how the community owners/admins send these MAGNETLINK messages that we just traced receiving.
Searched codebase for "publish magnetlink CommunityMessageArchiveMagnetlink send", 20 results
Perfect! Now I can show you exactly where and how magnetlink messages are published to the Waku network. Let me trace this step by step:
## **Where Magnetlink Messages are Published to Waku**
### **1. The Key Function: `dispatchMagnetlinkMessage()`**
Read [](file:///home/mc2/code/status-im/status-go/protocol/messenger_communities.go#4079-4079), lines 4079 to 4109
Read [](file:///home/mc2/code/status-im/status-go/protocol/messenger_communities.go#4109-4109), lines 4109 to 4124
This is the core function! Let me also check what calls this function:
Searched text for `dispatchMagnetlinkMessage` (`**/*.go`), 4 results
Read [](file:///home/mc2/code/status-im/status-go/protocol/messenger_communities.go#250-250), lines 250 to 270
Let me check what this subscription handler is:
Read [](file:///home/mc2/code/status-im/status-go/protocol/messenger_communities.go#220-220), lines 220 to 250
Read [](file:///home/mc2/code/status-im/status-go/protocol/messenger_communities.go#200-200), lines 200 to 220
Now I can see the complete picture! Let me trace the flow and also check what triggers the seeding signal:
Searched text for `HistoryArchivesSeedingSignal` (`**/*.go`), 20 results
Let me check where the seeding signal is triggered:
Read [](file:///home/mc2/code/status-im/status-go/protocol/communities/manager_archive.go#430-430), lines 430 to 440
Read [](file:///home/mc2/code/status-im/status-go/protocol/communities/manager_archive.go#420-420), lines 420 to 435
Read [](file:///home/mc2/code/status-im/status-go/protocol/communities/manager_archive.go#400-400), lines 400 to 420
Now let me check where `SeedHistoryArchiveTorrent` is called:
Searched text for `SeedHistoryArchiveTorrent` (`**/*.go`), 20 results
Perfect! Now I have the complete picture. Let me also check where the seeding happens from archive creation:
Read [](file:///home/mc2/code/status-im/status-go/protocol/communities/manager_archive.go#310-310), lines 310 to 325
## **Complete Magnetlink Publication Flow**
Here's the complete flow of where and how magnetlink messages are published to Waku:
### **1. Archive Creation and Seeding Triggers**
**A. Periodic Archive Creation (Community Owner):**
```
StartHistoryArchiveTasksInterval() [periodic scheduler]
CreateAndSeedHistoryArchive()
CreateHistoryArchiveTorrentFromDB() [creates .torrent files]
SeedHistoryArchiveTorrent() [starts BitTorrent seeding]
publisher.publish(HistoryArchivesSeedingSignal)
```
**B. Manual Archive Creation:**
```
Discord Import or other triggers
SeedHistoryArchiveTorrent() [directly]
publisher.publish(HistoryArchivesSeedingSignal)
```
**C. Archive Download Completion:**
```
DownloadHistoryArchivesByMagnetlink() [when downloading completes]
publisher.publish(HistoryArchivesSeedingSignal) [line 642]
```
### **2. The Publishing Pipeline**
```
SeedHistoryArchiveTorrent()
publisher.publish(HistoryArchivesSeedingSignal) [internal event]
`handleCommunitiesHistoryArchivesSubscription`() [event handler]
IF community.IsControlNode():
dispatchMagnetlinkMessage(communityID)
Create CommunityMessageArchiveMagnetlink protobuf
messaging.SendPublic(context, chatID, rawMessage)
[WAKU NETWORK]
```
### **3. Key Publication Details**
**The Message Structure:**
```go
magnetLinkMessage := &protobuf.CommunityMessageArchiveMagnetlink{
Clock: m.getTimesource().GetCurrentTime(),
MagnetUri: magnetlink, // BitTorrent magnetlink
}
rawMessage := messagingtypes.RawMessage{
LocalChatID: chatID, // community.MagnetlinkMessageChannelID()
Sender: community.PrivateKey(), // Signed by community owner
Payload: encodedMessage,
MessageType: protobuf.ApplicationMetadataMessage_COMMUNITY_MESSAGE_ARCHIVE_MAGNETLINK,
SkipGroupMessageWrap: true, // Direct message, no encryption wrapper
PubsubTopic: community.PubsubTopic(), // Community's pubsub topic
Priority: &messagingtypes.LowPriority,
}
```
**Where it's Published:**
- **Topic**: Community's pubsub topic (`community.PubsubTopic()`)
- **Chat ID**: `community.MagnetlinkMessageChannelID()` (special channel for magnetlinks)
- **Sender**: Community owner's private key
- **Network**: Public Waku network via `messaging.SendPublic()`
### **4. Who Can Publish Magnetlinks?**
Only **community control nodes** (owners) can publish magnetlink messages:
```go
if c.IsControlNode() {
err := m.dispatchMagnetlinkMessage(sub.HistoryArchivesSeedingSignal.CommunityID)
}
```
### **5. When are Magnetlinks Published?**
1. **After creating new archives** (periodic archive creation)
2. **After seeding existing archives** (on startup/reboot)
3. **After downloading and re-seeding archives** (when community owners download from other peers)
4. **After Discord imports** (when historical data is imported)
### **6. The Network Flow**
```
Community Owner Node:
Archive Creation → Seeding → HistoryArchivesSeedingSignal
dispatchMagnetlinkMessage()
messaging.SendPublic() → [WAKU NETWORK]
[All Community Members]
RetrieveAll() receives messages
handleRetrievedMessages()
HandleHistoryArchiveMagnetlinkMessage()
downloadAndImportHistoryArchives()
```
### **7. Complete Publication Locations Summary**
**Primary Publication Point:**
- `dispatchMagnetlinkMessage()` in messenger_communities.go
- Called when `HistoryArchivesSeedingSignal` is received
- Only for community control nodes
- Publishes to community's pubsub topic
**Network Layer:**
- `messaging.SendPublic()``sendCommunity()``dispatchCommunityMessage()``transport.SendCommunityMessage()`
- Eventually reaches Waku network via `transport.api.Post()`
**The magnetlink messages flow from community owners to all community members via the Waku network**, triggering automatic archive downloads when members receive them through their normal message processing loop!

View File

@ -0,0 +1,299 @@
---
related-to:
- "[[Team-NLBR Solution Proposal]]"
- "[[status-go publishing magnet links]]"
- "[[status-go processing magnet links]]"
- "[[status-go-codex integration - design notes]]"
- "[[Creating History Archives - InitHistoryArchiveTasks]]"
- "[[testing codex-status-go integration]]"
---
The notes in this document aim at recording our path towards the Codex integration with status-go. It is thus good to learn the dilemmas we had and some initial assumptions. The reference is the spec document and the code.
## Codex for History Archives
As indicated in the [[Team-NLBR Solution Proposal]], the central entry point to the history management is [InitHistoryArchiveTasks](https://github.com/status-im/status-go/blob/6322f22783585474803cfc8a6f0a914757d763b5/protocol/messenger_communities.go#L3783). `InitHistoryArchiveTasks` is called from **two main places**:
- During `Messenger.Start()` (startup)
- When enabling archive protocol
In [[Creating History Archives - InitHistoryArchiveTasks]] we find the complete initialization flow:
```
System Startup
Messenger.Start()
Wait for Store Node Availability
InitHistoryArchiveTasks(controlledCommunities)
├─ For each community owner controls:
│ ├─ Check if archive support enabled
│ ├─ Seed existing torrents (if available)
| ├─ CreateAndSeedHistoryArchive
│ ├─ Get community topics and sync missed messages
│ ├─ Check when last archive was created
│ └─ Based on last archive timing:
│ ├─ No archives → StartHistoryArchiveTasksInterval() immediately
│ ├─ Recent archive → Seed + delayed CreateAndSeedHistoryArchive followed by StartHistoryArchiveTasksInterval()
│ └─ Old archive → Create new archive + CreateAndSeedHistoryArchive + StartHistoryArchiveTasksInterval()
└─ Each StartHistoryArchiveTasksInterval():
├─ Runs as background goroutine
├─ Creates ticker with 7-day interval
├─ Every 7 days: CreateAndSeedHistoryArchive()
├─ After seeding: publishes HistoryArchivesSeedingSignal
├─ Signal triggers: dispatchMagnetlinkMessage()
└─ Magnetlink sent to all community members via Waku
```
We will be going step by step through this flow and apply the changes (where we need to diverge, we will...).
### BitTorrent - with or without
In the first pass we do not delete the BitTorrent related code, but rather try to add Codex extensions next to it - this way I hope it will be easier to move things around without being too destructive from the beginning.
### Seed existing torrents (if available)
This step is only needed for torrents. Codex has its own persistence and will start seeding immediately after it starts.
### CreateAndSeedHistoryArchive
The first function that asks for attention is `CreateAndSeedHistoryArchive`. It is from `ArchiveService` interface.
```go
func (m *ArchiveManager) CreateAndSeedHistoryArchive(communityID types.HexBytes, topics []messagingtypes.ContentTopic, startDate time.Time, endDate time.Time, partition time.Duration, encrypt bool) error {
m.UnseedHistoryArchiveTorrent(communityID)
_, err := m.ArchiveFileManager.CreateHistoryArchiveTorrentFromDB(communityID, topics, startDate, endDate, partition, encrypt)
if err != nil {
return err
}
return m.SeedHistoryArchiveTorrent(communityID)
}
```
It calls `CreateHistoryArchiveTorrentFromDB`, which then calls `createHistoryArchiveTorrent`:
```go
func (m *ArchiveFileManager) CreateHistoryArchiveTorrentFromDB(communityID types.HexBytes, topics []messagingtypes.ContentTopic, startDate time.Time, endDate time.Time, partition time.Duration, encrypt bool) ([]string, error) {
return m.createHistoryArchiveTorrent(communityID, make([]*messagingtypes.ReceivedMessage, 0), topics, startDate, endDate, partition, encrypt)
}
```
`createHistoryArchiveTorrent` (`ArchiveFileManager`) is where the work is done.
#### Protobuf messages
Here we list all the Protobuf messages that are relevant to message archives:
```protobuf
message CommunityMessageArchiveMagnetlink {
uint64 clock = 1;
string magnet_uri = 2;
}
message WakuMessage {
bytes sig = 1;
uint64 timestamp = 2;
bytes topic = 3;
bytes payload = 4;
bytes padding = 5;
bytes hash = 6;
string thirdPartyId = 7;
}
message WakuMessageArchiveMetadata {
uint32 version = 1;
uint64 from = 2;
uint64 to = 3;
repeated bytes contentTopic = 4;
}
message WakuMessageArchive {
uint32 version = 1;
WakuMessageArchiveMetadata metadata = 2;
repeated WakuMessage messages = 3;
}
message WakuMessageArchiveIndexMetadata {
uint32 version = 1;
WakuMessageArchiveMetadata metadata = 2;
uint64 offset = 3;
uint64 size = 4;
uint64 padding = 5;
}
message WakuMessageArchiveIndex {
map<string, WakuMessageArchiveIndexMetadata> archives = 1;
}
```
All in `protocol/protobuf/communities.proto`. There is one more, not directly related, but for some reason it contains a `magnet_url` field (to be checked later):
```protobuf
message CommunityRequestToJoinResponse {
uint64 clock = 1;
CommunityDescription community = 2 [deprecated = true];
bool accepted = 3;
bytes grant = 4;
bytes community_id = 5;
string magnet_uri = 6;
bytes protected_topic_private_key = 7;
Shard shard = 8;
// CommunityDescription protocol message with owner signature
bytes community_description_protocol_message = 9;
}
```
We see that most are independent from BitTorrent. The ones that are BitTorrent specific are:
- `CommunityMessageArchiveMagnetlink`
- `WakuMessageArchiveIndexMetadata`
- `WakuMessageArchiveIndex` (because it depends on `WakuMessageArchiveIndexMetadata`)
- `CommunityRequestToJoinResponse` (because of the `magnet_uri` field)
Now, starting with something simple (in the end we are building PoC here), we know that Codex API operates on CID encoded as `base58btc` strings. In `WakuMessageArchiveIndexMetadata`, `offset`, `size`, and `padding` are relevant to the current BitTorrent-based implementation. For Codex we can use something simpler:
```protobuf
message CodexWakuMessageArchiveIndexMetadata {
uint32 version = 1;
WakuMessageArchiveMetadata metadata = 2;
string cid = 3;
}
message CodexWakuMessageArchiveIndex {
map<string, CodexWakuMessageArchiveIndexMetadata> archives = 1;
}
```
#### Appending the index file
> The final implementation proposal does not use files directly and delegates persistence fully to Codex. We also do not call Codex via its Rest API, but instead we use the Codex library (libcodex).
In a more production version we will not operate on the local file system, yet, here, for simplicity, we will be using a physical index file and a separate file for each archive. For this reason, in the initial implementation, a community owner will not query Codex for the current index file. For this purpose, we could use `http://localhost:8001/api/codex/v1/data/${CID}` API, which returns `404` when the file does not exist in the local store:
```bash
curl -s -D - -o /dev/null "http://localhost:8001/api/codex/v1/data/${CID}"
HTTP/1.1 404 Not Found
Connection: close
Server: nim-presto/0.0.3 (amd64/linux)
Content-Length: 74
Date: Thu, 25 Sep 2025 02:15:07 GMT
Content-Type: text/html; charset=utf-8
```
Instead, for this initial implementation, we will just read it from a local directory. For now, we will reuse BitTorrent configuration. BitTorrent config stores the index file under:
```go
path.Join(m.torrentConfig.DataDir, communityID, "index")
```
For codex, we will store it under:
```go
path.Join(m.torrentConfig.DataDir, "codex", communityID, "index")
```
In a similar way, the individual archive to be uploaded we will use:
```go
path.Join(m.torrentConfig.DataDir, "codex", communityID, "data")
```
This data file is temporary and will be overwritten for each new archive created. With Codex, we do not have to append, thus, we do not need the previous data file anymore. We just use file now, because it may be easier to start it this way.
Now, just for convenience, let's recall the original data structures involved:
![[team-nl-br-design-1.svg]]
The data structures using with BitTorrent are:
```go
wakuMessageArchiveIndexProto := &protobuf.WakuMessageArchiveIndex{}
wakuMessageArchiveIndex := make(map[string]*protobuf.WakuMessageArchiveIndexMetadata)
```
The original BitTorrent index, stored in `wakuMessageArchiveIndexProto`, is initially populated using `LoadHistoryArchiveIndexFromFile` function. After that `wakuMessageArchiveIndex` is used as temporary storage so that we can conveniently extend it with new entries and serialize it to protobuf afterwords. We use the contents of `wakuMessageArchiveIndexProto` to set it up:
```go
for hash, metadata := range wakuMessageArchiveIndexProto.Archives {
offset = offset + metadata.Size
wakuMessageArchiveIndex[hash] = metadata
}
```
For the codex extension we proceed in the analogous way:
![[replacing-bittorrent-with-codex-in-status-go-1.svg]]
![[replacing bittorrent with codex in status-go-2.svg]]
```go
codexWakuMessageArchiveIndexProto := &protobuf.CodexWakuMessageArchiveIndex{}
codexWakuMessageArchiveIndex := make(map[string]*protobuf.CodexWakuMessageArchiveIndexMetadata)
```
and then:
```go
for hash, metadata := range codexWakuMessageArchiveIndexProto.Archives {
codexWakuMessageArchiveIndex[hash] = metadata
}
```
Having those variables in place and initialized correctly, we enter the loop and start creating archives one by one.
Basically, we proceed in the same way as with BitTorrent - the `WakuMessageArchive` type does not change.
At some point, we arrive at:
```go
wakuMessageArchiveIndexMetadata := &protobuf.WakuMessageArchiveIndexMetadata{
Metadata: wakuMessageArchive.Metadata,
Offset: offset,
Size: uint64(size),
Padding: uint64(padding),
}
```
For Codex extension, we do not have `offset`, `size`, and `padding` any more as this is something that Codex will take care - but this is the moment we need to call into Codex, to upload the archive and get the corresponding CID back so that we can properly initialize the corresponding index entry:
```go
client := NewCodexClient("localhost", "8080") // make this configurable
cid, err := client.UploadArchive(encodedArchive)
if err != nil {
m.logger.Error("failed to upload to codex", zap.Error(err))
return codexArchiveIDs, err
}
m.logger.Debug("uploaded to codex", zap.String("cid", cid))
codexWakuMessageArchiveIndexMetadata := &protobuf.CodexWakuMessageArchiveIndexMetadata{
Metadata: wakuMessageArchive.Metadata,
Cid: cid,
}
codexWakuMessageArchiveIndexMetadataBytes, err := proto.Marshal(codexWakuMessageArchiveIndexMetadata)
if err != nil {
return codexArchiveIDs, err
}
codexArchiveID := crypto.Keccak256Hash(codexWakuMessageArchiveIndexMetadataBytes).String()
codexArchiveIDs = append(codexArchiveIDs, codexArchiveID)
codexWakuMessageArchiveIndex[codexArchiveID] = codexWakuMessageArchiveIndexMetadata
```
where `CodexClient` is a helper that encapsulates uploading arbitrary data to a Codex client via `/api/codex/v1/data` API. The corresponding `curl` call would be similar to:
```bash
curl -X POST \
http://localhost:${PORT}/api/codex/v1/data \
-H 'Content-Type: application/octet-stream' \
-H 'Content-Disposition: filename="archive-data.bin"' \
-w '\n' \
-T archive-data.bin
zDvZRwzm22eSYNdLBuNHVi7jSTR2a4n48yy4Ur9qws4vHV6madiz
```
At this stage we have an individual archive uploaded to Codex (it should be save there now) It is already being advertised but nobody is looking for it yet as we did not finish building the Codex-aware index file, which contains CIDs for all the archives.

View File

@ -0,0 +1,525 @@
---
related-to:
- "[[status-go-codex integration - design notes]]"
---
In [[Running Unit Tests for status-go]] we provide general notes on running unit tests in the status-go project. And then we have a similar note about functional tests in [[Running functional tests in status-go]].
Also, to learn the history archive creation/upload and download/processing flows (recorded AI conversation with some edits), please check:
- archive creation/upload: [[When are magnetlink messages sent]]
- archive download/processing: [[status-go processing magnet links]]. In this document I am including a link to an excalidraw diagram that can be helpful - for convenience: [https://link.excalidraw.com/readonly/vSon9uiUhYJWrwXiKAsi](https://link.excalidraw.com/readonly/vSon9uiUhYJWrwXiKAsi)
To grasp the concept of topics and filters - check [[Filters, Topics, Channels, and Chat IDs in status-go and waku]].
In this document, we focus on our Codex extension to status-go and here we focus on the related unit and integration tests.
> Notice that this is still a proof-of-concept and may deserve a bit more exhaustive testing if it is decided to use it in the production.
The most important tests we added:
1. [CodexClient](https://github.com/status-im/status-go/blob/feat/status-go-codex-integration/protocol/communities/codex_client.go) - the proxy to the code library `libcodex`:
- [protocol/communities/codex_client_test.go](https://github.com/status-im/status-go/blob/feat/status-go-codex-integration/protocol/communities/codex_client_test.go)
2. [CodexIndexDownloader](https://github.com/status-im/status-go/blob/feat/status-go-codex-integration/protocol/communities/codex_index_downloader.go):
- [protocol/communities/codex_index_downloader_test.go](https://github.com/status-im/status-go/blob/feat/status-go-codex-integration/protocol/communities/codex_index_downloader_test.go)
3. [CodexArchiveDownloader](https://github.com/status-im/status-go/blob/feat/status-go-codex-integration/protocol/communities/codex_archive_downloader.go):
- [protocol/communities/codex_archive_downloader_test.go](https://github.com/status-im/status-go/blob/feat/status-go-codex-integration/protocol/communities/codex_archive_downloader_test.go)
4. More end-2-end developer tests in [protocol/communities_messenger_token_permissions_test.go](https://github.com/status-im/status-go/blob/feat/status-go-codex-integration/protocol/communities_messenger_token_permissions_test.go):
- [TestUploadDownloadCodexHistoryArchives_withSharedCodexClient](https://github.com/status-im/status-go/blob/377d5c0df5e80e6543d2de816fbb39e684216b73/protocol/communities_messenger_token_permissions_test.go#L2386)
- [TestUploadDownloadCodexHistoryArchives](https://github.com/status-im/status-go/blob/377d5c0df5e80e6543d2de816fbb39e684216b73/protocol/communities_messenger_token_permissions_test.go#L2871)
5. Functional tests in [tests-functional/tests/test_wakuext_community_archives.py](https://github.com/status-im/status-go/blob/feat/status-go-codex-integration/tests-functional/tests/test_wakuext_community_archives.py)
In the remaining part of the document we give some handy information on how we can run those tests.
### Regenerating artifacts
There are two artifacts that need to be updated:
- the protobuf
- the mocks
For the first one - protobuf - you need two components:
1. **`protoc`** - the Protocol Buffer compiler itself
2. **`protoc-gen-go`** - the Go plugin for protoc that generates `.pb.go` files
#### Installing protoc
I have followed the instructions from [Protocol Buffer Compiler Installation](https://protobuf.dev/installation/).
The following bash script (Arch Linux) can come in handy:
```bash
#!/usr/bin/env bash
set -euo pipefail
echo "installing go..."
sudo pacman -S --noconfirm --needed go
echo "installing go protoc compiler"
PB_REL="https://github.com/protocolbuffers/protobuf/releases"
VERSION="32.1"
FILE="protoc-${VERSION}-linux-x86_64.zip"
# 1. create a temp dir
TMP_DIR="$(mktemp -d)"
# ensure cleanup on exit
trap 'rm -rf "$TMP_DIR"' EXIT
echo "Created temp dir: $TMP_DIR"
# 2. download file into temp dir
curl -L -o "$TMP_DIR/$FILE" "$PB_REL/download/v$VERSION/$FILE"
# 3. unzip into ~/.local/share/go
mkdir -p "$HOME/.local/share/go"
unzip -o "$TMP_DIR/$FILE" -d "$HOME/.local/share/go"
# 4. cleanup handled automatically by trap
echo "protoc $VERSION installed into $HOME/.local/share/go"
```
After that make sure that `$HOME/.local/share/go/bin` is in your path, and you should get:
```bash
protoc --version
libprotoc 32.1
```
#### Installing protoc-gen-go
The `protoc-gen-go` plugin is required to generate Go code from `.proto` files.
Install it with:
```bash
go install google.golang.org/protobuf/cmd/protoc-gen-go@v1.34.1
```
Make sure `$(go env GOPATH)/bin` is in your `$PATH` so protoc can find the plugin.
Verify the installation:
```bash
which protoc-gen-go
protoc-gen-go --version
# Should output: protoc-gen-go v1.34.1
```
#### Installing mockgen
In order to regenerate mocks you will need `mockgen`.
You can install it with:
```bash
go install go.uber.org/mock/mockgen
```
> Also make sure you have `$(go env GOPATH)/bin` in your PATH. Otherwise
make sure you have something like `export PATH="$PATH:$(go env GOPATH)/bin"`
in your `~/.bashrc` (adjusted to your SHELL and OS version).
This should be part of your standard GO installation.
If everything works well, you should see something like:
```bash
which mockgen && mockgen -version
/home/<your-user-name>/go/bin/mockgen
v0.6.0
```
If everything seems to be under control, we can now proceed with actual generation.
The easiest way is to regenerate all in one go:
```bash
go generate ./...
```
If you just need to regenerate the mocks:
```bash
go generate ./protocol/communities
```
If you just need to regenerate the protobuf:
```bash
go generate ./protobuf
```
> If you run `make`, e.g. `make statusgo-library`, the correct `generate` commands for the protobuf will be run for you. So in practice, you may not need to run `go generate ./protobuf` manually yourself - but for reference, why not... let's break something ;).
### Environment variables to run unit tests
After Codex library (`libcodex`) has been added to the project the build system depends on it. Thus, to run the tests directly using `go test` or `gotestsum` command you need to make sure that the following environment variables are set:
```bash
export LIBS_DIR="$(realpath ./libs)"
export CGO_CFLAGS=-I$LIBS_DIR
export CGO_LDFLAGS="-L$LIBS_DIR -lcodex -Wl,-rpath,$LIBS_DIR"
```
Additionally, since status-go added the SDS library, there are even more environment variables that need to be setup. We have a handy script to do just that:
```bash
#!/usr/bin/env bash
# Source this file to set up environment variables for running tests with gotestsum
# Usage: source ./set-test-env.sh
SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
export LIBS_DIR="$(realpath "$SCRIPT_DIR/libs")"
export NIM_SDS_LIB_DIR="$(realpath "$SCRIPT_DIR/../nim-sds/build")"
export NIM_SDS_INC_DIR="$(realpath "$SCRIPT_DIR/../nim-sds/library")"
export CGO_CFLAGS="-I$LIBS_DIR -I$NIM_SDS_INC_DIR"
export CGO_LDFLAGS="-L$LIBS_DIR -lcodex -Wl,-rpath,$LIBS_DIR -L$NIM_SDS_LIB_DIR -lsds"
# Detect OS and set library path accordingly
if [[ "$OSTYPE" == "darwin"* ]]; then
export DYLD_LIBRARY_PATH="$LIBS_DIR:$NIM_SDS_LIB_DIR:$DYLD_LIBRARY_PATH"
echo "Environment configured for macOS"
else
export LD_LIBRARY_PATH="$LIBS_DIR:$NIM_SDS_LIB_DIR:$LD_LIBRARY_PATH"
echo "Environment configured for Linux"
fi
echo "Test environment variables set:"
echo " LIBS_DIR=$LIBS_DIR"
echo " NIM_SDS_LIB_DIR=$NIM_SDS_LIB_DIR"
echo " NIM_SDS_INC_DIR=$NIM_SDS_INC_DIR"
echo ""
echo "You can now run tests with gotestsum, for example:"
echo ' gotestsum --packages="./protocol/communities" -f testname -- -count 1 -tags "gowaku_no_rln gowaku_skip_migrations" -run CodexArchiveManagerSuite'
```
Thus, just run:
```bash
source ./set-test-env.sh
```
in the top level directory and you should be good to go.
### Running developer-facing tests for Codex abstractions
**TL;DR**
Install `gotestsum`: `go install gotest.tools/gotestsum@v1.13.0`
Then to run `./protocol/community` tests:
```bash
gotestsum --packages="./protocol/communities" -f testname --rerun-fails -- -count 1 -timeout "5m" -tags "gowaku_no_rln gowaku_skip_migrations"
```
To run all `./protocol` tests:
```bash
gotestsum --packages="./protocol" -f testname --rerun-fails -- -count 1 -timeout "45m" -tags "gowaku_no_rln gowaku_skip_migrations"
```
Of course, you can also just run:
```bash
make test
```
I found running tests using `gotestsum` more reliable and more appropriate for development.
To be more selective, e.g. in order to run all the tests from
`CodexArchiveDownloaderSuite`, run:
```bash
gotestsum --packages="./protocol/communities" -f testname --rerun-fails -- -run CodexArchiveDownloader -count 1
```
or for an individual test from that suite:
```bash
gotestsum --packages="./protocol/communities" -f testname --rerun-fails -- -run CodexArchiveDownloaderSuite/TestCancellationDuringPolling -count 1
```
Notice, that the `-run` flag accepts a regular expression that matches against the full test path, so you can be more concise in naming if necessary, e.g.:
```bash
gotestsum --packages="./protocol/communities" -f testname --rerun-fails -- -run CodexArchiveDownloader/Cancellation -count 1
```
This also applies to native `go test` command, e.g.:
```bash
go test -v ./protocol/communities -count 1 -run CodexArchiveDownloaderSuite/TestCancellationDuringPolling
```
For a more verbose output including logs use `-f standard-verbose`, e.g.:
```bash
gotestsum --packages="./protocol/communities" -f standard-verbose --rerun-fails -- -v -count 1 -run CodexArchiveDownloaderSuite/TestCancellationDuringPolling
```
Finally, the relevant Codex integration tests from the `./protocol` module run:
```bash
gotestsum --packages="./protocol" -f testname -- -count 1 -tags "gowaku_no_rln gowaku_skip_migrations" -run TestMessengerCommunitiesTokenPermissionsSuite/Codex
PASS protocol.TestMessengerCommunitiesTokenPermissionsSuite/TestUploadDownloadCodexHistoryArchives (5.44s)
PASS protocol.TestMessengerCommunitiesTokenPermissionsSuite/TestUploadDownloadCodexHistoryArchives_withSharedCodexClient (5.50s)
PASS protocol.TestMessengerCommunitiesTokenPermissionsSuite (10.94s)
PASS protocol
DONE 3 tests in 10.968s
```
### Running functional tests
Here we use our convenience scripts. First we build the docker image running:
```bash
_assets/scripts/build_status_go_docker.sh
```
> We run all the commands from the top-level project directory
And then:
```bash
./_assets/scripts/run_functional_tests_dev.sh TestCommunityArchives
Using existing virtual environment
Installing dependencies
Discovering tests to be run...
Found 5 tests matching: TestCommunityArchives
Tests to execute:
1) test_community_archive_index_exists
2) test_community_archive_exists_for_default_chat
3) test_archive_is_not_created_without_messages
4) test_different_archives_are_created_with_multiple_messages
5) test_archive_is_downloaded_after_logout_login
Continue with execution? (y/n): y
...
tests-functional/tests/test_wakuext_community_archives.py::TestCommunityArchives::test_community_archive_exists_for_default_chat
tests-functional/tests/test_wakuext_community_archives.py::TestCommunityArchives::test_community_archive_index_exists
tests-functional/tests/test_wakuext_community_archives.py::TestCommunityArchives::test_archive_is_downloaded_after_logout_login
tests-functional/tests/test_wakuext_community_archives.py::TestCommunityArchives::test_different_archives_are_created_with_multiple_messages
tests-functional/tests/test_wakuext_community_archives.py::TestCommunityArchives::test_archive_is_not_created_without_messages
[gw1] [ 20%] PASSED tests-functional/tests/test_wakuext_community_archives.py::TestCommunityArchives::test_community_archive_exists_for_default_chat
[gw2] [ 40%] PASSED tests-functional/tests/test_wakuext_community_archives.py::TestCommunityArchives::test_archive_is_not_created_without_messages
[gw3] [ 60%] PASSED tests-functional/tests/test_wakuext_community_archives.py::TestCommunityArchives::test_different_archives_are_created_with_multiple_messages
[gw4] [ 80%] PASSED tests-functional/tests/test_wakuext_community_archives.py::TestCommunityArchives::test_archive_is_downloaded_after_logout_login
[gw0] [100%] PASSED tests-functional/tests/test_wakuext_community_archives.py::TestCommunityArchives::test_community_archive_index_exists
======================================================================== 5 passed in 141.69s (0:02:21) =========================================================================
Testing finished
Running cleanup...
```
You can also run a single test as follow:
```bash
./_assets/scripts/run_functional_tests_dev.sh test_community_archive_index_exists
```
The logs are in `tests-functional/logs`.
## More notes
For the Codex-related tests in `protocol/communities_messenger_token_permissions_test.go`, in we add some more context below.
Our two "Codex"-related tests are based on `TestImportDecryptedArchiveMessages` test.
This test produces lots of output - with lot's warnings and errors - so looking at the log to judge the success would be a challenge. Yet, the test passes:
```bash
gotestsum --packages="./protocol" -f testname -- -run "TestMessengerCommunitiesTokenPermissionsSuite/TestImportDecryptedArchiveMessages" -count 1 -tags "gowaku_no_rln gowaku_skip_migrations"
PASS protocol.TestMessengerCommunitiesTokenPermissionsSuite/TestImportDecryptedArchiveMessages (1.88s)
PASS protocol.TestMessengerCommunitiesTokenPermissionsSuite (1.88s)
PASS protocol
DONE 2 tests in 1.900s
```
If you want to take a look at the logs you can use the more verbose version of the above command:
```bash
gotestsum --packages="./protocol" -f standard-verbose -- -run "TestMessengerCommunitiesTokenPermissionsSuite/TestImportDecryptedArchiveMessages" -v -count 1 -tags "gowaku_no_rln gowaku_skip_migrations"
```
and you can use `tee` to redirect all the output to a file:
```bash
gotestsum --packages="./protocol" -f standard-verbose -- -run "TestMessengerCommunitiesTokenPermissionsSuite/TestImportDecryptedArchiveMessages" -v -count 1 -tags "gowaku_no_rln gowaku_skip_migrations" | tee "test_1.log"
```
The test first creates a community and sets up the corresponding permissions. Then the community owner sends a message to the community and then immediately retrieves it so that it is now recorded in the DB.
After that it prepares archive parameters: `startDate`, `endDate`, `partition`, and community `topics`. All those will be passed to `CreateHistoryArchiveTorrentFromDB` - our entry point to creating history archive torrent.
```go
// 1.1. Create community
community, chat := s.createCommunity()
// ...
// 1.2. Setup permissions
// ...
// 2. Owner: Send a message A
messageText1 := RandomLettersString(10)
message1 := s.sendChatMessage(s.owner, chat.ID, messageText1)
// 2.2. Retrieve own message (to make it stored in the archive later)
_, err = s.owner.RetrieveAll()
s.Require().NoError(err)
// 3. Owner: Create community archive
const partition = 2 * time.Minute
messageDate := time.UnixMilli(int64(message1.Timestamp))
startDate := messageDate.Add(-time.Minute)
endDate := messageDate.Add(time.Minute)
topic := messagingtypes.BytesToContentTopic(messaging.ToContentTopic(chat.ID))
communityCommonTopic := messagingtypes.BytesToContentTopic(messaging.ToContentTopic(community.UniversalChatID()))
topics := []messagingtypes.ContentTopic{topic, communityCommonTopic}
torrentConfig := params.TorrentConfig{
Enabled: true,
DataDir: os.TempDir() + "/archivedata",
TorrentDir: os.TempDir() + "/torrents",
Port: 0,
}
// Share archive directory between all users
s.owner.archiveManager.SetTorrentConfig(&torrentConfig)
s.bob.archiveManager.SetTorrentConfig(&torrentConfig)
s.owner.config.messengerSignalsHandler = &MessengerSignalsHandlerMock{}
s.bob.config.messengerSignalsHandler = &MessengerSignalsHandlerMock{}
```
Finally we call the `CreateHistoryArchiveTorrentFromDB`:
```go
archiveIDs, err := s.owner.archiveManager.CreateHistoryArchiveTorrentFromDB(community.ID(), topics, startDate, endDate, partition, community.Encrypted())
s.Require().NoError(err)
s.Require().Len(archiveIDs, 1)
```
Notice, there is one archive expected.
The `CreateHistoryArchiveTorrentFromDB` is called directly here, in a way bypassing the torrent seeding: in normal flow `CreateHistoryArchiveTorrentFromDB` is called in `CreateAndSeedHistoryArchive` which immediately after creating the archive, calls `SeedHistoryArchiveTorrent`. `CreateHistoryArchiveTorrentFromDB` calls `createHistoryArchiveTorrent` - which is central to the archive creating.
The "Codex" version of the `CreateHistoryArchiveTorrentFromDB` is `CreateHistoryArchiveCodexFromDB` which will call `createHistoryArchiveCodex` - this is where archives are created and uploaded to Codex.
Another function that this test "touches" is `LoadHistoryArchiveIndexFromFile`, for which the "Codex" version is `CodexLoadHistoryArchiveIndex`. Notice, we do not need to load it from a *file* anymore - Codex takes care.
Thus, this test focuses on an important cut of the whole end-to-end flow and in our case, where we use Codex, it also indirectly tests seeding and retrieving the archives from the network.
### Other places we can consider to test
The integration test described above does not cover actual publishing of the generated archives over waku channel. This normally happens in `CreateAndSeedHistoryArchive`:
```
CreateAndSeedHistoryArchive
|- if distributionPreference == Torrent
|- CreateHistoryArchiveTorrentFromDB
|- SeedHistoryArchiveTorrent
|- if distributionPreference == Codex
|- CreateHistoryArchiveCodexFromDB
|- SeedHistoryArchiveIndexCid (only in some cases: Codex takes care for seeding)
|- publish: HistoryArchiveSeedingSignal
```
Depending on the *distribution preference* (Codex or Torrent) we either seed the index file and the archives for torrent or for Codex and when at least one publishing method succeeds, we do:
```go
m.publisher.publish(&Subscription{
HistoryArchivesSeedingSignal: &signal.HistoryArchivesSeedingSignal{
CommunityID: communityID.String(),
MagnetLink: archiveTorrentCreatedSuccessfully, // true if torrent created successfully
IndexCid: archiveCodexCreatedSuccessfully, // true if codex created successfully
},
})
```
This signal is subsequently received in `handleCommunitiesHistoryArchivesSubscription`, where we find:
```go
if c.IsControlNode() {
if sub.HistoryArchivesSeedingSignal.MagnetLink {
err := m.dispatchMagnetlinkMessage(sub.HistoryArchivesSeedingSignal.CommunityID)
if err != nil {
m.logger.Debug("failed to dispatch magnetlink message", zap.Error(err))
}
}
if sub.HistoryArchivesSeedingSignal.IndexCid {
err := m.dispatchIndexCidMessage(sub.HistoryArchivesSeedingSignal.CommunityID)
if err != nil {
m.logger.Debug("failed to dispatch index cid message", zap.Error(err))
}
}
}
```
The `dispatchMagnetlinkMessage` and `dispatchIndexCidMessage` is where dispatching is happening. For Codex the MessageType used is `protobuf.ApplicationMetadataMessage_COMMUNITY_MESSAGE_ARCHIVE_INDEX_CID`.
The message is sent as follows:
```go
chatID := community.UniversalChatID()
rawMessage := common.RawMessage{
LocalChatID: chatID,
Sender: community.PrivateKey(),
Payload: encodedMessage,
MessageType: protobuf.ApplicationMetadataMessage_COMMUNITY_MESSAGE_ARCHIVE_INDEX_CID,
SkipGroupMessageWrap: true,
PubsubTopic: community.PubsubTopic(),
Priority: &messagingtypes.LowPriority,
}
_, err = m.messaging.SendPublic(context.Background(), chatID, rawMessage)
if err != nil {
return err
}
err = m.communitiesManager.UpdateCommunityDescriptionIndexCidMessageClock(community.ID(), indexCidMessage.Clock)
if err != nil {
return err
}
return m.communitiesManager.UpdateIndexCidMessageClock(community.ID(), indexCidMessage.Clock)
```
Here notice the call to `UpdateCommunityDescriptionIndexCidMessageClock`. As you can see, the clocks are also recorded in the community description. Community description is sent over waku to other community members periodically (every 5mins) or on some events. This how it is sent in `publishOrg` in `protocol/messenger_communities.go`:
```go
rawMessage := common.RawMessage{
Payload: payload,
Sender: org.PrivateKey(),
// we don't want to wrap in an encryption layer message
SkipEncryptionLayer: true,
CommunityID: org.ID(),
MessageType: protobuf.ApplicationMetadataMessage_COMMUNITY_DESCRIPTION,
PubsubTopic: org.PubsubTopic(), // TODO: confirm if it should be sent in community pubsub topic
Priority: &messagingtypes.HighPriority,
}
if org.Encrypted() {
members := org.GetMemberPubkeys()
if err != nil {
return err
}
rawMessage.CommunityKeyExMsgType = messagingtypes.KeyExMsgRekey
// This should be the one that it was used to encrypt this community
rawMessage.HashRatchetGroupID = org.ID()
rawMessage.Recipients = members
}
messageID, err := m.messaging.SendPublic(context.Background(), org.IDString(), rawMessage)
```
When community members receive the community description, they compare the clocks in the community description with the most recent clocks of the most recent magnet link or index Cid. IF the clocks in the community description are newer (more recent), they update the local copies, so that when a new magnet link or index Cid message arrives, they know to ignore outdated messages.
To read more, check: [[History Archives and Community Description]] (AI session recording).
This naturally brought us to the reception of the magnet links/index Cid (see also [[status-go processing magnet links]] and [[status-go publishing magnet links]]).
Going from Waku and layers of status-go, the accumulated messages are periodically collected using `RetrieveAll` functions, which passes the message to `handleRetrievedMessages` for processing.
For each message, `dispatchToHandler` is called (in it in generated file `message_handler.go` generated with `go generate ./cmd/generae_handlers/`) where are magnet links/index Cid message are forwarded to their respective handlers: `handleCommunityMessageArchiveMagnetlinkProtobuf` and `handleCommunityMessageArchiveIndexCidProtobuf`. In the end they end up in `HandleHistoryArchiveMagnetlinkMessage` and `HandleHistoryArchiveIndexCidMessage` respectively. At the end, they trigger async processing in go routines calling `downloadAndImportHistoryArchives` and `downloadAndImportCodexHistoryArchives`. Just after the respective go routines are created, the clock are updated with `m.communitiesManager.UpdateMagnetlinkMessageClock(id, clock)` and `m.communitiesManager.UpdateIndexCidMessageClock(id, clock)`.
But before even starting go routine, we check the distribution preference with `GetArchiveDistributionPreference` and the last seen index Cid `m.communitiesManager.GetLastSeenIndexCid(id)` (it will be updated after downloading the archives but before processing them in `downloadAndImportCodexHistoryArchives`) and the corresponding clock `GetIndexCidMessageClock`, to make sure we are not processing some outdated archives.
We can see that the above "integration test", in case of Codex, by calling `RetrieveAll` it effectively triggers archive download from Codex, yet it does not cover the dissemination of the indexCid - the test basically assumes that the index file has been stored, thus, if we want to cover more, we need to go beyond that test.
Perhaps, to cover the whole flow, it is best to build status-desktop with our status-go library and test it from there.

Binary file not shown.

After

Width:  |  Height:  |  Size: 642 KiB

File diff suppressed because one or more lines are too long

After

Width:  |  Height:  |  Size: 42 KiB

File diff suppressed because one or more lines are too long

After

Width:  |  Height:  |  Size: 58 KiB

File diff suppressed because one or more lines are too long

After

Width:  |  Height:  |  Size: 58 KiB

File diff suppressed because one or more lines are too long

After

Width:  |  Height:  |  Size: 66 KiB

File diff suppressed because one or more lines are too long

After

Width:  |  Height:  |  Size: 42 KiB

File diff suppressed because one or more lines are too long

After

Width:  |  Height:  |  Size: 21 KiB