some dev docs added

This commit is contained in:
Marcin Czenko 2025-09-25 06:08:55 +02:00
parent 6e3bc29a92
commit 6a0e313688
No known key found for this signature in database
GPG Key ID: A0449219BDBA98AE
6 changed files with 1321 additions and 4 deletions

View File

@ -0,0 +1,203 @@
---
related-to:
- "[[Team-NLBR Solution Proposal]]"
- "[[status-go publishing magnet links]]"
- "[[status-go processing magnet links]]"
- "[[status-go-codex integration - design notes]]"
- "[[Creating History Archives - InitHistoryArchiveTasks]]"
---
Great question! Let me trace the connection between `StartHistoryArchiveTasksInterval` and `InitHistoryArchiveTasks` to show you how the periodic archive creation system is initialized and started.
Searched codebase for "InitHistoryArchiveTasks StartHistoryArchiveTasksInterval", 13 results
Perfect! Now I can show you the complete connection between `InitHistoryArchiveTasks` and `StartHistoryArchiveTasksInterval`. Let me trace through this step by step:
## **Connection Flow: `InitHistoryArchiveTasks``StartHistoryArchiveTasksInterval`**
### **1. Where `InitHistoryArchiveTasks` is Called**
Read [](file:///home/mc2/code/status-im/status-go/protocol/messenger.go#640-640), lines 640 to 655
`InitHistoryArchiveTasks` is called from **two main places**:
**A. During `Messenger.Start()` (startup):**
```go
// In messenger.go Start() method
controlledCommunities, err := m.communitiesManager.Controlled()
if err != nil {
return nil, err
}
if m.archiveManager.IsReady() {
go func() {
// Wait for store node to be available
select {
case <-m.ctx.Done():
return
case <-m.messaging.OnStorenodeAvailable():
}
m.InitHistoryArchiveTasks(controlledCommunities) // CALLED HERE
}()
}
```
**B. When enabling archive protocol:**
```go
// In EnableCommunityHistoryArchiveProtocol()
controlledCommunities, err := m.communitiesManager.Controlled()
if err != nil {
return err
}
if len(controlledCommunities) > 0 {
go m.InitHistoryArchiveTasks(controlledCommunities) // CALLED HERE
}
```
### **2. What `InitHistoryArchiveTasks` Does**
Read [](file:///home/mc2/code/status-im/status-go/protocol/messenger_communities.go#3770-3770), lines 3770 to 3810
Read [](file:///home/mc2/code/status-im/status-go/protocol/messenger_communities.go#3810-3810), lines 3810 to 3840
Read [](file:///home/mc2/code/status-im/status-go/protocol/messenger_communities.go#3840-3840), lines 3840 to 3880
Read [](file:///home/mc2/code/status-im/status-go/protocol/messenger_communities.go#3880-3880), lines 3880 to 3895
### **3. How `StartHistoryArchiveTasksInterval` Works**
Read [](file:///home/mc2/code/status-im/status-go/protocol/communities/manager_archive.go#320-320), lines 320 to 345
## **Complete Connection Flow**
Here's the detailed connection between `InitHistoryArchiveTasks` and `StartHistoryArchiveTasksInterval`:
### **1. Initialization Chain**
```
Messenger.Start() [startup]
Wait for store node availability
InitHistoryArchiveTasks(controlledCommunities)
For each controlled community:
├─ Check if archive support enabled
├─ Seed existing torrents if available
├─ Get community chat filters/topics
├─ Request missed messages from store nodes
├─ Check last archive timestamp
└─ Decide archive strategy:
├─ Case 1: No archives yet → StartHistoryArchiveTasksInterval()
├─ Case 2: Recent archive → Seed + delayed start → StartHistoryArchiveTasksInterval()
└─ Case 3: Old archive → Create new + StartHistoryArchiveTasksInterval()
```
### **2. The Three Archive Strategy Cases**
**Case 1: No Previous Archives (`lastArchiveEndDateTimestamp == 0`)**
```go
// No prior messages to be archived, so we just kick off the archive creation loop
// for future archives
go m.archiveManager.StartHistoryArchiveTasksInterval(c, messageArchiveInterval)
```
**Case 2: Recent Archive (`durationSinceLastArchive < messageArchiveInterval`)**
```go
// Last archive is less than `interval` old, wait until `interval` is complete,
// then create archive and kick off archive creation loop for future archives
// Seed current archive in the meantime
err := m.archiveManager.SeedHistoryArchiveTorrent(c.ID())
timeToNextInterval := messageArchiveInterval - durationSinceLastArchive
time.AfterFunc(timeToNextInterval, func() {
err := m.archiveManager.CreateAndSeedHistoryArchive(c.ID(), topics, lastArchiveEndDate, to.Add(timeToNextInterval), messageArchiveInterval, c.Encrypted())
go m.archiveManager.StartHistoryArchiveTasksInterval(c, messageArchiveInterval)
})
```
**Case 3: Old Archive (`durationSinceLastArchive >= messageArchiveInterval`)**
```go
// Looks like the last archive was generated more than `interval`
// ago, so lets create a new archive now and then schedule the archive
// creation loop
err := m.archiveManager.CreateAndSeedHistoryArchive(c.ID(), topics, lastArchiveEndDate, to, messageArchiveInterval, c.Encrypted())
go m.archiveManager.StartHistoryArchiveTasksInterval(c, messageArchiveInterval)
```
### **3. What `StartHistoryArchiveTasksInterval` Does**
```go
func (m *ArchiveManager) StartHistoryArchiveTasksInterval(community *Community, interval time.Duration) {
// 1. Check if already running for this community
if _, exists := m.historyArchiveTasks.Load(id); exists {
return // Already running
}
// 2. Set up cancellation and tracking
cancel := make(chan struct{})
m.historyArchiveTasks.Store(id, cancel)
m.historyArchiveTasksWaitGroup.Add(1)
// 3. Start periodic ticker
ticker := time.NewTicker(interval) // Default: 1 hour intervals
// 4. Archive creation loop
for {
select {
case <-ticker.C:
// Every interval, create and seed new archive
err = m.CreateAndSeedHistoryArchive(community.ID(), topics, lastArchiveEndDate, to, interval, community.Encrypted())
case <-cancel:
// Cleanup and exit
m.UnseedHistoryArchiveTorrent(community.ID())
m.historyArchiveTasks.Delete(id)
m.historyArchiveTasksWaitGroup.Done()
return
}
}
}
```
### **4. The Interval Constants**
Searched text for `messageArchiveInterval` (`**/messenger_communities.go`), 20 results
The default interval is **7 days** (`7 * 24 * time.Hour`).
### **5. Complete Flow Summary**
```
System Startup
Messenger.Start()
Wait for Store Node Availability
InitHistoryArchiveTasks(controlledCommunities)
├─ For each community owner controls:
│ ├─ Check if archive support enabled
│ ├─ Seed existing torrents (if available)
| ├─ CreateAndSeedHistoryArchive
│ ├─ Get community topics and sync missed messages
│ ├─ Check when last archive was created
│ └─ Based on last archive timing:
│ ├─ No archives → StartHistoryArchiveTasksInterval() immediately
│ ├─ Recent archive → Seed + delayed CreateAndSeedHistoryArchive followed by StartHistoryArchiveTasksInterval()
│ └─ Old archive → Create new archive + CreateAndSeedHistoryArchive + StartHistoryArchiveTasksInterval()
└─ Each StartHistoryArchiveTasksInterval():
├─ Runs as background goroutine
├─ Creates ticker with 7-day interval
├─ Every 7 days: CreateAndSeedHistoryArchive()
├─ After seeding: publishes HistoryArchivesSeedingSignal
├─ Signal triggers: dispatchMagnetlinkMessage()
└─ Magnetlink sent to all community members via Waku
```
So `InitHistoryArchiveTasks` is the **initialization and setup function** that analyzes each community's archive state and decides how to start the periodic archive creation, while `StartHistoryArchiveTasksInterval` is the **long-running background worker** that actually performs the periodic archive creation every 7 days.
The connection is that `InitHistoryArchiveTasks` **always ends up calling** `StartHistoryArchiveTasksInterval` for each community (either immediately, delayed, or after creating a catch-up archive), making it the entry point that bootstraps the entire periodic archive system.

View File

@ -22,6 +22,12 @@ Plus, and what the original documentation does not say, you will need `gotestsum
go install gotest.tools/gotestsum@latest
```
or to use specific version (`v1.13.0` was the most recent while writing this doc):
```bash
go install gotest.tools/gotestsum@v1.13.0
```
You can check it is available by running:
```bash
@ -29,8 +35,57 @@ gotestsum --version
gotestsum version dev
```
> `dev` version may be coming from using `@latest` when installing `gotestsum`.
`dev` version comes from using `@latest` when installing `gotestsum`. If you installed a concrete version, you will see:
```bash
gotestsum --version
gotestsum version v1.13.0
```
You may also manually install go Protobuf compiler: `protoc`. I have followed the instructions from [Protocol Buffer Compiler Installation](https://protobuf.dev/installation/).
The following bash script (Arch Linux) can come in handy:
```bash
#!/usr/bin/env bash
set -euo pipefail
echo "installing go..."
sudo pacman -S --noconfirm --needed go
echo "installing go protoc compiler"
PB_REL="https://github.com/protocolbuffers/protobuf/releases"
VERSION="32.1"
FILE="protoc-${VERSION}-linux-x86_64.zip"
# 1. create a temp dir
TMP_DIR="$(mktemp -d)"
# ensure cleanup on exit
trap 'rm -rf "$TMP_DIR"' EXIT
echo "Created temp dir: $TMP_DIR"
# 2. download file into temp dir
curl -L -o "$TMP_DIR/$FILE" "$PB_REL/download/v$VERSION/$FILE"
# 3. unzip into ~/.local/share/go
mkdir -p "$HOME/.local/share/go"
unzip -o "$TMP_DIR/$FILE" -d "$HOME/.local/share/go"
# 4. cleanup handled automatically by trap
echo "protoc $VERSION installed into $HOME/.local/share/go"
```
After that make sure that `$HOME/.local/share/go/bin` is in your path, and you should get:
```bash
protoc --version
libprotoc 32.1
```
### Building backend and the libs
Just to check if everything is setup correctly, let's build `status-backend` (which is a wrapper over status-go that provides web API - handy for testing), and then status-go static and shared libraries:

View File

@ -1,3 +1,11 @@
---
related-to:
- "[[Team-NLBR Solution Proposal]]"
- "[[status-go publishing magnet links]]"
- "[[status-go processing magnet links]]"
- "[[status-go-codex integration - design notes]]"
- "[[Creating History Archives - InitHistoryArchiveTasks]]"
---
The `TorrentConfig` type provides the configuration for the BitTorrent-based History Archive management functionality:
```go
@ -113,9 +121,9 @@ wakuMessageArchiveIndex[archiveID] = wakuMessageArchiveIndexMetadata
> `wakuMessageArchiveIndex` is earlier initialized to contain existing archive index entries from the current `index` file. Here we are basically appending new archive meta to the archive index data structure.
We repeat the whole process for each message chunk in the given time period, adding more period if needed (recall, each period is 7 days long).
We repeat the whole process for each message chunk in the given time period, adding more periods if needed (recall, each period is 7 days long).
After that we have a list of new archives (in `encodedArchives`) and a new archive index entries. We are ready to be encoded and serialized to the corresponding `data` (by appending) and `index` files.
After that we have a list of new archives (in `encodedArchives`) and a new archive index entries. They are ready to be encoded and serialized to the corresponding `data` (by appending) and `index` files.
Finally, the corresponding torrent file is (re)created, the `HistoryArchivesCreatedSignal` is emitted, and the last message archive end date is recorded in the persistence.
@ -153,7 +161,11 @@ type WakuMessageArchiveIndexMetadata struct {
}
```
We then upload the resulting `index` to Codex under its own `index` CID. Instead of the magnet link, the community owner only publishes this `index` CID.
Now instead of appending each new archive to the data file, we stream each single archive to codex via an API call. For each archive, we receive a CID back from Codex, which we then add to the corresponding archive index metadata entry as defined above.
After all archive entries are persisted, we then upload the resulting `index` to Codex under its own `index` CID. Instead of the magnet link, the community owner only publishes this `index` CID.
> If the system fails to publish the index, we assume the archive publishing was unsuccessful and we will start from scratch after restart (to be checked). In other words, we do not have test is torrent file exists. If the previous publishing was successful, then the CIDs are advertised by Codex - they will already stored in the Codex RepoStore.
In order to receive the historical messages for the given period (given by `from` and `to` in the `WakuMessageArchiveMetadata`), the receiving node first acquires the `index` using the `index` CID. For each entry in the `index` that the node has interest in, the node then downloads the corresponding archive directly using the `Cid` from this `index` entry.

View File

@ -0,0 +1,596 @@
---
related-to:
- "[[Team-NLBR Solution Proposal]]"
- "[[status-go publishing magnet links]]"
- "[[status-go processing magnet links]]"
- "[[status-go-codex integration - design notes]]"
- "[[Creating History Archives - InitHistoryArchiveTasks]]"
---
marcinczenko: What is the difference in handleImportedMessages and handleRetrievedMessages. How are they used?
GitHub Copilot: I'll examine the two methods `handleImportedMessages` and `handleRetrievedMessages` to understand their differences and usage.
Read [](file:///home/mc2/code/status-im/status-go/protocol/messenger.go#2900-2900), lines 2900 to 3100
Read [](file:///home/mc2/code/status-im/status-go/protocol/messenger.go#3100-3100), lines 3100 to 3300
Now let me search for where these methods are called to understand their usage patterns:
Searched text for `handleImportedMessages|handleRetrievedMessages` (`**//home/mc2/code/status-im/status-go/protocol/**`), 6 results
Let me examine the context where these methods are called:
Read [](file:///home/mc2/code/status-im/status-go/protocol/messenger.go#2670-2670), lines 2670 to 2700
Read [](file:///home/mc2/code/status-im/status-go/protocol/messenger_handler.go#1480-1480), lines 1480 to 1510
Now let me look at the specific message handling in both methods:
Searched text for `dispatchToHandler` (`**/messenger.go`), 2 results
Let me check the dispatchToHandler method to understand the difference:
Searched codebase for "dispatchToHandler method implementation", 15 results
Great! Now I can see the `dispatchToHandler` method is generated. Let me also check the specific handling in `handleImportedMessages`:
Read [](file:///home/mc2/code/status-im/status-go/protocol/messenger.go#3030-3030), lines 3030 to 3080
Now let me also check the context where these are used in messenger_handler.go:
Read [](file:///home/mc2/code/status-im/status-go/protocol/messenger_handler.go#1460-1460), lines 1460 to 1520
Let me check what method contains this logic by looking for the function signature:
Searched text for `func (m *Messenger)` (`**/messenger_handler.go`), 20 results
Let me look for the function that contains the separation logic:
Searched text for `ThirdPartyID` (`**/messenger_handler.go`), 6 results
Read [](file:///home/mc2/code/status-im/status-go/protocol/messenger_handler.go#1440-1440), lines 1440 to 1470
Perfect! Now I can see the complete picture. Based on my analysis of the code, I can explain the differences between `handleImportedMessages` and `handleRetrievedMessages`:
## Differences between `handleImportedMessages` and `handleRetrievedMessages`
### **1. Purpose and Context**
**`handleImportedMessages`:**
- Specifically designed for handling **imported/archived messages** (e.g., from Discord imports or community history archives)
- Used when messages have a `ThirdPartyID` (indicating they come from external sources)
- Part of the history import/archive restoration functionality
**`handleRetrievedMessages`:**
- General-purpose message handler for **regular incoming messages**
- Used for normal message flow from the messaging protocol
- Called by `RetrieveAll()` which is the main message retrieval mechanism
### **2. Message Separation Logic**
In `handleArchiveMessages()` method in messenger_handler.go, messages are separated based on `ThirdPartyID`:
```go
for filter, messages := range messagesToHandle {
for _, message := range messages {
if message.ThirdPartyID != "" {
importedMessages[filter] = append(importedMessages[filter], message) // → handleImportedMessages
} else {
otherMessages[filter] = append(otherMessages[filter], message) // → handleRetrievedMessages
}
}
}
```
### **3. Message Type Handling**
**`handleImportedMessages`:**
- **Limited scope**: Only handles specific message types:
- `CHAT_MESSAGE`
- `PIN_MESSAGE`
- Uses hardcoded switch statement for message type handling
- More restrictive and focused on basic message types
**`handleRetrievedMessages`:**
- **Full scope**: Handles all message types using `dispatchToHandler()`
- `dispatchToHandler()` is auto-generated and routes to appropriate handlers for all supported message types
- More comprehensive message processing
### **4. Feature Differences**
**`handleImportedMessages`:**
- **No mutex protection** during main processing (only for Discord-specific saves)
- **No notifications/activity center** updates
- **No community message storage** in Waku
- **No metrics tracking**
- **No blocked user filtering**
- **No datasync metadata handling**
- **No installation/shared secrets handling**
- Focuses on basic message persistence
**`handleRetrievedMessages`:**
- **Full mutex protection** (`m.handleMessagesMutex`)
- **Community message storage** for controlled communities
- **Metrics tracking** (`MetricsPushReceivedMessages`)
- **Blocked user filtering**
- **Datasync metadata handling**
- **Installation and shared secrets handling**
- **Activity center notifications**
- **Contact management**
- Returns a full `MessengerResponse`
### **5. Return Values**
**`handleImportedMessages`:**
- Returns `error` only
- No response object for UI updates
**`handleRetrievedMessages`:**
- Returns `(*MessengerResponse, error)`
- Provides full response for UI updates via `saveDataAndPrepareResponse()`
### **6. Usage Context**
**`handleImportedMessages`:**
- Used in archive restoration scenarios
- Called when processing historical/imported messages from external sources
- Part of the community history import functionality
**`handleRetrievedMessages`:**
- Used in normal message flow
- Called by `RetrieveAll()` which is invoked by `ProcessAllMessages()`
- Part of the regular message processing loop
### **Summary**
`handleImportedMessages` is a **lightweight, specialized handler** for processing historical/imported messages with minimal side effects, while `handleRetrievedMessages` is the **full-featured, comprehensive handler** for regular message processing with all the bells and whistles (notifications, community features, metrics, etc.).
The separation ensures that imported messages don't trigger unnecessary notifications or duplicate processing while still preserving the message content in the database.
marcinczenko: ok, both functions are called by `handleArchiveMessages`, which is called by `importHistoryArchive`, which in turn is called by either `resumeHistoryArchivesImport` or `downloadAndImportHistoryArchives`. The `downloadAndImportHistoryArchives` is in turn called indirectly by `handleRetrievedMessages` - which looks like a circural dependency, which I can really understand. I am trying to get a better picure of the process of archive history download...
GitHub Copilot: You're absolutely right to be confused about this circular dependency! Let me trace through the archive history download process to clarify what's happening. Let me examine the call chain you mentioned.
Searched codebase for "downloadAndImportHistoryArchives function implementation", 13 results
Now let me look at the specific methods to understand the flow better:
Read [](file:///home/mc2/code/status-im/status-go/protocol/messenger_handler.go#1410-1410), lines 1410 to 1450
Searched text for `downloadAndImportHistoryArchives` (`**/messenger_handler.go`), 6 results
Now let me search for where `handleRetrievedMessages` might indirectly call `downloadAndImportHistoryArchives`:
Searched text for `HandleHistoryArchiveMagnetlinkMessage` (`**/messenger_handler.go`), 4 results
Read [](file:///home/mc2/code/status-im/status-go/protocol/messenger_handler.go#3980-3980), lines 3980 to 4010
Let me search for what calls `HandleCommunityMessageArchiveMagnetlink`:
Searched text for `HandleCommunityMessageArchiveMagnetlink` (`**/*.go`), 8 results
Now let me check the `dispatchToHandler` generated file:
Read [](file:///home/mc2/code/status-im/status-go/protocol/messenger_handlers.go#130-130), lines 130 to 150
Perfect! Now I can see the complete picture. The apparent "circular dependency" you noticed is not actually circular, but rather involves **different message types flowing through the system**. Let me explain the complete archive history download process:
## **Archive History Download Process Flow**
### **1. Initial Trigger Points**
There are several ways archive downloads can be triggered:
**A. Via `resumeHistoryArchivesImport()`:**
- Called during startup to resume interrupted downloads
- Directly calls `importHistoryArchives()` to process already-downloaded archives
**B. Via received magnetlink messages:**
- `handleRetrievedMessages()` processes incoming `COMMUNITY_MESSAGE_ARCHIVE_MAGNETLINK` messages
- This goes through `dispatchToHandler()``handleCommunityMessageArchiveMagnetlinkProtobuf()``HandleCommunityMessageArchiveMagnetlink()``HandleHistoryArchiveMagnetlinkMessage()`
- Which then spawns a goroutine that calls `downloadAndImportHistoryArchives()`
**C. Via community join responses:**
- When joining a community, `HandleCommunityRequestToJoinResponse()` also calls `downloadAndImportHistoryArchives()`
### **2. The Download and Import Pipeline**
```
downloadAndImportHistoryArchives()
1. DownloadHistoryArchivesByMagnetlink() [downloads archives via torrent]
2. importHistoryArchives() [processes downloaded archives]
3. ExtractMessagesFromHistoryArchive() [extracts messages from archive files]
4. handleArchiveMessages() [processes the extracted messages]
5. Separates messages by ThirdPartyID:
- Messages WITH ThirdPartyID → handleImportedMessages()
- Messages WITHOUT ThirdPartyID → handleRetrievedMessages()
```
### **3. The Key Insight: Different Message Types**
The "circular dependency" you noticed is actually **different message flows**:
**Flow A: Regular Messages → Archive Download**
1. `handleRetrievedMessages()` processes regular incoming messages from the network
2. One of these messages might be a `COMMUNITY_MESSAGE_ARCHIVE_MAGNETLINK` message
3. This triggers `downloadAndImportHistoryArchives()` to download new archives
**Flow B: Archive Processing → Message Handling**
1. `downloadAndImportHistoryArchives()` downloads archive files
2. `importHistoryArchives()` extracts messages from archives
3. `handleArchiveMessages()` processes these extracted historical messages
4. These messages are then handled by `handleImportedMessages()` and `handleRetrievedMessages()` again
### **4. The Separation Logic**
In `handleArchiveMessages()`, messages are separated based on `ThirdPartyID`:
```go
for filter, messages := range messagesToHandle {
for _, message := range messages {
if message.ThirdPartyID != "" {
importedMessages[filter] = append(importedMessages[filter], message) // External imports (Discord, etc.)
} else {
otherMessages[filter] = append(otherMessages[filter], message) // Regular Status messages
}
}
}
```
### **5. Why This Design Makes Sense**
1. **Regular message processing** (`handleRetrievedMessages`) can trigger **archive downloads** when magnetlink messages arrive
2. **Archive processing** can extract both **imported messages** (Discord, etc.) and **regular Status messages**
3. **Different handlers** are used based on message origin and type:
- `handleImportedMessages()`: Lightweight processing for external imports
- `handleRetrievedMessages()`: Full-featured processing for Status messages
### **6. Complete Flow Diagram**
```
Network Messages → handleRetrievedMessages()
[If MAGNETLINK message] → downloadAndImportHistoryArchives()
Download Archive Files → importHistoryArchives()
Extract Messages → handleArchiveMessages()
Separate by ThirdPartyID:
├─ External Messages → handleImportedMessages()
└─ Status Messages → handleRetrievedMessages() [different context]
```
So it's not circular - it's a **layered system** where:
- **Network messages** can trigger **archive downloads**
- **Archive processing** extracts **historical messages**
- **Historical messages** are processed through appropriate handlers
The same function names are used, but they're processing **different types of messages** in **different contexts**.
marcinczenko: ok, so the MAGNETLINK message can only come from the network I guess, and not from the archive, right? Thus, when `handleRetrievedMessages` is called in the response to the MAGETLINK message, it will process it differntly (it will not call downloadAndImportHistoryArchives). Is this correct?
GitHub Copilot: You're absolutely correct! That's a very insightful observation. Let me confirm this by examining the logic:
Read [](file:///home/mc2/code/status-im/status-go/protocol/messenger_handler.go#1450-1450), lines 1450 to 1500
Now let me check the parameters passed to `handleRetrievedMessages` in different contexts:
Searched text for `handleRetrievedMessages` (`**/*.go`), 6 results
Perfect! Looking at the function signature and calls:
```go
func (m *Messenger) handleRetrievedMessages(chatWithMessages map[messagingtypes.ChatFilter][]*messagingtypes.ReceivedMessage, storeWakuMessages bool, fromArchive bool) (*MessengerResponse, error)
```
**Yes, you are absolutely correct!** Here's the key distinction:
### **1. Network Messages (including MAGNETLINK):**
```go
// From RetrieveAll() -> called for regular network message processing
return m.handleRetrievedMessages(chatWithMessages, true, false)
// ^^^^ ^^^^^
// storeWaku fromArchive=false
```
### **2. Archive Messages:**
```go
// From handleArchiveMessages() -> called when processing extracted archive messages
response, err := m.handleRetrievedMessages(otherMessages, false, true)
// ^^^^^ ^^^^
// storeWaku fromArchive=true
```
### **3. The Key Parameter: `fromArchive`**
When `fromArchive=true`, the `dispatchToHandler` call passes this parameter:
Read [](file:///home/mc2/code/status-im/status-go/protocol/messenger.go#3240-3240), lines 3240 to 3250
Now let me check if the generated handlers use this `fromArchive` parameter:
Read [](file:///home/mc2/code/status-im/status-go/cmd/generate_handlers/generate_handlers_template.txt#15-15), lines 15 to 45
Let me check the specific handler for MAGNETLINK messages:
Read [](file:///home/mc2/code/status-im/status-go/protocol/messenger_handlers.go#1000-1000), lines 1000 to 1020
I notice that the magnetlink handler doesn't have the `fromArchive` parameter. Let me check if there's a configuration that controls this:
Searched text for `CommunityMessageArchiveMagnetlink.*fromArchive|fromArchive.*CommunityMessageArchiveMagnetlink` (`**/*.go`), no results
Let me check what determines if a handler gets the `fromArchive` parameter by looking at the handler generation:
Searched text for `FromArchiveArg` (`**/cmd/generate_handlers/**`), 5 results
Perfect! Now I can see the complete picture:
Read [](file:///home/mc2/code/status-im/status-go/cmd/generate_handlers/generate_handlers.go#65-65), lines 65 to 75
## **You are 100% correct!** Here's why:
### **1. MAGNETLINK Messages and Archive Context**
**MAGNETLINK messages (`CommunityMessageArchiveMagnetlink`) can ONLY come from the network**, not from archives, because:
1. **Archive files don't contain MAGNETLINK messages** - they contain the actual chat messages, pin messages, etc. that were archived
2. **MAGNETLINK messages are instructions to download archives** - they wouldn't be stored inside the archives themselves (that would be recursive)
3. **Only specific message types can come from archives**: According to the handler generation code, only `ChatMessage` and `PinMessage` get the `fromArchive` parameter
### **2. Handler Behavior Differences**
```go
// Only these message types get fromArchive parameter:
info.FromArchiveArg = protobufName == "ChatMessage" || protobufName == "PinMessage"
```
This means:
- **`CommunityMessageArchiveMagnetlink` handler** does NOT get `fromArchive` parameter
- **`ChatMessage` and `PinMessage` handlers** DO get `fromArchive` parameter and can behave differently
### **3. The Safety Mechanism**
When `handleRetrievedMessages` is called in different contexts:
**A. Network Messages (fromArchive=false):**
- MAGNETLINK messages can trigger `downloadAndImportHistoryArchives()`
- This is the normal behavior for network-received magnetlinks
**B. Archive Messages (fromArchive=true):**
- MAGNETLINK messages **cannot exist** in archives (they're not archived)
- Even if they did, the handler doesn't have access to `fromArchive` to change behavior
- But `ChatMessage` and `PinMessage` handlers **do** get `fromArchive=true` and can behave accordingly
### **4. Logical Flow Protection**
```
Network Message Flow:
MAGNETLINK arrives via network
→ handleRetrievedMessages(fromArchive=false)
→ HandleCommunityMessageArchiveMagnetlink()
→ downloadAndImportHistoryArchives() ✓ ALLOWED
Archive Processing Flow:
Archive extracted messages
→ handleRetrievedMessages(fromArchive=true)
→ Only ChatMessage/PinMessage handlers get fromArchive=true
→ MAGNETLINK messages don't exist in archives anyway ✓ SAFE
```
So yes, your understanding is perfect! The system naturally prevents infinite loops because:
1. **MAGNETLINK messages only come from the network**
2. **Archive files only contain chat content messages**
3. **The `fromArchive` parameter** allows handlers to behave differently when processing historical vs. live messages
4. **Only relevant message types** get the `fromArchive` parameter
This is a well-designed system that prevents the circular dependency issue you were concerned about!
marcinczenko: Can you show me with more details the flow of receiving the magnet link before and how it is connected to `RetrieveAll()`?
GitHub Copilot: Absolutely! Let me trace the complete flow from the very beginning of how a magnetlink message arrives and gets processed through to `RetrieveAll()`. This will show you the entire message reception pipeline.
Searched codebase for "RetrieveAll message retrieval from network waku", 26 results
Perfect! Now I can show you the complete detailed flow of how a magnetlink message travels from the network to `RetrieveAll()`. Let me trace this step by step:
## **Complete Magnetlink Message Flow - From Network to `RetrieveAll()`**
### **1. Network Layer - Message Reception**
```
Waku Network
OnNewEnvelope() [waku/nwaku.go or waku/gowaku.go]
postEvent(envelope)
msgQueue <- envelope [buffered channel]
```
**Details:**
- Messages arrive from the Waku network via `OnNewEnvelope()`
- Messages are queued in `msgQueue` (buffered channel with `messageQueueLimit`)
- `processQueueLoop()` continuously processes this queue
Read [](file:///home/mc2/code/status-im/status-go/messaging/waku/nwaku.go#1180-1180), lines 1180 to 1200
### **2. Message Store Layer**
```
processMessage(envelope)
filter.Messages.Add(message) [MemoryMessageStore]
Messages stored in filter-specific stores
```
**Details:**
- `processMessage()` routes messages to appropriate filters based on topic
- Each filter has its own `MemoryMessageStore` where messages accumulate
- Messages wait here until retrieved by `RetrieveAll()`
Read [](file:///home/mc2/code/status-im/status-go/messaging/waku/nwaku.go#1200-1200), lines 1200 to 1215
Read [](file:///home/mc2/code/status-im/status-go/messaging/waku/nwaku.go#1215-1215), lines 1215 to 1235
### **3. Message Retrieval Trigger**
The `RetrieveAll()` is called either:
**A. Periodically via the Message Loop:**
```
StartRetrieveMessagesLoop() [called at startup]
ticker.C (every tick interval)
ProcessAllMessages()
RetrieveAll()
```
**B. Manually by the application:**
```
Direct call to RetrieveAll()
```
### **4. The RetrieveAll() Chain**
```go
// messenger.go
func (m *Messenger) RetrieveAll() (*MessengerResponse, error) {
chatWithMessages, err := m.messaging.RetrieveRawAll() // Step A
if err != nil {
return nil, err
}
return m.handleRetrievedMessages(chatWithMessages, true, false) // Step B
}
```
### **Step A: `m.messaging.RetrieveRawAll()`**
Read [](file:///home/mc2/code/status-im/status-go/messaging/api.go#150-150), lines 150 to 165
This calls into the transport layer:
Read [](file:///home/mc2/code/status-im/status-go/messaging/layers/transport/transport.go#210-210), lines 210 to 240
The key call here is `t.api.GetFilterMessages(filter.FilterID)`:
Read [](file:///home/mc2/code/status-im/status-go/messaging/waku/api.go#344-344), lines 344 to 359
**This is where the stored messages are retrieved!** `f.Retrieve()` calls the filter's message store:
Read [](file:///home/mc2/code/status-im/status-go/messaging/waku/common/message.go#140-140), lines 140 to 155
### **Step B: `handleRetrievedMessages()`**
Now we have the messages collected from all filters. Let's trace what happens next:
```go
// messenger.go - handleRetrievedMessages()
func (m *Messenger) handleRetrievedMessages(chatWithMessages map[messagingtypes.ChatFilter][]*messagingtypes.ReceivedMessage, storeWakuMessages bool, fromArchive bool) (*MessengerResponse, error) {
// Process each message through the handler pipeline
for iterator.HasNext() {
filter, messages := iterator.Next()
for _, shhMessage := range messages {
// Decode the raw message
handleMessagesResponse, err := m.messaging.HandleReceivedMessages(shhMessage)
for _, msg := range statusMessages {
// Route to appropriate handler
err := m.dispatchToHandler(messageState, msg.ApplicationLayer.Payload, msg, filter, fromArchive)
// If this is a MAGNETLINK message, it goes to:
// dispatchToHandler() → handleCommunityMessageArchiveMagnetlinkProtobuf()
// → HandleCommunityMessageArchiveMagnetlink() → HandleHistoryArchiveMagnetlinkMessage()
// → downloadAndImportHistoryArchives()
}
}
}
}
```
### **5. Complete Flow Diagram**
```
┌─────────────────────────────────────────────────────────────────┐
│ NETWORK LAYER │
└─────────────────────────────────────────────────────────────────┘
┌─────────────────────────────────────────────────────────────────┐
│ WAKU MESSAGE RECEPTION │
│ Network → OnNewEnvelope() → postEvent() → msgQueue │
└─────────────────────────────────────────────────────────────────┘
┌─────────────────────────────────────────────────────────────────┐
│ MESSAGE STORE LAYER │
│ processQueueLoop() → processMessage() → filter.Messages.Add() │
│ │
│ Messages stored in MemoryMessageStore per filter │
└─────────────────────────────────────────────────────────────────┘
┌─────────────────────────────────────────────────────────────────┐
│ RETRIEVAL TRIGGER │
│ │
│ Timer Loop: │ Manual: │
│ StartRetrieveMessagesLoop() │ Direct RetrieveAll() │
│ ↓ │ call │
│ ProcessAllMessages() │ │
│ ↓ │ │
│ RetrieveAll() ←─────────────────┘ │
└─────────────────────────────────────────────────────────────────┘
┌─────────────────────────────────────────────────────────────────┐
│ MESSAGE COLLECTION │
│ RetrieveAll() → messaging.RetrieveRawAll() │
│ ↓ │
│ transport.RetrieveRawAll() │
│ ↓ │
│ For each filter: api.GetFilterMessages() │
│ ↓ │
│ filter.Retrieve() → store.Pop() │
│ │
│ Returns: map[ChatFilter][]*ReceivedMessage │
└─────────────────────────────────────────────────────────────────┘
┌─────────────────────────────────────────────────────────────────┐
│ MESSAGE PROCESSING │
│ handleRetrievedMessages(messages, true, false) │
│ ↓ │
│ For each message: messaging.HandleReceivedMessages() │
│ ↓ │
│ dispatchToHandler() → route by message type │
│ │
│ IF MAGNETLINK MESSAGE: │
│ → handleCommunityMessageArchiveMagnetlinkProtobuf() │
│ → HandleCommunityMessageArchiveMagnetlink() │
│ → HandleHistoryArchiveMagnetlinkMessage() │
│ → downloadAndImportHistoryArchives() [ASYNC GOROUTINE] │
└─────────────────────────────────────────────────────────────────┘
```
### **6. Key Points About the Flow**
1. **Asynchronous Message Queue**: Messages from network are queued and processed asynchronously
2. **Per-Filter Storage**: Each chat filter has its own message store
3. **Batch Retrieval**: `RetrieveAll()` collects messages from all filters at once
4. **Cache Deduplication**: Transport layer checks cache to avoid reprocessing
5. **Message Type Routing**: `dispatchToHandler()` routes messages to type-specific handlers
6. **MAGNETLINK Processing**: MAGNETLINK messages trigger async archive downloads
7. **Regular Polling**: The retrieve loop runs continuously to process new messages
### **7. Why This Design Works Well**
- **Network messages flow continuously** into filter stores
- **RetrieveAll() periodically collects** all accumulated messages
- **MAGNETLINK messages** in this flow trigger downloads of historical archives
- **Archive processing** uses a separate flow (`handleArchiveMessages`) that eventually calls `handleRetrievedMessages` again but with `fromArchive=true`
- **No infinite loops** because MAGNETLINK messages only come from network, not archives
This shows how the live message processing naturally triggers the archive download process when MAGNETLINK messages arrive from community owners!

View File

@ -0,0 +1,186 @@
---
related-to:
- "[[Team-NLBR Solution Proposal]]"
- "[[status-go publishing magnet links]]"
- "[[status-go processing magnet links]]"
- "[[status-go-codex integration - design notes]]"
- "[[Creating History Archives - InitHistoryArchiveTasks]]"
---
See also previous conversation: [[When are magnetlink messages sent]]
Great question! Let me trace where and how magnetlink messages are published to the Waku network. This is the other side of the equation - how the community owners/admins send these MAGNETLINK messages that we just traced receiving.
Searched codebase for "publish magnetlink CommunityMessageArchiveMagnetlink send", 20 results
Perfect! Now I can show you exactly where and how magnetlink messages are published to the Waku network. Let me trace this step by step:
## **Where Magnetlink Messages are Published to Waku**
### **1. The Key Function: `dispatchMagnetlinkMessage()`**
Read [](file:///home/mc2/code/status-im/status-go/protocol/messenger_communities.go#4079-4079), lines 4079 to 4109
Read [](file:///home/mc2/code/status-im/status-go/protocol/messenger_communities.go#4109-4109), lines 4109 to 4124
This is the core function! Let me also check what calls this function:
Searched text for `dispatchMagnetlinkMessage` (`**/*.go`), 4 results
Read [](file:///home/mc2/code/status-im/status-go/protocol/messenger_communities.go#250-250), lines 250 to 270
Let me check what this subscription handler is:
Read [](file:///home/mc2/code/status-im/status-go/protocol/messenger_communities.go#220-220), lines 220 to 250
Read [](file:///home/mc2/code/status-im/status-go/protocol/messenger_communities.go#200-200), lines 200 to 220
Now I can see the complete picture! Let me trace the flow and also check what triggers the seeding signal:
Searched text for `HistoryArchivesSeedingSignal` (`**/*.go`), 20 results
Let me check where the seeding signal is triggered:
Read [](file:///home/mc2/code/status-im/status-go/protocol/communities/manager_archive.go#430-430), lines 430 to 440
Read [](file:///home/mc2/code/status-im/status-go/protocol/communities/manager_archive.go#420-420), lines 420 to 435
Read [](file:///home/mc2/code/status-im/status-go/protocol/communities/manager_archive.go#400-400), lines 400 to 420
Now let me check where `SeedHistoryArchiveTorrent` is called:
Searched text for `SeedHistoryArchiveTorrent` (`**/*.go`), 20 results
Perfect! Now I have the complete picture. Let me also check where the seeding happens from archive creation:
Read [](file:///home/mc2/code/status-im/status-go/protocol/communities/manager_archive.go#310-310), lines 310 to 325
## **Complete Magnetlink Publication Flow**
Here's the complete flow of where and how magnetlink messages are published to Waku:
### **1. Archive Creation and Seeding Triggers**
**A. Periodic Archive Creation (Community Owner):**
```
StartHistoryArchiveTasksInterval() [periodic scheduler]
CreateAndSeedHistoryArchive()
CreateHistoryArchiveTorrentFromDB() [creates .torrent files]
SeedHistoryArchiveTorrent() [starts BitTorrent seeding]
publisher.publish(HistoryArchivesSeedingSignal)
```
**B. Manual Archive Creation:**
```
Discord Import or other triggers
SeedHistoryArchiveTorrent() [directly]
publisher.publish(HistoryArchivesSeedingSignal)
```
**C. Archive Download Completion:**
```
DownloadHistoryArchivesByMagnetlink() [when downloading completes]
publisher.publish(HistoryArchivesSeedingSignal) [line 642]
```
### **2. The Publishing Pipeline**
```
SeedHistoryArchiveTorrent()
publisher.publish(HistoryArchivesSeedingSignal) [internal event]
handleCommunitiesHistoryArchivesSubscription() [event handler]
IF community.IsControlNode():
dispatchMagnetlinkMessage(communityID)
Create CommunityMessageArchiveMagnetlink protobuf
messaging.SendPublic(context, chatID, rawMessage)
[WAKU NETWORK]
```
### **3. Key Publication Details**
**The Message Structure:**
```go
magnetLinkMessage := &protobuf.CommunityMessageArchiveMagnetlink{
Clock: m.getTimesource().GetCurrentTime(),
MagnetUri: magnetlink, // BitTorrent magnetlink
}
rawMessage := messagingtypes.RawMessage{
LocalChatID: chatID, // community.MagnetlinkMessageChannelID()
Sender: community.PrivateKey(), // Signed by community owner
Payload: encodedMessage,
MessageType: protobuf.ApplicationMetadataMessage_COMMUNITY_MESSAGE_ARCHIVE_MAGNETLINK,
SkipGroupMessageWrap: true, // Direct message, no encryption wrapper
PubsubTopic: community.PubsubTopic(), // Community's pubsub topic
Priority: &messagingtypes.LowPriority,
}
```
**Where it's Published:**
- **Topic**: Community's pubsub topic (`community.PubsubTopic()`)
- **Chat ID**: `community.MagnetlinkMessageChannelID()` (special channel for magnetlinks)
- **Sender**: Community owner's private key
- **Network**: Public Waku network via `messaging.SendPublic()`
### **4. Who Can Publish Magnetlinks?**
Only **community control nodes** (owners) can publish magnetlink messages:
```go
if c.IsControlNode() {
err := m.dispatchMagnetlinkMessage(sub.HistoryArchivesSeedingSignal.CommunityID)
}
```
### **5. When are Magnetlinks Published?**
1. **After creating new archives** (periodic archive creation)
2. **After seeding existing archives** (on startup/reboot)
3. **After downloading and re-seeding archives** (when community owners download from other peers)
4. **After Discord imports** (when historical data is imported)
### **6. The Network Flow**
```
Community Owner Node:
Archive Creation → Seeding → HistoryArchivesSeedingSignal
dispatchMagnetlinkMessage()
messaging.SendPublic() → [WAKU NETWORK]
[All Community Members]
RetrieveAll() receives messages
handleRetrievedMessages()
HandleHistoryArchiveMagnetlinkMessage()
downloadAndImportHistoryArchives()
```
### **7. Complete Publication Locations Summary**
**Primary Publication Point:**
- `dispatchMagnetlinkMessage()` in messenger_communities.go
- Called when `HistoryArchivesSeedingSignal` is received
- Only for community control nodes
- Publishes to community's pubsub topic
**Network Layer:**
- `messaging.SendPublic()``sendCommunity()``dispatchCommunityMessage()``transport.SendCommunityMessage()`
- Eventually reaches Waku network via `transport.api.Post()`
**The magnetlink messages flow from community owners to all community members via the Waku network**, triggering automatic archive downloads when members receive them through their normal message processing loop!

View File

@ -0,0 +1,265 @@
---
related-to:
- "[[Team-NLBR Solution Proposal]]"
- "[[status-go publishing magnet links]]"
- "[[status-go processing magnet links]]"
- "[[status-go-codex integration - design notes]]"
- "[[Creating History Archives - InitHistoryArchiveTasks]]"
---
## Codex for History Archives
As indicated in the [[Team-NLBR Solution Proposal]], the central entry point to the history management is [InitHistoryArchiveTasks](https://github.com/status-im/status-go/blob/6322f22783585474803cfc8a6f0a914757d763b5/protocol/messenger_communities.go#L3783). `InitHistoryArchiveTasks` is called from **two main places**:
- During `Messenger.Start()` (startup)
- When enabling archive protocol
In [[Creating History Archives - InitHistoryArchiveTasks]] we find the complete initialization flow:
```
System Startup
Messenger.Start()
Wait for Store Node Availability
InitHistoryArchiveTasks(controlledCommunities)
├─ For each community owner controls:
│ ├─ Check if archive support enabled
│ ├─ Seed existing torrents (if available)
| ├─ CreateAndSeedHistoryArchive
│ ├─ Get community topics and sync missed messages
│ ├─ Check when last archive was created
│ └─ Based on last archive timing:
│ ├─ No archives → StartHistoryArchiveTasksInterval() immediately
│ ├─ Recent archive → Seed + delayed CreateAndSeedHistoryArchive followed by StartHistoryArchiveTasksInterval()
│ └─ Old archive → Create new archive + CreateAndSeedHistoryArchive + StartHistoryArchiveTasksInterval()
└─ Each StartHistoryArchiveTasksInterval():
├─ Runs as background goroutine
├─ Creates ticker with 7-day interval
├─ Every 7 days: CreateAndSeedHistoryArchive()
├─ After seeding: publishes HistoryArchivesSeedingSignal
├─ Signal triggers: dispatchMagnetlinkMessage()
└─ Magnetlink sent to all community members via Waku
```
We will be going step by step through this flow and apply the changes (where we need to diverge, we will...).
### BitTorrent - with or without
In the first pass we do not delete the BitTorrent related code, but rather try to add Codex extensions next to it - this way I hope it will be easier to move things around without being too destructive from the beginning.
### Seed existing torrents (if available)
This step is only needed for torrents. Codex has its own persistence and will start seeding immediately after it starts.
### CreateAndSeedHistoryArchive
The first function that asks for attention is `CreateAndSeedHistoryArchive`. It is from `ArchiveService` interface.
```go
func (m *ArchiveManager) CreateAndSeedHistoryArchive(communityID types.HexBytes, topics []messagingtypes.ContentTopic, startDate time.Time, endDate time.Time, partition time.Duration, encrypt bool) error {
m.UnseedHistoryArchiveTorrent(communityID)
_, err := m.ArchiveFileManager.CreateHistoryArchiveTorrentFromDB(communityID, topics, startDate, endDate, partition, encrypt)
if err != nil {
return err
}
return m.SeedHistoryArchiveTorrent(communityID)
}
```
It calls `CreateHistoryArchiveTorrentFromDB`, which then calls `createHistoryArchiveTorrent`:
```go
func (m *ArchiveFileManager) CreateHistoryArchiveTorrentFromDB(communityID types.HexBytes, topics []messagingtypes.ContentTopic, startDate time.Time, endDate time.Time, partition time.Duration, encrypt bool) ([]string, error) {
return m.createHistoryArchiveTorrent(communityID, make([]*messagingtypes.ReceivedMessage, 0), topics, startDate, endDate, partition, encrypt)
}
```
`createHistoryArchiveTorrent` (`ArchiveFileManager`) is where the work is done.
#### Protobuf messages
Here we list all the Protobuf messages that are relevant to message archives:
```protobuf
message CommunityMessageArchiveMagnetlink {
uint64 clock = 1;
string magnet_uri = 2;
}
message WakuMessage {
bytes sig = 1;
uint64 timestamp = 2;
bytes topic = 3;
bytes payload = 4;
bytes padding = 5;
bytes hash = 6;
string thirdPartyId = 7;
}
message WakuMessageArchiveMetadata {
uint32 version = 1;
uint64 from = 2;
uint64 to = 3;
repeated bytes contentTopic = 4;
}
message WakuMessageArchive {
uint32 version = 1;
WakuMessageArchiveMetadata metadata = 2;
repeated WakuMessage messages = 3;
}
message WakuMessageArchiveIndexMetadata {
uint32 version = 1;
WakuMessageArchiveMetadata metadata = 2;
uint64 offset = 3;
uint64 size = 4;
uint64 padding = 5;
}
message WakuMessageArchiveIndex {
map<string, WakuMessageArchiveIndexMetadata> archives = 1;
}
```
All in `protocol/protobuf/communities.proto`. There is one more, not directly related, but for some reason it contains a `magnet_url` field (to be checked later):
```protobuf
message CommunityRequestToJoinResponse {
uint64 clock = 1;
CommunityDescription community = 2 [deprecated = true];
bool accepted = 3;
bytes grant = 4;
bytes community_id = 5;
string magnet_uri = 6;
bytes protected_topic_private_key = 7;
Shard shard = 8;
// CommunityDescription protocol message with owner signature
bytes community_description_protocol_message = 9;
}
```
We see that most are independent from BitTorrent. The ones that are BitTorrent specific are:
- `CommunityMessageArchiveMagnetlink`
- `WakuMessageArchiveIndexMetadata`
- `WakuMessageArchiveIndex` (because it depends on `WakuMessageArchiveIndexMetadata`)
- `CommunityRequestToJoinResponse` (because of the `magnet_uri` field)
Now, starting with something simple (in the end we are building PoC here), we know that Codex API operates on CID encoded as `base58btc` strings. In `WakuMessageArchiveIndexMetadata`, `offset`, `size`, and `padding` are relevant to the current BitTorrent-based implementation. For Codex we can use something simpler:
```protobuf
message CodexWakuMessageArchiveIndexMetadata {
uint32 version = 1;
WakuMessageArchiveMetadata metadata = 2;
string cid = 3;
}
message CodexWakuMessageArchiveIndex {
map<string, CodexWakuMessageArchiveIndexMetadata> archives = 1;
}
```
#### Appending the index file
In a more production version we will not operate on the local file system, yet, here, for simplicity, we will be using a physical index file and a separate file for each archive. For this reason, in the initial implementation, a community owner will not query Codex for the current index file. For this purpose, we could use `http://localhost:8001/api/codex/v1/data/${CID}` API, which returns `404` when the file does not exist in the local store:
```bash
curl -s -D - -o /dev/null "http://localhost:8001/api/codex/v1/data/${CID}"
HTTP/1.1 404 Not Found
Connection: close
Server: nim-presto/0.0.3 (amd64/linux)
Content-Length: 74
Date: Thu, 25 Sep 2025 02:15:07 GMT
Content-Type: text/html; charset=utf-8
```
Instead, for this initial implementation, we will just read it from a local directory. For now, we will reuse BitTorrent configuration. BitTorrent config stores the index file under:
```go
path.Join(m.torrentConfig.DataDir, communityID, "index")
```
For codex, we will store it under:
```go
path.Join(m.torrentConfig.DataDir, "codex", communityID, "index")
```
In a similar way, the individual archive to be uploaded we will use:
```go
path.Join(m.torrentConfig.DataDir, "codex", communityID, "data")
```
This data file is temporary and will be overwritten for each new archive created. With Codex, we do not have to append, thus, we do not need the previous data file anymore. We just use file now, because it may be easier to start it this way.
This is done with `LoadHistoryArchiveIndexFromFile`
## Testing
There will be a number of tests that will need to adjust or fix.
But there is one test that has slightly more end-to-end nature. It is from the `protocol` package:
```
protocol/communities_messenger_token_permissions_test.go
```
This test call a couple of important functions, which will be a good indication which functions will need taken into account.
The test - `TestImportDecryptedArchiveMessages` - first creates a community and sets up the corresponding permissions. Then the community owner sends a message to the community and then immediately retrieves it so that it is now recorded in the DB.
After that it prepares archive parameters: `startDate`, `endDate`, `partition`, and community `topics`. All those will be passed to `CreateHistoryArchiveTorrentFromDB` - our entry point to creating history archive torrent.
```go
// 1.1. Create community
community, chat := s.createCommunity()
// ...
// 1.2. Setup permissions
// ...
// 2. Owner: Send a message A
messageText1 := RandomLettersString(10)
message1 := s.sendChatMessage(s.owner, chat.ID, messageText1)
// 2.2. Retrieve own message (to make it stored in the archive later)
_, err = s.owner.RetrieveAll()
s.Require().NoError(err)
// 3. Owner: Create community archive
const partition = 2 * time.Minute
messageDate := time.UnixMilli(int64(message1.Timestamp))
startDate := messageDate.Add(-time.Minute)
endDate := messageDate.Add(time.Minute)
topic := messagingtypes.BytesToContentTopic(messaging.ToContentTopic(chat.ID))
communityCommonTopic := messagingtypes.BytesToContentTopic(messaging.ToContentTopic(community.UniversalChatID()))
topics := []messagingtypes.ContentTopic{topic, communityCommonTopic}
torrentConfig := params.TorrentConfig{
Enabled: true,
DataDir: os.TempDir() + "/archivedata",
TorrentDir: os.TempDir() + "/torrents",
Port: 0,
}
// Share archive directory between all users
s.owner.archiveManager.SetTorrentConfig(&torrentConfig)
s.bob.archiveManager.SetTorrentConfig(&torrentConfig)
s.owner.config.messengerSignalsHandler = &MessengerSignalsHandlerMock{}
s.bob.config.messengerSignalsHandler = &MessengerSignalsHandlerMock{}
```
Finally we call the `CreateHistoryArchiveTorrentFromDB`:
```go
archiveIDs, err := s.owner.archiveManager.CreateHistoryArchiveTorrentFromDB(community.ID(), topics, startDate, endDate, partition, community.Encrypted())
s.Require().NoError(err)
s.Require().Len(archiveIDs, 1)
```
Notice, there is one archive expected.
The `CreateHistoryArchiveTorrentFromDB` is called directly here, in a way bypassing the torrent seeding: in normal flow `CreateHistoryArchiveTorrentFromDB` is called in `CreateAndSeedHistoryArchive` which immediately after creating the archive, calls `SeedHistoryArchiveTorrent`. `CreateHistoryArchiveTorrentFromDB` calls `createHistoryArchiveTorrent` - which is central to the archive creating.