74 Commits

Author SHA1 Message Date
Eric
4006b0024b
fix: undeclared identifier 2025-11-11 12:05:09 +11:00
Eric
e47186db90
fix: decrease block retry count when no peers found
There is a limitation in `downloadInternal` that infinitely loops for peers when attempting to download a block if the node does not have peers. Currently, the block retry counter is only decremented if there exists peers and a download attempt is made, otherwise the loop will continue until peers are found, ad infinium. This fix decrements the retry counter on each loop that no peers are found.
2025-11-11 12:04:50 +11:00
Eric
46413c8dff
catch CancelledError in BlockExchange 2025-11-10 21:03:24 +11:00
Chrysostomos Nanakos
8257406e82
refactor(blockexchange): extract batched want list sending to helper proc
Part of https://github.com/codex-storage/nim-codex/issues/974

Signed-off-by: Chrysostomos Nanakos <chris@include.gr>
2025-11-03 17:24:03 +02:00
Chrysostomos Nanakos
86ce276a90
fix(blockexchange): cleanup stale lastSentWants entries to prevent memory leak
Stale entries in peer.lastSentWants were not removed when blocks were
resolved or cancelled, causing unbounded memory growth. This adds
incremental cleanup during refreshBlockKnowledge, removing up to 2048
stale entries per refresh cycle with proper event loop yielding.

Part of https://github.com/codex-storage/nim-codex/issues/974

Signed-off-by: Chrysostomos Nanakos <chris@include.gr>
2025-11-03 16:49:31 +02:00
Chrysostomos Nanakos
ff7ac829c4
fix(blockexchange): handle evicted peer in download retry loop
Part of https://github.com/codex-storage/nim-codex/issues/974

Signed-off-by: Chrysostomos Nanakos <chris@include.gr>
2025-11-03 14:10:57 +02:00
Chrysostomos Nanakos
ad4d3b5d62
feat(node): implement sliding window for block prefetching
Now fetchBatched maintains a sliding window of batchSize blocks in-flight.
When 75% complete, adds next chunk to maintain constant window size.
This ensures blocks are already pending or have been fetched when
StoreStream needs them.

Part of https://github.com/codex-storage/nim-codex/issues/974

Signed-off-by: Chrysostomos Nanakos <chris@include.gr>
2025-11-03 14:10:55 +02:00
Chrysostomos Nanakos
aea9337ddc
feat(blockexchange): implement delta WantList updates with batching
Implements delta-based WantList updates to reduce network traffic during
block exchange. Only sends newly added blocks instead of resending the
entire WantList on every refresh.

Also some network related fixes:

- Add TCP_NODELAY flag to prevent Nagle's algorithm delays
- Clear sendConn on stream reset to allow garbage collection
- Improve error handling in NetworkPeer.send()

Part of https://github.com/codex-storage/nim-codex/issues/974

Signed-off-by: Chrysostomos Nanakos <chris@include.gr>
2025-11-03 14:10:55 +02:00
Chrysostomos Nanakos
6f378b3c46
perf: add time-based yielding to hot loops
Part of https://github.com/codex-storage/nim-codex/issues/974

Signed-off-by: Chrysostomos Nanakos <chris@include.gr>
2025-11-03 14:10:54 +02:00
Chrysostomos Nanakos
8812d98271
fix: assign selectPeer field in BlockExcEngine ctor
Part of https://github.com/codex-storage/nim-codex/issues/974

Signed-off-by: Chrysostomos Nanakos <chris@include.gr>
2025-11-03 14:10:53 +02:00
Chrysostomos Nanakos
0636adf5e8
refactor: make markRequested idempotent
Returns false on duplicate marking attempts instead of logging errors,
eliminating duplicate marking loop in blockPresenceHandler and
preventing duplicate block requests across concurrent flows.

Part of https://github.com/codex-storage/nim-codex/issues/974
2025-11-03 14:10:51 +02:00
Chrysostomos Nanakos
dddf7424b4
fix: resolve stuck peer refresh state preventing block discovery
This prevents peers from becoming permanently invisible to block discovery when
they fail to respond to WantHave requests.

Part of https://github.com/codex-storage/nim-codex/issues/974
2025-11-03 14:10:50 +02:00
Chrysostomos Nanakos
4abe8c4d97
feat: add strategic runtime metrics for block exchange monitoring
- Add codex_block_exchange_discovery_requests_total counter to track peer
  discovery frequency
- Add codex_block_exchange_peer_timeouts_total counter to monitor peer
  reliability issues
- Add codex_block_exchange_requests_failed_total counter to track request
  failure rates

Part of https://github.com/codex-storage/nim-codex/issues/974
2025-11-03 14:10:49 +02:00
Chrysostomos Nanakos
9af510316a
perf: optimize block batch size from 500 to 50 blocks per message
Achieves significant memory reduction with equivalent network
performance. The reduced batch size prevents memory pressure
while preserving transfer efficiency, improving overall system
resource utilization.

Part of https://github.com/codex-storage/nim-codex/issues/974
2025-11-03 14:10:49 +02:00
Chrysostomos Nanakos
0fbca17269
feat: implement weighted random peer selection for load balancing
Use probabilistic distribution based on peer quality scores, giving all peers
opportunity while favoring better-performing ones. Selection probability is
inversely proportional to score.

Part of https://github.com/codex-storage/nim-codex/issues/974
2025-11-03 14:10:46 +02:00
gmega
8e8d9f8e60
fix: randomize block refresh time, optimize context store checks 2025-11-03 14:10:46 +02:00
gmega
1fdb14f092
feat: add block knowledge request mechanism, implement tests 2025-11-03 14:10:45 +02:00
gmega
9114a620a3
feat: add stopgap "adaptive" refresh 2025-11-03 14:10:44 +02:00
gmega
908d527dc9
fix: fix testdiscovery so it works with stricter block protocol 2025-11-03 14:10:44 +02:00
gmega
544ec123c7
fix: fix block exchange test to stricter protocol; minor refactor 2025-11-03 14:10:43 +02:00
gmega
50ab785662
feat: drop peer on activity timeout 2025-11-03 14:10:42 +02:00
gmega
d91fd053e7
feat: modify retry mechanism; add DHT guard rails; improve block cancellation handling 2025-11-03 14:10:42 +02:00
gmega
d875022ec3
feat: allow futures to be returned out-of-order to decrease memory consumption 2025-11-03 14:10:40 +02:00
gmega
d0466ccf80
feat: remove quadratic joins in cancelBlocks; use SafeAsyncIterator for getBlocks; limit memory usage for fetchBatched when used as prefetcher 2025-11-03 14:10:39 +02:00
gmega
b0b1c45376
fix: refresh timestamp before issuing request to prevent flood of knowledge updates 2025-11-03 14:10:38 +02:00
gmega
3c43d57497
optimize remaining list joins so they're not quadratic 2025-11-03 14:10:36 +02:00
gmega
096eb118f9
replace list operations with sets 2025-11-03 14:10:35 +02:00
gmega
9088566632
update engine tests; add BlockAddress hashing tests 2025-11-03 14:10:34 +02:00
gmega
1135a513d4
feat: cap how many blocks we can pack in a single message 2025-11-03 14:10:33 +02:00
gmega
475d31bef2
feat: add dataset request batching 2025-11-03 14:10:31 +02:00
Chrysostomos Nanakos
baff902137
fix: resolve shared block request cancellation conflicts (#1284) 2025-06-24 15:05:25 +00:00
Marcin Czenko
748830570a
checked exceptions in stores (#1179)
* checked exceptions in stores

* makes asynciter as much exception safe as it gets

* introduce "SafeAsyncIter" that uses Results and limits exceptions to cancellations

* adds {.push raises: [].} to errors

* uses SafeAsyncIter in "listBlocks" and in "getBlockExpirations"

* simplifies safeasynciter (magic of auto)

* gets rid of ugly casts

* tiny fix in hte way we create raising futures in tests of safeasynciter

* Removes two more casts caused by using checked exceptions

* adds an extended explanation of one more complex SafeAsyncIter test

* adds missing "finishOnErr" param in slice constructor of SafeAsyncIter

* better fix for "Error: Exception can raise an unlisted exception: Exception" error.

---------

Co-authored-by: Dmitriy Ryajov <dryajov@gmail.com>
2025-05-21 21:17:04 +00:00
munna0908
3a312596bf
deps: upgrade libp2p & constantine (#1167)
* upgrade libp2p and constantine

* fix libp2p update issues

* add missing vendor package

* add missing vendor package
2025-03-20 19:11:00 -07:00
Dmitriy Ryajov
1cac3e2a11
Fix/rework async exceptions (#1130)
* cleanup imports and logs

* add BlockHandle type

* revert deps

* refactor: async error handling and future tracking improvements

- Update async procedures to use explicit raises annotation
- Modify TrackedFutures to handle futures with no raised exceptions
- Replace `asyncSpawn` with explicit future tracking
- Update test suites to use `unittest2`
- Standardize error handling across network and async components
- Remove deprecated error handling patterns

This commit introduces a more robust approach to async error handling and future management, improving type safety and reducing potential runtime errors.

* bump nim-serde

* remove asyncSpawn

* rework background downloads and prefetch

* imporove logging

* refactor: enhance async procedures with error handling and raise annotations

* misc cleanup

* misc

* refactor: implement allFinishedFailed to aggregate future results with success and failure tracking

* refactor: update error handling in reader procedures to raise ChunkerError and CancelledError

* refactor: improve error handling in wantListHandler and accountHandler procedures

* refactor: simplify LPStreamReadError creation by consolidating parameters

* refactor: enhance error handling in AsyncStreamWrapper to catch unexpected errors

* refactor: enhance error handling in advertiser and discovery loops to improve resilience

* misc

* refactor: improve code structure and readability

* remove cancellation from addSlotToQueue

* refactor: add assertion for unexpected errors in local store checks

* refactor: prevent tracking of finished futures and improve test assertions

* refactor: improve error handling in local store checks

* remove usage of msgDetail

* feat: add initial implementation of discovery engine and related components

* refactor: improve task scheduling logic by removing unnecessary break statement

* break after scheduling a task

* make taskHandler cancelable

* refactor: update async handlers to raise CancelledError

* refactor(advertiser): streamline error handling and improve task flow in advertise loops

* fix: correct spelling of "divisible" in error messages and comments

* refactor(discovery): simplify discovery task loop and improve error handling

* refactor(engine): filter peers before processing in cancelBlocks procedure
2025-03-13 07:33:15 -07:00
Dmitriy Ryajov
a609baea26
Add basic retry functionality (#1119)
* adding basic retry functionality

* avoid duplicate requests and batch them

* fix cancelling blocks

* properly resolve blocks

* minor cleanup - use `self`

* avoid useless asyncSpawn

* track retries

* limit max inflight and set libp2p maxIncomingStreams

* cleanup

* add basic yield in readLoop

* use tuple instead of object

* cleanup imports and logs

* increase defaults

* wip

* fix prefetch batching

* cleanup

* decrease timeouts to speedup tests

* remove outdated test

* add retry tests

* should track retries

* remove useless test

* use correct block address (index was off by 1)

* remove duplicate noop proc

* add BlockHandle type

* Use BlockHandle type

* add fetchLocal to control batching from local store

* add format target

* revert deps

* adjust quotaMaxBytes

* cleanup imports and logs

* revert deps

* cleanup blocks on cancelled

* terminate erasure and prefetch jobs on stream end

* split storing and retrieving data into separate tests

* track `b.discoveryLoop` future

* misc

* remove useless check
2025-02-24 21:01:23 +00:00
Marcin Czenko
8880ad9cd4
fix linting in "codex/blockexchange/engine/engine.nim" (#1107) 2025-02-11 10:47:25 +00:00
Dmitriy Ryajov
17d3f99f45
use a case-of instead of if for better readability (#1063) 2025-02-06 21:36:35 +00:00
Adam Uhlíř
e5df8c50d3
style: nph formatting (#1067)
* style: nph setup

* chore: formates codex/ and tests/ folder with nph 0.6.1
2025-01-21 20:54:46 +00:00
Ben Bierens
caed3c07a3
Fix sending of WantBlocks messages and tracking of peerWants (#1019)
* sends wantBlock to peers with block. wantHave to everyone else

* Cleanup cheapestPeer. Fixes test for peers lists

* Fixes issue where peerWants are only stored for type wantBlock.

* Review comments by Dmitriy

* consistent logging of addresses

* prevents duplicate scheduling. Fixes cancellation

* fast

* Marks cancel-presence situation with todo comment.

* fixtest: testsales enable logging

* Review by Dmitriy: Remember peerWants only if we don't have them.

* rework `wantListHandler` handling

---------

Co-authored-by: Dmitriy Ryajov <dryajov@gmail.com>
2025-01-09 22:44:02 +00:00
Eric
8645d336ff
refactor(trackedfutures): remove return of future from tracked futures api (#1046)
- cleans up all instances of `.track` to use the `module.trackedfutures.track(future)` procedure, for better readability
- removes the `track` override that is no longer used in the codebase
2024-12-18 07:39:03 +00:00
Eric
b0cc27f563
fix(blockexchange): ensures futures are asyncSpawned (#1037)
* fix(blockexchange): asyncSpawn advertising of local store blocks

* fix(blockexchange): asyncSpawn discoveryQueueLoop

- prevents silently swallowing async errors

* fix(blockexchange): asyncSpawns block exchange tasks

- prevents silently swallow future exceptions
2024-12-16 06:01:49 +00:00
Ben Bierens
8e29939cf8
Send pluralized wantBlock messages (#1016)
* don't unroll wantCids when sending wantBlock message in blockPresenceHandler

* workaround logging

* Fixes logformatting upraises for sequences.

* Applies upraises rule for setProperty of textmode for sequences.

* Replaces upraises with raises

* Removes redundant log in sendWantHave
2024-12-04 13:33:48 +00:00
Ben Bierens
1e2ad95659
Update advertising (#862)
* Setting up advertiser

* Wires up advertiser

* cleanup

* test compiles

* tests pass

* setting up test for advertiser

* Finishes advertiser tests

* fixes commonstore tests

* Review comments by Giuliano

* Race condition found by Giuliano

* Review comment by Dmitriy

Co-authored-by: Dmitriy Ryajov <dryajov@gmail.com>
Signed-off-by: Ben Bierens <39762930+benbierens@users.noreply.github.com>

* fixes tests

---------

Signed-off-by: Ben Bierens <39762930+benbierens@users.noreply.github.com>
Co-authored-by: Dmitriy Ryajov <dryajov@gmail.com>
2024-08-26 13:18:59 +00:00
Ben Bierens
a0b12e85bf
Update logging for download (#799)
* Updates logging for file upload

* Restores trace for placing block and proof in repo store

* Reduces logging while transmitting blocks

* unnecessary formatter

* Clean up some more download related traces

* much better

* Review comment by dryajov
2024-05-16 10:06:12 -07:00
Ben Bierens
c7bc28d723
Reduce logging during file upload (#792)
* Removes warning

* Updates logging for file upload

* Restores trace for placing block and proof in repo store
2024-04-30 09:31:06 +00:00
Ben Bierens
3041f5ff5f
Announce to DHT only tree and manifest CIDs (#788)
* announce only tree and manifest cids

* wip

* Adds tests for selecting which CIDs are announced

* newline

* Review comments by Tomasz
2024-04-24 07:30:02 +00:00
Giuliano Mega
d33804f700
fixes double lookups when block does not exist (#739)
* fixes double lookups when block does not exist

* handle timeouts on requestBlock

* fix indentation which was causing an integration test to fail
2024-03-15 21:50:56 +00:00
markspanbroek
b3e57a37e2
Wire up prover (#736)
* wire prover into node

* stricter case object checks

* return correct proof

* misc renames

* adding usefull traces

* fix nodes and tolerance to match expected params

* format challenges in logs

* add circom compat to solidity groth16 convertion

* update

* bump time to give nodes time to load with all circom artifacts

* misc

* misc

* use correct dataset geometry in erasure

* make errors more searchable

* use parens around `=? (await...)` calls

* styling

* styling

* use push raises

* fix to match constructor arguments

* merge master

* merge master

* integration: fix proof parameters for a test

Increased times due to ZK proof generation.
Increased storage requirement because we're now hosting
5 slots instead of 1.

* sales: calculate initial proof at start of period

reason: this ensures that the period (and therefore
the challenge) doesn't change while we're calculating
the proof

* integration: fix proof parameters for tests

Increased times due to waiting on next period.
Fixed data to be of right size.
Updated expected payout due to hosting 5 slots.

* sales: wait for stable proof challenge

When the block pointer is nearing the
wrap-around point, we wait another period
before calculating a proof.

* fix merge conflict

---------

Co-authored-by: Dmitriy Ryajov <dryajov@gmail.com>
Co-authored-by: Eric <5089238+emizzle@users.noreply.github.com>
2024-03-12 12:10:14 +00:00
Ben Bierens
5e7ce52fbe
Fix block retransmit (#651)
* Applies peer-scoped lock to peer task handler.

* Replace async lock with delete-first approach.

* Cleanup some logging

* Adds inFlight flag to WantListEntry

* Clears inflight flag when local retrieval fails.

* Adds test for setting of in-flight

* Adds test for clearing in-flight when lookup fails

* Review comments by Tomasz

---------

Co-authored-by: gmega <giuliano.mega@gmail.com>
2024-02-29 07:37:12 +00:00
Giuliano Mega
457567531f
Fixes active cancellation for pending want requests (#714)
* add block cancellation support + tests

* tie issueCancellations into resolveBlocks for proper exception tracking, address comments

* pull cancellation as separate primitive in BlockExcNetwork

* use allFutures, rename issueBlockCancellations -> cancelBlocks

* use trc instead of wrn to register send error

* do not log peer IDs
2024-02-22 14:54:45 +00:00