69 Commits

Author SHA1 Message Date
Chrysostomos Nanakos
ff7ac829c4
fix(blockexchange): handle evicted peer in download retry loop
Part of https://github.com/codex-storage/nim-codex/issues/974

Signed-off-by: Chrysostomos Nanakos <chris@include.gr>
2025-11-03 14:10:57 +02:00
Chrysostomos Nanakos
ad4d3b5d62
feat(node): implement sliding window for block prefetching
Now fetchBatched maintains a sliding window of batchSize blocks in-flight.
When 75% complete, adds next chunk to maintain constant window size.
This ensures blocks are already pending or have been fetched when
StoreStream needs them.

Part of https://github.com/codex-storage/nim-codex/issues/974

Signed-off-by: Chrysostomos Nanakos <chris@include.gr>
2025-11-03 14:10:55 +02:00
Chrysostomos Nanakos
aea9337ddc
feat(blockexchange): implement delta WantList updates with batching
Implements delta-based WantList updates to reduce network traffic during
block exchange. Only sends newly added blocks instead of resending the
entire WantList on every refresh.

Also some network related fixes:

- Add TCP_NODELAY flag to prevent Nagle's algorithm delays
- Clear sendConn on stream reset to allow garbage collection
- Improve error handling in NetworkPeer.send()

Part of https://github.com/codex-storage/nim-codex/issues/974

Signed-off-by: Chrysostomos Nanakos <chris@include.gr>
2025-11-03 14:10:55 +02:00
Chrysostomos Nanakos
6f378b3c46
perf: add time-based yielding to hot loops
Part of https://github.com/codex-storage/nim-codex/issues/974

Signed-off-by: Chrysostomos Nanakos <chris@include.gr>
2025-11-03 14:10:54 +02:00
Chrysostomos Nanakos
8812d98271
fix: assign selectPeer field in BlockExcEngine ctor
Part of https://github.com/codex-storage/nim-codex/issues/974

Signed-off-by: Chrysostomos Nanakos <chris@include.gr>
2025-11-03 14:10:53 +02:00
Chrysostomos Nanakos
0636adf5e8
refactor: make markRequested idempotent
Returns false on duplicate marking attempts instead of logging errors,
eliminating duplicate marking loop in blockPresenceHandler and
preventing duplicate block requests across concurrent flows.

Part of https://github.com/codex-storage/nim-codex/issues/974
2025-11-03 14:10:51 +02:00
Chrysostomos Nanakos
dddf7424b4
fix: resolve stuck peer refresh state preventing block discovery
This prevents peers from becoming permanently invisible to block discovery when
they fail to respond to WantHave requests.

Part of https://github.com/codex-storage/nim-codex/issues/974
2025-11-03 14:10:50 +02:00
Chrysostomos Nanakos
4abe8c4d97
feat: add strategic runtime metrics for block exchange monitoring
- Add codex_block_exchange_discovery_requests_total counter to track peer
  discovery frequency
- Add codex_block_exchange_peer_timeouts_total counter to monitor peer
  reliability issues
- Add codex_block_exchange_requests_failed_total counter to track request
  failure rates

Part of https://github.com/codex-storage/nim-codex/issues/974
2025-11-03 14:10:49 +02:00
Chrysostomos Nanakos
9af510316a
perf: optimize block batch size from 500 to 50 blocks per message
Achieves significant memory reduction with equivalent network
performance. The reduced batch size prevents memory pressure
while preserving transfer efficiency, improving overall system
resource utilization.

Part of https://github.com/codex-storage/nim-codex/issues/974
2025-11-03 14:10:49 +02:00
Chrysostomos Nanakos
0fbca17269
feat: implement weighted random peer selection for load balancing
Use probabilistic distribution based on peer quality scores, giving all peers
opportunity while favoring better-performing ones. Selection probability is
inversely proportional to score.

Part of https://github.com/codex-storage/nim-codex/issues/974
2025-11-03 14:10:46 +02:00
gmega
8e8d9f8e60
fix: randomize block refresh time, optimize context store checks 2025-11-03 14:10:46 +02:00
gmega
1fdb14f092
feat: add block knowledge request mechanism, implement tests 2025-11-03 14:10:45 +02:00
gmega
9114a620a3
feat: add stopgap "adaptive" refresh 2025-11-03 14:10:44 +02:00
gmega
908d527dc9
fix: fix testdiscovery so it works with stricter block protocol 2025-11-03 14:10:44 +02:00
gmega
544ec123c7
fix: fix block exchange test to stricter protocol; minor refactor 2025-11-03 14:10:43 +02:00
gmega
50ab785662
feat: drop peer on activity timeout 2025-11-03 14:10:42 +02:00
gmega
d91fd053e7
feat: modify retry mechanism; add DHT guard rails; improve block cancellation handling 2025-11-03 14:10:42 +02:00
gmega
d875022ec3
feat: allow futures to be returned out-of-order to decrease memory consumption 2025-11-03 14:10:40 +02:00
gmega
d0466ccf80
feat: remove quadratic joins in cancelBlocks; use SafeAsyncIterator for getBlocks; limit memory usage for fetchBatched when used as prefetcher 2025-11-03 14:10:39 +02:00
gmega
b0b1c45376
fix: refresh timestamp before issuing request to prevent flood of knowledge updates 2025-11-03 14:10:38 +02:00
gmega
3c43d57497
optimize remaining list joins so they're not quadratic 2025-11-03 14:10:36 +02:00
gmega
096eb118f9
replace list operations with sets 2025-11-03 14:10:35 +02:00
gmega
9088566632
update engine tests; add BlockAddress hashing tests 2025-11-03 14:10:34 +02:00
gmega
1135a513d4
feat: cap how many blocks we can pack in a single message 2025-11-03 14:10:33 +02:00
gmega
475d31bef2
feat: add dataset request batching 2025-11-03 14:10:31 +02:00
Chrysostomos Nanakos
baff902137
fix: resolve shared block request cancellation conflicts (#1284) 2025-06-24 15:05:25 +00:00
Marcin Czenko
748830570a
checked exceptions in stores (#1179)
* checked exceptions in stores

* makes asynciter as much exception safe as it gets

* introduce "SafeAsyncIter" that uses Results and limits exceptions to cancellations

* adds {.push raises: [].} to errors

* uses SafeAsyncIter in "listBlocks" and in "getBlockExpirations"

* simplifies safeasynciter (magic of auto)

* gets rid of ugly casts

* tiny fix in hte way we create raising futures in tests of safeasynciter

* Removes two more casts caused by using checked exceptions

* adds an extended explanation of one more complex SafeAsyncIter test

* adds missing "finishOnErr" param in slice constructor of SafeAsyncIter

* better fix for "Error: Exception can raise an unlisted exception: Exception" error.

---------

Co-authored-by: Dmitriy Ryajov <dryajov@gmail.com>
2025-05-21 21:17:04 +00:00
munna0908
3a312596bf
deps: upgrade libp2p & constantine (#1167)
* upgrade libp2p and constantine

* fix libp2p update issues

* add missing vendor package

* add missing vendor package
2025-03-20 19:11:00 -07:00
Dmitriy Ryajov
1cac3e2a11
Fix/rework async exceptions (#1130)
* cleanup imports and logs

* add BlockHandle type

* revert deps

* refactor: async error handling and future tracking improvements

- Update async procedures to use explicit raises annotation
- Modify TrackedFutures to handle futures with no raised exceptions
- Replace `asyncSpawn` with explicit future tracking
- Update test suites to use `unittest2`
- Standardize error handling across network and async components
- Remove deprecated error handling patterns

This commit introduces a more robust approach to async error handling and future management, improving type safety and reducing potential runtime errors.

* bump nim-serde

* remove asyncSpawn

* rework background downloads and prefetch

* imporove logging

* refactor: enhance async procedures with error handling and raise annotations

* misc cleanup

* misc

* refactor: implement allFinishedFailed to aggregate future results with success and failure tracking

* refactor: update error handling in reader procedures to raise ChunkerError and CancelledError

* refactor: improve error handling in wantListHandler and accountHandler procedures

* refactor: simplify LPStreamReadError creation by consolidating parameters

* refactor: enhance error handling in AsyncStreamWrapper to catch unexpected errors

* refactor: enhance error handling in advertiser and discovery loops to improve resilience

* misc

* refactor: improve code structure and readability

* remove cancellation from addSlotToQueue

* refactor: add assertion for unexpected errors in local store checks

* refactor: prevent tracking of finished futures and improve test assertions

* refactor: improve error handling in local store checks

* remove usage of msgDetail

* feat: add initial implementation of discovery engine and related components

* refactor: improve task scheduling logic by removing unnecessary break statement

* break after scheduling a task

* make taskHandler cancelable

* refactor: update async handlers to raise CancelledError

* refactor(advertiser): streamline error handling and improve task flow in advertise loops

* fix: correct spelling of "divisible" in error messages and comments

* refactor(discovery): simplify discovery task loop and improve error handling

* refactor(engine): filter peers before processing in cancelBlocks procedure
2025-03-13 07:33:15 -07:00
Dmitriy Ryajov
a609baea26
Add basic retry functionality (#1119)
* adding basic retry functionality

* avoid duplicate requests and batch them

* fix cancelling blocks

* properly resolve blocks

* minor cleanup - use `self`

* avoid useless asyncSpawn

* track retries

* limit max inflight and set libp2p maxIncomingStreams

* cleanup

* add basic yield in readLoop

* use tuple instead of object

* cleanup imports and logs

* increase defaults

* wip

* fix prefetch batching

* cleanup

* decrease timeouts to speedup tests

* remove outdated test

* add retry tests

* should track retries

* remove useless test

* use correct block address (index was off by 1)

* remove duplicate noop proc

* add BlockHandle type

* Use BlockHandle type

* add fetchLocal to control batching from local store

* add format target

* revert deps

* adjust quotaMaxBytes

* cleanup imports and logs

* revert deps

* cleanup blocks on cancelled

* terminate erasure and prefetch jobs on stream end

* split storing and retrieving data into separate tests

* track `b.discoveryLoop` future

* misc

* remove useless check
2025-02-24 21:01:23 +00:00
Marcin Czenko
8880ad9cd4
fix linting in "codex/blockexchange/engine/engine.nim" (#1107) 2025-02-11 10:47:25 +00:00
Dmitriy Ryajov
17d3f99f45
use a case-of instead of if for better readability (#1063) 2025-02-06 21:36:35 +00:00
Adam Uhlíř
e5df8c50d3
style: nph formatting (#1067)
* style: nph setup

* chore: formates codex/ and tests/ folder with nph 0.6.1
2025-01-21 20:54:46 +00:00
Ben Bierens
caed3c07a3
Fix sending of WantBlocks messages and tracking of peerWants (#1019)
* sends wantBlock to peers with block. wantHave to everyone else

* Cleanup cheapestPeer. Fixes test for peers lists

* Fixes issue where peerWants are only stored for type wantBlock.

* Review comments by Dmitriy

* consistent logging of addresses

* prevents duplicate scheduling. Fixes cancellation

* fast

* Marks cancel-presence situation with todo comment.

* fixtest: testsales enable logging

* Review by Dmitriy: Remember peerWants only if we don't have them.

* rework `wantListHandler` handling

---------

Co-authored-by: Dmitriy Ryajov <dryajov@gmail.com>
2025-01-09 22:44:02 +00:00
Eric
8645d336ff
refactor(trackedfutures): remove return of future from tracked futures api (#1046)
- cleans up all instances of `.track` to use the `module.trackedfutures.track(future)` procedure, for better readability
- removes the `track` override that is no longer used in the codebase
2024-12-18 07:39:03 +00:00
Eric
b0cc27f563
fix(blockexchange): ensures futures are asyncSpawned (#1037)
* fix(blockexchange): asyncSpawn advertising of local store blocks

* fix(blockexchange): asyncSpawn discoveryQueueLoop

- prevents silently swallowing async errors

* fix(blockexchange): asyncSpawns block exchange tasks

- prevents silently swallow future exceptions
2024-12-16 06:01:49 +00:00
Ben Bierens
8e29939cf8
Send pluralized wantBlock messages (#1016)
* don't unroll wantCids when sending wantBlock message in blockPresenceHandler

* workaround logging

* Fixes logformatting upraises for sequences.

* Applies upraises rule for setProperty of textmode for sequences.

* Replaces upraises with raises

* Removes redundant log in sendWantHave
2024-12-04 13:33:48 +00:00
Ben Bierens
1e2ad95659
Update advertising (#862)
* Setting up advertiser

* Wires up advertiser

* cleanup

* test compiles

* tests pass

* setting up test for advertiser

* Finishes advertiser tests

* fixes commonstore tests

* Review comments by Giuliano

* Race condition found by Giuliano

* Review comment by Dmitriy

Co-authored-by: Dmitriy Ryajov <dryajov@gmail.com>
Signed-off-by: Ben Bierens <39762930+benbierens@users.noreply.github.com>

* fixes tests

---------

Signed-off-by: Ben Bierens <39762930+benbierens@users.noreply.github.com>
Co-authored-by: Dmitriy Ryajov <dryajov@gmail.com>
2024-08-26 13:18:59 +00:00
Ben Bierens
a0b12e85bf
Update logging for download (#799)
* Updates logging for file upload

* Restores trace for placing block and proof in repo store

* Reduces logging while transmitting blocks

* unnecessary formatter

* Clean up some more download related traces

* much better

* Review comment by dryajov
2024-05-16 10:06:12 -07:00
Ben Bierens
c7bc28d723
Reduce logging during file upload (#792)
* Removes warning

* Updates logging for file upload

* Restores trace for placing block and proof in repo store
2024-04-30 09:31:06 +00:00
Ben Bierens
3041f5ff5f
Announce to DHT only tree and manifest CIDs (#788)
* announce only tree and manifest cids

* wip

* Adds tests for selecting which CIDs are announced

* newline

* Review comments by Tomasz
2024-04-24 07:30:02 +00:00
Giuliano Mega
d33804f700
fixes double lookups when block does not exist (#739)
* fixes double lookups when block does not exist

* handle timeouts on requestBlock

* fix indentation which was causing an integration test to fail
2024-03-15 21:50:56 +00:00
markspanbroek
b3e57a37e2
Wire up prover (#736)
* wire prover into node

* stricter case object checks

* return correct proof

* misc renames

* adding usefull traces

* fix nodes and tolerance to match expected params

* format challenges in logs

* add circom compat to solidity groth16 convertion

* update

* bump time to give nodes time to load with all circom artifacts

* misc

* misc

* use correct dataset geometry in erasure

* make errors more searchable

* use parens around `=? (await...)` calls

* styling

* styling

* use push raises

* fix to match constructor arguments

* merge master

* merge master

* integration: fix proof parameters for a test

Increased times due to ZK proof generation.
Increased storage requirement because we're now hosting
5 slots instead of 1.

* sales: calculate initial proof at start of period

reason: this ensures that the period (and therefore
the challenge) doesn't change while we're calculating
the proof

* integration: fix proof parameters for tests

Increased times due to waiting on next period.
Fixed data to be of right size.
Updated expected payout due to hosting 5 slots.

* sales: wait for stable proof challenge

When the block pointer is nearing the
wrap-around point, we wait another period
before calculating a proof.

* fix merge conflict

---------

Co-authored-by: Dmitriy Ryajov <dryajov@gmail.com>
Co-authored-by: Eric <5089238+emizzle@users.noreply.github.com>
2024-03-12 12:10:14 +00:00
Ben Bierens
5e7ce52fbe
Fix block retransmit (#651)
* Applies peer-scoped lock to peer task handler.

* Replace async lock with delete-first approach.

* Cleanup some logging

* Adds inFlight flag to WantListEntry

* Clears inflight flag when local retrieval fails.

* Adds test for setting of in-flight

* Adds test for clearing in-flight when lookup fails

* Review comments by Tomasz

---------

Co-authored-by: gmega <giuliano.mega@gmail.com>
2024-02-29 07:37:12 +00:00
Giuliano Mega
457567531f
Fixes active cancellation for pending want requests (#714)
* add block cancellation support + tests

* tie issueCancellations into resolveBlocks for proper exception tracking, address comments

* pull cancellation as separate primitive in BlockExcNetwork

* use allFutures, rename issueBlockCancellations -> cancelBlocks

* use trc instead of wrn to register send error

* do not log peer IDs
2024-02-22 14:54:45 +00:00
Eric
de88fd2c53
feat: create logging proxy (#663)
* implement a logging proxy

The logging proxy:
- prevents the need to import chronicles (as well as export except toJson),
- prevents the need to override `writeValue` or use or import nim-json-seralization elsewhere in the codebase, allowing for sole use of utils/json for de/serialization,
- and handles json formatting correctly in chronicles json sinks

* Rename logging -> logutils to avoid ambiguity with common names

* clean up

* add setProperty for JsonRecord, remove nim-json-serialization conflict

* Allow specifying textlines and json format separately

Not specifying a LogFormat will apply the formatting to both textlines and json sinks.

Specifying a LogFormat will apply the formatting to only that sink.

* remove unneeded usages of std/json

We only need to import utils/json instead of std/json

* move serialization from rest/json to utils/json so it can be shared

* fix NoColors ambiguity

Was causing unit tests to fail on Windows.

* Remove nre usage to fix Windows error

Windows was erroring with `could not load: pcre64.dll`. Instead of fixing that error, remove the pcre usage :)

* Add logutils module doc

* Shorten logutils.formatIt for `NBytes`

Both json and textlines formatIt were not needed, and could be combined into one formatIt

* remove debug integration test config

debug output and logformat of json for integration test logs

* Use ## module doc to support docgen

* bump nim-poseidon2 to export fromBytes

Before the changes in this branch, fromBytes was likely being resolved by nim-stew, or other dependency. With the changes in this branch, that dependency was removed and fromBytes could no longer be resolved. By exporting fromBytes from nim-poseidon, the correct resolution is now happening.

* fixes to get compiling after rebasing master

* Add support for Result types being logged using formatIt
2024-01-22 23:35:03 -08:00
Dmitriy Ryajov
72da534856
Wire sampler (#676)
* Setting up testfixture for proof datasampler

* Sets up calculating number of cells in a slot

* Sets up tests for bitwise modulo

* Implements cell index collection

* setting up slot blocks module

* Implements getting treeCID from slot

* implements getting slot blocks by index

* Implements out-of-range check for slot index

* cleanup

* Sets up getting sample from block

* Implements selecting a cell sample from a block

* Implements building a minitree for block cells

* Adds method to get dataset block index from slot block index

* It's running

* splits up indexing

* almost there

* Fixes test. Implementation is now functional

* Refactoring to object-oriented

* Cleanup

* Lining up output type with updated reference code.

* setting up

* Updates expected samples

* Updates proof checking test to match new format

* move builder to own dir

* move sampler to own dir

* fix paths

* various changes to add support for the sampler

* wip sampler implementation

* don't use upraises

* wip sampler integration

* misc

* move tests around

* Various fixes to select correct slot and block index

* removing old tests

* cleanup

* misc

fix tests that work with correct cell indices

* remove unused file

* fixup logging

* add logscope

* truncate entropy to 31 bytes, otherwise it might be > than mod

* forwar getCidAndProof to local store

* misc

* Adds missing test for initial-proving state

* reverting back to correct slot/block indexing

* fix tests for revert

* misc

* misc

---------

Co-authored-by: benbierens <thatbenbierens@gmail.com>
2024-01-17 11:24:34 -08:00
Dmitriy Ryajov
fffb674bba
Integrate slot builder (#666)
* rework merkle tree support

* rename merkletree -> codexmerkletree

* treed and proof encoding/decoding

* style

* adding codex merkle and coders tests

* use default hash codec

* proof size changed

* add from nodes test

* shorte file names

* wip poseidon tree

* shorten file names

* root returns a result

* import poseidon tests

* fix merge issues and cleanup a few warnings

* setting up slot builder

* Getting cids in slot

* ensures blocks are devisable by number of slots

* wip

* Implements indexing strategies

* Swaps in indexing strategy into erasure.

* wires slot and indexing tests up

* Fixes issue where indexing strategy stepped gives wrong values for smallest of ranges

* debugs indexing strategies

* Can select slot blocks

* finding number of pad cells

* Implements building slot tree

* finishes implementing slot builder

* Adds check that block size is a multiple of cell size

* Cleanup slotbuilder

* Review comments by Tomasz

* Fixes issue where ecK was used as numberOfSlots.

* rework merkle tree support

* deps

* rename merkletree -> codexmerkletree

* treed and proof encoding/decoding

* style

* adding codex merkle and coders tests

* remove new codecs for now

* proof size changed

* add from nodes test

* shorte file names

* wip poseidon tree

* shorten file names

* fix bad `elements` iter

* bump

* bump

* wip

* reworking slotbuilder

* move out of manifest

* expose getCidAndProof

* import index strat...

* remove getMHash

* remove unused artifacts

* alias zero

* add digest for multihash

* merge issues

* remove unused hashes

* add option to result converter

* misc

* fix tests

* add helper to derive EC block count

* rename method

* misc

* bump

* extract slot root building into own proc

* revert to manifest to accessor

---------

Co-authored-by: benbierens <thatbenbierens@gmail.com>
2024-01-08 14:52:46 -08:00
Dmitriy Ryajov
52c5578c46
Rework merkle tree (#654)
* rework merkle tree support

* deps

* rename merkletree -> codexmerkletree

* treed and proof encoding/decoding

* small change to invoke proof verification

* rename merkletree to codexmerkletree

* style

* adding codex merkle and coders tests

* fixup imports

* remove new codecs for now

* bump deps

* adding trace statement

* properly serde of manifest block codecs

* use default hash codec

* add more trace logging to aid debugging

* misc

* remove double import

* revert un-needded change

* proof size changed

* bump poseidon2

* add from nodes test

* shorte file names

* remove upraises

* wip poseidon tree

* adjust file names

* misc

* shorten file names

* fix bad `elements` iter

* don't do asserts

* add fromNodes and converters

* root and getProof now return result

* add poseidon2 tree tests

* root now returns result

* misc

* had to make merkletree a ref, because nim blows up otherwise

* root returns a result

* root returns a result

* import poseidon tests

* bump

* merkle poseidon2 digest

* misc

* add merkle digest tests

* bump

* don't use checksuite

* Update tests/codex/merkletree/generictreetests.nim

Co-authored-by: markspanbroek <mark@spanbroek.net>
Signed-off-by: Dmitriy Ryajov <dryajov@gmail.com>

* Update codex/merkletree/merkletree.nim

Co-authored-by: markspanbroek <mark@spanbroek.net>
Signed-off-by: Dmitriy Ryajov <dryajov@gmail.com>

* Update codex/merkletree/merkletree.nim

Co-authored-by: markspanbroek <mark@spanbroek.net>
Signed-off-by: Dmitriy Ryajov <dryajov@gmail.com>

* Update tests/codex/merkletree/generictreetests.nim

Co-authored-by: markspanbroek <mark@spanbroek.net>
Signed-off-by: Dmitriy Ryajov <dryajov@gmail.com>

* missing return

* make toBool private (it's still needed otherwise comparison won't work)

* added `digestTree` that returns a tree and `digest` for root

* test against both poseidon trees - codex and poseidon2

* shorten merkle tree names

* don't compare trees - it's going to be too slow

* move comparison to mekrle helper

* remove merkle utils

---------

Signed-off-by: Dmitriy Ryajov <dryajov@gmail.com>
Co-authored-by: markspanbroek <mark@spanbroek.net>
2023-12-21 06:41:43 +00:00
Eric
f293082ae9
Reverts logging-proxy, commit 27f585eb6f380a6de2fbbef721fdf1a16bd5f600 (#660) 2023-12-20 13:24:40 +11:00