* fix(sales): handle cancellation of slot cleanup
Ensures that processing slots from the slot queue
continues even when cleanup of a slot is cancelled.
Co-Authored-By: Eric <5089238+emizzle@users.noreply.github.com>
* chore(reservations): add more `raises` annotations
* Fix cleanup cancellation
* Add remove-agent to trackedfutures instead of the cleanup function
* Increase the timeout to match the request expiry
* Enable logs to debug on CI
* Remove useless except and do not return when add item back to slot queue fails
* Reduce poll interval to detect sale cancelled state
* Avoid cancelling cleanup routine
* Do not cancel creating reservation in order to avoid inconsistent state
* Remove useless try except
---------
Co-authored-by: Mark Spanbroek <mark@spanbroek.net>
Co-authored-by: Eric <5089238+emizzle@users.noreply.github.com>
* checked exceptions in stores
* makes asynciter as much exception safe as it gets
* introduce "SafeAsyncIter" that uses Results and limits exceptions to cancellations
* adds {.push raises: [].} to errors
* uses SafeAsyncIter in "listBlocks" and in "getBlockExpirations"
* simplifies safeasynciter (magic of auto)
* gets rid of ugly casts
* tiny fix in hte way we create raising futures in tests of safeasynciter
* Removes two more casts caused by using checked exceptions
* adds an extended explanation of one more complex SafeAsyncIter test
* adds missing "finishOnErr" param in slice constructor of SafeAsyncIter
* better fix for "Error: Exception can raise an unlisted exception: Exception" error.
---------
Co-authored-by: Dmitriy Ryajov <dryajov@gmail.com>
* cleanup imports and logs
* add BlockHandle type
* revert deps
* refactor: async error handling and future tracking improvements
- Update async procedures to use explicit raises annotation
- Modify TrackedFutures to handle futures with no raised exceptions
- Replace `asyncSpawn` with explicit future tracking
- Update test suites to use `unittest2`
- Standardize error handling across network and async components
- Remove deprecated error handling patterns
This commit introduces a more robust approach to async error handling and future management, improving type safety and reducing potential runtime errors.
* bump nim-serde
* remove asyncSpawn
* rework background downloads and prefetch
* imporove logging
* refactor: enhance async procedures with error handling and raise annotations
* misc cleanup
* misc
* refactor: implement allFinishedFailed to aggregate future results with success and failure tracking
* refactor: update error handling in reader procedures to raise ChunkerError and CancelledError
* refactor: improve error handling in wantListHandler and accountHandler procedures
* refactor: simplify LPStreamReadError creation by consolidating parameters
* refactor: enhance error handling in AsyncStreamWrapper to catch unexpected errors
* refactor: enhance error handling in advertiser and discovery loops to improve resilience
* misc
* refactor: improve code structure and readability
* remove cancellation from addSlotToQueue
* refactor: add assertion for unexpected errors in local store checks
* refactor: prevent tracking of finished futures and improve test assertions
* refactor: improve error handling in local store checks
* remove usage of msgDetail
* feat: add initial implementation of discovery engine and related components
* refactor: improve task scheduling logic by removing unnecessary break statement
* break after scheduling a task
* make taskHandler cancelable
* refactor: update async handlers to raise CancelledError
* refactor(advertiser): streamline error handling and improve task flow in advertise loops
* fix: correct spelling of "divisible" in error messages and comments
* refactor(discovery): simplify discovery task loop and improve error handling
* refactor(engine): filter peers before processing in cancelBlocks procedure
* adding basic retry functionality
* avoid duplicate requests and batch them
* fix cancelling blocks
* properly resolve blocks
* minor cleanup - use `self`
* avoid useless asyncSpawn
* track retries
* limit max inflight and set libp2p maxIncomingStreams
* cleanup
* add basic yield in readLoop
* use tuple instead of object
* cleanup imports and logs
* increase defaults
* wip
* fix prefetch batching
* cleanup
* decrease timeouts to speedup tests
* remove outdated test
* add retry tests
* should track retries
* remove useless test
* use correct block address (index was off by 1)
* remove duplicate noop proc
* add BlockHandle type
* Use BlockHandle type
* add fetchLocal to control batching from local store
* add format target
* revert deps
* adjust quotaMaxBytes
* cleanup imports and logs
* revert deps
* cleanup blocks on cancelled
* terminate erasure and prefetch jobs on stream end
* split storing and retrieving data into separate tests
* track `b.discoveryLoop` future
* misc
* remove useless check
* fix(statemachine): do not raise from state.run
* fix rebase
* fix exception handling in SaleProvingSimulated.prove
- re-raise CancelledError
- don't return State on CatchableError
- expect the Proofs_InvalidProof custom error instead of checking a string
* asyncSpawn salesagent.onCancelled
This was swallowing a KeyError in one of the tests (fixed in the previous commit)
* remove error handling states in asyncstatemachine
* revert unneeded changes
* formatting
* PR feedback, logging updates
* Move to version 2.0.6
* Update nim-confutils submodule to latest version
* Update dependencies
* Update Nim version to 2.0.12
* Add gcsafe pragma
* Add missing import
* Update specific conf for Nim 2.x
* Fix method signatures
* Revert erasure coding attempt to fix bug
* More gcsafe pragma
* Duplicate code from libp2p because it is not exported anymore
* Fix camelcase function names
* Use alreadySeen because need is not a bool anymore
* newLPStreamReadError does not exist anymore so use another error
* Replace ValidIpAddress by IpAddress
* Add gcsafe pragma
* Restore maintenance parameter deleted by mistake when removing esasure coding fix attempt code
* Update method signatures
* Copy LPStreamReadError code from libp2p which was removed
* Fix camel case
* Fix enums in tests
* Fix camel case
* Extract node components to a variable to make Nim 2 happy
* Update the tests using ValidIpAddress to IpAddress
* Fix cast for value which is already an option
* Set nim version to 2.0.x for CI
* Set nim version to 2.0.x for CI
* Move to miniupnp version 2.2.4 to avoid symlink error
* Set core.symlinks to false for Windows for miniupnp >= 2.2.5 support
* Update to Nim 2.0.14
* Update CI nim versions to 2.0.14
* Try with GCC 14
* Replace apt-fast by apt-get
* Update ubuntu runner to latest
* Use Ubuntu 20.04 for coverage
* Disable CI cache for coverage
* Add coverage property description
* Remove commented test
* Check the node value of seen instead of using alreadySeen
* Fix the merge. The taskpool work was reverted.
* Update nim-ethers submodule
* Remove deprecated ValidIpAddress. Fix missing case and imports.
* Fix a weird issue where nim-confutils cannot find NatAny
* Fix tests and remove useless static keyword
- cleans up all instances of `.track` to use the `module.trackedfutures.track(future)` procedure, for better readability
- removes the `track` override that is no longer used in the codebase
- adds a break in scheduler when CancelledError is caught
- tracks asyncSpawned state.run, so that it can be cancelled during stop
- removes usages of `then`
- ensures that no exceptions are leaked from async procs
- removes usage of `then`, simplifying the logic, and allowing `then` to be removed completely
- updates annotations to reflect that all procs (sync and async) raise no exceptions
* chore: bump dependencies, including nim-ethers with chronos v4 support
Bumps the following dependencies:
- nim-ethers to commit 507ac6a4cc71cec9be7693fa393db4a49b52baf9 which contains a pinned nim-eth version. This is to be replaced by a versioned library, so it will be pinned to a particular version. There is a crucial fix in this version of ethers that fixes nonce management which is causing issues in the Codex testnet.
- nim-json-rpc to v0.4.4
- nim-json-serialization to v0.2.8
- nim-serde to v1.2.2
- nim-serialization to v0.2.4
Currently, one of the integration tests is failing.
* fix integration test
- When a state's run was cancelled, it was being caught as an error due to catching all CatchableErrors. This caused a state transition to SaleErrored, however cancellation of run was not actually an error. Handling this correctly fixed the issue.
- Stopping of the clock was moved to after `HostInteractions` (sales) which avoided an assertion around getting time when the clock was not started.
* bump ethers to include nonce fix and filter not found fix
* bump ethers: fixes missing symbol not exported in ethers
* Fix cirdl test imports/exports
* Debugging in ci
* Handle CancelledErrors for state.run in one place only
* Rename `config` to `configuration`
There was a symbol clash preventing compilation and it was easiest to rename `config` to `configuration` in the contracts. Not even remotely ideal, but it was the only way.
* bump ethers to latest
Prevents an issue were `JsonNode.items` symbol could not be found
* More changes to support `config` > `configuration`
* cleanup
* testing to see if this fixes failure in ci
* bumps contracts
- ensures slot is free before allowing reservation
- renames config to configuration to avoid symbol clash
* Avoid cancelling states when slot is filled
* improve logging
Improves logging for situations where a Sale should be ignored instead of being considered an error, including when reservation is not allowed and when a slot was filled by another host.
* remove onSlotFilled unit tests from states
* Rework AsyncIter
* Add tests for finishing iter on error
* Improved error handling for and additional tests
* Use new style of constructors
* Handle future cancellation
* Docs for constructors
* json > nim-serde bump
Should wait until serde is integrated into nim-ethers before making these changes as there will be less import exceptions required.
* bump nim-serde
* change func to proc due to chronicles side effects
* import serde into utils/json, use as proxy
import nim-serde into utils/json and use utils/json as a proxy for serde functions, including overloading `%` and `fromJson` for application types.
* update tests to use serde
* bump serde to latest
* remove testjson -- no longer needed
* bump serde in nimble
* updates to reconcile rebase with master
* wire prover into node
* stricter case object checks
* return correct proof
* misc renames
* adding usefull traces
* fix nodes and tolerance to match expected params
* format challenges in logs
* add circom compat to solidity groth16 convertion
* update
* bump time to give nodes time to load with all circom artifacts
* misc
* misc
* use correct dataset geometry in erasure
* make errors more searchable
* use parens around `=? (await...)` calls
* styling
* styling
* use push raises
* fix to match constructor arguments
* merge master
* merge master
* integration: fix proof parameters for a test
Increased times due to ZK proof generation.
Increased storage requirement because we're now hosting
5 slots instead of 1.
* sales: calculate initial proof at start of period
reason: this ensures that the period (and therefore
the challenge) doesn't change while we're calculating
the proof
* integration: fix proof parameters for tests
Increased times due to waiting on next period.
Fixed data to be of right size.
Updated expected payout due to hosting 5 slots.
* sales: wait for stable proof challenge
When the block pointer is nearing the
wrap-around point, we wait another period
before calculating a proof.
* fix merge conflict
---------
Co-authored-by: Dmitriy Ryajov <dryajov@gmail.com>
Co-authored-by: Eric <5089238+emizzle@users.noreply.github.com>
* implement a logging proxy
The logging proxy:
- prevents the need to import chronicles (as well as export except toJson),
- prevents the need to override `writeValue` or use or import nim-json-seralization elsewhere in the codebase, allowing for sole use of utils/json for de/serialization,
- and handles json formatting correctly in chronicles json sinks
* Rename logging -> logutils to avoid ambiguity with common names
* clean up
* add setProperty for JsonRecord, remove nim-json-serialization conflict
* Allow specifying textlines and json format separately
Not specifying a LogFormat will apply the formatting to both textlines and json sinks.
Specifying a LogFormat will apply the formatting to only that sink.
* remove unneeded usages of std/json
We only need to import utils/json instead of std/json
* move serialization from rest/json to utils/json so it can be shared
* fix NoColors ambiguity
Was causing unit tests to fail on Windows.
* Remove nre usage to fix Windows error
Windows was erroring with `could not load: pcre64.dll`. Instead of fixing that error, remove the pcre usage :)
* Add logutils module doc
* Shorten logutils.formatIt for `NBytes`
Both json and textlines formatIt were not needed, and could be combined into one formatIt
* remove debug integration test config
debug output and logformat of json for integration test logs
* Use ## module doc to support docgen
* bump nim-poseidon2 to export fromBytes
Before the changes in this branch, fromBytes was likely being resolved by nim-stew, or other dependency. With the changes in this branch, that dependency was removed and fromBytes could no longer be resolved. By exporting fromBytes from nim-poseidon, the correct resolution is now happening.
* fixes to get compiling after rebasing master
* Add support for Result types being logged using formatIt
* Setting up testfixture for proof datasampler
* Sets up calculating number of cells in a slot
* Sets up tests for bitwise modulo
* Implements cell index collection
* setting up slot blocks module
* Implements getting treeCID from slot
* implements getting slot blocks by index
* Implements out-of-range check for slot index
* cleanup
* Sets up getting sample from block
* Implements selecting a cell sample from a block
* Implements building a minitree for block cells
* Adds method to get dataset block index from slot block index
* It's running
* splits up indexing
* almost there
* Fixes test. Implementation is now functional
* Refactoring to object-oriented
* Cleanup
* Lining up output type with updated reference code.
* setting up
* Updates expected samples
* Updates proof checking test to match new format
* move builder to own dir
* move sampler to own dir
* fix paths
* various changes to add support for the sampler
* wip sampler implementation
* don't use upraises
* wip sampler integration
* misc
* move tests around
* Various fixes to select correct slot and block index
* removing old tests
* cleanup
* misc
fix tests that work with correct cell indices
* remove unused file
* fixup logging
* add logscope
* truncate entropy to 31 bytes, otherwise it might be > than mod
* forwar getCidAndProof to local store
* misc
* Adds missing test for initial-proving state
* reverting back to correct slot/block indexing
* fix tests for revert
* misc
* misc
---------
Co-authored-by: benbierens <thatbenbierens@gmail.com>
* rework merkle tree support
* rename merkletree -> codexmerkletree
* treed and proof encoding/decoding
* style
* adding codex merkle and coders tests
* use default hash codec
* proof size changed
* add from nodes test
* shorte file names
* wip poseidon tree
* shorten file names
* root returns a result
* import poseidon tests
* fix merge issues and cleanup a few warnings
* setting up slot builder
* Getting cids in slot
* ensures blocks are devisable by number of slots
* wip
* Implements indexing strategies
* Swaps in indexing strategy into erasure.
* wires slot and indexing tests up
* Fixes issue where indexing strategy stepped gives wrong values for smallest of ranges
* debugs indexing strategies
* Can select slot blocks
* finding number of pad cells
* Implements building slot tree
* finishes implementing slot builder
* Adds check that block size is a multiple of cell size
* Cleanup slotbuilder
* Review comments by Tomasz
* Fixes issue where ecK was used as numberOfSlots.
* rework merkle tree support
* deps
* rename merkletree -> codexmerkletree
* treed and proof encoding/decoding
* style
* adding codex merkle and coders tests
* remove new codecs for now
* proof size changed
* add from nodes test
* shorte file names
* wip poseidon tree
* shorten file names
* fix bad `elements` iter
* bump
* bump
* wip
* reworking slotbuilder
* move out of manifest
* expose getCidAndProof
* import index strat...
* remove getMHash
* remove unused artifacts
* alias zero
* add digest for multihash
* merge issues
* remove unused hashes
* add option to result converter
* misc
* fix tests
* add helper to derive EC block count
* rename method
* misc
* bump
* extract slot root building into own proc
* revert to manifest to accessor
---------
Co-authored-by: benbierens <thatbenbierens@gmail.com>
* implement a logging proxy
The logging proxy:
- prevents the need to import chronicles (as well as export except toJson),
- prevents the need to override `writeValue` or use or import nim-json-seralization elsewhere in the codebase, allowing for sole use of utils/json for de/serialization,
- and handles json formatting correctly in chronicles json sinks
* Rename logging -> logutils to avoid ambiguity with common names
* clean up
* add setProperty for JsonRecord, remove nim-json-serialization conflict
* Allow specifying textlines and json format separately
Not specifying a LogFormat will apply the formatting to both textlines and json sinks.
Specifying a LogFormat will apply the formatting to only that sink.
* remove unneeded usages of std/json
We only need to import utils/json instead of std/json
* move serialization from rest/json to utils/json so it can be shared
* fix NoColors ambiguity
Was causing unit tests to fail on Windows.
* Remove nre usage to fix Windows error
Windows was erroring with `could not load: pcre64.dll`. Instead of fixing that error, remove the pcre usage :)
* Add logutils module doc
* Shorten logutils.formatIt for `NBytes`
Both json and textlines formatIt were not needed, and could be combined into one formatIt
* remove debug integration test config
debug output and logformat of json for integration test logs
* Use ## module doc to support docgen
* Add get active slot /slots/{slotId} to REST api, use utils/json
- Add endpoint /slots/{slotId} to get an active SalesAgent from the Sales module. Used in integration tests to test when a sale has reached a certain state. Those integration test changes will be included in a larger PR, coming later.
- Add OpenAPI changes for new endpoint and associated components
- Use utils/json instead of nim-json-serialization. Required exemption of imports from several packages that export nim-json-serialization by default.
* Only except `toJson` from import/export of chronicles
* use str on JString types, `$` will preserve `"`
* Adding enum support
* deserialize cid test
* make enum descerializer public
* unify fromJson for objects and refs
* add enum descerialization testing
* Blockexchange uses merkle root and index to fetch blocks
* Links the network store getTree to the local store.
* Update codex/stores/repostore.nim
Co-authored-by: Dmitriy Ryajov <dryajov@gmail.com>
Signed-off-by: Tomasz Bekas <tomasz.bekas@gmail.com>
* Rework erasure.nim to include recent cleanup
* Revert accidential changes to lib versions
* Addressing review comments
* Storing proofs instead of trees
* Fix a comment
* Fix broken tests
* Fix for broken testerasure.nim
* Addressing PR comments
---------
Signed-off-by: Tomasz Bekas <tomasz.bekas@gmail.com>
Co-authored-by: benbierens <thatbenbierens@gmail.com>
Co-authored-by: Dmitriy Ryajov <dryajov@gmail.com>
## Problem
When Availabilities are created, the amount of bytes in the Availability are reserved in the repo, so those bytes on disk cannot be written to otherwise. When a request for storage is received by a node, if a previously created Availability is matched, an attempt will be made to fill a slot in the request (more accurately, the request's slots are added to the SlotQueue, and eventually those slots will be processed). During download, bytes that were reserved for the Availability were released (as they were written to disk). To prevent more bytes from being released than were reserved in the Availability, the Availability was marked as used during the download, so that no other requests would match the Availability, and therefore no new downloads (and byte releases) would begin. The unfortunate downside to this, is that the number of Availabilities a node has determines the download concurrency capacity. If, for example, a node creates a single Availability that covers all available disk space the operator is willing to use, that single Availability would mean that only one download could occur at a time, meaning the node could potentially miss out on storage opportunities.
## Solution
To alleviate the concurrency issue, each time a slot is processed, a Reservation is created, which takes size (aka reserved bytes) away from the Availability and stores them in the Reservation object. This can be done as many times as needed as long as there are enough bytes remaining in the Availability. Therefore, concurrent downloads are no longer limited by the number of Availabilities. Instead, they would more likely be limited to the SlotQueue's `maxWorkers`.
From a database design perspective, an Availability has zero or more Reservations.
Reservations are persisted in the RepoStore's metadata, along with Availabilities. The metadata store key path for Reservations is ` meta / sales / reservations / <availabilityId> / <reservationId>`, while Availabilities are stored one level up, eg `meta / sales / reservations / <availabilityId> `, allowing all Reservations for an Availability to be queried (this is not currently needed, but may be useful when work to restore Availability size is implemented, more on this later).
### Lifecycle
When a reservation is created, its size is deducted from the Availability, and when a reservation is deleted, any remaining size (bytes not written to disk) is returned to the Availability. If the request finishes, is cancelled (expired), or an error occurs, the Reservation is deleted (and any undownloaded bytes returned to the Availability). In addition, when the Sales module starts, any Reservations that are not actively being used in a filled slot, are deleted.
Having a Reservation persisted until after a storage request is completed, will allow for the originally set Availability size to be reclaimed once a request contract has been completed. This is a feature that is yet to be implemented, however the work in this PR is a step in the direction towards enabling this.
### Unknowns
Reservation size is determined by the `StorageAsk.slotSize`. If during download, more bytes than `slotSize` are attempted to be downloaded than this, then the Reservation update will fail, and the state machine will move to a `SaleErrored` state, deleting the Reservation. This will likely prevent the slot from being filled.
### Notes
Based on #514
* [docs] fix two client scenario: add missing collateral
* [integration] separate step to wait for node to be started
* [cli] add option to specify ethereum private key
* Remove unused imports
* Fix warnings
* [integration] move type definitions to correct place
* [integration] wait a bit longer for a node to start in debug mode
When e.g. running against Taiko test net rpc, the node start
takes longer
* [integration] simplify handling of codex node and client
* [integration] add Taiko integration test
* [contracts] await token approval confirmation before next tx
* [contracts] deployment address of marketplace on Taiko
* [cli] --eth-private-key now takes a file name
Instead of supplying the private key on the command line,
expect the private key to be in a file with the correct
permissions.
* [utils] Fixes undeclared `activeChroniclesStream` on Windows
* [build] update nim-ethers to include PR #52
Co-authored-by: Eric Mastro <eric.mastro@gmail.com>
* [cli] Better error messages when reading eth private key
Co-authored-by: Eric Mastro <eric.mastro@gmail.com>
* [integration] simplify reading of cmd line arguments
Co-authored-by: Eric Mastro <eric.mastro@gmail.com>
* [build] update to latest version of nim-ethers
* [contracts] updated contract address for Taiko L2
* [build] update codex contracts to latest version
---------
Co-authored-by: Eric Mastro <eric.mastro@gmail.com>
* Improve integration testing client (CodexClient) and json serialization
The current client used for integration testing against the REST endpoints for Codex accepts and passes primitive types. This caused a hard to diagnose bug where a `uint` was not being deserialized correctly.
In addition, the json de/serializing done between the CodexClient and REST client was not easy to read and was not tested.
These changes bring non-primitive types to most of the CodexClient functions, allowing us to lean on the compiler to ensure we're providing correct typings. More importantly, a json de/serialization util was created as a drop-in replacement for the std/json lib, with the main two differences being that field serialization is opt-in (instead of opt-out as in the case of json_serialization) and serialization errors are captured and logged, making debugging serialization issues much easier.
* Update integration test to use nodes=2 and tolerance=1
* clean up
* extra utilities and tweaks
* add atlas lock
* update ignores
* break build into it's own script
* update url rules
* base off codexdht's
* compile fixes for Nim 1.6.14
* update submodules
* convert mapFailure to procs to work around type resolution issues
* add toml parser for multiaddress
* change error type on keyutils
* bump nimbus build to use 1.6.14
* update gitignore
* adding new deps submodules
* bump nim ci version
* even more fixes
* more libp2p changes
* update keys
* fix eventually function
* adding coverage test file
* move coverage to build.nims
* use nimcache/coverage
* move libp2p import for tests into helper.nim
* remove named bin
* bug fixes for networkpeers (from Dmitriy)
---------
Co-authored-by: Dmitriy Ryajov <dryajov@gmail.com>