311 Commits

Author SHA1 Message Date
gmega
96156b16c3
half-baked integration of new profiler branch 2023-12-07 09:17:46 -03:00
gmega
b6321dc91c
revert memory leak fix so we can run tests with new DHT 2023-12-07 09:08:28 -03:00
gmega
a3ae613fff
add failing test case for memory leak 2023-12-07 09:08:28 -03:00
gmega
256d0e7270
add some hacks to allow enabling profiling on specific threads, and guarding against enabling it on multiple 2023-12-07 09:07:00 -03:00
gmega
3c6ef3019f
allow configuration of profiler output volume from CLI option 2023-12-07 09:06:59 -03:00
gmega
f048404cb7
add callback to eliminate the need for an async timer in metric updates 2023-12-07 09:06:59 -03:00
gmega
29d36b51f2
modify metrics collector so it uses standard gauges 2023-12-07 09:06:58 -03:00
gmega
b50469f6cc
rename metrics to follow codex conventions 2023-12-07 09:06:58 -03:00
gmega
c809af7dc7
add tests to main test suite, add global async profiler info collector 2023-12-07 09:06:58 -03:00
gmega
c4adc65823
rename ProfilingCollector => AsyncProfilerInfo to match nim-metrics naming convention 2023-12-07 09:06:57 -03:00
gmega
f16bccfcb6
add labeled top-k slowest async procs to prometheus collector 2023-12-07 09:06:57 -03:00
gmega
613e4c4038
add basic prometheus profiling metrics tracker 2023-12-07 09:06:57 -03:00
gmega
aa6d8d7b56
fix test file name and remove old test file 2023-12-07 09:06:57 -03:00
gmega
19d90191d8
order api results by descending totalExecTime (with option of adding a query parameter) 2023-12-07 09:06:56 -03:00
gmega
f99a516203
add simple profiling API 2023-12-07 09:06:51 -03:00
gmega
91d186b717
WiP 2023-12-07 09:04:47 -03:00
Jaremy Creechley
28593bed1d
push changes 2023-12-07 09:03:33 -03:00
Jaremy Creechley
8622befaf2
import profiler utils 2023-12-07 09:03:31 -03:00
Eric
3907ca4095
Add get active slot /slots/{slotId} to REST api, use utils/json (#645)
* Add get active slot /slots/{slotId} to REST api, use utils/json

- Add endpoint /slots/{slotId} to get an active SalesAgent from the Sales module. Used in integration tests to test when a sale has reached a certain state. Those integration test changes will be included in a larger PR, coming later.
- Add OpenAPI changes for new endpoint and associated components
- Use utils/json instead of nim-json-serialization. Required exemption of imports from several packages that export nim-json-serialization by default.

* Only except `toJson` from import/export of chronicles
2023-12-07 12:16:36 +11:00
Adam Uhlíř
b38146d3f7
fix: check expiration is before request end (#641) 2023-12-05 14:25:28 +01:00
Adam Uhlíř
0e4387d1b3
refactor: move expiry update from fetchBatched (#634) 2023-11-28 22:04:11 +01:00
Dmitriy Ryajov
22c31046a7
Dont dial self (#633)
* don't dial self

* revert style changes

* filter out self on dial

* add helpers to `peerId` and `isSelf`

* don't fire up discovery eaguerly

* allow excluding multiple peers in sendWantHave

* revert style changes

* move self check to discovery.find

* readd eaguer dht lookup is required in some cases

* revert style changes

* misc

* drop peer first, before queueing a dht lookup

* moar style changes

* use isSelf
2023-11-27 10:25:53 -08:00
markspanbroek
b62ccf5a8a
Update Request: remove PoR, add merkle root (#590)
* [build] update codex-contracts-eth

* [contracts] update request parameters

- Remove PoR parameters and totalChunks
- Add merkleRoot
2023-11-27 12:25:01 +00:00
Adam Uhlíř
79fce39dbf
fix: require expiry from api (#629)
Co-authored-by: markspanbroek <mark@spanbroek.net>
2023-11-22 11:35:26 +00:00
Adam Uhlíř
8681a40ee7
feat: update expiry when data downloaded and slot filled (#619)
Co-authored-by: Eric <5089238+emizzle@users.noreply.github.com>
Co-authored-by: markspanbroek <mark@spanbroek.net>
2023-11-22 10:09:12 +00:00
Tomasz Bekas
4d546f9ace
Bump questionable and test for tree init failure (#630) 2023-11-21 19:30:14 +01:00
Dmitriy Ryajov
ec8d0c98b2
Fix REST endpoints semantics (#612)
* Fix REST endpoints semantics

* update endpoint description

* update, operation id

* Adding enum support

* make enum descerializer public

* add support for listing manifests

* test `/data` endpoint to list local manifests

* debug leftovers

* remove commented out line
2023-11-20 16:14:06 -08:00
Ben Bierens
bece1b88a1
Feat/bump questionable (#627)
* Bumps questionable version to 0.10.12

* removes unnecessary questionnable bindings.

* Fixes tests

* unnecessary whitespaces
2023-11-17 13:49:45 +01:00
Dmitriy Ryajov
06bb21bfc7
backport changes from fix-erasure-tests (#618) 2023-11-14 08:53:06 -08:00
Dmitriy Ryajov
a1725594a8
Fix json serialization for cid and enums (#615)
* use str on JString types, `$` will preserve `"`

* Adding enum support

* deserialize cid test

* make enum descerializer public

* unify fromJson for objects and refs

* add enum descerialization testing
2023-11-14 07:23:50 -08:00
Tomasz Bekas
2396c4d76d
Blockexchange uses merkle root and index to fetch blocks (#566)
* Blockexchange uses merkle root and index to fetch blocks

* Links the network store getTree to the local store.

* Update codex/stores/repostore.nim

Co-authored-by: Dmitriy Ryajov <dryajov@gmail.com>
Signed-off-by: Tomasz Bekas <tomasz.bekas@gmail.com>

* Rework erasure.nim to include recent cleanup

* Revert accidential changes to lib versions

* Addressing review comments

* Storing proofs instead of trees

* Fix a comment

* Fix broken tests

* Fix for broken testerasure.nim

* Addressing PR comments

---------

Signed-off-by: Tomasz Bekas <tomasz.bekas@gmail.com>
Co-authored-by: benbierens <thatbenbierens@gmail.com>
Co-authored-by: Dmitriy Ryajov <dryajov@gmail.com>
2023-11-14 13:02:17 +01:00
Ben Bierens
cb02962231
Adds endpoint for listing files (manifests) in node. Useful for demo UI. (#599)
* Adds endpoint for listing files (manifests) in node. Useful for demo UI.

* Moves upload/download/files into content API calls.

* Cleans up json serialization for manifest

* Cleans up some more json serialization

* Moves block iteration and decoding to node.nim

* Moves api methods into their own init procs.

* Applies RestContent api object.

* Replaces format methods with Rest objects in json.nim

* Unused import

* Review comments by Adam

* Fixes issue where content/local endpoint clashes with content/cid.

* faulty merge resolution

* Renames content API to data.

* Fixes faulty rebase

* Adds test for data/local API

* Renames local and download api.
2023-11-09 08:47:09 +00:00
Eric
7d4ea878d2
Support logging to file (#558)
* Support logging to file

* Log the entire config and fix build error

* Downgrade log level for "starting codex node" config output

* bump ethers to prevent nonce gaps

* fix tests
2023-11-09 16:35:55 +11:00
Adam Uhlíř
c0bec2f899
feat: ensure block expiry (#597)
* feat: update block expiry

* chore: feedback implementation

* chore: feedback implementation

* chore: feedback implementation
2023-11-06 08:10:30 +00:00
Adam Uhlíř
0014ffdef5
test: integration tests listen on localhost (#596) 2023-10-24 16:52:06 +02:00
Adam Uhlíř
2fc71cf81b
feat: partial payouts for cancelled requests (#561) 2023-10-24 10:12:54 +00:00
Dmitriy Ryajov
8a7d74e6b2
removing old por proofs implementation (#593) 2023-10-23 07:58:07 -07:00
markspanbroek
a77d0cdcec
[sales] Fix intermittently failing test (#591) 2023-10-19 15:46:21 +02:00
Eric
570a1f7b67
[marketplace] Availability improvements (#535)
## Problem
When Availabilities are created, the amount of bytes in the Availability are reserved in the repo, so those bytes on disk cannot be written to otherwise. When a request for storage is received by a node, if a previously created Availability is matched, an attempt will be made to fill a slot in the request (more accurately, the request's slots are added to the SlotQueue, and eventually those slots will be processed). During download, bytes that were reserved for the Availability were released (as they were written to disk). To prevent more bytes from being released than were reserved in the Availability, the Availability was marked as used during the download, so that no other requests would match the Availability, and therefore no new downloads (and byte releases) would begin. The unfortunate downside to this, is that the number of Availabilities a node has determines the download concurrency capacity. If, for example, a node creates a single Availability that covers all available disk space the operator is willing to use, that single Availability would mean that only one download could occur at a time, meaning the node could potentially miss out on storage opportunities.

## Solution
To alleviate the concurrency issue, each time a slot is processed, a Reservation is created, which takes size (aka reserved bytes) away from the Availability and stores them in the Reservation object. This can be done as many times as needed as long as there are enough bytes remaining in the Availability. Therefore, concurrent downloads are no longer limited by the number of Availabilities. Instead, they would more likely be limited to the SlotQueue's `maxWorkers`.

From a database design perspective, an Availability has zero or more Reservations.

Reservations are persisted in the RepoStore's metadata, along with Availabilities. The metadata store key path for Reservations is ` meta / sales / reservations / <availabilityId> / <reservationId>`, while Availabilities are stored one level up, eg `meta / sales / reservations / <availabilityId> `, allowing all Reservations for an Availability to be queried (this is not currently needed, but may be useful when work to restore Availability size is implemented, more on this later).

### Lifecycle
When a reservation is created, its size is deducted from the Availability, and when a reservation is deleted, any remaining size (bytes not written to disk) is returned to the Availability. If the request finishes, is cancelled (expired), or an error occurs, the Reservation is deleted (and any undownloaded bytes returned to the Availability). In addition, when the Sales module starts, any Reservations that are not actively being used in a filled slot, are deleted.

Having a Reservation persisted until after a storage request is completed, will allow for the originally set Availability size to be reclaimed once a request contract has been completed. This is a feature that is yet to be implemented, however the work in this PR is a step in the direction towards enabling this.

### Unknowns
Reservation size is determined by the `StorageAsk.slotSize`. If during download, more bytes than `slotSize` are attempted to be downloaded than this, then the Reservation update will fail, and the state machine will move to a `SaleErrored` state, deleting the Reservation. This will likely prevent the slot from being filled.

### Notes
Based on #514
2023-09-29 14:33:08 +10:00
Adam Uhlíř
2f1c778d02
fix: unknown state goes to payout when slot state is finished (#555) 2023-09-27 15:57:41 +02:00
Dmitriy Ryajov
25ea7fd0b2
Erasure cleanup (#545)
* cleanup erasure coding

* moar cleanup

* fix off by 1 issues in tests

* style

* consolidate decoding data code

* simplify tuple unpacking

* fix retrieve purchase

We don't support single blocks for now

* Apply suggestions from code review

Co-authored-by: Eric <5089238+emizzle@users.noreply.github.com>
Signed-off-by: Dmitriy Ryajov <dryajov@gmail.com>

---------

Signed-off-by: Dmitriy Ryajov <dryajov@gmail.com>
Co-authored-by: Eric <5089238+emizzle@users.noreply.github.com>
2023-09-25 07:31:10 -07:00
markspanbroek
71cd35112b
Taiko L2 (#483)
* [docs] fix two client scenario: add missing collateral

* [integration] separate step to wait for node to be started

* [cli] add option to specify ethereum private key

* Remove unused imports

* Fix warnings

* [integration] move type definitions to correct place

* [integration] wait a bit longer for a node to start in debug mode

When e.g. running against Taiko test net rpc, the node start
takes longer

* [integration] simplify handling of codex node and client

* [integration] add Taiko integration test

* [contracts] await token approval confirmation before next tx

* [contracts] deployment address of marketplace on Taiko

* [cli] --eth-private-key now takes a file name

Instead of supplying the private key on the command line,
expect the private key to be in a file with the correct
permissions.

* [utils] Fixes undeclared `activeChroniclesStream` on Windows

* [build] update nim-ethers to include PR #52

Co-authored-by: Eric Mastro <eric.mastro@gmail.com>

* [cli] Better error messages when reading eth private key

Co-authored-by: Eric Mastro <eric.mastro@gmail.com>

* [integration] simplify reading of cmd line arguments

Co-authored-by: Eric Mastro <eric.mastro@gmail.com>

* [build] update to latest version of nim-ethers

* [contracts] updated contract address for Taiko L2

* [build] update codex contracts to latest version

---------

Co-authored-by: Eric Mastro <eric.mastro@gmail.com>
2023-09-13 16:17:56 +02:00
Adam Uhlíř
ae89db1eea
fix: sales concurrency bug (#537) 2023-09-05 16:47:29 +02:00
markspanbroek
d3a22a7b7b
Fix slot queue push (#542)
* [sales] remove availability check before adding to slot queue

* [sales] add missing return statement

* [tests] remove 'eventuallyCheck' helper

* [sales] remove reservations from slot queue

* [tests] rename module `eventually` -> `always`

* [sales] increase slot queue size

Because it will now also hold items for which we haven't
checked availability yet.
2023-09-04 16:42:09 +02:00
Ben Bierens
545e0d47e1
Slows down block maintenance iteration (#543)
* Slows down block maintenance iteration

* Fixes unstable integration test.
2023-09-04 11:12:14 +02:00
Eric
37b3d99c3d
Improve integration testing client (CodexClient) and json serialization (#514)
* Improve integration testing client (CodexClient) and json serialization

The current client used for integration testing against the REST endpoints for Codex accepts and passes primitive types. This caused a hard to diagnose bug where a `uint` was not being deserialized correctly.

In addition, the json de/serializing done between the CodexClient and REST client was not easy to read and was not tested.

These changes bring non-primitive types to most of the CodexClient functions, allowing us to lean on the compiler to ensure we're providing correct typings. More importantly, a json de/serialization util was created as a drop-in replacement for the std/json lib, with the main two differences being that field serialization is opt-in (instead of opt-out as in the case of json_serialization) and serialization errors are captured and logged, making debugging serialization issues much easier.

* Update integration test to use nodes=2 and tolerance=1

* clean up
2023-09-01 15:44:41 +10:00
Adam Uhlíř
f459a2c6f6
refactor: merging proving module into sales (#469)
Co-authored-by: Eric <5089238+emizzle@users.noreply.github.com>
2023-08-21 12:26:43 +02:00
Eric
9cecb68520
[repostore] Retrieve empty blocks (#513)
Add handling of empty blocks in the RepoStore.

* Add empty block handling to repostore for put, del, has
Also added tests for all empty block handling blockstore operations. This showed there was an ambiguous identifier present for `hasBlock`, so one of the two `hasBlock` definitions was removed in `repostore`.

* Change CacheStore to RepoStore in testerasure
As CacheStore is not used in the node, update the Datastore used in the erasure coding tests to be a RepoStore. This ensures that the K > 1 cases are being tested, where they will produce empty padding blocks in the erasure-coded manifests.
2023-08-21 12:51:04 +10:00
Tomasz Bekas
e8601274b9
Merkle tree construction (#504)
* Building a merkle tree

* Obtaining merkle proof from a tree

---------

Co-authored-by: benbierens <thatbenbierens@gmail.com>
2023-08-15 13:23:35 +02:00
Adam Uhlíř
39efac1a97
fix: load slots on sales module start (#510) 2023-08-15 11:39:49 +02:00