A classic encoding of a merkle tree is to store the layers consecutively
in memory breadth-first. This encoding has several advantages:
* Good performance for accessing successive nodes, such as when
constructing the tree or serializing it
* Significantly lower memory usage - avoids the per-node allocation
overhead which otherwise more than doubles the memory usage for
"regular" 32-byte hashes
* Less memory management - a single memory allocation can reserve memory
for the whole tree meaning that there are fewer allocations to keep
track of
* Simplified buffer lifetimes - with all memory allocated up-front,
there's no need for cross-thread memory management or transfers
While we're here, we can clean up a few other things in the
implementation:
* Move async implementation to `merkletree` so that it doesn't have to
be repeated
* Factor tree construction into preparation and computation - the latter
is the part offloaded onto a different thread
* Simplify task posting - `threadpools` already creates a "task" from
the worker function call
* Deprecate several high-overhead accessors that presumably are only
needed in tests
For minimal correctness, we must ensure that buffers that cross the
thread boundary are allocated and deallocated within the same thread and
that there is no reference counting going on during the computation.
To get there with minimal changes:
* Preallocate a buffer for the outcome of the merkle tree computation
* Pass pointers instead of `ref` types between threads
* Avoid relying on isolation - this is an ORC-only feature
* Add `SharedBuf` as a simple "view" type that allows working with a set
of values while at the same time avoiding allocations and refcounts -
the view checks for out-of-bounds acccess much like a seq, but the user
is responsible for managing lifetime (which in this case is simple since
all that needs to happen is for the task to complete)
* In order not to upset the code too much, use a simple linear packer
for the hashes that simply copies the values back and forth
* Block cancellation and panic if the thread signalling mechanism fails
- cancelling the task itself would require inserting cancellation points
in the computation
The worker task relies on a nuance, namely that calling a closure
procedure does not count as a reference-counting event - while this
works, it can be brittle in "general" code since it's easy to make copy
of the closure itself by accident - the refactoring necessary for
addressing this point is beyond the scope of this change however.
* Move to version 2.0.6
* Update nim-confutils submodule to latest version
* Update dependencies
* Update Nim version to 2.0.12
* Add gcsafe pragma
* Add missing import
* Update specific conf for Nim 2.x
* Fix method signatures
* Revert erasure coding attempt to fix bug
* More gcsafe pragma
* Duplicate code from libp2p because it is not exported anymore
* Fix camelcase function names
* Use alreadySeen because need is not a bool anymore
* newLPStreamReadError does not exist anymore so use another error
* Replace ValidIpAddress by IpAddress
* Add gcsafe pragma
* Restore maintenance parameter deleted by mistake when removing esasure coding fix attempt code
* Update method signatures
* Copy LPStreamReadError code from libp2p which was removed
* Fix camel case
* Fix enums in tests
* Fix camel case
* Extract node components to a variable to make Nim 2 happy
* Update the tests using ValidIpAddress to IpAddress
* Fix cast for value which is already an option
* Set nim version to 2.0.x for CI
* Set nim version to 2.0.x for CI
* Move to miniupnp version 2.2.4 to avoid symlink error
* Set core.symlinks to false for Windows for miniupnp >= 2.2.5 support
* Update to Nim 2.0.14
* Update CI nim versions to 2.0.14
* Try with GCC 14
* Replace apt-fast by apt-get
* Update ubuntu runner to latest
* Use Ubuntu 20.04 for coverage
* Disable CI cache for coverage
* Add coverage property description
* Remove commented test
* Check the node value of seen instead of using alreadySeen
* Fix the merge. The taskpool work was reverted.
* Update nim-ethers submodule
* Remove deprecated ValidIpAddress. Fix missing case and imports.
* Fix a weird issue where nim-confutils cannot find NatAny
* Fix tests and remove useless static keyword
* rework merkle tree support
* rename merkletree -> codexmerkletree
* treed and proof encoding/decoding
* style
* adding codex merkle and coders tests
* use default hash codec
* proof size changed
* add from nodes test
* shorte file names
* wip poseidon tree
* shorten file names
* root returns a result
* import poseidon tests
* fix merge issues and cleanup a few warnings
* setting up slot builder
* Getting cids in slot
* ensures blocks are devisable by number of slots
* wip
* Implements indexing strategies
* Swaps in indexing strategy into erasure.
* wires slot and indexing tests up
* Fixes issue where indexing strategy stepped gives wrong values for smallest of ranges
* debugs indexing strategies
* Can select slot blocks
* finding number of pad cells
* Implements building slot tree
* finishes implementing slot builder
* Adds check that block size is a multiple of cell size
* Cleanup slotbuilder
* Review comments by Tomasz
* Fixes issue where ecK was used as numberOfSlots.
* rework merkle tree support
* deps
* rename merkletree -> codexmerkletree
* treed and proof encoding/decoding
* style
* adding codex merkle and coders tests
* remove new codecs for now
* proof size changed
* add from nodes test
* shorte file names
* wip poseidon tree
* shorten file names
* fix bad `elements` iter
* bump
* bump
* wip
* reworking slotbuilder
* move out of manifest
* expose getCidAndProof
* import index strat...
* remove getMHash
* remove unused artifacts
* alias zero
* add digest for multihash
* merge issues
* remove unused hashes
* add option to result converter
* misc
* fix tests
* add helper to derive EC block count
* rename method
* misc
* bump
* extract slot root building into own proc
* revert to manifest to accessor
---------
Co-authored-by: benbierens <thatbenbierens@gmail.com>