Chrysostomos Nanakos bbc98ee5c8
feat(bench): add --random flag for random order block reads
Signed-off-by: Chrysostomos Nanakos <chris@include.gr>
2026-01-06 14:27:24 +02:00
2026-01-05 03:05:14 +02:00
2026-01-05 03:05:14 +02:00
2026-01-05 03:05:14 +02:00
2026-01-05 03:05:14 +02:00
2026-01-05 03:05:14 +02:00
2026-01-05 03:05:14 +02:00

nim-blockstore

License: MIT Stability: experimental nim

A content-addressed block storage library for Nim with configurable hash algorithms, codecs, and merkle tree proofs for block verification.

Features

  • Content-addressed storage with CIDv1 identifiers
  • Configurable hash functions and codecs via BlockHashConfig
  • Merkle tree proofs for block verification
  • Multiple storage backends:
    • Block storage: sharded files (bbSharded) or packed files (bbPacked)
    • Merkle tree storage: embedded proofs (mbEmbeddedProofs), LevelDB (mbLevelDb), or packed files (mbPacked)
    • Blockmap storage: LevelDB (bmLevelDb) or files (bmFile)
  • Direct I/O support (ioDirect) for crash consistency and OS cache bypass
  • Buffered I/O mode (ioBuffered) with configurable batch sync
  • Storage quota management
  • Background garbage collection with deletion worker
  • Metadata stored in LevelDB
  • Async file chunking
  • Dataset management with manifests

Usage

Building a Dataset from a File

import blockstore
import taskpools
import std/options

proc buildDataset() {.async.} =
  # Create a shared thread pool for async file I/O
  let pool = Taskpool.new(numThreads = 4)
  defer: pool.shutdown()

  let store = newDatasetStore("./db", "./blocks").get()

  # Start building a dataset with 64KB chunks
  let builder = store.startDataset(64 * 1024, some("myfile.txt")).get()

  # Chunk the file
  let stream = (await builder.chunkFile(pool)).get()

  while true:
    let blockOpt = await stream.nextBlock()
    if blockOpt.isNone:
      break
    let blockResult = blockOpt.get()
    if blockResult.isErr:
      echo "Error: ", blockResult.error
      break
    discard await builder.addBlock(blockResult.value)

  stream.close()

  # Finalize and get the dataset
  let dataset = (await builder.finalize()).get()
  echo "Tree CID: ", dataset.treeCid
  echo "Manifest CID: ", dataset.manifestCid

waitFor buildDataset()

Retrieving Blocks with Proofs

import blockstore

proc getBlockWithProof(store: DatasetStore, treeCid: Cid, index: int) {.async.} =
  let datasetOpt = (await store.getDataset(treeCid)).get()
  if datasetOpt.isNone:
    echo "Dataset not found"
    return

  let dataset = datasetOpt.get()
  let blockOpt = (await dataset.getBlock(index)).get()

  if blockOpt.isSome:
    let (blk, proof) = blockOpt.get()
    echo "Block: ", blk
    echo "Proof index: ", proof.index
    echo "Proof path length: ", proof.path.len

API Reference

newDatasetStore

Creates a new dataset store with configurable backends and I/O modes:

proc newDatasetStore*(
  dbPath: string,                                    # Path to LevelDB database
  blocksDir: string,                                 # Directory for block storage
  quota: uint64 = 0,                                 # Storage quota (0 = unlimited)
  blockHashConfig: BlockHashConfig = defaultBlockHashConfig(),
  merkleBackend: MerkleBackend = mbPacked,           # Merkle tree storage backend
  blockBackend: BlockBackend = bbSharded,            # Block storage backend
  blockmapBackend: BlockmapBackend = bmLevelDb,      # Blockmap storage backend
  ioMode: IOMode = ioDirect,                         # I/O mode
  syncBatchSize: int = 0,                            # Batch size for sync (buffered mode)
  pool: Taskpool = nil                               # Thread pool for deletion worker
): BResult[DatasetStore]

Storage Backends

BlockBackend

Value Description
bbSharded Sharded directory structure (default). One file per block with 2-level sharding.
bbPacked Packed file format. All blocks for a dataset in a single file.

MerkleBackend

Controls how merkle proofs are stored and generated.

Value Description
mbEmbeddedProofs Proofs computed during build (tree in memory) and embedded in block references in LevelDB. Tree discarded after finalize. Good for smaller datasets.
mbLevelDb Tree nodes stored in LevelDB. Proofs generated on-demand from stored tree.
mbPacked Tree nodes in packed files (default). One file per tree. Proofs generated on-demand. Efficient for large datasets.

BlockmapBackend

Value Description
bmLevelDb LevelDB storage (default). Shared with metadata.
bmFile File-based storage. One file per blockmap.

I/O Modes

Value Description
ioDirect Direct I/O (default). Bypasses OS cache, data written directly to disk. Provides crash consistency.
ioBuffered Buffered I/O. Uses OS cache. Use syncBatchSize to control sync frequency if needed.

BlockHashConfig

Configuration for block hashing and CID generation:

Field Type Description
hashFunc HashFunc Hash function proc(data: openArray[byte]): HashDigest
hashCode MultiCodec Multicodec identifier for the hash (e.g., Sha256Code)
blockCodec MultiCodec Codec for blocks (e.g., LogosStorageBlock)
treeCodec MultiCodec Codec for merkle tree CIDs (e.g., LogosStorageTree)

Running Tests

nimble test

Code Coverage

Generate HTML coverage reports:

# All tests
nimble coverage

# Individual test suites
nimble coverage_merkle
nimble coverage_block
nimble coverage_chunker

# Clean coverage data
nimble coverage_clean

Stability

This library is in experimental status and may have breaking changes between versions until it stabilizes.

License

nim-blockstore is licensed and distributed under either of:

at your option. The contents of this repository may not be copied, modified, or distributed except according to those terms.

Description
No description provided
Readme
Languages
Nim 100%