Add testnet image build flow and runner docs

2026-02-20 21:23:10 +00:00 · 2025-11-26 12:15:07 +01:00 · 2025-11-26 12:15:07 +01:00 · 92e855741a
commit 92e855741a
parent e04af1441a
46 changed files with 3543 additions and 63 deletions
--- a/book/book.toml
+++ b/book/book.toml
@ -0,0 +1,13 @@
+[book]
+authors      = ["Nomos Testing"]
+language     = "en"
+multilingual = false
+src          = "src"
+title        = "Nomos Testing Book"
+
+[build]
+# Keep book output in target/ to avoid polluting the workspace root.
+build-dir = "../target/book"
+
+[output.html]
+default-theme = "light"
--- a/book/combined.md
+++ b/book/combined.md
@ -0,0 +1,549 @@
+# Nomos Testing Framework — Combined Reference
+
+## Project Context Primer
+This book focuses on the Nomos Testing Framework. It assumes familiarity with
+the Nomos architecture, but for completeness, here is a short primer.
+
+- **Nomos** is a modular blockchain protocol composed of validators, executors,
+  and a data-availability (DA) subsystem.
+- **Validators** participate in consensus and produce blocks.
+- **Executors** run application logic or off-chain computations referenced by
+  blocks.
+- **Data Availability (DA)** ensures that data referenced in blocks is
+  published and retrievable, including blobs or channel data used by workloads.
+
+These roles interact tightly, which is why meaningful testing must be performed
+in multi-node environments that include real networking, timing, and DA
+interaction.
+
+## What You Will Learn
+This book gives you a clear mental model for Nomos multi-node testing, shows how
+to author scenarios that pair realistic workloads with explicit expectations,
+and guides you to run them across local, containerized, and cluster environments
+without changing the plan.
+
+## Part I — Foundations
+
+### Introduction
+The Nomos Testing Framework is a purpose-built toolkit for exercising Nomos in
+realistic, multi-node environments. It solves the gap between small, isolated
+tests and full-system validation by letting teams describe a cluster layout,
+drive meaningful traffic, and assert the outcomes in one coherent plan.
+
+It is for protocol engineers, infrastructure operators, and QA teams who need
+repeatable confidence that validators, executors, and data-availability
+components work together under network and timing constraints.
+
+Multi-node integration testing is required because many Nomos behaviors—block
+progress, data availability, liveness under churn—only emerge when several
+roles interact over real networking and time. This framework makes those checks
+declarative, observable, and portable across environments.
+
+### Architecture Overview
+The framework follows a clear flow: **Topology → Scenario → Runner → Workloads → Expectations**.
+
+- **Topology** describes the cluster: how many nodes, their roles, and the high-level network and data-availability parameters they should follow.
+- **Scenario** combines that topology with the activities to run and the checks to perform, forming a single plan.
+- **Deployer/Runner** pair turns the plan into a live environment on the chosen backend (local processes, Docker Compose, or Kubernetes) and brokers readiness.
+- **Workloads** generate traffic and conditions that exercise the system.
+- **Expectations** observe the run and judge success or failure once activity completes.
+
+Conceptual diagram:
+```
+Topology  →  Scenario  →  Runner  →  Workloads  →  Expectations
+ (shape       (plan)      (deploy     (drive         (verify
+ cluster)                & orchestrate) traffic)     outcomes)
+```
+
+Mermaid view:
+```mermaid
+flowchart LR
+    A(Topology<br/>shape cluster) --> B(Scenario<br/>plan)
+    B --> C(Deployer/Runner<br/>deploy & orchestrate)
+    C --> D(Workloads<br/>drive traffic)
+    D --> E(Expectations<br/>verify outcomes)
+```
+
+Each layer has a narrow responsibility so that cluster shape, deployment choice, traffic generation, and health checks can evolve independently while fitting together predictably.
+
+### Testing Philosophy
+- **Declarative over imperative**: describe the desired cluster shape, traffic, and success criteria; let the framework orchestrate the run.
+- **Observable health signals**: prefer liveness and inclusion signals that reflect real user impact instead of internal debug state.
+- **Determinism first**: default scenarios aim for repeatable outcomes with fixed topologies and traffic rates; variability is opt-in.
+- **Targeted non-determinism**: introduce randomness (e.g., restarts) only when probing resilience or operational robustness.
+- **Protocol time, not wall time**: reason in blocks and protocol-driven intervals to reduce dependence on host speed or scheduler noise.
+- **Minimum run window**: always allow enough block production to make assertions meaningful; very short runs risk false confidence.
+- **Use chaos with intent**: chaos workloads are for recovery and fault-tolerance validation, not for baseline functional checks.
+
+### Scenario Lifecycle (Conceptual)
+1. **Build the plan**: Declare a topology, attach workloads and expectations, and set the run window. The plan is the single source of truth for what will happen.
+2. **Deploy**: Hand the plan to a runner. It provisions the environment on the chosen backend and waits for nodes to signal readiness.
+3. **Drive workloads**: Start traffic and behaviors (transactions, data-availability activity, restarts) for the planned duration.
+4. **Observe blocks and signals**: Track block progression and other high-level metrics during or after the run window to ground assertions in protocol time.
+5. **Evaluate expectations**: Once activity stops (and optional cooldown completes), check liveness and workload-specific outcomes to decide pass or fail.
+6. **Cleanup**: Tear down resources so successive runs start fresh and do not inherit leaked state.
+
+Conceptual lifecycle diagram:
+```
+Plan → Deploy → Readiness → Drive Workloads → Observe → Evaluate → Cleanup
+```
+
+Mermaid view:
+```mermaid
+flowchart LR
+    P[Plan<br/>topology + workloads + expectations] --> D[Deploy<br/>runner provisions]
+    D --> R[Readiness<br/>wait for nodes]
+    R --> W[Drive Workloads]
+    W --> O[Observe<br/>blocks/metrics]
+    O --> E[Evaluate Expectations]
+    E --> C[Cleanup]
+```
+
+### Design Rationale
+- **Modular crates** keep configuration, orchestration, workloads, and runners decoupled so each can evolve without breaking the others.
+- **Pluggable runners** let the same scenario run on a laptop, a Docker host, or a Kubernetes cluster, making validation portable across environments.
+- **Separated workloads and expectations** clarify intent: what traffic to generate versus how to judge success. This simplifies review and reuse.
+- **Declarative topology** makes cluster shape explicit and repeatable, reducing surprise when moving between CI and developer machines.
+- **Maintainability through predictability**: a clear flow from plan to deployment to verification lowers the cost of extending the framework and interpreting failures.
+
+## Part II — User Guide
+
+### Workspace Layout
+The workspace focuses on multi-node integration testing and sits alongside a `nomos-node` checkout. Its crates separate concerns to keep scenarios repeatable and portable:
+
+- **Configs**: prepares high-level node, network, tracing, and wallet settings used across test environments.
+- **Core scenario orchestration**: the engine that holds topology descriptions, scenario plans, runtimes, workloads, and expectations.
+- **Workflows**: ready-made workloads (transactions, data-availability, chaos) and reusable expectations assembled into a user-facing DSL.
+- **Runners**: deployment backends for local processes, Docker Compose, and Kubernetes, all consuming the same scenario plan.
+- **Test workflows**: example scenarios and integration checks that show how the pieces fit together.
+
+This split keeps configuration, orchestration, reusable traffic patterns, and deployment adapters loosely coupled while sharing one mental model for tests.
+
+### Annotated Tree
+High-level view of the workspace and how pieces relate:
+```
+nomos-testing/
+├─ testing-framework/
+│  ├─ configs/          # shared configuration helpers
+│  ├─ core/             # scenario model, runtime, topology
+│  ├─ workflows/        # workloads, expectations, DSL extensions
+│  └─ runners/          # local, compose, k8s deployment backends
+├─ tests/               # integration scenarios using the framework
+└─ scripts/             # supporting setup utilities (e.g., assets)
+```
+
+Each area maps to a responsibility: describe configs, orchestrate scenarios, package common traffic and assertions, adapt to environments, and demonstrate end-to-end usage.
+
+### Authoring Scenarios
+Creating a scenario is a declarative exercise:
+
+1. **Shape the topology**: decide how many validators and executors to run, and what high-level network and data-availability characteristics matter for the test.
+2. **Attach workloads**: pick traffic generators that align with your goals (transactions, data-availability blobs, or chaos for resilience probes).
+3. **Define expectations**: specify the health signals that must hold when the run finishes (e.g., consensus liveness, inclusion of submitted activity; see [Core Content: Workloads & Expectations](workloads.md)).
+4. **Set duration**: choose a run window long enough to observe meaningful block progression and the effects of your workloads.
+5. **Choose a runner**: target local processes for fast iteration, Docker Compose for reproducible multi-node stacks, or Kubernetes for cluster-grade validation. For environment considerations, see [Operations](operations.md).
+
+Keep scenarios small and explicit: make the intended behavior and the success criteria clear so failures are easy to interpret and act upon.
+
+### Core Content: Workloads & Expectations
+Workloads describe the activity a scenario generates; expectations describe the signals that must hold when that activity completes. Both are pluggable so scenarios stay readable and purpose-driven.
+
+#### Workloads
+- **Transaction workload**: submits user-level transactions at a configurable rate and can limit how many distinct actors participate.
+- **Data-availability workload**: drives blob and channel activity to exercise data-availability paths.
+- **Chaos workload**: triggers controlled node restarts to test resilience and recovery behaviors (requires a runner that can control nodes).
+
+#### Expectations
+- **Consensus liveness**: verifies the system continues to produce blocks in line with the planned workload and timing window.
+- **Workload-specific checks**: each workload can attach its own success criteria (e.g., inclusion of submitted activity) so scenarios remain concise.
+
+Together, workloads and expectations let you express both the pressure applied to the system and the definition of “healthy” for that run.
+
+Workload pipeline (conceptual):
+```
+Inputs (topology + wallets + rates)
+    │
+    ▼
+Workload init → Drive traffic → Collect signals
+                                   │
+                                   ▼
+                           Expectations evaluate
+```
+
+Mermaid view:
+```mermaid
+flowchart TD
+    I[Inputs<br/>(topology + wallets + rates)] --> Init[Workload init]
+    Init --> Drive[Drive traffic]
+    Drive --> Collect[Collect signals]
+    Collect --> Eval[Expectations evaluate]
+```
+
+### Core Content: ScenarioBuilderExt Patterns
+Patterns that keep scenarios readable and reusable:
+
+- **Topology-first**: start by shaping the cluster (counts, layout) so later steps inherit a clear foundation.
+- **Bundle defaults**: use the DSL helpers to attach common expectations (like liveness) whenever you add a matching workload, reducing forgotten checks.
+- **Intentional rates**: express traffic in per-block terms to align with protocol timing rather than wall-clock assumptions.
+- **Opt-in chaos**: enable restart patterns only in scenarios meant to probe resilience; keep functional smoke tests deterministic.
+- **Wallet clarity**: seed only the number of actors you need; it keeps transaction scenarios deterministic and interpretable.
+
+These patterns make scenario definitions self-explanatory while staying aligned with the framework’s block-oriented timing model.
+
+### Best Practices
+- **State your intent**: document the goal of each scenario (throughput, DA validation, resilience) so expectation choices are obvious.
+- **Keep runs meaningful**: choose durations that allow multiple blocks and make timing-based assertions trustworthy.
+- **Separate concerns**: start with deterministic workloads for functional checks; add chaos in dedicated resilience scenarios to avoid noisy failures.
+- **Reuse patterns**: standardize on shared topology and workload presets so results are comparable across environments and teams.
+- **Observe first, tune second**: rely on liveness and inclusion signals to interpret outcomes before tweaking rates or topology.
+- **Environment fit**: pick runners that match the feedback loop you need—local for speed, compose for reproducible stacks, k8s for cluster-grade fidelity.
+- **Minimal surprises**: seed only necessary wallets and keep configuration deltas explicit when moving between CI and developer machines.
+
+### Examples
+Concrete scenario shapes that illustrate how to combine topologies, workloads, and expectations. Adjust counts, rates, and durations to fit your environment.
+
+#### Simple 2-validator transaction workload
+- **Topology**: two validators.
+- **Workload**: transaction submissions at a modest per-block rate with a small set of wallet actors.
+- **Expectations**: consensus liveness and inclusion of submitted activity.
+- **When to use**: smoke tests for consensus and transaction flow on minimal hardware.
+
+#### DA + transaction workload
+- **Topology**: validators plus executors if available.
+- **Workloads**: data-availability blobs/channels and transactions running together to stress both paths.
+- **Expectations**: consensus liveness and workload-level inclusion/availability checks.
+- **When to use**: end-to-end coverage of transaction and DA layers in one run.
+
+#### Chaos + liveness check
+- **Topology**: validators (optionally executors) with node control enabled.
+- **Workloads**: baseline traffic (transactions or DA) plus chaos restarts on selected roles.
+- **Expectations**: consensus liveness to confirm the system keeps progressing despite restarts; workload-specific inclusion if traffic is present.
+- **When to use**: resilience validation and operational readiness drills.
+
+### Advanced & Artificial Examples
+These illustrative scenarios stretch the framework to show how to build new workloads, expectations, deployers, and topology tricks. They are intentionally “synthetic” to teach capabilities rather than prescribe production tests.
+
+#### Synthetic Delay Workload (Network Latency Simulation)
+- **Idea**: inject fake latency between node interactions using internal timers, not OS-level tooling.
+- **Demonstrates**: sequencing control inside a workload, verifying protocol progression under induced lag, using timers to pace submissions.
+- **Shape**: wrap submissions in delays that mimic slow peers; ensure the expectation checks blocks still progress.
+
+#### Oscillating Load Workload (Traffic Waves)
+- **Idea**: traffic rate changes every block or N seconds (e.g., blocks 1–3 low, 4–5 high, 6–7 zero, repeat).
+- **Demonstrates**: dynamic, stateful workloads that use `RunMetrics` to time phases; modeling real-world burstiness.
+- **Shape**: schedule per-phase rates; confirm inclusion/liveness across peaks and troughs.
+
+#### Byzantine Behavior Mock
+- **Idea**: a workload that drops half its planned submissions, sometimes double-submits, and intentionally triggers expectation failures.
+- **Demonstrates**: negative testing, resilience checks, and the value of clear expectations when behavior is adversarial by design.
+- **Shape**: parameterize drop/double-submit probabilities; pair with an expectation that documents what “bad” looks like.
+
+#### Custom Expectation: Block Finality Drift
+- **Idea**: assert the last few blocks differ and block time stays within a tolerated drift budget.
+- **Demonstrates**: consuming `BlockFeed` or time-series metrics to validate protocol cadence; crafting post-run assertions around block diversity and timing.
+- **Shape**: collect recent blocks, confirm no duplicates, and compare observed intervals to a drift threshold.
+
+#### Custom Deployer: Dry-Run Deployer
+- **Idea**: a deployer that never starts nodes; it emits configs, simulates readiness, provides fake blockfeed/metrics.
+- **Demonstrates**: full power of the deployer interface for CI dry-runs, config verification, and ultra-fast feedback without Nomos binaries.
+- **Shape**: produce logs/artifacts, stub readiness, and feed synthetic blocks so expectations can still run.
+
+#### Stochastic Topology Generator
+- **Idea**: topology parameters change at runtime (random validators, DA settings, network shapes).
+- **Demonstrates**: randomized property testing and fuzzing approaches to topology building.
+- **Shape**: pick roles and network layouts randomly per run; keep expectations tolerant to variability while still asserting core liveness.
+
+#### Multi-Phase Scenario (“Pipelines”)
+- **Idea**: scenario runs in phases (e.g., phase 1 transactions, phase 2 DA, phase 3 restarts, phase 4 sync check).
+- **Demonstrates**: multi-stage tests, modular scenario assembly, and deliberate lifecycle control.
+- **Shape**: drive phase-specific workloads/expectations sequentially; enforce clear boundaries and post-phase checks.
+
+### Running Scenarios
+Running a scenario follows the same conceptual flow regardless of environment:
+
+1. Select or author a scenario plan that pairs a topology with workloads, expectations, and a suitable run window.
+2. Choose a runner aligned with your environment (local, compose, or k8s) and ensure its prerequisites are available.
+3. Deploy the plan through the runner; wait for readiness signals before starting workloads.
+4. Let workloads drive activity for the planned duration; keep observability signals visible so you can correlate outcomes.
+5. Evaluate expectations and capture results as the primary pass/fail signal.
+
+Use the same plan across different runners to compare behavior between local development and CI or cluster settings. For environment prerequisites and flags, see [Operations](operations.md).
+
+### Runners
+Runners turn a scenario plan into a live environment while keeping the plan unchanged. Choose based on feedback speed, reproducibility, and fidelity. For environment and operational considerations, see [Operations](operations.md):
+
+#### Local runner
+- Launches node processes directly on the host.
+- Fastest feedback loop and minimal orchestration overhead.
+- Best for development-time iteration and debugging.
+
+#### Docker Compose runner
+- Starts nodes in containers to provide a reproducible multi-node stack on a single machine.
+- Discovers service ports and wires observability for convenient inspection.
+- Good balance between fidelity and ease of setup.
+
+#### Kubernetes runner
+- Deploys nodes onto a cluster for higher-fidelity, longer-running scenarios.
+- Suits CI or shared environments where cluster behavior and scheduling matter.
+
+#### Common expectations
+- All runners require at least one validator and, for transaction scenarios, access to seeded wallets.
+- Readiness probes gate workload start so traffic begins only after nodes are reachable.
+- Environment flags can relax timeouts or increase tracing when diagnostics are needed.
+
+Runner abstraction:
+```
+Scenario Plan
+    │
+    ▼
+Runner (local | compose | k8s)
+    │  provisions env + readiness
+    ▼
+Runtime + Observability
+    │
+    ▼
+Workloads / Expectations execute
+```
+
+Mermaid view:
+```mermaid
+flowchart TD
+    Plan[Scenario Plan] --> RunSel{Runner<br/>(local | compose | k8s)}
+    RunSel --> Provision[Provision & readiness]
+    Provision --> Runtime[Runtime + observability]
+    Runtime --> Exec[Workloads & Expectations execute]
+```
+
+### Operations
+Operational readiness focuses on prerequisites, environment fit, and clear signals:
+
+- **Prerequisites**: keep a sibling `nomos-node` checkout available; ensure the chosen runner’s platform needs are met (local binaries for host runs, Docker for compose, cluster access for k8s).
+- **Artifacts**: some scenarios depend on prover or circuit assets; fetch them ahead of time with the provided helper scripts when needed.
+- **Environment flags**: use slow-environment toggles to relax timeouts, enable tracing when debugging, and adjust observability ports to avoid clashes.
+- **Readiness checks**: verify runners report node readiness before starting workloads; this avoids false negatives from starting too early.
+- **Failure triage**: map failures to missing prerequisites (wallet seeding, node control availability), runner platform issues, or unmet expectations. Start with liveness signals, then dive into workload-specific assertions.
+
+Treat operational hygiene—assets present, prerequisites satisfied, observability reachable—as the first step to reliable scenario outcomes.
+
+Metrics and observability flow:
+```
+Runner exposes endpoints/ports
+    │
+    ▼
+Runtime collects block/health signals
+    │
+    ▼
+Expectations consume signals to decide pass/fail
+    │
+    ▼
+Operators inspect logs/metrics when failures arise
+```
+
+Mermaid view:
+```mermaid
+flowchart TD
+    Expose[Runner exposes endpoints/ports] --> Collect[Runtime collects block/health signals]
+    Collect --> Consume[Expectations consume signals<br/>decide pass/fail]
+    Consume --> Inspect[Operators inspect logs/metrics<br/>when failures arise]
+```
+
+## Part III — Developer Reference
+
+### Scenario Model (Developer Level)
+The scenario model defines clear, composable responsibilities:
+
+- **Topology**: a declarative description of the cluster—how many nodes, their roles, and the broad network and data-availability characteristics. It represents the intended shape of the system under test.
+- **Scenario**: a plan combining topology, workloads, expectations, and a run window. Building a scenario validates prerequisites (like seeded wallets) and ensures the run lasts long enough to observe meaningful block progression.
+- **Workloads**: asynchronous tasks that generate traffic or conditions. They use shared context to interact with the deployed cluster and may bundle default expectations.
+- **Expectations**: post-run assertions. They can capture baselines before workloads start and evaluate success once activity stops.
+- **Runtime**: coordinates workloads and expectations for the configured duration, enforces cooldowns when control actions occur, and ensures cleanup so runs do not leak resources.
+
+Developers extending the model should keep these boundaries strict: topology describes, scenarios assemble, runners deploy, workloads drive, and expectations judge outcomes. For guidance on adding new capabilities, see [Extending the Framework](extending.md).
+
+### Extending the Framework
+#### Adding a workload
+1) Implement the workload contract: provide a name, optional bundled expectations, validate prerequisites up front, and drive asynchronous activity against the deployed cluster.
+2) Export it through the workflows layer and consider adding DSL helpers for ergonomic wiring.
+
+#### Adding an expectation
+1) Implement the expectation contract: capture baselines if needed and evaluate outcomes after workloads finish; report meaningful errors to aid debugging.
+2) Expose reusable expectations from the workflows layer so scenarios can attach them declaratively.
+
+#### Adding a runner
+1) Implement the deployer contract for the target backend, producing a runtime context with client access, metrics endpoints, and optional node control.
+2) Preserve cleanup guarantees so resources are reclaimed even when runs fail; mirror readiness and observation signals used by existing runners for consistency.
+
+#### Adding topology helpers
+Extend the topology description with new layouts or presets while keeping defaults safe and predictable; favor declarative inputs over ad hoc logic so scenarios stay reviewable.
+
+### Internal Crate Reference
+High-level roles of the crates that make up the framework:
+
+- **Configs**: prepares reusable configuration primitives for nodes, networking, tracing, data availability, and wallets, shared by all scenarios and runners.
+- **Core scenario orchestration**: houses the topology and scenario model, runtime coordination, node clients, and readiness/health probes.
+- **Workflows**: packages workloads and expectations into reusable building blocks and offers a fluent DSL to assemble them.
+- **Runners**: implements deployment backends (local host, Docker Compose, Kubernetes) that all consume the same scenario plan.
+- **Test workflows**: example scenarios and integration checks that exercise the framework end to end and serve as living documentation.
+
+Use this map to locate where to add new capabilities: configuration primitives in configs, orchestration changes in core, reusable traffic/assertions in workflows, environment adapters in runners, and demonstrations in tests.
+
+### Example: New Workload & Expectation (Rust)
+A minimal, end-to-end illustration of adding a custom workload and matching expectation. This shows the shape of the traits and where to plug into the framework; expand the logic to fit your real test.
+
+#### Workload: simple reachability probe
+Key ideas:
+- **name**: identifies the workload in logs.
+- **expectations**: workloads can bundle defaults so callers don’t forget checks.
+- **init**: derive inputs from the generated topology (e.g., pick a target node).
+- **start**: drive async activity using the shared `RunContext`.
+
+```rust
+use std::sync::Arc;
+use async_trait::async_trait;
+use testing_framework_core::scenario::{
+    DynError, Expectation, RunContext, RunMetrics, Workload,
+};
+use testing_framework_core::topology::GeneratedTopology;
+
+pub struct ReachabilityWorkload {
+    target_idx: usize,
+    bundled: Vec<Box<dyn Expectation>>,
+}
+
+impl ReachabilityWorkload {
+    pub fn new(target_idx: usize) -> Self {
+        Self {
+            target_idx,
+            bundled: vec![Box::new(ReachabilityExpectation::new(target_idx))],
+        }
+    }
+}
+
+#[async_trait]
+impl Workload for ReachabilityWorkload {
+    fn name(&self) -> &'static str {
+        "reachability_workload"
+    }
+
+    fn expectations(&self) -> Vec<Box<dyn Expectation>> {
+        self.bundled.clone()
+    }
+
+    fn init(
+        &mut self,
+        topology: &GeneratedTopology,
+        _metrics: &RunMetrics,
+    ) -> Result<(), DynError> {
+        if topology.validators().get(self.target_idx).is_none() {
+            return Err("no validator at requested index".into());
+        }
+        Ok(())
+    }
+
+    async fn start(&self, ctx: &RunContext) -> Result<(), DynError> {
+        let client = ctx
+            .clients()
+            .validators()
+            .get(self.target_idx)
+            .ok_or("missing target client")?;
+
+        // Pseudo-action: issue a lightweight RPC to prove reachability.
+        client.health_check().await.map_err(|e| e.into())
+    }
+}
+```
+
+#### Expectation: confirm the target stayed reachable
+Key ideas:
+- **start_capture**: snapshot baseline if needed (not used here).
+- **evaluate**: assert the condition after workloads finish.
+
+```rust
+use async_trait::async_trait;
+use testing_framework_core::scenario::{DynError, Expectation, RunContext};
+
+pub struct ReachabilityExpectation {
+    target_idx: usize,
+}
+
+impl ReachabilityExpectation {
+    pub fn new(target_idx: usize) -> Self {
+        Self { target_idx }
+    }
+}
+
+#[async_trait]
+impl Expectation for ReachabilityExpectation {
+    fn name(&self) -> &str {
+        "target_reachable"
+    }
+
+    async fn evaluate(&mut self, ctx: &RunContext) -> Result<(), DynError> {
+        let client = ctx
+            .clients()
+            .validators()
+            .get(self.target_idx)
+            .ok_or("missing target client")?;
+
+        client.health_check().await.map_err(|e| {
+            format!("target became unreachable during run: {e}").into()
+        })
+    }
+}
+```
+
+#### How to wire it
+- Build your scenario as usual and call `.with_workload(ReachabilityWorkload::new(0))`.
+- The bundled expectation is attached automatically; you can add more with `.with_expectation(...)` if needed.
+- Keep the logic minimal and fast for smoke tests; grow it into richer probes for deeper scenarios.
+
+## Part IV — Appendix
+
+### DSL Cheat Sheet
+The framework offers a fluent builder style to keep scenarios readable. Common knobs:
+
+- **Topology shaping**: set validator and executor counts, pick a network layout style, and adjust high-level data-availability traits.
+- **Wallet seeding**: define how many users participate and the total funds available for transaction workloads.
+- **Workload tuning**: configure transaction rates, data-availability channel and blob rates, and whether chaos restarts should include validators, executors, or both.
+- **Expectations**: attach liveness and workload-specific checks so success is explicit.
+- **Run window**: set a minimum duration long enough for multiple blocks to be observed and verified.
+
+Use these knobs to express intent clearly, keeping scenario definitions concise and consistent across teams.
+
+### Troubleshooting Scenarios
+Common symptoms and likely causes:
+
+- **No or slow block progression**: runner started workloads before readiness, insufficient run window, or environment too slow—extend duration or enable slow-environment tuning.
+- **Transactions not included**: missing or insufficient wallet seeding, misaligned transaction rate with block cadence, or network instability—reduce rate and verify wallet setup.
+- **Chaos stalls the run**: node control not available for the chosen runner or restart cadence too aggressive—enable control capability and widen restart intervals.
+- **Observability gaps**: metrics or logs unreachable because ports clash or services are not exposed—adjust observability ports and confirm runner wiring.
+- **Flaky behavior across runs**: mixing chaos with functional smoke tests or inconsistent topology between environments—separate deterministic and chaos scenarios and standardize topology presets.
+
+### FAQ
+**Why block-oriented timing?**  
+Using block cadence reduces dependence on host speed and keeps assertions aligned with protocol behavior.
+
+**Can I reuse the same scenario across runners?**  
+Yes. The plan stays the same; swap runners (local, compose, k8s) to target different environments.
+
+**When should I enable chaos workloads?**  
+Only when testing resilience or operational recovery; keep functional smoke tests deterministic.
+
+**How long should runs be?**  
+Long enough for multiple blocks so liveness and inclusion checks are meaningful; very short runs risk false confidence.
+
+**Do I always need seeded wallets?**  
+Only for transaction scenarios. Data-availability or pure chaos scenarios may not require them, but liveness checks still need validators producing blocks.
+
+**What if expectations fail but workloads “look fine”?**  
+Trust expectations first—they capture the intended success criteria. Use the observability signals and runner logs to pinpoint why the system missed the target.
+
+### Glossary
+- **Validator**: node role responsible for participating in consensus and block production.
+- **Executor**: node role that processes transactions or workloads delegated by validators.
+- **DA (Data Availability)**: subsystem ensuring blobs or channel data are published and retrievable for validation.
+- **Workload**: traffic or behavior generator that exercises the system during a scenario run.
+- **Expectation**: post-run assertion that judges whether the system met the intended success criteria.
+- **Topology**: declarative description of the cluster shape, roles, and high-level parameters for a scenario.
+- **Blockfeed**: stream of block observations used for liveness or inclusion signals during a run.
+- **Control capability**: the ability for a runner to start, stop, or restart nodes, used by chaos workloads.
--- a/book/nomos_testing_framework_book_v4.md
+++ b/book/nomos_testing_framework_book_v4.md
--- a/book/src/SUMMARY.md
+++ b/book/src/SUMMARY.md
@ -0,0 +1,31 @@
+# Summary
+- [Project Context Primer](project-context-primer.md)
+- [What You Will Learn](what-you-will-learn.md)
+- [Part I — Foundations](part-i.md)
+  - [Introduction](introduction.md)
+  - [Architecture Overview](architecture-overview.md)
+  - [Testing Philosophy](testing-philosophy.md)
+  - [Scenario Lifecycle (Conceptual)](scenario-lifecycle.md)
+  - [Design Rationale](design-rationale.md)
+- [Part II — User Guide](part-ii.md)
+  - [Workspace Layout](workspace-layout.md)
+  - [Annotated Tree](annotated-tree.md)
+  - [Authoring Scenarios](authoring-scenarios.md)
+  - [Core Content: Workloads & Expectations](workloads.md)
+  - [Core Content: ScenarioBuilderExt Patterns](scenario-builder-ext-patterns.md)
+  - [Best Practices](best-practices.md)
+  - [Examples](examples.md)
+  - [Advanced & Artificial Examples](examples-advanced.md)
+  - [Running Scenarios](running-scenarios.md)
+  - [Runners](runners.md)
+  - [Operations](operations.md)
+- [Part III — Developer Reference](part-iii.md)
+  - [Scenario Model (Developer Level)](scenario-model.md)
+  - [Extending the Framework](extending.md)
+  - [Example: New Workload & Expectation (Rust)](custom-workload-example.md)
+  - [Internal Crate Reference](internal-crate-reference.md)
+- [Part IV — Appendix](part-iv.md)
+  - [DSL Cheat Sheet](dsl-cheat-sheet.md)
+  - [Troubleshooting Scenarios](troubleshooting.md)
+  - [FAQ](faq.md)
+  - [Glossary](glossary.md)
--- a/book/src/annotated-tree.md
+++ b/book/src/annotated-tree.md
@ -0,0 +1,17 @@
+# Annotated Tree
+
+High-level view of the workspace and how pieces relate:
+```
+nomos-testing/
+├─ testing-framework/
+│  ├─ configs/          # shared configuration helpers
+│  ├─ core/             # scenario model, runtime, topology
+│  ├─ workflows/        # workloads, expectations, DSL extensions
+│  └─ runners/          # local, compose, k8s deployment backends
+├─ tests/               # integration scenarios using the framework
+└─ scripts/             # supporting setup utilities (e.g., assets)
+```
+
+Each area maps to a responsibility: describe configs, orchestrate scenarios,
+package common traffic and assertions, adapt to environments, and demonstrate
+end-to-end usage.
--- a/book/src/architecture-overview.md
+++ b/book/src/architecture-overview.md
@ -0,0 +1,29 @@
+# Architecture Overview
+
+The framework follows a clear flow: **Topology → Scenario → Runner → Workloads → Expectations**.
+
+- **Topology** describes the cluster: how many nodes, their roles, and the high-level network and data-availability parameters they should follow.
+- **Scenario** combines that topology with the activities to run and the checks to perform, forming a single plan.
+- **Deployer/Runner** pair turns the plan into a live environment on the chosen backend (local processes, Docker Compose, or Kubernetes) and brokers readiness.
+- **Workloads** generate traffic and conditions that exercise the system.
+- **Expectations** observe the run and judge success or failure once activity completes.
+
+Conceptual diagram:
+```
+Topology  →  Scenario  →  Runner  →  Workloads  →  Expectations
+ (shape       (plan)      (deploy     (drive         (verify
+ cluster)                & orchestrate) traffic)     outcomes)
+```
+
+Mermaid view:
+```mermaid
+flowchart LR
+    A(Topology<br/>shape cluster) --> B(Scenario<br/>plan)
+    B --> C(Deployer/Runner<br/>deploy & orchestrate)
+    C --> D(Workloads<br/>drive traffic)
+    D --> E(Expectations<br/>verify outcomes)
+```
+
+Each layer has a narrow responsibility so that cluster shape, deployment choice,
+traffic generation, and health checks can evolve independently while fitting
+together predictably.
--- a/book/src/authoring-scenarios.md
+++ b/book/src/authoring-scenarios.md
@ -0,0 +1,20 @@
+# Authoring Scenarios
+
+Creating a scenario is a declarative exercise:
+
+1. **Shape the topology**: decide how many validators and executors to run, and
+   what high-level network and data-availability characteristics matter for the
+   test.
+2. **Attach workloads**: pick traffic generators that align with your goals
+   (transactions, data-availability blobs, or chaos for resilience probes).
+3. **Define expectations**: specify the health signals that must hold when the
+   run finishes (e.g., consensus liveness, inclusion of submitted activity; see
+   [Core Content: Workloads & Expectations](workloads.md)).
+4. **Set duration**: choose a run window long enough to observe meaningful
+   block progression and the effects of your workloads.
+5. **Choose a runner**: target local processes for fast iteration, Docker
+   Compose for reproducible multi-node stacks, or Kubernetes for cluster-grade
+   validation. For environment considerations, see [Operations](operations.md).
+
+Keep scenarios small and explicit: make the intended behavior and the success
+criteria clear so failures are easy to interpret and act upon.
--- a/book/src/best-practices.md
+++ b/book/src/best-practices.md
@ -0,0 +1,16 @@
+# Best Practices
+
+- **State your intent**: document the goal of each scenario (throughput, DA
+  validation, resilience) so expectation choices are obvious.
+- **Keep runs meaningful**: choose durations that allow multiple blocks and make
+  timing-based assertions trustworthy.
+- **Separate concerns**: start with deterministic workloads for functional
+  checks; add chaos in dedicated resilience scenarios to avoid noisy failures.
+- **Reuse patterns**: standardize on shared topology and workload presets so
+  results are comparable across environments and teams.
+- **Observe first, tune second**: rely on liveness and inclusion signals to
+  interpret outcomes before tweaking rates or topology.
+- **Environment fit**: pick runners that match the feedback loop you need—local
+  for speed, compose for reproducible stacks, k8s for cluster-grade fidelity.
+- **Minimal surprises**: seed only necessary wallets and keep configuration
+  deltas explicit when moving between CI and developer machines.
--- a/book/src/custom-workload-example.md
+++ b/book/src/custom-workload-example.md
@ -0,0 +1,116 @@
+# Example: New Workload & Expectation (Rust)
+
+A minimal, end-to-end illustration of adding a custom workload and matching
+expectation. This shows the shape of the traits and where to plug into the
+framework; expand the logic to fit your real test.
+
+## Workload: simple reachability probe
+
+Key ideas:
+- **name**: identifies the workload in logs.
+- **expectations**: workloads can bundle defaults so callers don’t forget checks.
+- **init**: derive inputs from the generated topology (e.g., pick a target node).
+- **start**: drive async activity using the shared `RunContext`.
+
+```rust
+use std::sync::Arc;
+use async_trait::async_trait;
+use testing_framework_core::scenario::{
+    DynError, Expectation, RunContext, RunMetrics, Workload,
+};
+use testing_framework_core::topology::GeneratedTopology;
+
+pub struct ReachabilityWorkload {
+    target_idx: usize,
+    bundled: Vec<Box<dyn Expectation>>,
+}
+
+impl ReachabilityWorkload {
+    pub fn new(target_idx: usize) -> Self {
+        Self {
+            target_idx,
+            bundled: vec![Box::new(ReachabilityExpectation::new(target_idx))],
+        }
+    }
+}
+
+#[async_trait]
+impl Workload for ReachabilityWorkload {
+    fn name(&self) -> &'static str {
+        "reachability_workload"
+    }
+
+    fn expectations(&self) -> Vec<Box<dyn Expectation>> {
+        self.bundled.clone()
+    }
+
+    fn init(
+        &mut self,
+        topology: &GeneratedTopology,
+        _metrics: &RunMetrics,
+    ) -> Result<(), DynError> {
+        if topology.validators().get(self.target_idx).is_none() {
+            return Err("no validator at requested index".into());
+        }
+        Ok(())
+    }
+
+    async fn start(&self, ctx: &RunContext) -> Result<(), DynError> {
+        let client = ctx
+            .clients()
+            .validators()
+            .get(self.target_idx)
+            .ok_or("missing target client")?;
+
+        // Pseudo-action: issue a lightweight RPC to prove reachability.
+        client.health_check().await.map_err(|e| e.into())
+    }
+}
+```
+
+## Expectation: confirm the target stayed reachable
+
+Key ideas:
+- **start_capture**: snapshot baseline if needed (not used here).
+- **evaluate**: assert the condition after workloads finish.
+
+```rust
+use async_trait::async_trait;
+use testing_framework_core::scenario::{DynError, Expectation, RunContext};
+
+pub struct ReachabilityExpectation {
+    target_idx: usize,
+}
+
+impl ReachabilityExpectation {
+    pub fn new(target_idx: usize) -> Self {
+        Self { target_idx }
+    }
+}
+
+#[async_trait]
+impl Expectation for ReachabilityExpectation {
+    fn name(&self) -> &str {
+        "target_reachable"
+    }
+
+    async fn evaluate(&mut self, ctx: &RunContext) -> Result<(), DynError> {
+        let client = ctx
+            .clients()
+            .validators()
+            .get(self.target_idx)
+            .ok_or("missing target client")?;
+
+        client.health_check().await.map_err(|e| {
+            format!("target became unreachable during run: {e}").into()
+        })
+    }
+}
+```
+
+## How to wire it
+- Build your scenario as usual and call `.with_workload(ReachabilityWorkload::new(0))`.
+- The bundled expectation is attached automatically; you can add more with
+  `.with_expectation(...)` if needed.
+- Keep the logic minimal and fast for smoke tests; grow it into richer probes
+  for deeper scenarios.
--- a/book/src/design-rationale.md
+++ b/book/src/design-rationale.md
@ -0,0 +1,7 @@
+# Design Rationale
+
+- **Modular crates** keep configuration, orchestration, workloads, and runners decoupled so each can evolve without breaking the others.
+- **Pluggable runners** let the same scenario run on a laptop, a Docker host, or a Kubernetes cluster, making validation portable across environments.
+- **Separated workloads and expectations** clarify intent: what traffic to generate versus how to judge success. This simplifies review and reuse.
+- **Declarative topology** makes cluster shape explicit and repeatable, reducing surprise when moving between CI and developer machines.
+- **Maintainability through predictability**: a clear flow from plan to deployment to verification lowers the cost of extending the framework and interpreting failures.
--- a/book/src/dsl-cheat-sheet.md
+++ b/book/src/dsl-cheat-sheet.md
@ -0,0 +1,19 @@
+# Core Content: DSL Cheat Sheet
+
+The framework offers a fluent builder style to keep scenarios readable. Common
+knobs:
+
+- **Topology shaping**: set validator and executor counts, pick a network layout
+  style, and adjust high-level data-availability traits.
+- **Wallet seeding**: define how many users participate and the total funds
+  available for transaction workloads.
+- **Workload tuning**: configure transaction rates, data-availability channel
+  and blob rates, and whether chaos restarts should include validators,
+  executors, or both.
+- **Expectations**: attach liveness and workload-specific checks so success is
+  explicit.
+- **Run window**: set a minimum duration long enough for multiple blocks to be
+  observed and verified.
+
+Use these knobs to express intent clearly, keeping scenario definitions concise
+and consistent across teams.
--- a/book/src/examples-advanced.md
+++ b/book/src/examples-advanced.md
@ -0,0 +1,62 @@
+# Advanced & Artificial Examples
+
+These illustrative scenarios stretch the framework to show how to build new
+workloads, expectations, deployers, and topology tricks. They are intentionally
+“synthetic” to teach capabilities rather than prescribe production tests.
+
+## Synthetic Delay Workload (Network Latency Simulation)
+- **Idea**: inject fake latency between node interactions using internal timers,
+  not OS-level tooling.
+- **Demonstrates**: sequencing control inside a workload, verifying protocol
+  progression under induced lag, using timers to pace submissions.
+- **Shape**: wrap submissions in delays that mimic slow peers; ensure the
+  expectation checks blocks still progress.
+
+## Oscillating Load Workload (Traffic Waves)
+- **Idea**: traffic rate changes every block or N seconds (e.g., blocks 1–3 low,
+  4–5 high, 6–7 zero, repeat).
+- **Demonstrates**: dynamic, stateful workloads that use `RunMetrics` to time
+  phases; modeling real-world burstiness.
+- **Shape**: schedule per-phase rates; confirm inclusion/liveness across peaks
+  and troughs.
+
+## Byzantine Behavior Mock
+- **Idea**: a workload that drops half its planned submissions, sometimes
+  double-submits, and intentionally triggers expectation failures.
+- **Demonstrates**: negative testing, resilience checks, and the value of clear
+  expectations when behavior is adversarial by design.
+- **Shape**: parameterize drop/double-submit probabilities; pair with an
+  expectation that documents what “bad” looks like.
+
+## Custom Expectation: Block Finality Drift
+- **Idea**: assert the last few blocks differ and block time stays within a
+  tolerated drift budget.
+- **Demonstrates**: consuming `BlockFeed` or time-series metrics to validate
+  protocol cadence; crafting post-run assertions around block diversity and
+  timing.
+- **Shape**: collect recent blocks, confirm no duplicates, and compare observed
+  intervals to a drift threshold.
+
+## Custom Deployer: Dry-Run Deployer
+- **Idea**: a deployer that never starts nodes; it emits configs, simulates
+  readiness, provides fake blockfeed/metrics.
+- **Demonstrates**: full power of the deployer interface for CI dry-runs,
+  config verification, and ultra-fast feedback without Nomos binaries.
+- **Shape**: produce logs/artifacts, stub readiness, and feed synthetic blocks
+  so expectations can still run.
+
+## Stochastic Topology Generator
+- **Idea**: topology parameters change at runtime (random validators, DA
+  settings, network shapes).
+- **Demonstrates**: randomized property testing and fuzzing approaches to
+  topology building.
+- **Shape**: pick roles and network layouts randomly per run; keep expectations
+  tolerant to variability while still asserting core liveness.
+
+## Multi-Phase Scenario (“Pipelines”)
+- **Idea**: scenario runs in phases (e.g., phase 1 transactions, phase 2 DA,
+  phase 3 restarts, phase 4 sync check).
+- **Demonstrates**: multi-stage tests, modular scenario assembly, and deliberate
+  lifecycle control.
+- **Shape**: drive phase-specific workloads/expectations sequentially; enforce
+  clear boundaries and post-phase checks.
--- a/book/src/examples.md
+++ b/book/src/examples.md
@ -0,0 +1,28 @@
+# Examples
+
+Concrete scenario shapes that illustrate how to combine topologies, workloads,
+and expectations. Adjust counts, rates, and durations to fit your environment.
+
+## Simple 2-validator transaction workload
+- **Topology**: two validators.
+- **Workload**: transaction submissions at a modest per-block rate with a small
+  set of wallet actors.
+- **Expectations**: consensus liveness and inclusion of submitted activity.
+- **When to use**: smoke tests for consensus and transaction flow on minimal
+  hardware.
+
+## DA + transaction workload
+- **Topology**: validators plus executors if available.
+- **Workloads**: data-availability blobs/channels and transactions running
+  together to stress both paths.
+- **Expectations**: consensus liveness and workload-level inclusion/availability
+  checks.
+- **When to use**: end-to-end coverage of transaction and DA layers in one run.
+
+## Chaos + liveness check
+- **Topology**: validators (optionally executors) with node control enabled.
+- **Workloads**: baseline traffic (transactions or DA) plus chaos restarts on
+  selected roles.
+- **Expectations**: consensus liveness to confirm the system keeps progressing
+  despite restarts; workload-specific inclusion if traffic is present.
+- **When to use**: resilience validation and operational readiness drills.
--- a/book/src/extending.md
+++ b/book/src/extending.md
@ -0,0 +1,31 @@
+# Extending the Framework
+
+## Adding a workload
+1) Implement `testing_framework_core::scenario::Workload`:
+   - Provide a name and any bundled expectations.
+   - In `init`, derive inputs from `GeneratedTopology` and `RunMetrics`; fail
+     fast if prerequisites are missing (e.g., wallet data, node addresses).
+   - In `start`, drive async traffic using the `RunContext` clients.
+2) Expose the workload from a module under `testing-framework/workflows` and
+   consider adding a DSL helper for ergonomic wiring.
+
+## Adding an expectation
+1) Implement `testing_framework_core::scenario::Expectation`:
+   - Use `start_capture` to snapshot baseline metrics.
+   - Use `evaluate` to assert outcomes after workloads finish; return all errors
+     so the runner can aggregate them.
+2) Export it from `testing-framework/workflows` if it is reusable.
+
+## Adding a runner
+1) Implement `testing_framework_core::scenario::Deployer` for your backend.
+   - Produce a `RunContext` with `NodeClients`, metrics endpoints, and optional
+     `NodeControlHandle`.
+   - Guard cleanup with `CleanupGuard` to reclaim resources even on failures.
+2) Mirror the readiness and block-feed probes used by the existing runners so
+   workloads can rely on consistent signals.
+
+## Adding topology helpers
+- Extend `testing_framework_core::topology::TopologyBuilder` with new layouts or
+  configuration presets (e.g., specialized DA parameters). Keep defaults safe:
+  ensure at least one participant and clamp dispersal factors as the current
+  helpers do.
--- a/book/src/faq.md
+++ b/book/src/faq.md
@ -0,0 +1,26 @@
+# FAQ
+
+**Why block-oriented timing?**  
+Using block cadence reduces dependence on host speed and keeps assertions aligned
+with protocol behavior.
+
+**Can I reuse the same scenario across runners?**  
+Yes. The plan stays the same; swap runners (local, compose, k8s) to target
+different environments.
+
+**When should I enable chaos workloads?**  
+Only when testing resilience or operational recovery; keep functional smoke
+tests deterministic.
+
+**How long should runs be?**  
+Long enough for multiple blocks so liveness and inclusion checks are
+meaningful; very short runs risk false confidence.
+
+**Do I always need seeded wallets?**  
+Only for transaction scenarios. Data-availability or pure chaos scenarios may
+not require them, but liveness checks still need validators producing blocks.
+
+**What if expectations fail but workloads “look fine”?**  
+Trust expectations first—they capture the intended success criteria. Use the
+observability signals and runner logs to pinpoint why the system missed the
+target.
--- a/book/src/glossary.md
+++ b/book/src/glossary.md
@ -0,0 +1,18 @@
+# Glossary
+
+- **Validator**: node role responsible for participating in consensus and block
+  production.
+- **Executor**: node role that processes transactions or workloads delegated by
+  validators.
+- **DA (Data Availability)**: subsystem ensuring blobs or channel data are
+  published and retrievable for validation.
+- **Workload**: traffic or behavior generator that exercises the system during a
+  scenario run.
+- **Expectation**: post-run assertion that judges whether the system met the
+  intended success criteria.
+- **Topology**: declarative description of the cluster shape, roles, and
+  high-level parameters for a scenario.
+- **Blockfeed**: stream of block observations used for liveness or inclusion
+  signals during a run.
+- **Control capability**: the ability for a runner to start, stop, or restart
+  nodes, used by chaos workloads.
--- a/book/src/internal-crate-reference.md
+++ b/book/src/internal-crate-reference.md
@ -0,0 +1,18 @@
+# Internal Crate Reference
+
+High-level roles of the crates that make up the framework:
+
+- **Configs**: prepares reusable configuration primitives for nodes, networking,
+  tracing, data availability, and wallets, shared by all scenarios and runners.
+- **Core scenario orchestration**: houses the topology and scenario model,
+  runtime coordination, node clients, and readiness/health probes.
+- **Workflows**: packages workloads and expectations into reusable building
+  blocks and offers a fluent DSL to assemble them.
+- **Runners**: implements deployment backends (local host, Docker Compose,
+  Kubernetes) that all consume the same scenario plan.
+- **Test workflows**: example scenarios and integration checks that exercise the
+  framework end to end and serve as living documentation.
+
+Use this map to locate where to add new capabilities: configuration primitives
+in configs, orchestration changes in core, reusable traffic/assertions in
+workflows, environment adapters in runners, and demonstrations in tests.
--- a/book/src/introduction.md
+++ b/book/src/introduction.md
@ -0,0 +1,15 @@
+# Introduction
+
+The Nomos Testing Framework is a purpose-built toolkit for exercising Nomos in
+realistic, multi-node environments. It solves the gap between small, isolated
+tests and full-system validation by letting teams describe a cluster layout,
+drive meaningful traffic, and assert the outcomes in one coherent plan.
+
+It is for protocol engineers, infrastructure operators, and QA teams who need
+repeatable confidence that validators, executors, and data-availability
+components work together under network and timing constraints.
+
+Multi-node integration testing is required because many Nomos behaviors—block
+progress, data availability, liveness under churn—only emerge when several
+roles interact over real networking and time. This framework makes those checks
+declarative, observable, and portable across environments.
--- a/book/src/operations.md
+++ b/book/src/operations.md
@ -0,0 +1,42 @@
+# Operations
+
+Operational readiness focuses on prerequisites, environment fit, and clear
+signals:
+
+- **Prerequisites**: keep a sibling `nomos-node` checkout available; ensure the
+  chosen runner’s platform needs are met (local binaries for host runs, Docker
+  for compose, cluster access for k8s).
+- **Artifacts**: some scenarios depend on prover or circuit assets; fetch them
+  ahead of time with the provided helper scripts when needed.
+- **Environment flags**: use slow-environment toggles to relax timeouts, enable
+  tracing when debugging, and adjust observability ports to avoid clashes.
+- **Readiness checks**: verify runners report node readiness before starting
+  workloads; this avoids false negatives from starting too early.
+- **Failure triage**: map failures to missing prerequisites (wallet seeding,
+  node control availability), runner platform issues, or unmet expectations.
+  Start with liveness signals, then dive into workload-specific assertions.
+
+Treat operational hygiene—assets present, prerequisites satisfied, observability
+reachable—as the first step to reliable scenario outcomes.
+
+Metrics and observability flow:
+```
+Runner exposes endpoints/ports
+    │
+    ▼
+Runtime collects block/health signals
+    │
+    ▼
+Expectations consume signals to decide pass/fail
+    │
+    ▼
+Operators inspect logs/metrics when failures arise
+```
+
+Mermaid view:
+```mermaid
+flowchart TD
+    Expose[Runner exposes endpoints/ports] --> Collect[Runtime collects block/health signals]
+    Collect --> Consume[Expectations consume signals<br/>decide pass/fail]
+    Consume --> Inspect[Operators inspect logs/metrics<br/>when failures arise]
+```
--- a/book/src/part-i.md
+++ b/book/src/part-i.md
@ -0,0 +1,4 @@
+# Part I — Foundations
+
+Conceptual chapters that establish the mental model for the framework and how
+it approaches multi-node testing.
--- a/book/src/part-ii.md
+++ b/book/src/part-ii.md
@ -0,0 +1,4 @@
+# Part II — User Guide
+
+Practical guidance for shaping scenarios, combining workloads and expectations,
+and running them across different environments.
--- a/book/src/part-iii.md
+++ b/book/src/part-iii.md
@ -0,0 +1,4 @@
+# Part III — Developer Reference
+
+Deep dives for contributors who extend the framework, evolve its abstractions,
+or maintain the crate set.
--- a/book/src/part-iv.md
+++ b/book/src/part-iv.md
@ -0,0 +1,4 @@
+# Part IV — Appendix
+
+Quick-reference material and supporting guidance to keep scenarios discoverable,
+debuggable, and consistent.
--- a/book/src/project-context-primer.md
+++ b/book/src/project-context-primer.md
@ -0,0 +1,16 @@
+# Project Context Primer
+
+This book focuses on the Nomos Testing Framework. It assumes familiarity with
+the Nomos architecture, but for completeness, here is a short primer.
+
+- **Nomos** is a modular blockchain protocol composed of validators, executors,
+  and a data-availability (DA) subsystem.
+- **Validators** participate in consensus and produce blocks.
+- **Executors** run application logic or off-chain computations referenced by
+  blocks.
+- **Data Availability (DA)** ensures that data referenced in blocks is
+  published and retrievable, including blobs or channel data used by workloads.
+
+These roles interact tightly, which is why meaningful testing must be performed
+in multi-node environments that include real networking, timing, and DA
+interaction.
--- a/book/src/runners.md
+++ b/book/src/runners.md
@ -0,0 +1,51 @@
+# Runners
+
+Runners turn a scenario plan into a live environment while keeping the plan
+unchanged. Choose based on feedback speed, reproducibility, and fidelity. For
+environment and operational considerations, see [Operations](operations.md):
+
+## Local runner
+- Launches node processes directly on the host.
+- Fastest feedback loop and minimal orchestration overhead.
+- Best for development-time iteration and debugging.
+
+## Docker Compose runner
+- Starts nodes in containers to provide a reproducible multi-node stack on a
+  single machine.
+- Discovers service ports and wires observability for convenient inspection.
+- Good balance between fidelity and ease of setup.
+
+## Kubernetes runner
+- Deploys nodes onto a cluster for higher-fidelity, longer-running scenarios.
+- Suits CI or shared environments where cluster behavior and scheduling matter.
+
+### Common expectations
+- All runners require at least one validator and, for transaction scenarios,
+  access to seeded wallets.
+- Readiness probes gate workload start so traffic begins only after nodes are
+  reachable.
+- Environment flags can relax timeouts or increase tracing when diagnostics are
+  needed.
+
+Runner abstraction:
+```
+Scenario Plan
+    │
+    ▼
+Runner (local | compose | k8s)
+    │  provisions env + readiness
+    ▼
+Runtime + Observability
+    │
+    ▼
+Workloads / Expectations execute
+```
+
+Mermaid view:
+```mermaid
+flowchart TD
+    Plan[Scenario Plan] --> RunSel{Runner<br/>(local | compose | k8s)}
+    RunSel --> Provision[Provision & readiness]
+    Provision --> Runtime[Runtime + observability]
+    Runtime --> Exec[Workloads & Expectations execute]
+```
--- a/book/src/running-scenarios.md
+++ b/book/src/running-scenarios.md
@ -0,0 +1,17 @@
+# Running Scenarios
+
+Running a scenario follows the same conceptual flow regardless of environment:
+
+1. Select or author a scenario plan that pairs a topology with workloads,
+   expectations, and a suitable run window.
+2. Choose a runner aligned with your environment (local, compose, or k8s) and
+   ensure its prerequisites are available.
+3. Deploy the plan through the runner; wait for readiness signals before
+   starting workloads.
+4. Let workloads drive activity for the planned duration; keep observability
+   signals visible so you can correlate outcomes.
+5. Evaluate expectations and capture results as the primary pass/fail signal.
+
+Use the same plan across different runners to compare behavior between local
+development and CI or cluster settings. For environment prerequisites and
+flags, see [Operations](operations.md).
--- a/book/src/scenario-builder-ext-patterns.md
+++ b/book/src/scenario-builder-ext-patterns.md
@ -0,0 +1,17 @@
+# Core Content: ScenarioBuilderExt Patterns
+
+Patterns that keep scenarios readable and reusable:
+
+- **Topology-first**: start by shaping the cluster (counts, layout) so later
+  steps inherit a clear foundation.
+- **Bundle defaults**: use the DSL helpers to attach common expectations (like
+  liveness) whenever you add a matching workload, reducing forgotten checks.
+- **Intentional rates**: express traffic in per-block terms to align with
+  protocol timing rather than wall-clock assumptions.
+- **Opt-in chaos**: enable restart patterns only in scenarios meant to probe
+  resilience; keep functional smoke tests deterministic.
+- **Wallet clarity**: seed only the number of actors you need; it keeps
+  transaction scenarios deterministic and interpretable.
+
+These patterns make scenario definitions self-explanatory while staying aligned
+with the framework’s block-oriented timing model.
--- a/book/src/scenario-lifecycle.md
+++ b/book/src/scenario-lifecycle.md
@ -0,0 +1,24 @@
+# Scenario Lifecycle (Conceptual)
+
+1. **Build the plan**: Declare a topology, attach workloads and expectations, and set the run window. The plan is the single source of truth for what will happen.
+2. **Deploy**: Hand the plan to a runner. It provisions the environment on the chosen backend and waits for nodes to signal readiness.
+3. **Drive workloads**: Start traffic and behaviors (transactions, data-availability activity, restarts) for the planned duration.
+4. **Observe blocks and signals**: Track block progression and other high-level metrics during or after the run window to ground assertions in protocol time.
+5. **Evaluate expectations**: Once activity stops (and optional cooldown completes), check liveness and workload-specific outcomes to decide pass or fail.
+6. **Cleanup**: Tear down resources so successive runs start fresh and do not inherit leaked state.
+
+Conceptual lifecycle diagram:
+```
+Plan → Deploy → Readiness → Drive Workloads → Observe → Evaluate → Cleanup
+```
+
+Mermaid view:
+```mermaid
+flowchart LR
+    P[Plan<br/>topology + workloads + expectations] --> D[Deploy<br/>runner provisions]
+    D --> R[Readiness<br/>wait for nodes]
+    R --> W[Drive Workloads]
+    W --> O[Observe<br/>blocks/metrics]
+    O --> E[Evaluate Expectations]
+    E --> C[Cleanup]
+```
--- a/book/src/scenario-model.md
+++ b/book/src/scenario-model.md
@ -0,0 +1,23 @@
+# Scenario Model (Developer Level)
+
+The scenario model defines clear, composable responsibilities:
+
+- **Topology**: a declarative description of the cluster—how many nodes, their
+  roles, and the broad network and data-availability characteristics. It
+  represents the intended shape of the system under test.
+- **Scenario**: a plan combining topology, workloads, expectations, and a run
+  window. Building a scenario validates prerequisites (like seeded wallets) and
+  ensures the run lasts long enough to observe meaningful block progression.
+- **Workloads**: asynchronous tasks that generate traffic or conditions. They
+  use shared context to interact with the deployed cluster and may bundle
+  default expectations.
+- **Expectations**: post-run assertions. They can capture baselines before
+  workloads start and evaluate success once activity stops.
+- **Runtime**: coordinates workloads and expectations for the configured
+  duration, enforces cooldowns when control actions occur, and ensures cleanup
+  so runs do not leak resources.
+
+Developers extending the model should keep these boundaries strict: topology
+describes, scenarios assemble, runners deploy, workloads drive, and expectations
+judge outcomes. For guidance on adding new capabilities, see
+[Extending the Framework](extending.md).
--- a/book/src/testing-philosophy.md
+++ b/book/src/testing-philosophy.md
@ -0,0 +1,9 @@
+# Testing Philosophy
+
+- **Declarative over imperative**: describe the desired cluster shape, traffic, and success criteria; let the framework orchestrate the run.
+- **Observable health signals**: prefer liveness and inclusion signals that reflect real user impact instead of internal debug state.
+- **Determinism first**: default scenarios aim for repeatable outcomes with fixed topologies and traffic rates; variability is opt-in.
+- **Targeted non-determinism**: introduce randomness (e.g., restarts) only when probing resilience or operational robustness.
+- **Protocol time, not wall time**: reason in blocks and protocol-driven intervals to reduce dependence on host speed or scheduler noise.
+- **Minimum run window**: always allow enough block production to make assertions meaningful; very short runs risk false confidence.
+- **Use chaos with intent**: chaos workloads are for recovery and fault-tolerance validation, not for baseline functional checks.
--- a/book/src/troubleshooting.md
+++ b/book/src/troubleshooting.md
@ -0,0 +1,9 @@
+# Troubleshooting Scenarios
+
+Common symptoms and likely causes:
+
+- **No or slow block progression**: runner started workloads before readiness, insufficient run window, or environment too slow—extend duration or enable slow-environment tuning.
+- **Transactions not included**: missing or insufficient wallet seeding, misaligned transaction rate with block cadence, or network instability—reduce rate and verify wallet setup.
+- **Chaos stalls the run**: node control not available for the chosen runner or restart cadence too aggressive—enable control capability and widen restart intervals.
+- **Observability gaps**: metrics or logs unreachable because ports clash or services are not exposed—adjust observability ports and confirm runner wiring.
+- **Flaky behavior across runs**: mixing chaos with functional smoke tests or inconsistent topology between environments—separate deterministic and chaos scenarios and standardize topology presets.
--- a/book/src/usage-patterns.md
+++ b/book/src/usage-patterns.md
@ -0,0 +1,7 @@
+# Usage Patterns
+
+- **Shape a topology, pick a runner**: choose local for quick iteration, compose for reproducible multi-node stacks with observability, or k8s for cluster-grade validation.
+- **Compose workloads deliberately**: pair transactions and data-availability traffic for end-to-end coverage; add chaos only when assessing recovery and resilience.
+- **Align expectations with goals**: use liveness-style checks to confirm the system keeps up with planned activity, and add workload-specific assertions for inclusion or availability.
+- **Reuse plans across environments**: keep the scenario constant while swapping runners to compare behavior between developer machines and CI clusters.
+- **Iterate with clear signals**: treat expectation outcomes as the primary pass/fail indicator, and adjust topology or workloads based on what those signals reveal.
--- a/book/src/what-you-will-learn.md
+++ b/book/src/what-you-will-learn.md
@ -0,0 +1,6 @@
+# What You Will Learn
+
+This book gives you a clear mental model for Nomos multi-node testing, shows how
+to author scenarios that pair realistic workloads with explicit expectations,
+and guides you to run them across local, containerized, and cluster environments
+without changing the plan.
--- a/book/src/workloads.md
+++ b/book/src/workloads.md
@ -0,0 +1,42 @@
+# Core Content: Workloads & Expectations
+
+Workloads describe the activity a scenario generates; expectations describe the
+signals that must hold when that activity completes. Both are pluggable so
+scenarios stay readable and purpose-driven.
+
+## Workloads
+- **Transaction workload**: submits user-level transactions at a configurable
+  rate and can limit how many distinct actors participate.
+- **Data-availability workload**: drives blob and channel activity to exercise
+  data-availability paths.
+- **Chaos workload**: triggers controlled node restarts to test resilience and
+  recovery behaviors (requires a runner that can control nodes).
+
+## Expectations
+- **Consensus liveness**: verifies the system continues to produce blocks in
+  line with the planned workload and timing window.
+- **Workload-specific checks**: each workload can attach its own success
+  criteria (e.g., inclusion of submitted activity) so scenarios remain concise.
+
+Together, workloads and expectations let you express both the pressure applied
+to the system and the definition of “healthy” for that run.
+
+Workload pipeline (conceptual):
+```
+Inputs (topology + wallets + rates)
+    │
+    ▼
+Workload init → Drive traffic → Collect signals
+                                   │
+                                   ▼
+                           Expectations evaluate
+```
+
+Mermaid view:
+```mermaid
+flowchart TD
+    I[Inputs<br/>(topology + wallets + rates)] --> Init[Workload init]
+    Init --> Drive[Drive traffic]
+    Drive --> Collect[Collect signals]
+    Collect --> Eval[Expectations evaluate]
+```
--- a/book/src/workspace-layout.md
+++ b/book/src/workspace-layout.md
@ -0,0 +1,19 @@
+# Workspace Layout
+
+The workspace focuses on multi-node integration testing and sits alongside a
+`nomos-node` checkout. Its crates separate concerns to keep scenarios
+repeatable and portable:
+
+- **Configs**: prepares high-level node, network, tracing, and wallet settings
+  used across test environments.
+- **Core scenario orchestration**: the engine that holds topology descriptions,
+  scenario plans, runtimes, workloads, and expectations.
+- **Workflows**: ready-made workloads (transactions, data-availability, chaos)
+  and reusable expectations assembled into a user-facing DSL.
+- **Runners**: deployment backends for local processes, Docker Compose, and
+  Kubernetes, all consuming the same scenario plan.
+- **Test workflows**: example scenarios and integration checks that show how
+  the pieces fit together.
+
+This split keeps configuration, orchestration, reusable traffic patterns, and
+deployment adapters loosely coupled while sharing one mental model for tests.
--- a/scripts/build-rapidsnark.sh
+++ b/scripts/build-rapidsnark.sh
@ -21,6 +21,20 @@ if [ ! -d "$CIRCUITS_DIR" ]; then
    exit 1
 fi

+system_gmp_package() {
+    local multiarch
+    multiarch="$(gcc -print-multiarch 2>/dev/null || echo aarch64-linux-gnu)"
+    local lib_path="/usr/lib/${multiarch}/libgmp.a"
+    if [ ! -f "$lib_path" ]; then
+        echo "system libgmp.a not found at $lib_path" >&2
+        return 1
+    fi
+    mkdir -p depends/gmp/package_aarch64/lib depends/gmp/package_aarch64/include
+    cp "$lib_path" depends/gmp/package_aarch64/lib/
+    # Headers are small; copy the public ones the build expects.
+    cp /usr/include/gmp*.h depends/gmp/package_aarch64/include/ || true
+}
+
 case "$TARGET_ARCH" in
    arm64 | aarch64)
        ;;
@ -41,12 +55,23 @@ git submodule update --init --recursive >&2
 if [ "${RAPIDSNARK_BUILD_GMP:-1}" = "1" ]; then
    GMP_TARGET="${RAPIDSNARK_GMP_TARGET:-aarch64}"
    ./build_gmp.sh "$GMP_TARGET" >&2
+else
+    echo "Using system libgmp to satisfy rapidsnark dependencies" >&2
+    system_gmp_package
 fi

-MAKE_TARGET="${RAPIDSNARK_MAKE_TARGET:-host_arm64}"
 PACKAGE_DIR="${RAPIDSNARK_PACKAGE_DIR:-package_arm64}"

-make "$MAKE_TARGET" -j"$(nproc)" >&2
+rm -rf build_prover_arm64
+mkdir build_prover_arm64
+cd build_prover_arm64
+cmake .. \
+    -DTARGET_PLATFORM=aarch64 \
+    -DCMAKE_BUILD_TYPE=Release \
+    -DCMAKE_INSTALL_PREFIX="../${PACKAGE_DIR}" \
+    -DBUILD_SHARED_LIBS=OFF >&2
+cmake --build . --target prover verifier -- -j"$(nproc)" >&2

-install -m 0755 "${PACKAGE_DIR}/bin/prover" "$CIRCUITS_DIR/prover"
+install -m 0755 "src/prover" "$CIRCUITS_DIR/prover"
+install -m 0755 "src/verifier" "$CIRCUITS_DIR/verifier"
 echo "rapidsnark prover installed to $CIRCUITS_DIR/prover" >&2
--- a/scripts/setup-nomos-circuits.sh
+++ b/scripts/setup-nomos-circuits.sh
@ -121,7 +121,7 @@ download_release() {
        print_error "Please check that version ${VERSION} exists for platform ${platform}"
        print_error "Available releases: https://github.com/${REPO}/releases"
        rm -rf "$temp_dir"
-        exit 1
+        return 1
    fi

    print_success "Download complete"
@ -132,7 +132,7 @@ download_release() {
    if ! tar -xzf "${temp_dir}/${artifact}" -C "$INSTALL_DIR" --strip-components=1; then
        print_error "Failed to extract archive"
        rm -rf "$temp_dir"
-        exit 1
+        return 1
    fi

    rm -rf "$temp_dir"
@ -171,8 +171,18 @@ main() {
    # Check existing installation
    check_existing_installation

-    # Download and extract
-    download_release "$platform"
+    # Download and extract (retry with x86_64 bundle on aarch64 if needed)
+    if ! download_release "$platform"; then
+        if [[ "$platform" == linux-aarch64 ]]; then
+            print_warning "Falling back to linux-x86_64 circuits bundle; will rebuild prover for aarch64."
+            rm -rf "$INSTALL_DIR"
+            if ! download_release "linux-x86_64"; then
+                exit 1
+            fi
+        else
+            exit 1
+        fi
+    fi

    # Handle macOS quarantine if needed
    if [[ "$platform" == macos-* ]]; then
--- a/testing-framework/configs/src/nodes/executor.rs
+++ b/testing-framework/configs/src/nodes/executor.rs
@ -82,7 +82,7 @@ pub fn create_executor_config(config: GeneralConfig) -> ExecutorConfig {
            // non-string keys and keep services alive.
            recovery_file: PathBuf::new(),
            bootstrap: chain_service::BootstrapConfig {
-                prolonged_bootstrap_period: Duration::from_secs(3),
+                prolonged_bootstrap_period: config.bootstrapping_config.prolonged_bootstrap_period,
                force_bootstrap: false,
                offline_grace_period: chain_service::OfflineGracePeriodConfig {
                    grace_period: Duration::from_secs(20 * 60),
--- a/testing-framework/runners/k8s/src/assets.rs
+++ b/testing-framework/runners/k8s/src/assets.rs
@ -204,7 +204,8 @@ fn build_values(topology: &GeneratedTopology) -> HelmValues {
    let validators = topology
        .validators()
        .iter()
-        .map(|validator| {
+        .enumerate()
+        .map(|(index, validator)| {
            let mut env = BTreeMap::new();
            env.insert(
                "CFG_NETWORK_PORT".into(),
@ -225,6 +226,8 @@ fn build_values(topology: &GeneratedTopology) -> HelmValues {
                    .port()
                    .to_string(),
            );
+            env.insert("CFG_HOST_KIND".into(), "validator".into());
+            env.insert("CFG_HOST_IDENTIFIER".into(), format!("validator-{index}"));

            NodeValues {
                api_port: validator.general.api_config.address.port(),
@ -237,7 +240,8 @@ fn build_values(topology: &GeneratedTopology) -> HelmValues {
    let executors = topology
        .executors()
        .iter()
-        .map(|executor| {
+        .enumerate()
+        .map(|(index, executor)| {
            let mut env = BTreeMap::new();
            env.insert(
                "CFG_NETWORK_PORT".into(),
@ -258,6 +262,8 @@ fn build_values(topology: &GeneratedTopology) -> HelmValues {
                    .port()
                    .to_string(),
            );
+            env.insert("CFG_HOST_KIND".into(), "executor".into());
+            env.insert("CFG_HOST_IDENTIFIER".into(), format!("executor-{index}"));

            NodeValues {
                api_port: executor.general.api_config.address.port(),
--- a/testing-framework/runners/k8s/src/runner.rs
+++ b/testing-framework/runners/k8s/src/runner.rs
@ -22,7 +22,7 @@ use crate::{
    helm::{HelmError, install_release},
    host::node_host,
    logs::dump_namespace_logs,
-    wait::{ClusterPorts, ClusterWaitError, NodeConfigPorts, wait_for_cluster_ready},
+    wait::{ClusterPorts, ClusterReady, ClusterWaitError, NodeConfigPorts, wait_for_cluster_ready},
 };

 pub struct K8sRunner {
@ -66,6 +66,7 @@ struct ClusterEnvironment {
    executor_api_ports: Vec<u16>,
    executor_testing_ports: Vec<u16>,
    prometheus_port: u16,
+    port_forwards: Vec<std::process::Child>,
 }

 impl ClusterEnvironment {
@ -75,6 +76,7 @@ impl ClusterEnvironment {
        release: String,
        cleanup: RunnerCleanup,
        ports: &ClusterPorts,
+        port_forwards: Vec<std::process::Child>,
    ) -> Self {
        Self {
            client,
@ -86,6 +88,7 @@ impl ClusterEnvironment {
            executor_api_ports: ports.executors.iter().map(|ports| ports.api).collect(),
            executor_testing_ports: ports.executors.iter().map(|ports| ports.testing).collect(),
            prometheus_port: ports.prometheus,
+            port_forwards,
        }
    }

@ -97,15 +100,17 @@ impl ClusterEnvironment {
            "k8s stack failure; collecting diagnostics"
        );
        dump_namespace_logs(&self.client, &self.namespace).await;
+        kill_port_forwards(&mut self.port_forwards);
        if let Some(guard) = self.cleanup.take() {
            Box::new(guard).cleanup();
        }
    }

-    fn into_cleanup(mut self) -> RunnerCleanup {
-        self.cleanup
-            .take()
-            .expect("cleanup guard should be available")
+    fn into_cleanup(self) -> (RunnerCleanup, Vec<std::process::Child>) {
+        (
+            self.cleanup.expect("cleanup guard should be available"),
+            self.port_forwards,
+        )
    }
 }

@ -264,12 +269,15 @@ impl Deployer for K8sRunner {
                return Err(err);
            }
        };
-        let cleanup = cluster
+        let (cleanup, port_forwards) = cluster
            .take()
            .expect("cluster should still be available")
            .into_cleanup();
-        let cleanup_guard: Box<dyn CleanupGuard> =
-            Box::new(K8sCleanupGuard::new(cleanup, block_feed_guard));
+        let cleanup_guard: Box<dyn CleanupGuard> = Box::new(K8sCleanupGuard::new(
+            cleanup,
+            block_feed_guard,
+            port_forwards,
+        ));
        let context = RunContext::new(
            descriptors,
            None,
@ -301,6 +309,14 @@ fn ensure_supported_topology(descriptors: &GeneratedTopology) -> Result<(), K8sR
    Ok(())
 }

+fn kill_port_forwards(handles: &mut Vec<std::process::Child>) {
+    for handle in handles.iter_mut() {
+        let _ = handle.kill();
+        let _ = handle.wait();
+    }
+    handles.clear();
+}
+
 fn collect_port_specs(descriptors: &GeneratedTopology) -> PortSpecs {
    let validators = descriptors
        .validators()
@ -386,11 +402,11 @@ async fn setup_cluster(
    let mut cleanup_guard =
        Some(install_stack(client, &assets, &namespace, &release, validators, executors).await?);

-    let cluster_ports =
+    let cluster_ready =
        wait_for_ports_or_cleanup(client, &namespace, &release, specs, &mut cleanup_guard).await?;

    info!(
-        prometheus_port = cluster_ports.prometheus,
+        prometheus_port = cluster_ready.ports.prometheus,
        "discovered prometheus endpoint"
    );

@ -401,7 +417,8 @@ async fn setup_cluster(
        cleanup_guard
            .take()
            .expect("cleanup guard must exist after successful cluster startup"),
-        &cluster_ports,
+        &cluster_ready.ports,
+        cluster_ready.port_forwards,
    );

    if readiness_checks {
@ -448,7 +465,7 @@ async fn wait_for_ports_or_cleanup(
    release: &str,
    specs: &PortSpecs,
    cleanup_guard: &mut Option<RunnerCleanup>,
-) -> Result<ClusterPorts, K8sRunnerError> {
+) -> Result<ClusterReady, K8sRunnerError> {
    match wait_for_cluster_ready(
        client,
        namespace,
@ -498,13 +515,19 @@ async fn ensure_cluster_readiness(
 struct K8sCleanupGuard {
    cleanup: RunnerCleanup,
    block_feed: Option<BlockFeedTask>,
+    port_forwards: Vec<std::process::Child>,
 }

 impl K8sCleanupGuard {
-    const fn new(cleanup: RunnerCleanup, block_feed: BlockFeedTask) -> Self {
+    const fn new(
+        cleanup: RunnerCleanup,
+        block_feed: BlockFeedTask,
+        port_forwards: Vec<std::process::Child>,
+    ) -> Self {
        Self {
            cleanup,
            block_feed: Some(block_feed),
+            port_forwards,
        }
    }
 }
@ -514,6 +537,7 @@ impl CleanupGuard for K8sCleanupGuard {
        if let Some(block_feed) = self.block_feed.take() {
            CleanupGuard::cleanup(Box::new(block_feed));
        }
+        kill_port_forwards(&mut self.port_forwards);
        CleanupGuard::cleanup(Box::new(self.cleanup));
    }
 }
--- a/testing-framework/runners/k8s/src/wait.rs
+++ b/testing-framework/runners/k8s/src/wait.rs
@ -1,4 +1,9 @@
-use std::time::Duration;
+use std::{
+    net::{Ipv4Addr, TcpListener, TcpStream},
+    process::{Command as StdCommand, Stdio},
+    thread,
+    time::Duration,
+};

 use k8s_openapi::api::{apps::v1::Deployment, core::v1::Service};
 use kube::{Api, Client, Error as KubeError};
@ -9,7 +14,12 @@ use tokio::time::sleep;
 use crate::host::node_host;

 const DEPLOYMENT_TIMEOUT: Duration = Duration::from_secs(180);
+const NODE_HTTP_TIMEOUT: Duration = Duration::from_secs(240);
+const NODE_HTTP_PROBE_TIMEOUT: Duration = Duration::from_secs(30);
+const HTTP_POLL_INTERVAL: Duration = Duration::from_secs(1);
 const PROMETHEUS_HTTP_PORT: u16 = 9090;
+const PROMETHEUS_HTTP_TIMEOUT: Duration = Duration::from_secs(240);
+const PROMETHEUS_HTTP_PROBE_TIMEOUT: Duration = Duration::from_secs(30);
 const PROMETHEUS_SERVICE_NAME: &str = "prometheus";

 #[derive(Clone, Copy)]
@ -30,6 +40,11 @@ pub struct ClusterPorts {
    pub prometheus: u16,
 }

+pub struct ClusterReady {
+    pub ports: ClusterPorts,
+    pub port_forwards: Vec<std::process::Child>,
+}
+
 #[derive(Debug, Error)]
 pub enum ClusterWaitError {
    #[error("deployment {name} in namespace {namespace} did not become ready within {timeout:?}")]
@ -62,6 +77,13 @@ pub enum ClusterWaitError {
    },
    #[error("timeout waiting for prometheus readiness on NodePort {port}")]
    PrometheusTimeout { port: u16 },
+    #[error("failed to start port-forward for service {service} port {port}: {source}")]
+    PortForward {
+        service: String,
+        port: u16,
+        #[source]
+        source: anyhow::Error,
+    },
 }

 pub async fn wait_for_deployment_ready(
@ -159,7 +181,7 @@ pub async fn wait_for_cluster_ready(
    release: &str,
    validator_ports: &[NodeConfigPorts],
    executor_ports: &[NodeConfigPorts],
-) -> Result<ClusterPorts, ClusterWaitError> {
+) -> Result<ClusterReady, ClusterWaitError> {
    if validator_ports.is_empty() {
        return Err(ClusterWaitError::MissingValidator);
    }
@ -177,11 +199,40 @@ pub async fn wait_for_cluster_ready(
        });
    }

+    let mut port_forwards = Vec::new();
+
    let validator_api_ports: Vec<u16> = validator_allocations
        .iter()
        .map(|ports| ports.api)
        .collect();
-    wait_for_node_http(&validator_api_ports, NodeRole::Validator).await?;
+    if wait_for_node_http_nodeport(
+        &validator_api_ports,
+        NodeRole::Validator,
+        NODE_HTTP_PROBE_TIMEOUT,
+    )
+    .await
+    .is_err()
+    {
+        // Fall back to port-forwarding when NodePorts are unreachable from the host.
+        validator_allocations.clear();
+        port_forwards = port_forward_group(
+            namespace,
+            release,
+            "validator",
+            validator_ports,
+            &mut validator_allocations,
+        )?;
+        let validator_api_ports: Vec<u16> = validator_allocations
+            .iter()
+            .map(|ports| ports.api)
+            .collect();
+        if let Err(err) =
+            wait_for_node_http_port_forward(&validator_api_ports, NodeRole::Validator).await
+        {
+            kill_port_forwards(&mut port_forwards);
+            return Err(err);
+        }
+    }

    let mut executor_allocations = Vec::with_capacity(executor_ports.len());
    for (index, ports) in executor_ports.iter().enumerate() {
@ -195,39 +246,102 @@ pub async fn wait_for_cluster_ready(
        });
    }

-    if !executor_allocations.is_empty() {
+    let executor_api_ports: Vec<u16> = executor_allocations.iter().map(|ports| ports.api).collect();
+    if !executor_allocations.is_empty()
+        && wait_for_node_http_nodeport(
+            &executor_api_ports,
+            NodeRole::Executor,
+            NODE_HTTP_PROBE_TIMEOUT,
+        )
+        .await
+        .is_err()
+    {
+        executor_allocations.clear();
+        match port_forward_group(
+            namespace,
+            release,
+            "executor",
+            executor_ports,
+            &mut executor_allocations,
+        ) {
+            Ok(forwards) => port_forwards.extend(forwards),
+            Err(err) => {
+                kill_port_forwards(&mut port_forwards);
+                return Err(err);
+            }
+        }
        let executor_api_ports: Vec<u16> =
            executor_allocations.iter().map(|ports| ports.api).collect();
-        wait_for_node_http(&executor_api_ports, NodeRole::Executor).await?;
+        if let Err(err) =
+            wait_for_node_http_port_forward(&executor_api_ports, NodeRole::Executor).await
+        {
+            kill_port_forwards(&mut port_forwards);
+            return Err(err);
+        }
    }

-    let prometheus_port = find_node_port(
+    let mut prometheus_port = find_node_port(
        client,
        namespace,
        PROMETHEUS_SERVICE_NAME,
        PROMETHEUS_HTTP_PORT,
    )
    .await?;
-    wait_for_prometheus_http(prometheus_port).await?;
+    if wait_for_prometheus_http_nodeport(prometheus_port, PROMETHEUS_HTTP_PROBE_TIMEOUT)
+        .await
+        .is_err()
+    {
+        let (local_port, forward) =
+            port_forward_service(namespace, PROMETHEUS_SERVICE_NAME, PROMETHEUS_HTTP_PORT)
+                .map_err(|err| {
+                    kill_port_forwards(&mut port_forwards);
+                    err
+                })?;
+        prometheus_port = local_port;
+        port_forwards.push(forward);
+        if let Err(err) =
+            wait_for_prometheus_http_port_forward(prometheus_port, PROMETHEUS_HTTP_TIMEOUT).await
+        {
+            kill_port_forwards(&mut port_forwards);
+            return Err(err);
+        }
+    }

-    Ok(ClusterPorts {
-        validators: validator_allocations,
-        executors: executor_allocations,
-        prometheus: prometheus_port,
+    Ok(ClusterReady {
+        ports: ClusterPorts {
+            validators: validator_allocations,
+            executors: executor_allocations,
+            prometheus: prometheus_port,
+        },
+        port_forwards,
    })
 }

-async fn wait_for_node_http(ports: &[u16], role: NodeRole) -> Result<(), ClusterWaitError> {
+async fn wait_for_node_http_nodeport(
+    ports: &[u16],
+    role: NodeRole,
+    timeout: Duration,
+) -> Result<(), ClusterWaitError> {
    let host = node_host();
-    http_probe::wait_for_http_ports_with_host(
-        ports,
-        role,
-        &host,
-        Duration::from_secs(240),
-        Duration::from_secs(1),
-    )
-    .await
-    .map_err(map_http_error)
+    wait_for_node_http_on_host(ports, role, &host, timeout).await
+}
+
+async fn wait_for_node_http_port_forward(
+    ports: &[u16],
+    role: NodeRole,
+) -> Result<(), ClusterWaitError> {
+    wait_for_node_http_on_host(ports, role, "127.0.0.1", NODE_HTTP_TIMEOUT).await
+}
+
+async fn wait_for_node_http_on_host(
+    ports: &[u16],
+    role: NodeRole,
+    host: &str,
+    timeout: Duration,
+) -> Result<(), ClusterWaitError> {
+    http_probe::wait_for_http_ports_with_host(ports, role, host, timeout, HTTP_POLL_INTERVAL)
+        .await
+        .map_err(map_http_error)
 }

 const fn map_http_error(error: HttpReadinessError) -> ClusterWaitError {
@ -238,11 +352,30 @@ const fn map_http_error(error: HttpReadinessError) -> ClusterWaitError {
    }
 }

-pub async fn wait_for_prometheus_http(port: u16) -> Result<(), ClusterWaitError> {
-    let client = reqwest::Client::new();
-    let url = format!("http://{}:{port}/-/ready", node_host());
+pub async fn wait_for_prometheus_http_nodeport(
+    port: u16,
+    timeout: Duration,
+) -> Result<(), ClusterWaitError> {
+    let host = node_host();
+    wait_for_prometheus_http(&host, port, timeout).await
+}

-    for _ in 0..240 {
+pub async fn wait_for_prometheus_http_port_forward(
+    port: u16,
+    timeout: Duration,
+) -> Result<(), ClusterWaitError> {
+    wait_for_prometheus_http("127.0.0.1", port, timeout).await
+}
+
+pub async fn wait_for_prometheus_http(
+    host: &str,
+    port: u16,
+    timeout: Duration,
+) -> Result<(), ClusterWaitError> {
+    let client = reqwest::Client::new();
+    let url = format!("http://{host}:{port}/-/ready");
+
+    for _ in 0..timeout.as_secs() {
        if let Ok(resp) = client.get(&url).send().await
            && resp.status().is_success()
        {
@ -253,3 +386,101 @@ pub async fn wait_for_prometheus_http(port: u16) -> Result<(), ClusterWaitError>

    Err(ClusterWaitError::PrometheusTimeout { port })
 }
+
+fn port_forward_group(
+    namespace: &str,
+    release: &str,
+    kind: &str,
+    ports: &[NodeConfigPorts],
+    allocations: &mut Vec<NodePortAllocation>,
+) -> Result<Vec<std::process::Child>, ClusterWaitError> {
+    let mut forwards = Vec::new();
+    for (index, ports) in ports.iter().enumerate() {
+        let service = format!("{release}-{kind}-{index}");
+        let (api_port, api_forward) = match port_forward_service(namespace, &service, ports.api) {
+            Ok(forward) => forward,
+            Err(err) => {
+                kill_port_forwards(&mut forwards);
+                return Err(err);
+            }
+        };
+        let (testing_port, testing_forward) =
+            match port_forward_service(namespace, &service, ports.testing) {
+                Ok(forward) => forward,
+                Err(err) => {
+                    kill_port_forwards(&mut forwards);
+                    return Err(err);
+                }
+            };
+        allocations.push(NodePortAllocation {
+            api: api_port,
+            testing: testing_port,
+        });
+        forwards.push(api_forward);
+        forwards.push(testing_forward);
+    }
+    Ok(forwards)
+}
+
+fn port_forward_service(
+    namespace: &str,
+    service: &str,
+    remote_port: u16,
+) -> Result<(u16, std::process::Child), ClusterWaitError> {
+    let local_port = allocate_local_port().map_err(|source| ClusterWaitError::PortForward {
+        service: service.to_owned(),
+        port: remote_port,
+        source,
+    })?;
+
+    let mut child = StdCommand::new("kubectl")
+        .arg("port-forward")
+        .arg("-n")
+        .arg(namespace)
+        .arg(format!("svc/{service}"))
+        .arg(format!("{local_port}:{remote_port}"))
+        .stdout(Stdio::null())
+        .stderr(Stdio::null())
+        .spawn()
+        .map_err(|source| ClusterWaitError::PortForward {
+            service: service.to_owned(),
+            port: remote_port,
+            source: source.into(),
+        })?;
+
+    for _ in 0..20 {
+        if let Ok(Some(status)) = child.try_wait() {
+            return Err(ClusterWaitError::PortForward {
+                service: service.to_owned(),
+                port: remote_port,
+                source: anyhow::anyhow!("kubectl exited with {status}"),
+            });
+        }
+        if TcpStream::connect((Ipv4Addr::LOCALHOST, local_port)).is_ok() {
+            return Ok((local_port, child));
+        }
+        thread::sleep(Duration::from_millis(250));
+    }
+
+    let _ = child.kill();
+    Err(ClusterWaitError::PortForward {
+        service: service.to_owned(),
+        port: remote_port,
+        source: anyhow::anyhow!("port-forward did not become ready"),
+    })
+}
+
+fn allocate_local_port() -> anyhow::Result<u16> {
+    let listener = TcpListener::bind((Ipv4Addr::LOCALHOST, 0))?;
+    let port = listener.local_addr()?.port();
+    drop(listener);
+    Ok(port)
+}
+
+fn kill_port_forwards(handles: &mut Vec<std::process::Child>) {
+    for handle in handles.iter_mut() {
+        let _ = handle.kill();
+        let _ = handle.wait();
+    }
+    handles.clear();
+}
--- a/testnet/Dockerfile
+++ b/testnet/Dockerfile
@ -2,7 +2,8 @@
 # check=skip=SecretsUsedInArgOrEnv
 # Ignore warnings about sensitive information as this is test data.

-ARG VERSION=v0.2.0
+ARG VERSION=v0.3.1
+ARG CIRCUITS_OVERRIDE

 # ===========================
 # BUILD IMAGE
@ -11,24 +12,61 @@ ARG VERSION=v0.2.0
 FROM rust:1.91.0-slim-bookworm AS builder

 ARG VERSION
+ARG CIRCUITS_OVERRIDE

 LABEL maintainer="augustinas@status.im" \
    source="https://github.com/logos-co/nomos-node" \
    description="Nomos testnet build image"

-WORKDIR /nomos
+WORKDIR /workspace
 COPY . .

 # Install dependencies needed for building RocksDB.
 RUN apt-get update && apt-get install -yq \
-    git gcc g++ clang libssl-dev pkg-config ca-certificates curl
+    git gcc g++ clang make cmake m4 xz-utils libgmp-dev libssl-dev pkg-config ca-certificates curl wget

-RUN chmod +x scripts/setup-nomos-circuits.sh && \
-    scripts/setup-nomos-circuits.sh "$VERSION" "/opt/circuits"
+RUN mkdir -p /opt/circuits && \
+    select_circuits_source() { \
+        # Prefer an explicit override when it exists (file or directory). \
+        if [ -n "$CIRCUITS_OVERRIDE" ] && [ -e "/workspace/${CIRCUITS_OVERRIDE}" ]; then \
+            echo "/workspace/${CIRCUITS_OVERRIDE}"; \
+            return 0; \
+        fi; \
+        # Fall back to the workspace bundle shipped with the repo. \
+        if [ -e "/workspace/tests/kzgrs/kzgrs_test_params" ]; then \
+            echo "/workspace/tests/kzgrs/kzgrs_test_params"; \
+            return 0; \
+        fi; \
+        return 1; \
+    }; \
+    if CIRCUITS_PATH="$(select_circuits_source)"; then \
+        echo "Using prebuilt circuits bundle from ${CIRCUITS_PATH#/workspace/}"; \
+        if [ -d "$CIRCUITS_PATH" ]; then \
+            cp -R "${CIRCUITS_PATH}/." /opt/circuits; \
+        else \
+            cp "${CIRCUITS_PATH}" /opt/circuits/; \
+        fi; \
+    fi; \
+    if [ ! -f "/opt/circuits/pol/verification_key.json" ]; then \
+        echo "Local circuits missing pol artifacts; downloading ${VERSION} bundle and rebuilding"; \
+        chmod +x scripts/setup-nomos-circuits.sh && \
+        NOMOS_CIRCUITS_REBUILD_RAPIDSNARK=1 \
+        RAPIDSNARK_BUILD_GMP=1 \
+            scripts/setup-nomos-circuits.sh "$VERSION" "/opt/circuits"; \
+    fi

 ENV NOMOS_CIRCUITS=/opt/circuits
+ENV CARGO_TARGET_DIR=/workspace/target

-RUN cargo build --release --all-features
+# Fetch the nomos-node sources pinned in Cargo.lock and build the runtime binaries.
+RUN git clone https://github.com/logos-co/nomos-node.git /workspace/nomos-node && \
+    cd /workspace/nomos-node && \
+    git fetch --depth 1 origin 2f60a0372c228968c3526c341ebc7e58bbd178dd && \
+    git checkout 2f60a0372c228968c3526c341ebc7e58bbd178dd && \
+    cargo build --release --all-features --bins
+
+# Build cfgsync binaries from this workspace.
+RUN cargo build --release --locked --manifest-path /workspace/testnet/cfgsync/Cargo.toml --bins

 # ===========================
 # NODE IMAGE
@ -50,11 +88,11 @@ RUN apt-get update && apt-get install -yq \

 COPY --from=builder /opt/circuits /opt/circuits

-COPY --from=builder /nomos/target/release/nomos-node /usr/bin/nomos-node
-COPY --from=builder /nomos/target/release/nomos-executor /usr/bin/nomos-executor
-COPY --from=builder /nomos/target/release/nomos-cli /usr/bin/nomos-cli
-COPY --from=builder /nomos/target/release/cfgsync-server /usr/bin/cfgsync-server
-COPY --from=builder /nomos/target/release/cfgsync-client /usr/bin/cfgsync-client
+COPY --from=builder /workspace/target/release/nomos-node /usr/bin/nomos-node
+COPY --from=builder /workspace/target/release/nomos-executor /usr/bin/nomos-executor
+COPY --from=builder /workspace/target/release/nomos-cli /usr/bin/nomos-cli
+COPY --from=builder /workspace/target/release/cfgsync-server /usr/bin/cfgsync-server
+COPY --from=builder /workspace/target/release/cfgsync-client /usr/bin/cfgsync-client

 ENV NOMOS_CIRCUITS=/opt/circuits

--- a/testnet/scripts/build_test_image.sh
+++ b/testnet/scripts/build_test_image.sh
@ -0,0 +1,38 @@
+#!/bin/bash
+set -euo pipefail
+
+# Builds the testnet image with circuits. Prefers a local circuits bundle
+# (tests/kzgrs/kzgrs_test_params) or a custom override; otherwise downloads
+# from logos-co/nomos-circuits.
+
+ROOT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")/../.." && pwd)"
+IMAGE_TAG="${IMAGE_TAG:-nomos-testnet:local}"
+VERSION="${VERSION:-v0.3.1}"
+CIRCUITS_OVERRIDE="${CIRCUITS_OVERRIDE:-tests/kzgrs/kzgrs_test_params}"
+
+echo "Workspace root: ${ROOT_DIR}"
+echo "Image tag: ${IMAGE_TAG}"
+echo "Circuits override: ${CIRCUITS_OVERRIDE:-<none>}"
+echo "Circuits version (fallback download): ${VERSION}"
+
+build_args=(
+  -f "${ROOT_DIR}/testnet/Dockerfile"
+  -t "${IMAGE_TAG}"
+  "${ROOT_DIR}"
+)
+
+# Pass override/version args to the Docker build.
+if [ -n "${CIRCUITS_OVERRIDE}" ]; then
+  build_args+=(--build-arg "CIRCUITS_OVERRIDE=${CIRCUITS_OVERRIDE}")
+fi
+build_args+=(--build-arg "VERSION=${VERSION}")
+
+echo "Running: docker build ${build_args[*]}"
+docker build "${build_args[@]}"
+
+cat <<EOF
+
+Build complete.
+- Use this image in k8s/compose by exporting NOMOS_TESTNET_IMAGE=${IMAGE_TAG}
+- Circuits source: ${CIRCUITS_OVERRIDE:-download ${VERSION}}
+EOF
--- a/testnet/scripts/run_nomos_executor.sh
+++ b/testnet/scripts/run_nomos_executor.sh
@ -14,5 +14,9 @@ export CFG_FILE_PATH="/config.yaml" \
 # persist state.
 mkdir -p /recovery

-/usr/bin/cfgsync-client && \
-    exec /usr/bin/nomos-executor /config.yaml
+/usr/bin/cfgsync-client
+
+# Align bootstrap timing with validators to keep configs consistent.
+sed -i "s/prolonged_bootstrap_period: .*/prolonged_bootstrap_period: '3.000000000'/" /config.yaml
+
+exec /usr/bin/nomos-executor /config.yaml
--- a/testnet/scripts/run_nomos_node.sh
+++ b/testnet/scripts/run_nomos_node.sh
@ -14,5 +14,9 @@ export CFG_FILE_PATH="/config.yaml" \
 # persist state.
 mkdir -p /recovery

-/usr/bin/cfgsync-client && \
-    exec /usr/bin/nomos-node /config.yaml
+/usr/bin/cfgsync-client
+
+# Align bootstrap timing with executors to keep configs consistent.
+sed -i "s/prolonged_bootstrap_period: .*/prolonged_bootstrap_period: '3.000000000'/" /config.yaml
+
+exec /usr/bin/nomos-node /config.yaml
--- a/testnet/scripts/setup-nomos-circuits.sh
+++ b/testnet/scripts/setup-nomos-circuits.sh
@ -0,0 +1,76 @@
+#!/bin/bash
+#
+# Setup script for nomos-circuits
+#
+# Usage: ./setup-nomos-circuits.sh [VERSION] [INSTALL_DIR]
+#   VERSION      - Optional. Version to install (default: v0.3.1)
+#   INSTALL_DIR  - Optional. Installation directory (default: $HOME/.nomos-circuits)
+#
+# Examples:
+#   ./setup-nomos-circuits.sh                    # Install default version to default location
+#   ./setup-nomos-circuits.sh v0.2.0             # Install specific version to default location
+#   ./setup-nomos-circuits.sh v0.2.0 /opt/circuits  # Install to custom location
+
+set -euo pipefail
+
+VERSION="${1:-v0.3.1}"
+DEFAULT_INSTALL_DIR="$HOME/.nomos-circuits"
+INSTALL_DIR="${2:-$DEFAULT_INSTALL_DIR}"
+REPO="logos-co/nomos-circuits"
+
+detect_platform() {
+    local os=""
+    local arch=""
+    case "$(uname -s)" in
+        Linux*) os="linux" ;;
+        Darwin*) os="macos" ;;
+        MINGW*|MSYS*|CYGWIN*) os="windows" ;;
+        *) echo "Unsupported operating system: $(uname -s)" >&2; exit 1 ;;
+    esac
+    case "$(uname -m)" in
+        x86_64) arch="x86_64" ;;
+        aarch64|arm64) arch="aarch64" ;;
+        *) echo "Unsupported architecture: $(uname -m)" >&2; exit 1 ;;
+    esac
+    echo "${os}-${arch}"
+}
+
+download_release() {
+    local platform="$1"
+    local artifact="nomos-circuits-${VERSION}-${platform}.tar.gz"
+    local url="https://github.com/${REPO}/releases/download/${VERSION}/${artifact}"
+    local temp_dir
+    temp_dir=$(mktemp -d)
+
+    echo "Downloading nomos-circuits ${VERSION} for ${platform}..."
+    if [ -n "${GITHUB_TOKEN:-}" ]; then
+        auth_header="Authorization: Bearer ${GITHUB_TOKEN}"
+    else
+        auth_header=""
+    fi
+
+    if ! curl -L ${auth_header:+-H "$auth_header"} -o "${temp_dir}/${artifact}" "${url}"; then
+        echo "Failed to download release artifact from ${url}" >&2
+        rm -rf "${temp_dir}"
+        exit 1
+    fi
+
+    echo "Extracting to ${INSTALL_DIR}..."
+    rm -rf "${INSTALL_DIR}"
+    mkdir -p "${INSTALL_DIR}"
+    if ! tar -xzf "${temp_dir}/${artifact}" -C "${INSTALL_DIR}" --strip-components=1; then
+        echo "Failed to extract ${artifact}" >&2
+        rm -rf "${temp_dir}"
+        exit 1
+    fi
+    rm -rf "${temp_dir}"
+}
+
+platform=$(detect_platform)
+echo "Setting up nomos-circuits ${VERSION} for ${platform}"
+echo "Installing to ${INSTALL_DIR}"
+
+download_release "${platform}"
+
+echo "Installation complete. Circuits installed at: ${INSTALL_DIR}"
+echo "If using a custom directory, set NOMOS_CIRCUITS=${INSTALL_DIR}"