logos-blockchain-testing/book/src/architecture-overview.md

# Architecture Overview

The framework follows a clear flow: **Topology → Scenario → Deployer → Runner → Workloads → Expectations**.

## Core Flow

```mermaid
flowchart LR
    A(Topology<br/>shape cluster) --> B(Scenario<br/>plan)
    B --> C(Deployer<br/>provision & readiness)
    C --> D(Runner<br/>orchestrate execution)
    D --> E(Workloads<br/>drive traffic)
    E --> F(Expectations<br/>verify outcomes)
```

### Components

- **Topology** describes the cluster: how many nodes, their roles, and the high-level network and data-availability parameters they should follow.
- **Scenario** combines that topology with the activities to run and the checks to perform, forming a single plan.
- **Deployer** provisions infrastructure on the chosen backend (local processes, Docker Compose, or Kubernetes), waits for readiness, and returns a Runner.
- **Runner** orchestrates scenario execution: starts workloads, observes signals, evaluates expectations, and triggers cleanup.
- **Workloads** generate traffic and conditions that exercise the system.
- **Expectations** observe the run and judge success or failure once activity completes.

Each layer has a narrow responsibility so that cluster shape, deployment choice,
traffic generation, and health checks can evolve independently while fitting
together predictably.

## Entry Points

The framework is consumed via **runnable example binaries** in `examples/src/bin/`:

- `local_runner.rs` — Spawns nodes as host processes
- `compose_runner.rs` — Deploys via Docker Compose (requires `NOMOS_TESTNET_IMAGE` built)
- `k8s_runner.rs` — Deploys via Kubernetes Helm (requires cluster + image)

**Recommended:** Use the convenience script:

```bash
scripts/run-examples.sh -t <duration> -v <validators> -e <executors> <mode>
# mode: host, compose, or k8s
```

This handles circuit setup, binary building/bundling, image building, and execution.

**Alternative:** Direct cargo run (requires manual setup):

```bash
POL_PROOF_DEV_MODE=true cargo run -p runner-examples --bin <name>
```

**Important:** All runners require `POL_PROOF_DEV_MODE=true` to avoid expensive Groth16 proof generation that causes timeouts.

These binaries use the framework API (`ScenarioBuilder`) to construct and execute scenarios.

## Builder API

Scenarios are defined using a fluent builder pattern:

```rust
use std::time::Duration;

use testing_framework_core::scenario::ScenarioBuilder;
use testing_framework_workflows::ScenarioBuilderExt;

pub fn scenario_plan() -> testing_framework_core::scenario::Scenario<()> {
    ScenarioBuilder::topology_with(|t| t.network_star().validators(3).executors(2))
        .wallets(50)
        .transactions_with(|txs| txs.rate(5).users(20))
        .da_with(|da| da.channel_rate(1).blob_rate(2))
        .expect_consensus_liveness()
        .with_run_duration(Duration::from_secs(90))
        .build()
}
```

**Key API Points:**
- Topology uses `.topology_with(|t| { t.validators(N).executors(M) })` closure pattern
- Workloads are configured via `_with` closures (`transactions_with`, `da_with`, `chaos_with`)
- Chaos workloads require `.enable_node_control()` and a compatible runner

## Deployers

Three deployer implementations:

| Deployer | Backend | Prerequisites | Node Control |
|----------|---------|---------------|--------------|
| `LocalDeployer` | Host processes | Binaries (built on demand or via bundle) | No |
| `ComposeDeployer` | Docker Compose | Image with embedded assets/binaries | Yes |
| `K8sDeployer` | Kubernetes Helm | Cluster + image loaded | Not yet |

**Compose-specific features:**
- Observability is external (set `NOMOS_METRICS_QUERY_URL` / `NOMOS_METRICS_OTLP_INGEST_URL` / `NOMOS_GRAFANA_URL` as needed)
- Optional OTLP trace/metrics endpoints (`NOMOS_OTLP_ENDPOINT`, `NOMOS_OTLP_METRICS_ENDPOINT`)
- Node control for chaos testing (restart validators/executors)

## Assets and Images

### Docker Image
Built via `scripts/build_test_image.sh`:
- Embeds KZG circuit parameters and binaries from `testing-framework/assets/stack/kzgrs_test_params/kzgrs_test_params`
- Includes runner scripts: `run_nomos_node.sh`, `run_nomos_executor.sh`
- Tagged as `NOMOS_TESTNET_IMAGE` (default: `logos-blockchain-testing:local`)
- **Recommended:** Use prebuilt bundle via `scripts/build-bundle.sh --platform linux` and set `NOMOS_BINARIES_TAR` before building image

### Circuit Assets
KZG parameters required for DA workloads:
- **Host path:** `testing-framework/assets/stack/kzgrs_test_params/kzgrs_test_params` (note repeated filename—directory contains file `kzgrs_test_params`)
- **Container path:** `/kzgrs_test_params/kzgrs_test_params` (for compose/k8s)
- **Override:** `NOMOS_KZGRS_PARAMS_PATH=/custom/path/to/file` (must point to file)
- **Fetch via:** `scripts/setup-nomos-circuits.sh v0.3.1 /tmp/circuits` or use `scripts/run-examples.sh`

### Compose Stack
Templates and configs in `testing-framework/runners/compose/assets/`:
- `docker-compose.yml.tera` — Stack template (validators, executors)
- Cfgsync config: `testing-framework/assets/stack/cfgsync.yaml`
- Monitoring assets (not deployed by the framework): `testing-framework/assets/stack/monitoring/`

## Logging Architecture

**Two separate logging pipelines:**

| Component | Configuration | Output |
|-----------|--------------|--------|
| **Runner binaries** | `RUST_LOG` | Framework orchestration logs |
| **Node processes** | `NOMOS_LOG_LEVEL`, `NOMOS_LOG_FILTER` (+ `NOMOS_LOG_DIR` on host runner) | Consensus, DA, mempool logs |

**Node logging:**
- **Local runner:** Writes to temporary directories by default (cleaned up). Set `NOMOS_TESTS_TRACING=true` + `NOMOS_LOG_DIR` for persistent files.
- **Compose runner:** Default logs to container stdout/stderr (`docker logs`). To write per-node files, set `tracing_settings.logger: !File` in `testing-framework/assets/stack/cfgsync.yaml` (and mount a writable directory).
- **K8s runner:** Logs to pod stdout/stderr (`kubectl logs`). To write per-node files, set `tracing_settings.logger: !File` in `testing-framework/assets/stack/cfgsync.yaml` (and mount a writable directory).

**File naming:** Per-node files use prefix `nomos-node-{index}` or `nomos-executor-{index}` (may include timestamps).

## Observability

**Prometheus-compatible metrics querying (optional):**
- The framework does **not** deploy Prometheus/Grafana.
- Provide a Prometheus-compatible base URL (PromQL API) via `NOMOS_METRICS_QUERY_URL`.
- Accessible in expectations when configured: `ctx.telemetry().prometheus().map(|p| p.base_url())`

**Grafana dashboards (optional):**
- Dashboards live in `testing-framework/assets/stack/monitoring/grafana/dashboards/` and can be imported into your Grafana.
- If you set `NOMOS_GRAFANA_URL`, the deployer prints it in `TESTNET_ENDPOINTS`.

**Node APIs:**
- HTTP endpoints per node for consensus info, network status, DA membership
- Accessible in expectations: `ctx.node_clients().validator_clients().get(0)`

**OTLP (optional):**
- Trace endpoint: `NOMOS_OTLP_ENDPOINT=http://localhost:4317`
- Metrics endpoint: `NOMOS_OTLP_METRICS_ENDPOINT=http://localhost:4318`
- Disabled by default (no noise if unset)

For detailed logging configuration, see [Logging and Observability](operations.md#logging-and-observability).
Initial import of Nomos testing framework 2025-12-01 12:48:39 +01:00			`# Architecture Overview`

			`The framework follows a clear flow: Topology → Scenario → Deployer → Runner → Workloads → Expectations.`

			`## Core Flow`

			```mermaid
			`flowchart LR`
			`A(Topology<br/>shape cluster) --> B(Scenario<br/>plan)`
			`B --> C(Deployer<br/>provision & readiness)`
			`C --> D(Runner<br/>orchestrate execution)`
			`D --> E(Workloads<br/>drive traffic)`
			`E --> F(Expectations<br/>verify outcomes)`
			```

			`### Components`

			`- Topology describes the cluster: how many nodes, their roles, and the high-level network and data-availability parameters they should follow.`
			`- Scenario combines that topology with the activities to run and the checks to perform, forming a single plan.`
			`- Deployer provisions infrastructure on the chosen backend (local processes, Docker Compose, or Kubernetes), waits for readiness, and returns a Runner.`
			`- Runner orchestrates scenario execution: starts workloads, observes signals, evaluates expectations, and triggers cleanup.`
			`- Workloads generate traffic and conditions that exercise the system.`
			`- Expectations observe the run and judge success or failure once activity completes.`

			`Each layer has a narrow responsibility so that cluster shape, deployment choice,`
			`traffic generation, and health checks can evolve independently while fitting`
			`together predictably.`

			`## Entry Points`

			The framework is consumed via runnable example binaries in `examples/src/bin/`:

Refine demo tooling and shared configs 2025-12-09 09:43:49 +01:00			- `local_runner.rs` — Spawns nodes as host processes
Initial import of Nomos testing framework 2025-12-01 12:48:39 +01:00			- `compose_runner.rs` — Deploys via Docker Compose (requires `NOMOS_TESTNET_IMAGE` built)
			- `k8s_runner.rs` — Deploys via Kubernetes Helm (requires cluster + image)

Refine demo tooling and shared configs 2025-12-09 09:43:49 +01:00			`Recommended: Use the convenience script:`

			```bash
			`scripts/run-examples.sh -t <duration> -v <validators> -e <executors> <mode>`
			`# mode: host, compose, or k8s`
			```

			`This handles circuit setup, binary building/bundling, image building, and execution.`

			`Alternative: Direct cargo run (requires manual setup):`

			```bash
			`POL_PROOF_DEV_MODE=true cargo run -p runner-examples --bin <name>`
			```
Initial import of Nomos testing framework 2025-12-01 12:48:39 +01:00
			Important: All runners require `POL_PROOF_DEV_MODE=true` to avoid expensive Groth16 proof generation that causes timeouts.

			These binaries use the framework API (`ScenarioBuilder`) to construct and execute scenarios.

			`## Builder API`

			`Scenarios are defined using a fluent builder pattern:`

			```rust
docs: add compilable doc-snippets crate 2025-12-16 06:55:44 +01:00			`use std::time::Duration;`

			`use testing_framework_core::scenario::ScenarioBuilder;`
			`use testing_framework_workflows::ScenarioBuilderExt;`

			`pub fn scenario_plan() -> testing_framework_core::scenario::Scenario<()> {`
			`ScenarioBuilder::topology_with(\|t\| t.network_star().validators(3).executors(2))`
			`.wallets(50)`
			`.transactions_with(\|txs\| txs.rate(5).users(20))`
			`.da_with(\|da\| da.channel_rate(1).blob_rate(2))`
			`.expect_consensus_liveness()`
			`.with_run_duration(Duration::from_secs(90))`
			`.build()`
			`}`
Initial import of Nomos testing framework 2025-12-01 12:48:39 +01:00			```

			`Key API Points:`
Align workflows and configs with latest nomos-node rev 2025-12-06 10:17:06 +01:00			- Topology uses `.topology_with(\|t\| { t.validators(N).executors(M) })` closure pattern
			- Workloads are configured via `_with` closures (`transactions_with`, `da_with`, `chaos_with`)
Initial import of Nomos testing framework 2025-12-01 12:48:39 +01:00			- Chaos workloads require `.enable_node_control()` and a compatible runner

			`## Deployers`

			`Three deployer implementations:`

			`\| Deployer \| Backend \| Prerequisites \| Node Control \|`
			`\|----------\|---------\|---------------\|--------------\|`
Refine demo tooling and shared configs 2025-12-09 09:43:49 +01:00			\| `LocalDeployer` \| Host processes \| Binaries (built on demand or via bundle) \| No \|
			\| `ComposeDeployer` \| Docker Compose \| Image with embedded assets/binaries \| Yes \|
Initial import of Nomos testing framework 2025-12-01 12:48:39 +01:00			\| `K8sDeployer` \| Kubernetes Helm \| Cluster + image loaded \| Not yet \|

			`Compose-specific features:`
refactor(observability): remove embedded prometheus/grafana Deployers no longer provision Prometheus/Grafana; metrics query/ingest now come from explicit URLs via env/flags. 2025-12-17 18:28:36 +01:00			- Observability is external (set `NOMOS_METRICS_QUERY_URL` / `NOMOS_METRICS_OTLP_INGEST_URL` / `NOMOS_GRAFANA_URL` as needed)
Initial import of Nomos testing framework 2025-12-01 12:48:39 +01:00			- Optional OTLP trace/metrics endpoints (`NOMOS_OTLP_ENDPOINT`, `NOMOS_OTLP_METRICS_ENDPOINT`)
			`- Node control for chaos testing (restart validators/executors)`

			`## Assets and Images`

			`### Docker Image`
refactor(testing-framework): rename runners to deployers - Update paths and orchestration for deployers (compose/k8s/local/docker) - Consolidate scripts helpers and refresh book/README docs 2025-12-16 21:20:27 +01:00			Built via `scripts/build_test_image.sh`:
Refine demo tooling and shared configs 2025-12-09 09:43:49 +01:00			- Embeds KZG circuit parameters and binaries from `testing-framework/assets/stack/kzgrs_test_params/kzgrs_test_params`
Initial import of Nomos testing framework 2025-12-01 12:48:39 +01:00			- Includes runner scripts: `run_nomos_node.sh`, `run_nomos_executor.sh`
Refactor node spawn helpers and cleanup wrappers 2025-12-10 15:15:34 +01:00			- Tagged as `NOMOS_TESTNET_IMAGE` (default: `logos-blockchain-testing:local`)
Refine demo tooling and shared configs 2025-12-09 09:43:49 +01:00			- Recommended: Use prebuilt bundle via `scripts/build-bundle.sh --platform linux` and set `NOMOS_BINARIES_TAR` before building image
Initial import of Nomos testing framework 2025-12-01 12:48:39 +01:00
			`### Circuit Assets`
			`KZG parameters required for DA workloads:`
Refine demo tooling and shared configs 2025-12-09 09:43:49 +01:00			- Host path: `testing-framework/assets/stack/kzgrs_test_params/kzgrs_test_params` (note repeated filename—directory contains file `kzgrs_test_params`)
			- Container path: `/kzgrs_test_params/kzgrs_test_params` (for compose/k8s)
			- Override: `NOMOS_KZGRS_PARAMS_PATH=/custom/path/to/file` (must point to file)
			- Fetch via: `scripts/setup-nomos-circuits.sh v0.3.1 /tmp/circuits` or use `scripts/run-examples.sh`
Initial import of Nomos testing framework 2025-12-01 12:48:39 +01:00
			`### Compose Stack`
			Templates and configs in `testing-framework/runners/compose/assets/`:
refactor(observability): remove embedded prometheus/grafana Deployers no longer provision Prometheus/Grafana; metrics query/ingest now come from explicit URLs via env/flags. 2025-12-17 18:28:36 +01:00			- `docker-compose.yml.tera` — Stack template (validators, executors)
Initial import of Nomos testing framework 2025-12-01 12:48:39 +01:00			- Cfgsync config: `testing-framework/assets/stack/cfgsync.yaml`
refactor(observability): remove embedded prometheus/grafana Deployers no longer provision Prometheus/Grafana; metrics query/ingest now come from explicit URLs via env/flags. 2025-12-17 18:28:36 +01:00			- Monitoring assets (not deployed by the framework): `testing-framework/assets/stack/monitoring/`
Initial import of Nomos testing framework 2025-12-01 12:48:39 +01:00
			`## Logging Architecture`

			`Two separate logging pipelines:`

			`\| Component \| Configuration \| Output \|`
			`\|-----------\|--------------\|--------\|`
			\| Runner binaries \| `RUST_LOG` \| Framework orchestration logs \|
refactor(testing-framework): rename runners to deployers - Update paths and orchestration for deployers (compose/k8s/local/docker) - Consolidate scripts helpers and refresh book/README docs 2025-12-16 21:20:27 +01:00			\| Node processes \| `NOMOS_LOG_LEVEL`, `NOMOS_LOG_FILTER` (+ `NOMOS_LOG_DIR` on host runner) \| Consensus, DA, mempool logs \|
Initial import of Nomos testing framework 2025-12-01 12:48:39 +01:00
			`Node logging:`
			- Local runner: Writes to temporary directories by default (cleaned up). Set `NOMOS_TESTS_TRACING=true` + `NOMOS_LOG_DIR` for persistent files.
refactor(testing-framework): rename runners to deployers - Update paths and orchestration for deployers (compose/k8s/local/docker) - Consolidate scripts helpers and refresh book/README docs 2025-12-16 21:20:27 +01:00			- Compose runner: Default logs to container stdout/stderr (`docker logs`). To write per-node files, set `tracing_settings.logger: !File` in `testing-framework/assets/stack/cfgsync.yaml` (and mount a writable directory).
			- K8s runner: Logs to pod stdout/stderr (`kubectl logs`). To write per-node files, set `tracing_settings.logger: !File` in `testing-framework/assets/stack/cfgsync.yaml` (and mount a writable directory).
Initial import of Nomos testing framework 2025-12-01 12:48:39 +01:00
			File naming: Per-node files use prefix `nomos-node-{index}` or `nomos-executor-{index}` (may include timestamps).

			`## Observability`

refactor(observability): remove embedded prometheus/grafana Deployers no longer provision Prometheus/Grafana; metrics query/ingest now come from explicit URLs via env/flags. 2025-12-17 18:28:36 +01:00			`Prometheus-compatible metrics querying (optional):`
			`- The framework does not deploy Prometheus/Grafana.`
			- Provide a Prometheus-compatible base URL (PromQL API) via `NOMOS_METRICS_QUERY_URL`.
			- Accessible in expectations when configured: `ctx.telemetry().prometheus().map(\|p\| p.base_url())`
Initial import of Nomos testing framework 2025-12-01 12:48:39 +01:00
refactor(observability): remove embedded prometheus/grafana Deployers no longer provision Prometheus/Grafana; metrics query/ingest now come from explicit URLs via env/flags. 2025-12-17 18:28:36 +01:00			`Grafana dashboards (optional):`
			- Dashboards live in `testing-framework/assets/stack/monitoring/grafana/dashboards/` and can be imported into your Grafana.
			- If you set `NOMOS_GRAFANA_URL`, the deployer prints it in `TESTNET_ENDPOINTS`.
docs(book): clarify observability + nomos-node rev workflow 2025-12-16 17:23:30 +01:00
Initial import of Nomos testing framework 2025-12-01 12:48:39 +01:00			`Node APIs:`
			`- HTTP endpoints per node for consensus info, network status, DA membership`
docs(gitbook): fix API snippets to match current code 2025-12-16 06:23:49 +01:00			- Accessible in expectations: `ctx.node_clients().validator_clients().get(0)`
Initial import of Nomos testing framework 2025-12-01 12:48:39 +01:00
			`OTLP (optional):`
			- Trace endpoint: `NOMOS_OTLP_ENDPOINT=http://localhost:4317`
			- Metrics endpoint: `NOMOS_OTLP_METRICS_ENDPOINT=http://localhost:4318`
			`- Disabled by default (no noise if unset)`

			`For detailed logging configuration, see [Logging and Observability](operations.md#logging-and-observability).`