logos-blockchain-testing/docs/operations.html
2025-12-20 09:51:51 +01:00

1047 lines
67 KiB
HTML
Raw Permalink Blame History

This file contains ambiguous Unicode characters

This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

<!DOCTYPE HTML>
<html lang="en" class="light" dir="ltr">
<head>
<!-- Book generated using mdBook -->
<meta charset="UTF-8">
<title>Operations - Nomos Testing Book</title>
<!-- Custom HTML head -->
<meta name="description" content="">
<meta name="viewport" content="width=device-width, initial-scale=1">
<meta name="theme-color" content="#ffffff">
<link rel="icon" href="favicon.svg">
<link rel="shortcut icon" href="favicon.png">
<link rel="stylesheet" href="css/variables.css">
<link rel="stylesheet" href="css/general.css">
<link rel="stylesheet" href="css/chrome.css">
<link rel="stylesheet" href="css/print.css" media="print">
<!-- Fonts -->
<link rel="stylesheet" href="FontAwesome/css/font-awesome.css">
<link rel="stylesheet" href="fonts/fonts.css">
<!-- Highlight.js Stylesheets -->
<link rel="stylesheet" href="highlight.css">
<link rel="stylesheet" href="tomorrow-night.css">
<link rel="stylesheet" href="ayu-highlight.css">
<!-- Custom theme stylesheets -->
</head>
<body class="sidebar-visible no-js">
<div id="body-container">
<!-- Provide site root to javascript -->
<script>
var path_to_root = "";
var default_theme = window.matchMedia("(prefers-color-scheme: dark)").matches ? "navy" : "light";
</script>
<!-- Work around some values being stored in localStorage wrapped in quotes -->
<script>
try {
var theme = localStorage.getItem('mdbook-theme');
var sidebar = localStorage.getItem('mdbook-sidebar');
if (theme.startsWith('"') && theme.endsWith('"')) {
localStorage.setItem('mdbook-theme', theme.slice(1, theme.length - 1));
}
if (sidebar.startsWith('"') && sidebar.endsWith('"')) {
localStorage.setItem('mdbook-sidebar', sidebar.slice(1, sidebar.length - 1));
}
} catch (e) { }
</script>
<!-- Set the theme before any content is loaded, prevents flash -->
<script>
var theme;
try { theme = localStorage.getItem('mdbook-theme'); } catch(e) { }
if (theme === null || theme === undefined) { theme = default_theme; }
var html = document.querySelector('html');
html.classList.remove('light')
html.classList.add(theme);
var body = document.querySelector('body');
body.classList.remove('no-js')
body.classList.add('js');
</script>
<input type="checkbox" id="sidebar-toggle-anchor" class="hidden">
<!-- Hide / unhide sidebar before it is displayed -->
<script>
var body = document.querySelector('body');
var sidebar = null;
var sidebar_toggle = document.getElementById("sidebar-toggle-anchor");
if (document.body.clientWidth >= 1080) {
try { sidebar = localStorage.getItem('mdbook-sidebar'); } catch(e) { }
sidebar = sidebar || 'visible';
} else {
sidebar = 'hidden';
}
sidebar_toggle.checked = sidebar === 'visible';
body.classList.remove('sidebar-visible');
body.classList.add("sidebar-" + sidebar);
</script>
<nav id="sidebar" class="sidebar" aria-label="Table of contents">
<div class="sidebar-scrollbox">
<ol class="chapter"><li class="chapter-item expanded "><a href="project-context-primer.html"><strong aria-hidden="true">1.</strong> Project Context Primer</a></li><li class="chapter-item expanded "><a href="what-you-will-learn.html"><strong aria-hidden="true">2.</strong> What You Will Learn</a></li><li class="chapter-item expanded "><a href="quickstart.html"><strong aria-hidden="true">3.</strong> Quickstart</a></li><li class="chapter-item expanded "><a href="api-docs.html"><strong aria-hidden="true">4.</strong> API Docs (rustdoc)</a></li><li class="chapter-item expanded "><a href="part-i.html"><strong aria-hidden="true">5.</strong> Part I — Foundations</a></li><li><ol class="section"><li class="chapter-item expanded "><a href="introduction.html"><strong aria-hidden="true">5.1.</strong> Introduction</a></li><li class="chapter-item expanded "><a href="architecture-overview.html"><strong aria-hidden="true">5.2.</strong> Architecture Overview</a></li><li class="chapter-item expanded "><a href="testing-philosophy.html"><strong aria-hidden="true">5.3.</strong> Testing Philosophy</a></li><li class="chapter-item expanded "><a href="scenario-lifecycle.html"><strong aria-hidden="true">5.4.</strong> Scenario Lifecycle</a></li><li class="chapter-item expanded "><a href="design-rationale.html"><strong aria-hidden="true">5.5.</strong> Design Rationale</a></li></ol></li><li class="chapter-item expanded "><a href="part-ii.html"><strong aria-hidden="true">6.</strong> Part II — User Guide</a></li><li><ol class="section"><li class="chapter-item expanded "><a href="workspace-layout.html"><strong aria-hidden="true">6.1.</strong> Workspace Layout</a></li><li class="chapter-item expanded "><a href="annotated-tree.html"><strong aria-hidden="true">6.2.</strong> Annotated Tree</a></li><li class="chapter-item expanded "><a href="authoring-scenarios.html"><strong aria-hidden="true">6.3.</strong> Authoring Scenarios</a></li><li class="chapter-item expanded "><a href="workloads.html"><strong aria-hidden="true">6.4.</strong> Core Content: Workloads & Expectations</a></li><li class="chapter-item expanded "><a href="scenario-builder-ext-patterns.html"><strong aria-hidden="true">6.5.</strong> Core Content: ScenarioBuilderExt Patterns</a></li><li class="chapter-item expanded "><a href="best-practices.html"><strong aria-hidden="true">6.6.</strong> Best Practices</a></li><li class="chapter-item expanded "><a href="usage-patterns.html"><strong aria-hidden="true">6.7.</strong> Usage Patterns</a></li><li class="chapter-item expanded "><a href="examples.html"><strong aria-hidden="true">6.8.</strong> Examples</a></li><li class="chapter-item expanded "><a href="examples-advanced.html"><strong aria-hidden="true">6.9.</strong> Advanced & Artificial Examples</a></li><li class="chapter-item expanded "><a href="running-scenarios.html"><strong aria-hidden="true">6.10.</strong> Running Scenarios</a></li><li class="chapter-item expanded "><a href="runners.html"><strong aria-hidden="true">6.11.</strong> Runners</a></li><li class="chapter-item expanded "><a href="node-control.html"><strong aria-hidden="true">6.12.</strong> Node Control & RunContext</a></li><li class="chapter-item expanded "><a href="chaos.html"><strong aria-hidden="true">6.13.</strong> Chaos Workloads</a></li><li class="chapter-item expanded "><a href="topology-chaos.html"><strong aria-hidden="true">6.14.</strong> Topology & Chaos Patterns</a></li><li class="chapter-item expanded "><a href="operations.html" class="active"><strong aria-hidden="true">6.15.</strong> Operations</a></li></ol></li><li class="chapter-item expanded "><a href="part-iii.html"><strong aria-hidden="true">7.</strong> Part III — Developer Reference</a></li><li><ol class="section"><li class="chapter-item expanded "><a href="scenario-model.html"><strong aria-hidden="true">7.1.</strong> Scenario Model (Developer Level)</a></li><li class="chapter-item expanded "><a href="api-levels.html"><strong aria-hidden="true">7.2.</strong> API Levels: Builder DSL vs. Direct</a></li><li class="chapter-item expanded "><a href="extending.html"><strong aria-hidden="true">7.3.</strong> Extending the Framework</a></li><li class="chapter-item expanded "><a href="custom-workload-example.html"><strong aria-hidden="true">7.4.</strong> Example: New Workload & Expectation (Rust)</a></li><li class="chapter-item expanded "><a href="internal-crate-reference.html"><strong aria-hidden="true">7.5.</strong> Internal Crate Reference</a></li></ol></li><li class="chapter-item expanded "><a href="part-iv.html"><strong aria-hidden="true">8.</strong> Part IV — Appendix</a></li><li><ol class="section"><li class="chapter-item expanded "><a href="dsl-cheat-sheet.html"><strong aria-hidden="true">8.1.</strong> Builder API Quick Reference</a></li><li class="chapter-item expanded "><a href="troubleshooting.html"><strong aria-hidden="true">8.2.</strong> Troubleshooting Scenarios</a></li><li class="chapter-item expanded "><a href="faq.html"><strong aria-hidden="true">8.3.</strong> FAQ</a></li><li class="chapter-item expanded "><a href="glossary.html"><strong aria-hidden="true">8.4.</strong> Glossary</a></li></ol></li></ol>
</div>
<div id="sidebar-resize-handle" class="sidebar-resize-handle">
<div class="sidebar-resize-indicator"></div>
</div>
</nav>
<!-- Track and set sidebar scroll position -->
<script>
var sidebarScrollbox = document.querySelector('#sidebar .sidebar-scrollbox');
sidebarScrollbox.addEventListener('click', function(e) {
if (e.target.tagName === 'A') {
sessionStorage.setItem('sidebar-scroll', sidebarScrollbox.scrollTop);
}
}, { passive: true });
var sidebarScrollTop = sessionStorage.getItem('sidebar-scroll');
sessionStorage.removeItem('sidebar-scroll');
if (sidebarScrollTop) {
// preserve sidebar scroll position when navigating via links within sidebar
sidebarScrollbox.scrollTop = sidebarScrollTop;
} else {
// scroll sidebar to current active section when navigating via "next/previous chapter" buttons
var activeSection = document.querySelector('#sidebar .active');
if (activeSection) {
activeSection.scrollIntoView({ block: 'center' });
}
}
</script>
<div id="page-wrapper" class="page-wrapper">
<div class="page">
<div id="menu-bar-hover-placeholder"></div>
<div id="menu-bar" class="menu-bar sticky">
<div class="left-buttons">
<label id="sidebar-toggle" class="icon-button" for="sidebar-toggle-anchor" title="Toggle Table of Contents" aria-label="Toggle Table of Contents" aria-controls="sidebar">
<i class="fa fa-bars"></i>
</label>
<button id="theme-toggle" class="icon-button" type="button" title="Change theme" aria-label="Change theme" aria-haspopup="true" aria-expanded="false" aria-controls="theme-list">
<i class="fa fa-paint-brush"></i>
</button>
<ul id="theme-list" class="theme-popup" aria-label="Themes" role="menu">
<li role="none"><button role="menuitem" class="theme" id="light">Light</button></li>
<li role="none"><button role="menuitem" class="theme" id="rust">Rust</button></li>
<li role="none"><button role="menuitem" class="theme" id="coal">Coal</button></li>
<li role="none"><button role="menuitem" class="theme" id="navy">Navy</button></li>
<li role="none"><button role="menuitem" class="theme" id="ayu">Ayu</button></li>
</ul>
<button id="search-toggle" class="icon-button" type="button" title="Search. (Shortkey: s)" aria-label="Toggle Searchbar" aria-expanded="false" aria-keyshortcuts="S" aria-controls="searchbar">
<i class="fa fa-search"></i>
</button>
</div>
<h1 class="menu-title">Nomos Testing Book</h1>
<div class="right-buttons">
<a href="print.html" title="Print this book" aria-label="Print this book">
<i id="print-button" class="fa fa-print"></i>
</a>
</div>
</div>
<div id="search-wrapper" class="hidden">
<form id="searchbar-outer" class="searchbar-outer">
<input type="search" id="searchbar" name="searchbar" placeholder="Search this book ..." aria-controls="searchresults-outer" aria-describedby="searchresults-header">
</form>
<div id="searchresults-outer" class="searchresults-outer hidden">
<div id="searchresults-header" class="searchresults-header"></div>
<ul id="searchresults">
</ul>
</div>
</div>
<!-- Apply ARIA attributes after the sidebar and the sidebar toggle button are added to the DOM -->
<script>
document.getElementById('sidebar-toggle').setAttribute('aria-expanded', sidebar === 'visible');
document.getElementById('sidebar').setAttribute('aria-hidden', sidebar !== 'visible');
Array.from(document.querySelectorAll('#sidebar a')).forEach(function(link) {
link.setAttribute('tabIndex', sidebar === 'visible' ? 0 : -1);
});
</script>
<div id="content" class="content">
<main>
<h1 id="operations"><a class="header" href="#operations">Operations</a></h1>
<p>Operational readiness focuses on prerequisites, environment fit, and clear
signals:</p>
<ul>
<li><strong>Prerequisites</strong>:
<ul>
<li><strong><code>versions.env</code> file</strong> at repository root (required by helper scripts; defines VERSION, NOMOS_NODE_REV, NOMOS_BUNDLE_VERSION)</li>
<li>Keep a sibling <code>nomos-node</code> checkout available, or use <code>scripts/run-examples.sh</code> which clones/builds on demand</li>
<li>Ensure the chosen runner's platform needs are met (Docker for compose, cluster access for k8s)</li>
<li>CI uses prebuilt binary artifacts from the <code>build-binaries</code> workflow</li>
</ul>
</li>
<li><strong>Artifacts</strong>: DA scenarios require KZG parameters (circuit assets) located at
<code>testing-framework/assets/stack/kzgrs_test_params</code>. Fetch them via
<code>scripts/setup-nomos-circuits.sh</code> or override the path with <code>NOMOS_KZGRS_PARAMS_PATH</code>.</li>
<li><strong>Environment flags</strong>: <code>POL_PROOF_DEV_MODE=true</code> is <strong>required for all runners</strong>
(local, compose, k8s) unless you want expensive Groth16 proof generation that
will cause tests to timeout. Configure logging via <code>NOMOS_LOG_DIR</code>, <code>NOMOS_LOG_LEVEL</code>,
and <code>NOMOS_LOG_FILTER</code> (see <a href="#logging-and-observability">Logging and Observability</a>
for details). Note that nodes ignore <code>RUST_LOG</code> and only respond to <code>NOMOS_*</code> variables.</li>
<li><strong>Readiness checks</strong>: verify runners report node readiness before starting
workloads; this avoids false negatives from starting too early.</li>
<li><strong>Failure triage</strong>: map failures to missing prerequisites (wallet seeding,
node control availability), runner platform issues, or unmet expectations.
Start with liveness signals, then dive into workload-specific assertions.</li>
</ul>
<p>Treat operational hygiene—assets present, prerequisites satisfied, observability
reachable—as the first step to reliable scenario outcomes.</p>
<h2 id="environment-variable-reference"><a class="header" href="#environment-variable-reference">Environment Variable Reference</a></h2>
<p>This section consolidates all environment variables used by the framework, helper scripts, and node processes.</p>
<h3 id="core-requirements"><a class="header" href="#core-requirements">Core Requirements</a></h3>
<div class="table-wrapper"><table><thead><tr><th>Variable</th><th>Default</th><th>Effect</th><th>Used By</th><th>Required?</th></tr></thead><tbody>
<tr><td><code>POL_PROOF_DEV_MODE</code></td><td>(none)</td><td>When set to <code>true</code> or <code>1</code>, disables expensive Groth16 proof generation. <strong>Critical for testing</strong> to avoid timeouts.</td><td>All runners</td><td><strong>Yes</strong></td></tr>
<tr><td><code>VERSION</code></td><td>(none)</td><td>Framework version identifier, read from <code>versions.env</code></td><td>Helper scripts</td><td><strong>Yes</strong> (in <code>versions.env</code>)</td></tr>
<tr><td><code>NOMOS_NODE_REV</code></td><td>(none)</td><td>Git revision of nomos-node to build/checkout</td><td>Helper scripts</td><td><strong>Yes</strong> (in <code>versions.env</code>)</td></tr>
<tr><td><code>NOMOS_BUNDLE_VERSION</code></td><td>(none)</td><td>Version tag for prebuilt binary bundles</td><td>Helper scripts</td><td><strong>Yes</strong> (in <code>versions.env</code>)</td></tr>
</tbody></table>
</div>
<h3 id="binary-and-asset-paths"><a class="header" href="#binary-and-asset-paths">Binary and Asset Paths</a></h3>
<div class="table-wrapper"><table><thead><tr><th>Variable</th><th>Default</th><th>Effect</th><th>Used By</th><th>Required?</th></tr></thead><tbody>
<tr><td><code>NOMOS_NODE_BIN</code></td><td><code>testing-framework/assets/stack/bin/nomos-node</code></td><td>Path to nomos-node binary</td><td>Local runner</td><td>No (uses default)</td></tr>
<tr><td><code>NOMOS_EXECUTOR_BIN</code></td><td><code>testing-framework/assets/stack/bin/nomos-executor</code></td><td>Path to nomos-executor binary</td><td>Local runner</td><td>No (uses default)</td></tr>
<tr><td><code>NOMOS_CLI_BIN</code></td><td><code>testing-framework/assets/stack/bin/nomos-cli</code></td><td>Path to nomos-cli binary</td><td>Helper scripts</td><td>No (uses default)</td></tr>
<tr><td><code>NOMOS_KZGRS_PARAMS_PATH</code></td><td><code>testing-framework/assets/stack/kzgrs_test_params/kzgrs_test_params</code></td><td>Path to KZG circuit parameters file (note: path includes filename twice—directory contains file <code>kzgrs_test_params</code>)</td><td>All runners (for DA workloads)</td><td>No (uses default)</td></tr>
<tr><td><code>NOMOS_NODE_PATH</code></td><td>(none)</td><td>Use local nomos-node checkout at this path (skips git clone/fetch)</td><td><code>build-bundle.sh</code></td><td>No</td></tr>
<tr><td><code>NOMOS_BINARIES_TAR</code></td><td><code>.tmp/nomos-binaries-{platform}-{version}.tar.gz</code></td><td>Path to prebuilt binaries tarball</td><td><code>run-examples.sh</code></td><td>No (builds if missing)</td></tr>
</tbody></table>
</div>
<h3 id="docker-and-image-configuration"><a class="header" href="#docker-and-image-configuration">Docker and Image Configuration</a></h3>
<div class="table-wrapper"><table><thead><tr><th>Variable</th><th>Default</th><th>Effect</th><th>Used By</th><th>Required?</th></tr></thead><tbody>
<tr><td><code>NOMOS_TESTNET_IMAGE</code></td><td><code>logos-blockchain-testing:local</code></td><td>Docker image name for compose/k8s runners</td><td>Compose, K8s runners</td><td>No (uses default)</td></tr>
<tr><td><code>NOMOS_BUNDLE_DOCKER_PLATFORM</code></td><td><code>linux/amd64</code> (or <code>linux/arm64</code> on Apple Silicon Docker Desktop)</td><td>Docker platform for Linux binary builds when running on non-Linux host</td><td><code>build-bundle.sh</code>, <code>build_test_image.sh</code></td><td>No (auto-detected)</td></tr>
<tr><td><code>NOMOS_SKIP_IMAGE_BUILD</code></td><td><code>0</code></td><td>When set to <code>1</code>, skips Docker image build step</td><td><code>run-examples.sh</code> compose/k8s</td><td>No</td></tr>
</tbody></table>
</div>
<h3 id="node-logging-configuration"><a class="header" href="#node-logging-configuration">Node Logging Configuration</a></h3>
<p><strong>Important:</strong> Node processes ignore <code>RUST_LOG</code> and only respond to <code>NOMOS_*</code> variables.</p>
<div class="table-wrapper"><table><thead><tr><th>Variable</th><th>Default</th><th>Effect</th><th>Used By</th><th>Required?</th></tr></thead><tbody>
<tr><td><code>NOMOS_LOG_DIR</code></td><td>(varies by runner)</td><td>Directory for per-node log files. Local runner: temp directory (cleaned up). Compose: <code>/tmp/node-logs</code> in container. Set to persist logs.</td><td>All runners</td><td>No</td></tr>
<tr><td><code>NOMOS_LOG_LEVEL</code></td><td><code>info</code></td><td>Global log level for node processes (<code>error</code>, <code>warn</code>, <code>info</code>, <code>debug</code>, <code>trace</code>)</td><td>Node processes</td><td>No</td></tr>
<tr><td><code>NOMOS_LOG_FILTER</code></td><td>(none)</td><td>Fine-grained logging filter (e.g., <code>nomos_core=debug,consensus=trace</code>)</td><td>Node processes</td><td>No</td></tr>
<tr><td><code>NOMOS_TESTS_TRACING</code></td><td>(none)</td><td>When set to <code>true</code>, enables tracing output to console for local runner</td><td>Local runner</td><td>No</td></tr>
<tr><td><code>NOMOS_TESTS_KEEP_LOGS</code></td><td>(none)</td><td>When set to <code>1</code>, prevents cleanup of temporary log directories</td><td>Local runner</td><td>No</td></tr>
</tbody></table>
</div>
<h3 id="framework-logging-configuration"><a class="header" href="#framework-logging-configuration">Framework Logging Configuration</a></h3>
<p><strong>Note:</strong> Runner binaries (not node processes) use <code>RUST_LOG</code> for framework orchestration logs.</p>
<div class="table-wrapper"><table><thead><tr><th>Variable</th><th>Default</th><th>Effect</th><th>Used By</th><th>Required?</th></tr></thead><tbody>
<tr><td><code>RUST_LOG</code></td><td>(none)</td><td>Controls logging for runner binaries and framework code (e.g., <code>testing_framework=debug</code>)</td><td>Runner binaries</td><td>No</td></tr>
</tbody></table>
</div>
<h3 id="observability-and-telemetry"><a class="header" href="#observability-and-telemetry">Observability and Telemetry</a></h3>
<div class="table-wrapper"><table><thead><tr><th>Variable</th><th>Default</th><th>Effect</th><th>Used By</th><th>Required?</th></tr></thead><tbody>
<tr><td><code>NOMOS_METRICS_QUERY_URL</code></td><td>(none)</td><td>Prometheus-compatible PromQL API base URL for querying metrics</td><td>Expectations (optional)</td><td>No</td></tr>
<tr><td><code>NOMOS_METRICS_OTLP_INGEST_URL</code></td><td>(none)</td><td>OTLP ingest endpoint for metrics (e.g., for external Tempo/Loki)</td><td>Compose runner</td><td>No</td></tr>
<tr><td><code>NOMOS_GRAFANA_URL</code></td><td>(none)</td><td>Grafana dashboard URL (printed in <code>TESTNET_ENDPOINTS</code> for convenience)</td><td>Compose runner</td><td>No</td></tr>
<tr><td><code>NOMOS_OTLP_ENDPOINT</code></td><td>(none)</td><td>OpenTelemetry Protocol endpoint for traces (e.g., <code>http://localhost:4317</code>)</td><td>Node processes</td><td>No</td></tr>
<tr><td><code>NOMOS_OTLP_METRICS_ENDPOINT</code></td><td>(none)</td><td>OpenTelemetry Protocol endpoint for metrics (e.g., <code>http://localhost:4318</code>)</td><td>Node processes</td><td>No</td></tr>
</tbody></table>
</div>
<h3 id="scenario-configuration"><a class="header" href="#scenario-configuration">Scenario Configuration</a></h3>
<div class="table-wrapper"><table><thead><tr><th>Variable</th><th>Default</th><th>Effect</th><th>Used By</th><th>Required?</th></tr></thead><tbody>
<tr><td><code>NOMOS_DEMO_VALIDATORS</code></td><td>(none)</td><td>Number of validators (used by <code>run-examples.sh</code> <code>-v</code> flag)</td><td><code>run-examples.sh</code></td><td>No (use <code>-v</code> flag)</td></tr>
<tr><td><code>NOMOS_DEMO_EXECUTORS</code></td><td>(none)</td><td>Number of executors (used by <code>run-examples.sh</code> <code>-e</code> flag)</td><td><code>run-examples.sh</code></td><td>No (use <code>-e</code> flag)</td></tr>
<tr><td><code>NOMOS_DEMO_RUN_SECONDS</code></td><td>(none)</td><td>Run duration in seconds (used by <code>run-examples.sh</code> <code>-t</code> flag)</td><td><code>run-examples.sh</code></td><td>No (use <code>-t</code> flag)</td></tr>
</tbody></table>
</div>
<h3 id="kubernetes-specific"><a class="header" href="#kubernetes-specific">Kubernetes-Specific</a></h3>
<div class="table-wrapper"><table><thead><tr><th>Variable</th><th>Default</th><th>Effect</th><th>Used By</th><th>Required?</th></tr></thead><tbody>
<tr><td><code>NOMOS_K8S_NAMESPACE</code></td><td><code>default</code></td><td>Kubernetes namespace for deployments</td><td>K8s runner</td><td>No</td></tr>
<tr><td><code>NOMOS_K8S_CONTEXT</code></td><td>(current context)</td><td>Kubernetes context to use</td><td>K8s runner</td><td>No</td></tr>
<tr><td><code>NOMOS_K8S_IMAGE_PULL_POLICY</code></td><td><code>IfNotPresent</code></td><td>Image pull policy (<code>Always</code>, <code>IfNotPresent</code>, <code>Never</code>)</td><td>K8s runner</td><td>No</td></tr>
</tbody></table>
</div>
<h3 id="quick-reference-common-scenarios"><a class="header" href="#quick-reference-common-scenarios">Quick Reference: Common Scenarios</a></h3>
<p><strong>Running local smoke test:</strong></p>
<pre><code class="language-bash">POL_PROOF_DEV_MODE=true cargo run -p runner-examples --bin local_runner
</code></pre>
<p><strong>Running with custom log directory:</strong></p>
<pre><code class="language-bash">POL_PROOF_DEV_MODE=true \
NOMOS_LOG_DIR=/tmp/my-test-logs \
NOMOS_TESTS_KEEP_LOGS=1 \
cargo run -p runner-examples --bin local_runner
</code></pre>
<p><strong>Running compose with observability:</strong></p>
<pre><code class="language-bash">POL_PROOF_DEV_MODE=true \
NOMOS_METRICS_QUERY_URL=http://localhost:9090 \
NOMOS_GRAFANA_URL=http://localhost:3000 \
scripts/run-examples.sh -t 60 -v 2 -e 1 compose
</code></pre>
<p><strong>Running with debug logging:</strong></p>
<pre><code class="language-bash">POL_PROOF_DEV_MODE=true \
NOMOS_LOG_LEVEL=debug \
NOMOS_LOG_FILTER="consensus=trace,da=debug" \
RUST_LOG=testing_framework=debug \
cargo run -p runner-examples --bin local_runner
</code></pre>
<p><strong>Building for Apple Silicon:</strong></p>
<pre><code class="language-bash">NOMOS_BUNDLE_DOCKER_PLATFORM=linux/arm64 scripts/build_test_image.sh
</code></pre>
<h2 id="ci-usage"><a class="header" href="#ci-usage">CI Usage</a></h2>
<p>Both <strong>LocalDeployer</strong> and <strong>ComposeDeployer</strong> work in CI environments:</p>
<p><strong>LocalDeployer in CI:</strong></p>
<ul>
<li>Faster (no Docker overhead)</li>
<li>Good for quick smoke tests</li>
<li><strong>Trade-off:</strong> Less isolation (processes share host)</li>
</ul>
<p><strong>ComposeDeployer in CI (recommended):</strong></p>
<ul>
<li>Better isolation (containerized)</li>
<li>Reproducible environment</li>
<li>Can integrate with external Prometheus/Grafana (optional)</li>
<li><strong>Trade-off:</strong> Slower startup (Docker image build)</li>
<li><strong>Trade-off:</strong> Requires Docker daemon</li>
</ul>
<h3 id="complete-ci-integration-example"><a class="header" href="#complete-ci-integration-example">Complete CI Integration Example</a></h3>
<p>Here's a production-ready GitHub Actions workflow demonstrating host and compose runners with proper artifact caching, log collection, and Cucumber test integration:</p>
<pre><code class="language-yaml">name: Integration Tests
on:
push:
branches: [main, develop]
pull_request:
branches: [main]
env:
CARGO_TERM_COLOR: always
RUST_BACKTRACE: 1
jobs:
# Host runner: Fast smoke tests
test-host:
name: Host Runner Tests (Smoke)
runs-on: ubuntu-latest
timeout-minutes: 15
steps:
- name: Checkout code
uses: actions/checkout@v4
- name: Install Rust nightly
uses: dtolnay/rust-toolchain@nightly
with:
components: rustfmt, clippy
- name: Cache Rust dependencies
uses: Swatinem/rust-cache@v2
with:
key: host-runner-${{ hashFiles('**/Cargo.lock') }}
- name: Cache binary bundle
id: cache-bundle
uses: actions/cache@v4
with:
path: .tmp/nomos-binaries-host-*.tar.gz
key: nomos-binaries-host-${{ hashFiles('versions.env') }}
- name: Build binaries (if not cached)
if: steps.cache-bundle.outputs.cache-hit != 'true'
run: |
scripts/build-bundle.sh --platform host --output .tmp/bundle.tar.gz
tar -xzf .tmp/bundle.tar.gz
mkdir -p testing-framework/assets/stack/bin
cp artifacts/nomos-node artifacts/nomos-executor testing-framework/assets/stack/bin/
- name: Extract cached binaries
if: steps.cache-bundle.outputs.cache-hit == 'true'
run: |
tar -xzf .tmp/nomos-binaries-host-*.tar.gz
mkdir -p testing-framework/assets/stack/bin
cp artifacts/nomos-node artifacts/nomos-executor testing-framework/assets/stack/bin/
- name: Run host smoke tests
env:
POL_PROOF_DEV_MODE: true
NOMOS_TESTS_KEEP_LOGS: 1
NOMOS_LOG_DIR: ${{ github.workspace }}/.tmp/host-logs
run: |
cargo test -p runner-examples --bin local_runner -- --nocapture
- name: Run Cucumber host tests
env:
POL_PROOF_DEV_MODE: true
run: |
cargo test -p runner-examples --bin cucumber_host -- --nocapture
- name: Upload logs on failure
if: failure()
uses: actions/upload-artifact@v4
with:
name: host-runner-logs
path: .tmp/host-logs/
retention-days: 7
# Compose runner: Full integration tests
test-compose:
name: Compose Runner Tests
runs-on: ubuntu-latest
timeout-minutes: 30
steps:
- name: Checkout code
uses: actions/checkout@v4
- name: Set up Docker Buildx
uses: docker/setup-buildx-action@v3
- name: Install Rust nightly
uses: dtolnay/rust-toolchain@nightly
- name: Cache Rust dependencies
uses: Swatinem/rust-cache@v2
with:
key: compose-runner-${{ hashFiles('**/Cargo.lock') }}
- name: Cache Linux binary bundle
id: cache-linux-bundle
uses: actions/cache@v4
with:
path: .tmp/nomos-binaries-linux-*.tar.gz
key: nomos-binaries-linux-${{ hashFiles('versions.env') }}
- name: Build Linux binaries (if not cached)
if: steps.cache-linux-bundle.outputs.cache-hit != 'true'
run: |
scripts/build-bundle.sh --platform linux --output .tmp/linux-bundle.tar.gz
- name: Cache Docker image layers
uses: actions/cache@v4
with:
path: /tmp/.buildx-cache
key: docker-buildx-${{ hashFiles('versions.env', 'testing-framework/assets/stack/**') }}
restore-keys: |
docker-buildx-
- name: Build Docker image
env:
NOMOS_BINARIES_TAR: .tmp/nomos-binaries-linux-*.tar.gz
run: |
scripts/build_test_image.sh
- name: Run compose smoke tests
env:
POL_PROOF_DEV_MODE: true
run: |
cargo test -p runner-examples --bin compose_runner -- --nocapture
- name: Run Cucumber compose tests
env:
POL_PROOF_DEV_MODE: true
run: |
cargo test -p runner-examples --bin cucumber_compose -- --nocapture
- name: Collect container logs on failure
if: failure()
run: |
mkdir -p .tmp/compose-logs
for container in $(docker ps -a --filter "name=nomos-compose" --format "{{.Names}}"); do
docker logs "$container" &gt; ".tmp/compose-logs/${container}.log" 2&gt;&amp;1 || true
done
- name: Upload container logs
if: failure()
uses: actions/upload-artifact@v4
with:
name: compose-runner-logs
path: .tmp/compose-logs/
retention-days: 7
- name: Cleanup containers
if: always()
run: |
docker compose down --volumes --remove-orphans || true
# Combined status check (required for branch protection)
integration-tests:
name: Integration Tests Status
runs-on: ubuntu-latest
needs: [test-host, test-compose]
if: always()
steps:
- name: Check all jobs succeeded
run: |
if [ "${{ needs.test-host.result }}" != "success" ] || \
[ "${{ needs.test-compose.result }}" != "success" ]; then
echo "One or more integration test jobs failed"
exit 1
fi
</code></pre>
<p><strong>Key Features:</strong></p>
<ol>
<li><strong>Matrix Strategy:</strong> Separate jobs for host and compose runners</li>
<li><strong>Artifact Caching:</strong>
<ul>
<li>Rust dependencies cached per <code>Cargo.lock</code></li>
<li>Binary bundles cached per <code>versions.env</code></li>
<li>Docker layers cached for faster image builds</li>
</ul>
</li>
<li><strong>Log Collection:</strong> Uploads logs only on failure, with 7-day retention</li>
<li><strong>Cucumber Integration:</strong> Runs both programmatic and BDD tests</li>
<li><strong>Proper Cleanup:</strong> Ensures containers are removed even on failure</li>
<li><strong>Status Check:</strong> Combined job for branch protection rules</li>
</ol>
<p><strong>Customization:</strong></p>
<pre><code class="language-yaml"># Add matrix for multiple topologies:
strategy:
matrix:
topology:
- { validators: 1, executors: 1, duration: 60 }
- { validators: 2, executors: 1, duration: 90 }
- { validators: 3, executors: 2, duration: 120 }
# Or add scheduled nightly stress tests:
on:
schedule:
- cron: '0 2 * * *' # 2 AM daily
</code></pre>
<p><strong>See also:</strong> <code>.github/workflows/lint.yml</code> for current CI examples.</p>
<h2 id="running-examples"><a class="header" href="#running-examples">Running Examples</a></h2>
<p>The framework provides three runner modes: <strong>host</strong> (local processes), <strong>compose</strong> (Docker Compose), and <strong>k8s</strong> (Kubernetes).</p>
<p><strong>Recommended:</strong> Use <code>scripts/run-examples.sh</code> for all modes:</p>
<pre><code class="language-bash"># Host mode (local processes)
scripts/run-examples.sh -t 60 -v 1 -e 1 host
# Compose mode (Docker Compose)
scripts/run-examples.sh -t 60 -v 1 -e 1 compose
# K8s mode (Kubernetes)
scripts/run-examples.sh -t 60 -v 1 -e 1 k8s
</code></pre>
<p>This script handles circuit setup, binary building/bundling, (local) image building, and execution.</p>
<p>Note: for <code>k8s</code> runs against non-local clusters (e.g. EKS), the cluster pulls images from a registry,
so a local <code>docker build</code> is not used. In that case, build + push your image separately (see
<code>scripts/build_test_image.sh</code>) and set <code>NOMOS_TESTNET_IMAGE</code> to the pushed reference.</p>
<h3 id="quick-smoke-matrix-hostcomposek8s"><a class="header" href="#quick-smoke-matrix-hostcomposek8s">Quick Smoke Matrix (Host/Compose/K8s)</a></h3>
<p>For a small “does everything still run?” matrix (including <code>--no-image-build</code> variants where relevant), use:</p>
<pre><code class="language-bash">scripts/run-test-matrix.sh -t 120 -v 1 -e 1
</code></pre>
<p>This is useful after making runner/image/script changes, and it forwards <code>--metrics-*</code> options through to <code>scripts/run-examples.sh</code>.</p>
<p><strong>Environment overrides:</strong></p>
<ul>
<li><code>VERSION=v0.3.1</code> — Circuit version</li>
<li><code>NOMOS_NODE_REV=&lt;commit&gt;</code> — nomos-node git revision</li>
<li><code>NOMOS_BINARIES_TAR=path/to/bundle.tar.gz</code> — Use prebuilt bundle</li>
<li><code>NOMOS_SKIP_IMAGE_BUILD=1</code> — Skip image rebuild (compose/k8s)</li>
<li><code>NOMOS_BUNDLE_DOCKER_PLATFORM=linux/arm64|linux/amd64</code> — Docker platform used when building a Linux bundle on non-Linux hosts (macOS/Windows)</li>
<li><code>COMPOSE_CIRCUITS_PLATFORM=linux-aarch64|linux-x86_64</code> — Circuits platform used when building the compose/k8s image (defaults based on host arch)</li>
<li><code>SLOW_TEST_ENV=true</code> — Doubles built-in readiness timeouts (useful in slower CI / constrained laptops)</li>
<li><code>TESTNET_PRINT_ENDPOINTS=1</code> — Print <code>TESTNET_ENDPOINTS</code> / <code>TESTNET_PPROF</code> lines during deploy (set automatically by <code>scripts/run-examples.sh</code>)</li>
<li><code>COMPOSE_RUNNER_HTTP_TIMEOUT_SECS=&lt;secs&gt;</code> — Override compose node HTTP readiness timeout</li>
<li><code>K8S_RUNNER_DEPLOYMENT_TIMEOUT_SECS=&lt;secs&gt;</code> — Override k8s deployment readiness timeout</li>
<li><code>K8S_RUNNER_HTTP_TIMEOUT_SECS=&lt;secs&gt;</code> — Override k8s HTTP readiness timeout for port-forwards</li>
<li><code>K8S_RUNNER_HTTP_PROBE_TIMEOUT_SECS=&lt;secs&gt;</code> — Override k8s HTTP readiness timeout for NodePort probes</li>
<li><code>K8S_RUNNER_PROMETHEUS_HTTP_TIMEOUT_SECS=&lt;secs&gt;</code> — Override k8s Prometheus readiness timeout</li>
<li><code>K8S_RUNNER_PROMETHEUS_HTTP_PROBE_TIMEOUT_SECS=&lt;secs&gt;</code> — Override k8s Prometheus NodePort probe timeout</li>
</ul>
<h3 id="updating-nomos-node-revision-dev-workflow"><a class="header" href="#updating-nomos-node-revision-dev-workflow">Updating <code>nomos-node</code> Revision (Dev Workflow)</a></h3>
<p>The repo pins a <code>nomos-node</code> revision in <code>versions.env</code> for reproducible builds. To update it (or point to a local checkout), use the helper script:</p>
<pre><code class="language-bash"># Pin to a new git revision (updates versions.env + Cargo.toml git revs)
scripts/update-nomos-rev.sh --rev &lt;git_sha&gt;
# Use a local nomos-node checkout instead (for development)
scripts/update-nomos-rev.sh --path /path/to/nomos-node
# If Cargo.toml was marked skip-worktree, clear it
scripts/update-nomos-rev.sh --unskip-worktree
</code></pre>
<p>Notes:</p>
<ul>
<li>Dont commit absolute <code>NOMOS_NODE_PATH</code> values; prefer <code>--rev</code> for shared history/CI.</li>
<li>After changing rev/path, expect <code>Cargo.lock</code> to update on the next <code>cargo build</code>/<code>cargo test</code>.</li>
</ul>
<h3 id="cleanup-helper"><a class="header" href="#cleanup-helper">Cleanup Helper</a></h3>
<p>If you hit Docker build failures, mysterious I/O errors, or are running out of disk space:</p>
<pre><code class="language-bash">scripts/clean.sh
</code></pre>
<p>For extra Docker cache cleanup:</p>
<pre><code class="language-bash">scripts/clean.sh --docker
</code></pre>
<h3 id="host-runner-direct-cargo-run"><a class="header" href="#host-runner-direct-cargo-run">Host Runner (Direct Cargo Run)</a></h3>
<p>For manual control, you can run the <code>local_runner</code> binary directly:</p>
<pre><code class="language-bash">POL_PROOF_DEV_MODE=true \
NOMOS_NODE_BIN=/path/to/nomos-node \
NOMOS_EXECUTOR_BIN=/path/to/nomos-executor \
cargo run -p runner-examples --bin local_runner
</code></pre>
<p><strong>Environment variables:</strong></p>
<ul>
<li><code>NOMOS_DEMO_VALIDATORS=3</code> — Number of validators (default: 1, or use legacy <code>LOCAL_DEMO_VALIDATORS</code>)</li>
<li><code>NOMOS_DEMO_EXECUTORS=2</code> — Number of executors (default: 1, or use legacy <code>LOCAL_DEMO_EXECUTORS</code>)</li>
<li><code>NOMOS_DEMO_RUN_SECS=120</code> — Run duration in seconds (default: 60, or use legacy <code>LOCAL_DEMO_RUN_SECS</code>)</li>
<li><code>NOMOS_NODE_BIN</code> / <code>NOMOS_EXECUTOR_BIN</code> — Paths to binaries (required for direct run)</li>
<li><code>NOMOS_LOG_DIR=/tmp/logs</code> — Directory for per-node log files (host runner). For compose/k8s, set <code>tracing_settings.logger: !File</code> in <code>testing-framework/assets/stack/cfgsync.yaml</code>.</li>
<li><code>NOMOS_TESTS_KEEP_LOGS=1</code> — Keep per-run temporary directories (useful for debugging/CI artifacts)</li>
<li><code>NOMOS_TESTS_TRACING=true</code> — Enable the debug tracing preset (optional; combine with <code>NOMOS_LOG_DIR</code> unless you have external tracing backends configured)</li>
<li><code>NOMOS_LOG_LEVEL=debug</code> — Set log level (default: info)</li>
<li><code>NOMOS_LOG_FILTER="cryptarchia=trace,nomos_da_sampling=debug"</code> — Fine-grained module filtering</li>
</ul>
<p><strong>Note:</strong> Requires circuit assets and host binaries. Use <code>scripts/run-examples.sh host</code> to handle setup automatically.</p>
<h3 id="compose-runner-direct-cargo-run"><a class="header" href="#compose-runner-direct-cargo-run">Compose Runner (Direct Cargo Run)</a></h3>
<p>For manual control, you can run the <code>compose_runner</code> binary directly. Compose requires a Docker image with embedded assets.</p>
<p><strong>Recommended setup:</strong> Use a prebuilt bundle:</p>
<pre><code class="language-bash"># Build a Linux bundle (includes binaries + circuits)
scripts/build-bundle.sh --platform linux
# Creates .tmp/nomos-binaries-linux-v0.3.1.tar.gz
# Build image (embeds bundle assets)
export NOMOS_BINARIES_TAR=.tmp/nomos-binaries-linux-v0.3.1.tar.gz
scripts/build_test_image.sh
# Run
NOMOS_TESTNET_IMAGE=logos-blockchain-testing:local \
POL_PROOF_DEV_MODE=true \
cargo run -p runner-examples --bin compose_runner
</code></pre>
<p><strong>Platform note (macOS / Apple silicon):</strong></p>
<ul>
<li>Docker Desktop runs a <code>linux/arm64</code> engine. If Linux bundle builds are slow/unstable when producing <code>.tmp/nomos-binaries-linux-*.tar.gz</code>, prefer <code>NOMOS_BUNDLE_DOCKER_PLATFORM=linux/arm64</code> for local compose/k8s runs.</li>
<li>If you need amd64 images/binaries specifically (e.g., deploying to amd64-only environments), set <code>NOMOS_BUNDLE_DOCKER_PLATFORM=linux/amd64</code> and expect slower builds via emulation.</li>
</ul>
<p><strong>Alternative:</strong> Manual circuit/image setup (rebuilds during image build):</p>
<pre><code class="language-bash"># Fetch and copy circuits
scripts/setup-nomos-circuits.sh v0.3.1 /tmp/nomos-circuits
cp -r /tmp/nomos-circuits/* testing-framework/assets/stack/kzgrs_test_params/
# Build image
scripts/build_test_image.sh
# Run
NOMOS_TESTNET_IMAGE=logos-blockchain-testing:local \
POL_PROOF_DEV_MODE=true \
cargo run -p runner-examples --bin compose_runner
</code></pre>
<p><strong>Environment variables:</strong></p>
<ul>
<li><code>NOMOS_TESTNET_IMAGE=logos-blockchain-testing:local</code> — Image tag (required, must match built image)</li>
<li><code>POL_PROOF_DEV_MODE=true</code><strong>Required</strong> for all runners</li>
<li><code>NOMOS_DEMO_VALIDATORS=3</code> / <code>NOMOS_DEMO_EXECUTORS=2</code> / <code>NOMOS_DEMO_RUN_SECS=120</code> — Topology overrides</li>
<li><code>COMPOSE_NODE_PAIRS=1x1</code> — Alternative topology format: "validators×executors"</li>
<li><code>NOMOS_METRICS_QUERY_URL</code> — Prometheus-compatible base URL for the runner process to query (optional)</li>
<li><code>NOMOS_METRICS_OTLP_INGEST_URL</code> — Full OTLP HTTP ingest URL for node metrics export (optional)</li>
<li><code>NOMOS_GRAFANA_URL</code> — Grafana base URL for printing/logging (optional)</li>
<li><code>COMPOSE_RUNNER_HOST=127.0.0.1</code> — Host address for port mappings</li>
<li><code>COMPOSE_RUNNER_PRESERVE=1</code> — Keep containers running after test</li>
<li><code>NOMOS_LOG_LEVEL=debug</code> / <code>NOMOS_LOG_FILTER=...</code> — Control node log verbosity (stdout/stderr)</li>
<li><code>testing-framework/assets/stack/cfgsync.yaml</code> (<code>tracing_settings.logger</code>) — Switch node logs between stdout/stderr and file output</li>
</ul>
<p><strong>Compose-specific features:</strong></p>
<ul>
<li><strong>Node control support</strong>: Only runner that supports chaos testing (<code>.enable_node_control()</code> + chaos workloads)</li>
<li><strong>Observability is external</strong>: Set <code>NOMOS_METRICS_*</code> / <code>NOMOS_GRAFANA_URL</code> to enable telemetry links and querying
<ul>
<li>Quickstart: <code>scripts/setup-observability.sh compose up</code> then <code>scripts/setup-observability.sh compose env</code></li>
</ul>
</li>
</ul>
<p><strong>Important:</strong></p>
<ul>
<li>Containers expect KZG parameters at <code>/kzgrs_test_params/kzgrs_test_params</code> (note the repeated filename)</li>
<li>Use <code>scripts/run-examples.sh compose</code> to handle all setup automatically</li>
</ul>
<h3 id="k8s-runner-direct-cargo-run"><a class="header" href="#k8s-runner-direct-cargo-run">K8s Runner (Direct Cargo Run)</a></h3>
<p>For manual control, you can run the <code>k8s_runner</code> binary directly. K8s requires the same image setup as Compose.</p>
<p><strong>Prerequisites:</strong></p>
<ol>
<li><strong>Kubernetes cluster</strong> with <code>kubectl</code> configured</li>
<li><strong>Test image built</strong> (same as Compose, preferably with prebuilt bundle)</li>
<li><strong>Image available in cluster</strong> (loaded or pushed to registry)</li>
</ol>
<p><strong>Build and load image:</strong></p>
<pre><code class="language-bash"># Build image with bundle (recommended)
scripts/build-bundle.sh --platform linux
export NOMOS_BINARIES_TAR=.tmp/nomos-binaries-linux-v0.3.1.tar.gz
scripts/build_test_image.sh
# Load into cluster
export NOMOS_TESTNET_IMAGE=logos-blockchain-testing:local
kind load docker-image logos-blockchain-testing:local # For kind
# OR: minikube image load logos-blockchain-testing:local # For minikube
# OR: docker push your-registry/logos-blockchain-testing:local # For remote
</code></pre>
<p><strong>Run the example:</strong></p>
<pre><code class="language-bash">export NOMOS_TESTNET_IMAGE=logos-blockchain-testing:local
export POL_PROOF_DEV_MODE=true
cargo run -p runner-examples --bin k8s_runner
</code></pre>
<p><strong>Environment variables:</strong></p>
<ul>
<li><code>NOMOS_TESTNET_IMAGE</code> — Image tag (required)</li>
<li><code>POL_PROOF_DEV_MODE=true</code><strong>Required</strong> for all runners</li>
<li><code>NOMOS_DEMO_VALIDATORS</code> / <code>NOMOS_DEMO_EXECUTORS</code> / <code>NOMOS_DEMO_RUN_SECS</code> — Topology overrides</li>
<li><code>NOMOS_METRICS_QUERY_URL</code> — Prometheus-compatible base URL for the runner process to query (PromQL)</li>
<li><code>NOMOS_METRICS_OTLP_INGEST_URL</code> — Full OTLP HTTP ingest URL for node metrics export (optional)</li>
<li><code>NOMOS_GRAFANA_URL</code> — Grafana base URL for printing/logging (optional)</li>
</ul>
<p><strong>Metrics + Grafana (optional):</strong></p>
<pre><code class="language-bash">export NOMOS_METRICS_QUERY_URL=http://your-prometheus:9090
# Prometheus OTLP receiver example:
export NOMOS_METRICS_OTLP_INGEST_URL=http://your-prometheus:9090/api/v1/otlp/v1/metrics
# Optional: print a Grafana link in TESTNET_ENDPOINTS
export NOMOS_GRAFANA_URL=http://your-grafana:3000
cargo run -p runner-examples --bin k8s_runner
</code></pre>
<p>Notes:</p>
<ul>
<li><code>NOMOS_METRICS_QUERY_URL</code> must be reachable from the runner process (often via <code>kubectl port-forward</code>).</li>
<li><code>NOMOS_METRICS_OTLP_INGEST_URL</code> must be reachable from nodes (pods/containers) and is backend-specific (Prometheus vs VictoriaMetrics paths differ).
<ul>
<li>Quickstart installer: <code>scripts/setup-observability.sh k8s install</code> then <code>scripts/setup-observability.sh k8s env</code> (optional dashboards: <code>scripts/setup-observability.sh k8s dashboards</code>)</li>
</ul>
</li>
</ul>
<p><strong>Via <code>scripts/run-examples.sh</code> (optional):</strong></p>
<pre><code class="language-bash">scripts/run-examples.sh -t 60 -v 1 -e 1 k8s \
--metrics-query-url http://your-prometheus:9090 \
--metrics-otlp-ingest-url http://your-prometheus:9090/api/v1/otlp/v1/metrics
</code></pre>
<p><strong>In code (optional):</strong></p>
<pre><pre class="playground"><code class="language-rust"><span class="boring">#![allow(unused)]
</span><span class="boring">fn main() {
</span>use testing_framework_core::scenario::ScenarioBuilder;
use testing_framework_workflows::ObservabilityBuilderExt as _;
let plan = ScenarioBuilder::with_node_counts(1, 1)
.with_metrics_query_url_str("http://your-prometheus:9090")
.with_metrics_otlp_ingest_url_str("http://your-prometheus:9090/api/v1/otlp/v1/metrics")
.build();
<span class="boring">}</span></code></pre></pre>
<p><strong>Important:</strong></p>
<ul>
<li>K8s runner mounts <code>testing-framework/assets/stack/kzgrs_test_params</code> as a hostPath volume with file <code>/kzgrs_test_params/kzgrs_test_params</code> inside pods</li>
<li><strong>No node control support yet</strong>: Chaos workloads (<code>.enable_node_control()</code>) will fail</li>
<li>Use <code>scripts/run-examples.sh k8s</code> to handle all setup automatically</li>
</ul>
<h2 id="circuit-assets-kzg-parameters"><a class="header" href="#circuit-assets-kzg-parameters">Circuit Assets (KZG Parameters)</a></h2>
<p>DA workloads require KZG cryptographic parameters for polynomial commitment schemes.</p>
<h3 id="asset-location"><a class="header" href="#asset-location">Asset Location</a></h3>
<p><strong>Default path:</strong> <code>testing-framework/assets/stack/kzgrs_test_params/kzgrs_test_params</code></p>
<p>Note the repeated filename: the directory <code>kzgrs_test_params/</code> contains a file named <code>kzgrs_test_params</code>. This is the actual proving key file.</p>
<p><strong>Container path</strong> (compose/k8s): <code>/kzgrs_test_params/kzgrs_test_params</code></p>
<p><strong>Override:</strong> Set <code>NOMOS_KZGRS_PARAMS_PATH</code> to use a custom location (must point to the file):</p>
<pre><code class="language-bash">NOMOS_KZGRS_PARAMS_PATH=/path/to/custom/params cargo run -p runner-examples --bin local_runner
</code></pre>
<h3 id="directory-vs-file-kzg"><a class="header" href="#directory-vs-file-kzg">Directory vs File (KZG)</a></h3>
<p>The system uses KZG assets in two distinct ways:</p>
<div class="table-wrapper"><table><thead><tr><th>Concept</th><th>Used by</th><th>Meaning</th></tr></thead><tbody>
<tr><td><strong>KZG directory</strong></td><td>deployers/scripts</td><td>A directory that contains the KZG file (and related artifacts). Defaults to <code>testing-framework/assets/stack/kzgrs_test_params</code> and is controlled by <code>NOMOS_KZG_DIR_REL</code> (relative to the workspace root).</td></tr>
<tr><td><strong>KZG file path</strong></td><td>node processes</td><td>A single file path passed to nodes via <code>NOMOS_KZGRS_PARAMS_PATH</code> (inside containers/pods this is typically <code>/kzgrs_test_params/kzgrs_test_params</code>).</td></tr>
</tbody></table>
</div>
<h3 id="getting-circuit-assets"><a class="header" href="#getting-circuit-assets">Getting Circuit Assets</a></h3>
<p><strong>Option 1: Use helper script</strong> (recommended):</p>
<pre><code class="language-bash"># From the repository root
chmod +x scripts/setup-nomos-circuits.sh
scripts/setup-nomos-circuits.sh v0.3.1 /tmp/nomos-circuits
# Copy to default location
cp -r /tmp/nomos-circuits/* testing-framework/assets/stack/kzgrs_test_params/
</code></pre>
<p><strong>Option 2: Build locally</strong> (advanced):</p>
<pre><code class="language-bash"># This repository does not provide a `make kzgrs_test_params` target.
# If you need to regenerate KZG params from source, follow upstream tooling
# instructions (unspecified here) or use the helper scripts above to fetch a
# known-good bundle.
</code></pre>
<h3 id="ci-workflow"><a class="header" href="#ci-workflow">CI Workflow</a></h3>
<p>The CI automatically fetches and places assets:</p>
<pre><code class="language-yaml">- name: Install circuits for host build
run: |
scripts/setup-nomos-circuits.sh v0.3.1 "$TMPDIR/nomos-circuits"
cp -a "$TMPDIR/nomos-circuits"/. testing-framework/assets/stack/kzgrs_test_params/
</code></pre>
<h3 id="when-are-assets-needed"><a class="header" href="#when-are-assets-needed">When Are Assets Needed?</a></h3>
<div class="table-wrapper"><table><thead><tr><th>Runner</th><th>When Required</th></tr></thead><tbody>
<tr><td><strong>Local</strong></td><td>Always (for DA workloads)</td></tr>
<tr><td><strong>Compose</strong></td><td>During image build (baked into <code>NOMOS_TESTNET_IMAGE</code>)</td></tr>
<tr><td><strong>K8s</strong></td><td>During image build + deployed to cluster via hostPath volume</td></tr>
</tbody></table>
</div>
<p><strong>Error without assets:</strong></p>
<pre><code>Error: missing KZG parameters at testing-framework/assets/stack/kzgrs_test_params/kzgrs_test_params
</code></pre>
<p>If you see this error, the file <code>kzgrs_test_params</code> is missing from the directory. Use <code>scripts/run-examples.sh</code> or <code>scripts/setup-nomos-circuits.sh</code> to fetch it.</p>
<h2 id="logging-and-observability"><a class="header" href="#logging-and-observability">Logging and Observability</a></h2>
<h3 id="node-logging-vs-framework-logging"><a class="header" href="#node-logging-vs-framework-logging">Node Logging vs Framework Logging</a></h3>
<p><strong>Critical distinction:</strong> Node logs and framework logs use different configuration mechanisms.</p>
<div class="table-wrapper"><table><thead><tr><th>Component</th><th>Controlled By</th><th>Purpose</th></tr></thead><tbody>
<tr><td><strong>Framework binaries</strong> (<code>cargo run -p runner-examples --bin local_runner</code>)</td><td><code>RUST_LOG</code></td><td>Runner orchestration, deployment logs</td></tr>
<tr><td><strong>Node processes</strong> (validators, executors spawned by runner)</td><td><code>NOMOS_LOG_LEVEL</code>, <code>NOMOS_LOG_FILTER</code> (+ <code>NOMOS_LOG_DIR</code> on host runner)</td><td>Consensus, DA, mempool, network logs</td></tr>
</tbody></table>
</div>
<p><strong>Common mistake:</strong> Setting <code>RUST_LOG=debug</code> only increases verbosity of the runner binary itself. Node logs remain at their default level unless you also set <code>NOMOS_LOG_LEVEL=debug</code>.</p>
<p><strong>Example:</strong></p>
<pre><code class="language-bash"># This only makes the RUNNER verbose, not the nodes:
RUST_LOG=debug cargo run -p runner-examples --bin local_runner
# This makes the NODES verbose:
NOMOS_LOG_LEVEL=debug cargo run -p runner-examples --bin local_runner
# Both verbose (typically not needed):
RUST_LOG=debug NOMOS_LOG_LEVEL=debug cargo run -p runner-examples --bin local_runner
</code></pre>
<h3 id="logging-environment-variables"><a class="header" href="#logging-environment-variables">Logging Environment Variables</a></h3>
<div class="table-wrapper"><table><thead><tr><th>Variable</th><th>Default</th><th>Effect</th></tr></thead><tbody>
<tr><td><code>NOMOS_LOG_DIR</code></td><td>None (console only)</td><td>Host runner: directory for per-node log files. Compose/k8s: use <code>testing-framework/assets/stack/cfgsync.yaml</code> (<code>tracing_settings.logger: !File</code>) and mount a writable directory.</td></tr>
<tr><td><code>NOMOS_LOG_LEVEL</code></td><td><code>info</code></td><td>Global log level: <code>error</code>, <code>warn</code>, <code>info</code>, <code>debug</code>, <code>trace</code></td></tr>
<tr><td><code>NOMOS_LOG_FILTER</code></td><td>None</td><td>Fine-grained target filtering (e.g., <code>cryptarchia=trace,nomos_da_sampling=debug</code>)</td></tr>
<tr><td><code>NOMOS_TESTS_TRACING</code></td><td><code>false</code></td><td>Enable the debug tracing preset (optional; combine with <code>NOMOS_LOG_DIR</code> unless you have external tracing backends configured)</td></tr>
<tr><td><code>NOMOS_OTLP_ENDPOINT</code></td><td>None</td><td>OTLP trace endpoint (optional, disables OTLP noise if unset)</td></tr>
<tr><td><code>NOMOS_OTLP_METRICS_ENDPOINT</code></td><td>None</td><td>OTLP metrics endpoint (optional)</td></tr>
</tbody></table>
</div>
<p><strong>Example:</strong> Full debug logging to files:</p>
<pre><code class="language-bash">NOMOS_TESTS_TRACING=true \
NOMOS_LOG_DIR=/tmp/test-logs \
NOMOS_LOG_LEVEL=debug \
NOMOS_LOG_FILTER="cryptarchia=trace,nomos_da_sampling=debug,nomos_da_dispersal=debug,nomos_da_verifier=debug,nomos_blend=debug,chain_service=info,chain_network=info,chain_leader=info" \
POL_PROOF_DEV_MODE=true \
cargo run -p runner-examples --bin local_runner
</code></pre>
<h3 id="per-node-log-files"><a class="header" href="#per-node-log-files">Per-Node Log Files</a></h3>
<p>When <code>NOMOS_LOG_DIR</code> is set, each node writes logs to separate files:</p>
<p><strong>File naming pattern:</strong></p>
<ul>
<li><strong>Validators</strong>: Prefix <code>nomos-node-0</code>, <code>nomos-node-1</code>, etc. (may include timestamp suffix)</li>
<li><strong>Executors</strong>: Prefix <code>nomos-executor-0</code>, <code>nomos-executor-1</code>, etc. (may include timestamp suffix)</li>
</ul>
<p><strong>Local runner note:</strong> The local runner uses per-run temporary directories under the current working directory and removes them after the run unless <code>NOMOS_TESTS_KEEP_LOGS=1</code>. Use <code>NOMOS_LOG_DIR=/path/to/logs</code> to write per-node log files to a stable location.</p>
<h3 id="filter-target-names"><a class="header" href="#filter-target-names">Filter Target Names</a></h3>
<p>Common target prefixes for <code>NOMOS_LOG_FILTER</code>:</p>
<div class="table-wrapper"><table><thead><tr><th>Target Prefix</th><th>Subsystem</th></tr></thead><tbody>
<tr><td><code>cryptarchia</code></td><td>Consensus (Cryptarchia)</td></tr>
<tr><td><code>nomos_da_sampling</code></td><td>DA sampling service</td></tr>
<tr><td><code>nomos_da_dispersal</code></td><td>DA dispersal service</td></tr>
<tr><td><code>nomos_da_verifier</code></td><td>DA verification</td></tr>
<tr><td><code>nomos_blend</code></td><td>Mix network/privacy layer</td></tr>
<tr><td><code>chain_service</code></td><td>Chain service (node APIs/state)</td></tr>
<tr><td><code>chain_network</code></td><td>P2P networking</td></tr>
<tr><td><code>chain_leader</code></td><td>Leader election</td></tr>
</tbody></table>
</div>
<p><strong>Example filter:</strong></p>
<pre><code class="language-bash">NOMOS_LOG_FILTER="cryptarchia=trace,nomos_da_sampling=debug,chain_service=info,chain_network=info,chain_leader=info"
</code></pre>
<h3 id="accessing-logs-per-runner"><a class="header" href="#accessing-logs-per-runner">Accessing Logs Per Runner</a></h3>
<h4 id="local-runner"><a class="header" href="#local-runner">Local Runner</a></h4>
<p><strong>Default (temporary directories, auto-cleanup):</strong></p>
<pre><code class="language-bash">POL_PROOF_DEV_MODE=true cargo run -p runner-examples --bin local_runner
# Logs written to temporary directories in working directory
# Automatically cleaned up after test completes
</code></pre>
<p><strong>Persistent file output:</strong></p>
<pre><code class="language-bash">NOMOS_LOG_DIR=/tmp/local-logs \
POL_PROOF_DEV_MODE=true \
cargo run -p runner-examples --bin local_runner
# After test completes:
ls /tmp/local-logs/
# Files with prefix: nomos-node-0*, nomos-node-1*, nomos-executor-0*
# May include timestamps in filename
</code></pre>
<p><strong>Tip:</strong> Use <code>NOMOS_LOG_DIR</code> for persistent per-node log files, and <code>NOMOS_TESTS_KEEP_LOGS=1</code> if you want to keep the per-run temporary directories (configs/state) for post-mortem inspection.</p>
<h4 id="compose-runner"><a class="header" href="#compose-runner">Compose Runner</a></h4>
<p><strong>Via Docker logs (default, recommended):</strong></p>
<pre><code class="language-bash"># List containers (note the UUID prefix in names)
docker ps --filter "name=nomos-compose-"
# Stream logs from specific container
docker logs -f &lt;container-id-or-name&gt;
# Or use name pattern matching:
docker logs -f $(docker ps --filter "name=nomos-compose-.*-validator-0" -q | head -1)
</code></pre>
<p><strong>Via file collection (advanced):</strong></p>
<p>To write per-node log files inside containers, set <code>tracing_settings.logger: !File</code> in <code>testing-framework/assets/stack/cfgsync.yaml</code> (and ensure the directory is writable). To access them, you must either:</p>
<ol>
<li><strong>Copy files out after the run:</strong></li>
</ol>
<pre><code class="language-bash"># Ensure `testing-framework/assets/stack/cfgsync.yaml` is configured to log to `/logs`
# via `tracing_settings.logger: !File`.
NOMOS_TESTNET_IMAGE=logos-blockchain-testing:local \
POL_PROOF_DEV_MODE=true \
cargo run -p runner-examples --bin compose_runner
# After test, copy files from containers:
docker ps --filter "name=nomos-compose-"
docker cp &lt;container-id&gt;:/logs/node* /tmp/
</code></pre>
<ol start="2">
<li><strong>Mount a host volume</strong> (requires modifying compose template):</li>
</ol>
<pre><code class="language-yaml">volumes:
- /tmp/host-logs:/logs # Add to docker-compose.yml.tera
</code></pre>
<p><strong>Recommendation:</strong> Use <code>docker logs</code> by default. File collection inside containers is complex and rarely needed.</p>
<p><strong>Keep containers for debugging:</strong></p>
<pre><code class="language-bash">COMPOSE_RUNNER_PRESERVE=1 \
NOMOS_TESTNET_IMAGE=logos-blockchain-testing:local \
cargo run -p runner-examples --bin compose_runner
# Containers remain running after test—inspect with docker logs or docker exec
</code></pre>
<p><strong>Compose networking/debug knobs:</strong></p>
<ul>
<li><code>COMPOSE_RUNNER_HOST=127.0.0.1</code> — host used for readiness probes (override for remote Docker daemons / VM networking)</li>
<li><code>COMPOSE_RUNNER_HOST_GATEWAY=host.docker.internal:host-gateway</code> — controls the <code>extra_hosts</code> entry injected into compose (set to <code>disable</code> to omit)</li>
<li><code>TESTNET_RUNNER_PRESERVE=1</code> — alias for <code>COMPOSE_RUNNER_PRESERVE=1</code></li>
<li><code>COMPOSE_RUNNER_HTTP_TIMEOUT_SECS=&lt;secs&gt;</code> — override compose node HTTP readiness timeout</li>
</ul>
<p><strong>Note:</strong> Container names follow pattern <code>nomos-compose-{uuid}-validator-{index}-1</code> where <code>{uuid}</code> changes per run.</p>
<h4 id="k8s-runner"><a class="header" href="#k8s-runner">K8s Runner</a></h4>
<p><strong>Via kubectl logs (use label selectors):</strong></p>
<pre><code class="language-bash"># List pods
kubectl get pods
# Stream logs using label selectors (recommended)
# Helm chart labels:
# - nomos/logical-role=validator|executor
# - nomos/validator-index / nomos/executor-index
kubectl logs -l nomos/logical-role=validator -f
kubectl logs -l nomos/logical-role=executor -f
# Stream logs from specific pod
kubectl logs -f nomos-validator-0
# Previous logs from crashed pods
kubectl logs --previous -l nomos/logical-role=validator
</code></pre>
<p><strong>Download logs for offline analysis:</strong></p>
<pre><code class="language-bash"># Using label selectors
kubectl logs -l nomos/logical-role=validator --tail=1000 &gt; all-validators.log
kubectl logs -l nomos/logical-role=executor --tail=1000 &gt; all-executors.log
# Specific pods
kubectl logs nomos-validator-0 &gt; validator-0.log
kubectl logs nomos-executor-1 &gt; executor-1.log
</code></pre>
<p><strong>K8s environment notes:</strong></p>
<ul>
<li>The k8s runner is optimized for local clusters (Docker Desktop Kubernetes / minikube / kind):
<ul>
<li>The default image <code>logos-blockchain-testing:local</code> must be available on the clusters nodes (Docker Desktop shares the local daemon; kind/minikube often requires an explicit image load step).</li>
<li>The Helm chart mounts KZG params via a <code>hostPath</code> to your workspace path; this typically wont work on remote/managed clusters without replacing it with a PV/CSI volume or baking the params into an image.</li>
</ul>
</li>
<li>Debug helpers:
<ul>
<li><code>K8S_RUNNER_DEBUG=1</code> — logs Helm stdout/stderr for install commands.</li>
<li><code>K8S_RUNNER_PRESERVE=1</code> — keep the namespace/release after the run.</li>
</ul>
</li>
<li><code>K8S_RUNNER_NODE_HOST=&lt;ip|hostname&gt;</code> — override NodePort host resolution for non-local clusters.</li>
<li><code>K8S_RUNNER_NAMESPACE=&lt;name&gt;</code> / <code>K8S_RUNNER_RELEASE=&lt;name&gt;</code> — pin namespace/release instead of random IDs (useful for debugging)</li>
</ul>
<p><strong>Specify namespace (if not using default):</strong></p>
<pre><code class="language-bash">kubectl logs -n my-namespace -l nomos/logical-role=validator -f
</code></pre>
<h3 id="otlp-and-telemetry"><a class="header" href="#otlp-and-telemetry">OTLP and Telemetry</a></h3>
<p><strong>OTLP exporters are optional.</strong> If you see errors about unreachable OTLP endpoints, it's safe to ignore them unless you're actively collecting traces/metrics.</p>
<p><strong>To enable OTLP:</strong></p>
<pre><code class="language-bash">NOMOS_OTLP_ENDPOINT=http://localhost:4317 \
NOMOS_OTLP_METRICS_ENDPOINT=http://localhost:4318 \
cargo run -p runner-examples --bin local_runner
</code></pre>
<p><strong>To silence OTLP errors:</strong> Simply leave these variables unset (the default).</p>
<h3 id="observability-prometheus-and-node-apis"><a class="header" href="#observability-prometheus-and-node-apis">Observability: Prometheus and Node APIs</a></h3>
<p>Runners expose metrics and node HTTP endpoints for expectation code and debugging:</p>
<p><strong>Prometheus-compatible metrics querying (optional):</strong></p>
<ul>
<li>Runners do <strong>not</strong> provision Prometheus automatically.</li>
<li>For a ready-to-run stack, use <code>scripts/setup-observability.sh</code>:
<ul>
<li>Compose: <code>scripts/setup-observability.sh compose up</code> then <code>scripts/setup-observability.sh compose env</code></li>
<li>K8s: <code>scripts/setup-observability.sh k8s install</code> then <code>scripts/setup-observability.sh k8s env</code></li>
</ul>
</li>
<li>Provide <code>NOMOS_METRICS_QUERY_URL</code> (PromQL base URL) to enable <code>ctx.telemetry()</code> queries.</li>
<li>Access from expectations when configured: <code>ctx.telemetry().prometheus().map(|p| p.base_url())</code></li>
</ul>
<p><strong>Grafana (optional):</strong></p>
<ul>
<li>Runners do <strong>not</strong> provision Grafana automatically (but <code>scripts/setup-observability.sh</code> can).</li>
<li>If you set <code>NOMOS_GRAFANA_URL</code>, the deployer prints it in <code>TESTNET_ENDPOINTS</code>.</li>
<li>Dashboards live in <code>testing-framework/assets/stack/monitoring/grafana/dashboards/</code> for import into your Grafana.</li>
</ul>
<p><strong>Node APIs:</strong></p>
<ul>
<li>Access from expectations: <code>ctx.node_clients().validator_clients().get(0)</code></li>
<li>Endpoints: consensus info, network info, DA membership, etc.</li>
<li>See <code>testing-framework/core/src/nodes/api_client.rs</code> for available methods</li>
</ul>
<pre><code class="language-mermaid">flowchart TD
Expose[Runner exposes endpoints/ports] --&gt; Collect[Runtime collects block/health signals]
Collect --&gt; Consume[Expectations consume signals&lt;br/&gt;decide pass/fail]
Consume --&gt; Inspect[Operators inspect logs/metrics&lt;br/&gt;when failures arise]
</code></pre>
</main>
<nav class="nav-wrapper" aria-label="Page navigation">
<!-- Mobile navigation buttons -->
<a rel="prev" href="topology-chaos.html" class="mobile-nav-chapters previous" title="Previous chapter" aria-label="Previous chapter" aria-keyshortcuts="Left">
<i class="fa fa-angle-left"></i>
</a>
<a rel="next prefetch" href="part-iii.html" class="mobile-nav-chapters next" title="Next chapter" aria-label="Next chapter" aria-keyshortcuts="Right">
<i class="fa fa-angle-right"></i>
</a>
<div style="clear: both"></div>
</nav>
</div>
</div>
<nav class="nav-wide-wrapper" aria-label="Page navigation">
<a rel="prev" href="topology-chaos.html" class="nav-chapters previous" title="Previous chapter" aria-label="Previous chapter" aria-keyshortcuts="Left">
<i class="fa fa-angle-left"></i>
</a>
<a rel="next prefetch" href="part-iii.html" class="nav-chapters next" title="Next chapter" aria-label="Next chapter" aria-keyshortcuts="Right">
<i class="fa fa-angle-right"></i>
</a>
</nav>
</div>
<script>
window.playground_copyable = true;
</script>
<script src="elasticlunr.min.js"></script>
<script src="mark.min.js"></script>
<script src="searcher.js"></script>
<script src="clipboard.min.js"></script>
<script src="highlight.js"></script>
<script src="book.js"></script>
<!-- Custom JS scripts -->
<script src="theme/mermaid-init.js"></script>
</div>
</body>
</html>