From 15ad227ecc34a06659020769c2c6b5f1c2ea4e8a Mon Sep 17 00:00:00 2001
From: Roman <zajic@zajic.net>
Date: Mon, 20 Apr 2026 12:45:35 +0800
Subject: [PATCH] chore: input for further project development

---
 colocated_vs_separated.md          | 108 ++++++++++++++++++++++
 current_vs_alternative_approach.md | 144 +++++++++++++++++++++++++++++
 2 files changed, 252 insertions(+)
 create mode 100644 colocated_vs_separated.md
 create mode 100644 current_vs_alternative_approach.md

diff --git a/colocated_vs_separated.md b/colocated_vs_separated.md
new file mode 100644
index 0000000..4425b0d
--- /dev/null
+++ b/colocated_vs_separated.md
@@ -0,0 +1,108 @@
+## Co-located vs. Separate Repository for LEZ Fuzzing
+
+### The Core Problem with the Current Setup
+
+[`docs/fuzzing.md:275`](docs/fuzzing.md:275) explicitly acknowledges the critical risk:
+
+> "There is no submodule pin — `lez-fuzzing` reads `../logos-execution-zone` as checked out."
+
+This means the two repositories can silently diverge. A LEZ API change will break `fuzz/Cargo.toml`'s path dependencies (`path = "../../logos-execution-zone/nssa"`) without any automated guard. A developer with a stale LEZ checkout will fuzz the wrong code version.
+
+---
+
+### What Co-location Would Look Like
+
+The standard `cargo fuzz` convention places the fuzz workspace **inside** the target repo:
+
+```
+logos-execution-zone/
+├── nssa/
+├── common/
+├── fuzz_props/          ← moved in as optional workspace member
+│   └── src/
+├── fuzz/                ← standard cargo fuzz location
+│   ├── Cargo.toml       ← [workspace] breaks out of parent workspace
+│   ├── rust-toolchain.toml   ← pins nightly for this sub-workspace only
+│   └── fuzz_targets/
+└── Cargo.toml           ← parent workspace (stable toolchain)
+```
+
+The `[workspace]` declaration in [`fuzz/Cargo.toml`](fuzz/Cargo.toml:11) already does exactly this break-out — the only structural change is moving the directory into LEZ.
+
+---
+
+### Detailed Trade-off Analysis
+
+#### Benefits of Co-location (moving into `logos-execution-zone/`)
+
+| Benefit | Detail |
+|---|---|
+| **Zero version drift** | Fuzz targets and production code are in the same commit graph — they are always in sync by construction |
+| **Atomic API changes** | A PR that renames a LEZ method updates the fuzz target in the same diff; currently a LEZ PR can silently break `lez-fuzzing` |
+| **Single clone onboarding** | Currently requires cloning two repos in an exact directory layout ([`docs/fuzzing.md:29-37`](docs/fuzzing.md:29)); co-location needs one |
+| **Standard convention** | `cargo fuzz init` places `fuzz/` inside the target repo; this is the Rust ecosystem standard (tokio, rustls, serde all do this) |
+| **Feature-gate access** | [`docs/fuzzing.md:272`](docs/fuzzing.md:272) notes that `cfg(any(test, feature = "fuzzing"))` guards on `V03State` are needed to expose internal APIs for fuzzing; these work naturally within the same workspace but require cross-repo feature flag coordination when separated |
+| **LEZ CI enforces fuzz compilation** | `cargo fuzz build` runs on every LEZ PR; currently a breaking LEZ change is only discovered when someone runs `lez-fuzzing` separately |
+| **Simpler path dependencies** | `path = "../../logos-execution-zone/nssa"` becomes `path = "../nssa"` — no sibling-directory assumption |
+
+#### Costs/Risks of Co-location
+
+| Cost | Severity | Mitigation |
+|---|---|---|
+| **Nightly toolchain in LEZ CI** | Medium | Place `fuzz/rust-toolchain.toml` specifying nightly — only the `fuzz/` sub-workspace uses it; stable toolchain unchanged for all production code |
+| **Corpus files in LEZ history** | Low | The ~150 corpus files are small binary blobs (~30–1600 bytes each); negligible `.git` impact. Alternatively, store corpus in a separate branch or use `cargo fuzz` corpus fetch from a CI artifact cache |
+| **Fuzzing noise in main repo PRs** | Low | Fuzz targets live in `fuzz/fuzz_targets/` which is outside `src/` — reviewers can ignore them |
+| **Security audit scope creep** | Low | Auditors can exclude `fuzz/` and `fuzz_props/` explicitly; fuzzing code is dev-only |
+
+#### Benefits of Staying Separate (current approach)
+
+| Benefit | Applicability |
+|---|---|
+| **Independent release cadence** | Valid during initial development; becomes less important as LEZ stabilises |
+| **Clean LEZ commit history** | Corpus additions and fuzzing experiments don't appear in LEZ history |
+| **Separate CI billing** | Fuzzing CI minutes billed to `lez-fuzzing` repo, not LEZ repo |
+
+#### Costs of Staying Separate
+
+| Cost | Current evidence |
+|---|---|
+| **Version drift** | Explicitly flagged as a known limitation in [`docs/fuzzing.md:275`](docs/fuzzing.md:275) with no automated enforcement |
+| **Two-repo onboarding friction** | Requires exact sibling directory layout; documented but error-prone |
+| **Broken fuzz builds go undetected** | A LEZ refactor that breaks fuzz targets compiles fine in LEZ CI and is only caught when `lez-fuzzing` is run separately |
+| **Cross-repo `cfg` feature coordination** | [`docs/fuzzing.md:272`](docs/fuzzing.md:272) requires adding `cfg(any(test, feature = "fuzzing"))` guards in LEZ — coupling that has no enforcement mechanism across repos |
+
+---
+
+### Architectural Options
+
+```mermaid
+graph TD
+    A[Current: separate lez-fuzzing repo] -->|Version drift risk| B[Option 1: fuzz/ inside LEZ]
+    A -->|Keep independence| C[Option 2: Git submodule]
+    A -->|Minimal change| D[Option 3: CI cross-repo trigger]
+
+    B -->|Standard cargo fuzz convention| B1["logos-execution-zone/fuzz/\n+ logos-execution-zone/fuzz_props/"]
+    C -->|LEZ embeds lez-fuzzing| C1["logos-execution-zone/ contains\nlez-fuzzing/ as submodule"]
+    D -->|LEZ PR triggers lez-fuzzing CI| D1["lez-fuzzing CI runs on\nevery LEZ commit via workflow_call"]
+```
+
+**Option 1 — Co-location (recommended)**: Move `fuzz/` and `fuzz_props/` into `logos-execution-zone/`. Standard convention, eliminates version drift, simplest long-term maintenance. Nightly toolchain scoped to `fuzz/rust-toolchain.toml`.
+
+**Option 2 — Git submodule**: `logos-execution-zone` embeds `lez-fuzzing` as a submodule. Preserves repo separation and independent history but adds submodule complexity (detached HEAD states, `git submodule update` friction). Not recommended — submodules are widely considered operationally painful.
+
+**Option 3 — CI cross-repo trigger**: Keep repos separate but add a GitHub Actions `workflow_call` or `repository_dispatch` that runs `lez-fuzzing` CI on every `logos-execution-zone` push. This catches compilation breakage early without merging histories. Lower migration cost than Option 1, but does not solve the onboarding problem or the `cfg` feature-gate coordination problem.
+
+---
+
+### Recommendation
+
+**Co-locate (Option 1)** for a project at this stage. The version drift problem is real and already documented; the `fuzz/Cargo.toml` sub-workspace pattern already handles nightly toolchain isolation; and the `fuzz_props` crate with its `ProtocolInvariant` framework belongs logically with the protocol it tests.
+
+The migration is low-effort:
+1. Move `fuzz/` and `fuzz_props/` into `logos-execution-zone/`.
+2. Update path dependencies from `path = "../../logos-execution-zone/nssa"` to `path = "../nssa"`.
+3. Add `fuzz/rust-toolchain.toml` pinning nightly.
+4. Add `cargo fuzz build` smoke step to LEZ's CI workflow.
+5. Archive or redirect `lez-fuzzing` with a pointer to the new location.
+
+The only scenario where staying separate remains preferable is if the LEZ team explicitly wants fuzzing CI costs billed separately and is disciplined about running `just update-lez` and rebuilding before every fuzzing session — which the current documentation already requires but provides no enforcement for.
\ No newline at end of file
diff --git a/current_vs_alternative_approach.md b/current_vs_alternative_approach.md
new file mode 100644
index 0000000..c711099
--- /dev/null
+++ b/current_vs_alternative_approach.md
@@ -0,0 +1,144 @@
+# Alternative Approaches vs. Current Implementation
+
+## What the Current Project Does
+
+The `lez-fuzzing` repository is a **coverage-guided, structured mutation fuzzing system** built on **cargo-fuzz / libFuzzer**, operating as a standalone companion to the Logos Execution Zone (LEZ) codebase. Its key design pillars:
+
+| Pillar | How it is realised |
+|---|---|
+| Coverage guidance | LLVM libFuzzer instruments every branch; mutations steered toward uncovered code |
+| Structured inputs | [`fuzz_props::arbitrary_types`](fuzz_props/src/arbitrary_types.rs) wraps all LEZ transaction types with the `Arbitrary` trait |
+| Rich generators | [`fuzz_props::generators`](fuzz_props/src/generators.rs) adds `proptest` strategies for pathological sequences, phantom-account attacks, overflow amounts, replay sequences |
+| Protocol invariants | [`fuzz_props::invariants`](fuzz_props/src/invariants.rs) expresses zero-mutation-on-rejection and replay-rejection as reusable `ProtocolInvariant` objects |
+| ZK-awareness | `RISC0_DEV_MODE=1` stubs out `risc0-zkvm` proofs, enabling ~5 000–200 000 exec/sec depending on target |
+| 9 dedicated targets | Covers encoding, signature verification, stateless checks, state transitions, state diffs, replay prevention, validate/execute consistency, block verification |
+| CI integration | GitHub Actions smoke, regression, and performance-baseline jobs run on every PR |
+| Pre-seeded corpus | Hundreds of minimised seed files in [`fuzz/corpus/`](fuzz/corpus/) ensure regressions are caught instantly |
+
+---
+
+## Alternative Approaches
+
+### 1. AFL++ (American Fuzzy Lop++)
+
+**What it is**: A fork of the original AFL with structured binary mutation, QEMU/Unicorn modes, and custom mutators. Corpus-compatible with libFuzzer.
+
+| Dimension | AFL++ | Current (libFuzzer) |
+|---|---|---|
+| Mutation engine | Multiple (havoc, splice, custom) | Single (libFuzzer) |
+| Structured mutators | `afl-fuzz -c` custom mutators possible | `arbitrary` trait |
+| Parallel scaling | `--parallel` native, multi-machine via `afl-whatsup` | `-jobs=N -workers=N` flags |
+| Corpus sharing | Same binary files — **zero migration cost** | (source) |
+| CI ergonomics | Requires AFL++ binary in CI image | `cargo install cargo-fuzz` only |
+| Rust integration | `cargo-afl` | `cargo-fuzz` |
+
+**Decision-maker view**: AFL++ and libFuzzer find *different* bugs because they use different mutation heuristics. Running both on the same corpus is the industry-standard "belt and suspenders" approach. [`docs/fuzzing.md`](docs/fuzzing.md:273) already lists `just fuzz-afl` as planned future work. **Incremental cost is low** — the same [`fuzz_props`](fuzz_props/src/lib.rs) crate and seed corpus work unchanged.
+
+---
+
+### 2. Honggfuzz
+
+**What it is**: Google's fuzzer, available via `cargo-hfuzz`. Uses hardware performance counters for coverage in addition to software instrumentation.
+
+| Dimension | Honggfuzz | Current (libFuzzer) |
+|---|---|---|
+| Coverage model | HW perf counters + SW instrumentation | SW instrumentation only |
+| Crash deduplication | Built-in | Manual `cargo fuzz tmin` |
+| macOS support | Partial (no HW counters on Apple Silicon) | Full |
+| Parallel | Native thread-based | `-jobs` flag |
+
+**Decision-maker view**: On x86-64 Linux CI runners, Honggfuzz's hardware coverage signal finds shallow loops and conditional jumps that software instrumentation misses. On macOS (this project's primary dev platform), it degrades to software-only mode — identical to libFuzzer. **Medium implementation cost**, moderate incremental benefit on Linux CI.
+
+---
+
+### 3. Property-Based Testing Only (proptest / quickcheck — no libFuzzer)
+
+**What it is**: Pure property testing without coverage guidance. The project already uses `proptest` strategies inside [`fuzz_props::generators`](fuzz_props/src/generators.rs); the question is whether this alone is sufficient.
+
+| Dimension | proptest-only | Current (libFuzzer + proptest) |
+|---|---|---|
+| Coverage guidance | ❌ None | ✅ LLVM-driven |
+| Input shrinking | ✅ Automatic, human-readable | ❌ Manual `cargo fuzz tmin` |
+| Determinism | ✅ Seed-reproducible | ❌ Inherently non-deterministic |
+| CI integration | ✅ Standard `cargo test` | Needs separate `cargo fuzz` step |
+| Depth of exploration | Shallow (combinatorial) | Deep (mutation chains) |
+
+**Decision-maker view**: proptest is already present and valuable for human-readable regression tests. It **cannot replace** libFuzzer for deep protocol bugs — coverage guidance is what lets libFuzzer reach the 20th nested conditional in Borsh decoding. The two are **complementary, not substitutes**. Dropping libFuzzer and keeping only proptest would roughly halve the expected bug-finding rate on encoding and state-transition targets.
+
+---
+
+### 4. Differential Fuzzing (Sequencer vs. Replayer)
+
+**What it is**: Feed identical inputs to two independent implementations of the same interface and assert identical outputs. Already **partially implemented** in [`fuzz_validate_execute_consistency.rs`](fuzz/fuzz_targets/fuzz_validate_execute_consistency.rs) — it compares [`validate_on_state`](fuzz/fuzz_targets/fuzz_validate_execute_consistency.rs:35) vs. [`execute_check_on_state`](fuzz/fuzz_targets/fuzz_validate_execute_consistency.rs:39).
+
+The extension noted in [`docs/fuzzing.md`](docs/fuzzing.md:274) is:
+
+> Feed the same block to `SequencerCore` and `indexer_core` and assert identical state roots.
+
+| Dimension | Differential target | Single-oracle target |
+|---|---|---|
+| Bug class | Implementation divergence | Crash / invariant violation |
+| Requires two implementations | ✅ | ❌ |
+| Implementation cost | High (replayer in scope) | Low |
+| Value for protocol correctness | Very high | High |
+
+**Decision-maker view**: This is the **highest-value extension** to the current project. The `fuzz_validate_execute_consistency` target proves the pattern works. A sequencer-vs-replayer target would catch consensus-breaking state root divergence — a class of bug no single-oracle target can detect. Estimated cost: 1–2 engineer-weeks.
+
+---
+
+### 5. Formal Verification (TLA+, Coq, Isabelle/HOL)
+
+**What it is**: Mathematical proof that the protocol model satisfies all invariants for *all* possible inputs, not just sampled ones.
+
+| Dimension | Formal verification | Current fuzzing |
+|---|---|---|
+| Coverage | 100 % (exhaustive proof) | Probabilistic |
+| Implementation cost | Very high (months–years) | ✅ Already built |
+| Maintenance cost | Very high (proofs break on refactors) | Low (re-run fuzzer) |
+| ZK circuit coverage | Can cover RISC0 guest formally | Not applicable (mocked out) |
+
+**Decision-maker view**: Formal verification and fuzzing are **not substitutes** for a blockchain protocol — they address different threat models. Fuzzing finds concrete exploitable bugs quickly; formal methods prove absence of entire bug classes. The current codebase complexity (ZK proofs, Borsh encoding, state machine) makes formal verification very expensive. **Recommended only for core invariants** (balance conservation, replay prevention) as a long-term supplement, not a replacement.
+
+---
+
+### 6. Mutation Testing (cargo-mutants)
+
+**What it is**: Systematically modifies the production source code and checks whether existing tests kill the mutant. A surviving mutant indicates a coverage gap in the assertions.
+
+| Dimension | Mutation testing | Current fuzzing |
+|---|---|---|
+| What it measures | Quality of *existing tests* | Finds *new bugs* |
+| Execution time | Slow (recompile per mutation) | Continuous |
+| Output | Surviving mutants = assertion gaps | Crash artifacts |
+
+**Decision-maker view**: `cargo-mutants` would **audit the invariant assertions themselves** — revealing if [`assert_invariants()`](fuzz_props/src/invariants.rs:72) has gaps (and it currently does, as [`StateIsolationOnFailure`](fuzz_props/src/invariants.rs:38) and [`ReplayRejection`](fuzz_props/src/invariants.rs:59) are stubs). This is a **complementary quality gate**, not a fuzzing replacement. Low cost (~1 day), highly useful before an external security audit.
+
+---
+
+## Summary Comparison Matrix
+
+| Approach | Bug-finding depth | CI cost | Impl. cost | Complements current? | Recommended action |
+|---|---|---|---|---|---|
+| **Current (cargo-fuzz/libFuzzer)** | High | Medium | ✅ Done | — | Maintain & expand |
+| AFL++ | High (different bugs) | Medium | Low | ✅ Yes | Add `just fuzz-afl` (already planned) |
+| Honggfuzz | High on Linux | Medium | Medium | ✅ Yes | Add for Linux CI only |
+| proptest-only | Low–medium | Low | ✅ Done | Already present | Keep as unit-test layer |
+| Differential (sequencer/replayer) | Very high (new bug class) | Medium | Medium–high | ✅ Yes | **Priority extension** |
+| Formal verification | Exhaustive (selected invariants) | Very high | Very high | ✅ Yes | Long-term supplement |
+| Mutation testing (`cargo-mutants`) | Measures assertion quality | High | Low | ✅ Yes | Pre-audit quality gate |
+
+---
+
+## Decision-maker Recommendations
+
+The current implementation is **well-architected and production-ready** for a protocol at this stage. Its [`fuzz_props`](fuzz_props/src/lib.rs) crate, typed `Arbitrary` wrappers, and `ProtocolInvariant` framework provide the right abstractions to add new targets and invariants incrementally.
+
+**Highest-ROI next steps, in priority order:**
+
+1. **Complete the stub invariants** in [`fuzz_props/src/invariants.rs`](fuzz_props/src/invariants.rs:41) — [`StateIsolationOnFailure`](fuzz_props/src/invariants.rs:38) and [`ReplayRejection`](fuzz_props/src/invariants.rs:59) are currently no-ops. This costs less than one day and immediately hardens all existing targets.
+
+2. **Add the sequencer-vs-replayer differential target** — highest new bug-finding value, unique to this protocol's architecture, already identified in [`docs/fuzzing.md`](docs/fuzzing.md:274).
+
+3. **Add AFL++ as a parallel fuzzing lane** (`just fuzz-afl`) — zero corpus migration cost, discovers different mutation paths through the same targets as libFuzzer.
+
+4. **Add `cargo-mutants`** before any external security audit — proves the invariant assertions in [`fuzz_props/src/invariants.rs`](fuzz_props/src/invariants.rs) are actually capable of catching the bugs they claim to detect.
\ No newline at end of file