lez-fuzzing/docs/mutants-not-fuzzable.md
Roman cfc415d214
fix: workflow files update
- polish documentation
2026-06-12 11:47:43 +08:00

234 lines
14 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters

This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

# Mutants Not Coverable by Fuzzing
This document catalogues the source mutations (from `just mutants-protocol`, the
"Plane B" corpus-replay mutation run over the `lee` / `common` crates) that the
**fuzzing corpus is not the right tool to catch**, together with where each one is
actually covered.
It exists to keep a clean separation between two questions that the tooling can
otherwise blur together:
- **"Does a test catch this mutant?"** — answered by the `lee` crate's own unit
tests via `cargo test` (call this **Plane A**).
- **"Does the committed fuzz corpus catch this mutant?"** — answered by
`just mutants-protocol`, which replaces `cargo test` with a fuzz-corpus replay
(`cargo fuzz run … -runs=0`) as the oracle (call this **Plane B**).
The mutants listed here are **expected Plane-B misses**. A future
`mutants-protocol` run that reports them as surviving is *not* a regression — it
is the documented, intended state.
This file is the complete registry, in **two groups**:
1. **Structurally unreachable by fuzzing** (Group 1) — mutants behind code that a
fuzzer cannot reach from raw bytes (they need a valid executing transaction or a
deliberately-misbehaving program). These were always unit-test territory.
2. **Migrated input-independent targets** (Group 2) — mutants that *were* caught by
input-independent fuzz targets (`fuzz_common_invariants`,
`fuzz_genesis_invariants`, `fuzz_system_account_protection`). Because an
input-independent target is a unit test in disguise, those targets were removed
and their invariants ported to LEZ unit tests; the mutants therefore now survive
Plane B by design.
Reconcile new `mutants-protocol` runs against this registry; only a surviving
mutant on **neither** list warrants a new corpus input.
---
## 🧭 Why fuzzing is the wrong tool for these
Fuzzing earns its keep by exploring a large, *unknown* input space to find inputs
a human wouldn't think of — malformed transactions, adversarial byte sequences,
surprising state-transition orderings. The corpus-replay oracle then re-runs those
discovered inputs cheaply as a regression net.
The mutations below live behind code that is only reachable by a **specific,
valid, semantically rich object** that random bytes essentially never synthesise:
1. **A fully-valid, executing transaction.** Reaching the post-execution
validation logic (authorization checks, claim checks, cycle limit) requires a
transaction whose signature matches its signer, whose nonce matches the
on-chain nonce, and whose program is deployed. A fuzzer mutating raw bytes
almost always breaks one of these and is rejected at the stateless/nonce gate
*before* any program runs — so the code never executes. Constructing such a
transaction is a deterministic "this exact scenario must hold" property, which
is the domain of **unit tests**, not input exploration.
2. **A deliberately-misbehaving program.** Some validator checks only fire when a
program returns malformed output (claims an account it shouldn't, mutates a
default account without claiming it, etc.). The only such programs are the
test fixtures behind `V03State::with_test_programs()` (`program_owner_changer`,
`extra_output_program`, …). They are **never deployed** in genesis or
production, so they are unreachable through the public transaction API that the
fuzzer drives — by construction, no fuzz input can exercise them.
In both cases the behaviour is pinned by deterministic unit tests in the `lee` /
`common` crates. Encoding such scenarios as **input-independent** fuzz targets
(targets that ignore their input and run a fixed battery) is an anti-pattern — it
duplicates the unit-test role, adds heavyweight zkVM work to every corpus replay,
and risks silent corpus rot, all to satisfy a metric (Plane B) better served by
documenting the boundary. `lez-fuzzing` therefore keeps **no** input-independent
targets: the public/privacy execution targets (which duplicated existing `lee`
tests) and the three genesis/common/system targets (whose invariants were ported
to new unit tests — see the companion doc) were all removed.
---
## 📋 Catalogue (Group 1 — structurally unreachable by fuzzing)
The nine mutations reported as MISSED by the `mutants-protocol` run for which
fuzzing is structurally the wrong tool, with their true coverage. Verified by
applying each mutation to the `logos-execution-zone` working tree and running the
cited tests (`RISC0_DEV_MODE=1 cargo test -p lee --lib`). (Group 2 — the migrated
input-independent-target mutants — is summarised further down.)
| # | Location | Mutation | Category | Covered by |
|---|----------|----------|----------|------------|
| 1 | `lee/state_machine/src/program.rs:21:51` | `*``/` (cycle limit `32`) | Valid-tx unit test | transfer-execution tests |
| 2 | `lee/state_machine/src/program.rs:21:51` | `*``+` (cycle limit `33 792`) | Valid-tx unit test | transfer-execution tests |
| 3 | `lee/state_machine/src/program.rs:21:58` | `*``/` (cycle limit `32 768`) | Valid-tx unit test | transfer-execution tests |
| 4 | `lee/state_machine/src/program.rs:21:58` | `*``+` (cycle limit `1 048 608`) | **Near-equivalent — genuine gap** | nothing (see below) |
| 5 | `lee/state_machine/src/validated_state_diff.rs:155:21` | `\|\|``&&` | Valid-tx unit test | transfer-execution tests |
| 6 | `lee/state_machine/src/validated_state_diff.rs:311:34` | `!=``==` | Misbehaving-program unit test | `public_changer_claimer_*` |
| 7 | `lee/state_machine/src/validated_state_diff.rs:314:20` | `==``!=` | Misbehaving-program unit test | `public_changer_claimer_*` + validity-window tests |
| 8 | `lee/state_machine/src/privacy_preserving_transaction/circuit.rs:88:32` | `>=``<` | Valid-PP-tx unit test | PP transition tests |
| 9 | `lee/state_machine/src/state.rs:335:16` | delete `!` | Valid-PP-tx unit test | PP transition tests |
### Category A — Covered by `lee` unit tests, requires a valid *executing* transaction (13, 5, 8, 9)
These fire only after a fully-valid transaction reaches real program execution.
A fuzzer's random bytes are rejected at the nonce/signature gate first, so the
corpus never reaches them; the `lee` crate pins each with a deterministic test.
- **13 (public cycle limit, the catchable variants).**
`MAX_NUM_CYCLES_PUBLIC_EXECUTION = 1024 * 1024 * 32` (= 33 554 432). A real
`authenticated_transfer` execution consumes **between 33 792 and 1 048 608**
RISC-V cycles, so any mutation lowering the limit below that range aborts
execution with *"Session limit exceeded"*.
Covered by `state::tests::transition_from_authenticated_transfer_program_invocation_*`
(and the ~66 other public-execution tests that run a transfer). Verified: limit
`33 792` → 66 tests fail.
- **5 (`||``&&` in `is_authorized`,
`validated_state_diff.rs:155`).** With `&&`, the transaction signer is no longer
treated as authorized, so a valid transfer fails with
`InvalidAccountAuthorization`. Covered by the same transfer-execution tests.
Verified: 3 of 7 `transition_from*` tests fail.
- **8 (`>=``<` in `execute_and_prove`,
`circuit.rs:88`).** With `<`, the chained-call guard fires on the first
iteration (`0 < MAX`) and proving aborts immediately with
`MaxChainedCallsDepthExceeded`. Covered by
`state::tests::transition_from_privacy_preserving_transaction_{shielded,private,deshielded}`.
Verified: 3 PP tests fail.
- **9 (delete `!` in `check_nullifiers_are_valid`,
`state.rs:335`).** Removing the `!` inverts the digest check so a *recognised*
commitment-set digest is rejected, breaking every valid privacy-preserving
transfer that spends a private input. Covered by the same PP transition tests.
Verified: 3 PP tests fail.
### Category B — Covered by `lee` unit tests, requires a *misbehaving* program (6, 7)
These guard against a program returning malformed output (modifying or claiming a
default account incorrectly). Only the test-only fixtures behind
`V03State::with_test_programs()` misbehave this way; they are never deployed, so no
fuzz input can reach this code. The `lee` crate exercises them directly.
- **6 (`!=``==`, `validated_state_diff.rs:311`)** — the
"only inspect uninitialised accounts" filter. Verified: 1 test fails under the
full `lee` suite.
- **7 (`==``!=`, `validated_state_diff.rs:314`)** — the
"skip unmodified accounts" guard. Verified: 16 tests fail, including
`state::tests::public_changer_claimer_data_change_no_claim_fails` and
`public_changer_claimer_no_data_change_no_claim_succeeds`.
> [!NOTE]
> an earlier analysis guessed 6 and 7 were *equivalent mutants*. They are
> not — they are caught by Plane A, just not reachable by Plane B. They appear
> "equivalent" only if you restrict yourself to the deployed `authenticated_transfer`
> program, which is exactly the restriction fuzzing operates under.
### Category C — The single genuine gap: near-equivalent weak mutant (4)
- **4 (`*``+` at `program.rs:21:58`, cycle limit `1 048 608`).**
Catching this would require a *single* public program execution that consumes
**more than 1 048 608 RISC-V cycles**. The `authenticated_transfer` instruction
uses fewer than that (it is caught only by limits ≤ 33 792 — see category A), and
no deployed program's single instruction reaches ~1M cycles. The difference
between the mutated limit (1.05M) and the real limit (33.5M) is therefore
**unobservable for any realistic workload**, making this a practically
equivalent / weak mutant. Verified: survives the full `lee` suite (211/211 pass).
It is not worth chasing in either plane. If a future deployed program legitimately
performs a >1M-cycle public execution, a normal execution test for that program
would catch this mutation incidentally.
---
## 🔁 Group 2 — migrated input-independent targets
These mutants used to be caught by Plane B via input-independent fuzz targets.
Those targets were removed and their invariants ported to LEZ unit tests, so the
mutants now survive Plane B by design. They are **not** structurally unreachable
like Group 1 — a fuzzer could "catch" them, but only by running a fixed scenario
that ignores its input, which is a unit test, not fuzzing.
Each port below was verified to kill its mutant (apply the mutation → run the named
test → observe a failure). Where a mutant had **no** prior unit-test coverage, the
port *added* coverage rather than merely relocating it; those are marked **(new)**.
**From `fuzz_common_invariants`:**
| Mutant | New unit test |
|---|---|
| `HashType::as_ref``Vec::leak(Vec::new())` / `vec![0]` / `vec![1]` | `common::tests::as_ref_returns_exact_inner_bytes` (`common/src/lib.rs`) **(new)** |
| `BasicAuth` `FromStr` delete `!` in `.filter(\|p\| !p.is_empty())` | `common::config::tests::parse_empty_password_is_none` (+ `parse_preserves_non_empty_password`) **(new)** |
| `Program::elf` → empty / `vec![0]` / `vec![1]` | `program::tests::elf_returns_the_program_bytecode_constant` (was already caught incidentally) |
| `Proof::into_inner` / `from_inner``vec![]` / `vec![0]` / `vec![1]` | `…::circuit::tests::proof_inner_roundtrip` **(new)** |
| `Message::into_bytecode``vec![]` / `vec![0]` / `vec![1]` | `program_deployment_transaction::message::tests::bytecode_roundtrip` **(new)** |
**From `fuzz_genesis_invariants`** (all in `lee/state_machine/src/state.rs`):
| Mutant | New unit test |
|---|---|
| `system_faucet_account``Default` / delete `balance` / delete `program_owner` | `state::tests::genesis_system_accounts_have_expected_contents` **(new)** |
| `system_bridge_account``Default` / delete `program_owner` | `genesis_system_accounts_have_expected_contents` **(new)** |
| `commitment_set_digest``Default` | `state::tests::genesis_commitment_set_digest_differs_from_empty_state` **(new)** |
| `add_pinata_token_program` delete `program_owner` / `data` | `state::tests::add_pinata_token_program_sets_non_default_owner_and_data` **(new)** |
| `system_faucet_account_id` / `system_bridge_account_id``Default` | `genesis_system_accounts_have_expected_contents` + `system_account_ids_are_distinct_and_non_default` (was already caught) |
**From `fuzz_system_account_protection`:**
| Mutant | New unit test |
|---|---|
| `validate_doesnt_modify_account` `!=``==` (`common/src/transaction.rs`) | `common::transaction::tests::validate_on_state_rejects_modifying_a_system_account` **(new)** |
| `public_diff``HashMap::new()` (`lee/.../validated_state_diff.rs`) | `validated_state_diff::tests::public_diff_reflects_a_successful_transfer` (+ the `validate_on_state_rejects…` test) **(new)** |
| `system_*_account_id` non-default / distinct | `common::transaction::tests::system_account_ids_are_distinct_and_non_default` (was already caught) |
---
## ✅ Re-verifying
From `logos-execution-zone/` with the fuzzing repo checked out as a sibling:
```bash
export RISC0_DEV_MODE=1
# Pick a mutation from a table above, apply it to the cited line, then run the
# owning crate's tests (Plane A). A real failure ⇒ unit tests cover it.
cargo test -p lee --lib # lee-owned mutants
cargo test -p common # common-owned mutants (Group 2)
git checkout -- <mutated-file> # always revert
```
A mutation that makes `cargo test` fail is covered by Plane A and belongs in this
registry; a mutation that the corpus replay (`just mutants-protocol`) catches
belongs in the corpus instead. Across both groups, mutation #4 (the near-equivalent
cycle-limit weak mutant) is the only one caught by **neither** plane.
> [!TIP]
> when reverting, prefer reverse-editing only the mutated line rather than
> `git checkout -- <file>` if you have uncommitted unit tests in the same file —
> a whole-file checkout would discard them too.