From c3daa9897d75e20615c5e2eca437e705ccba794c Mon Sep 17 00:00:00 2001 From: moudyellaz Date: Tue, 19 May 2026 18:54:11 +0200 Subject: [PATCH] docs(e2e_bench): drop machine table and stale benchmark numbers --- docs/benchmarks/e2e_bench.md | 79 ++++-------------------------------- 1 file changed, 7 insertions(+), 72 deletions(-) diff --git a/docs/benchmarks/e2e_bench.md b/docs/benchmarks/e2e_bench.md index 2f2a0a7a..278dab9b 100644 --- a/docs/benchmarks/e2e_bench.md +++ b/docs/benchmarks/e2e_bench.md @@ -2,16 +2,7 @@ End-to-end LEZ scenarios driven through the wallet against an in-process sequencer + indexer wired to an external Bedrock node. Times each step and records borsh sizes per block, split by tx variant. -## Machine - -| Field | Value | -|---|---| -| Chip | Apple M2 Pro (8P+4E) | -| RAM | 16 GB | -| OS | macOS 15.5 | -| Rust | 1.94.0 | -| Risc0 zkVM | 3.0.5 | -| Profile | release | +No numeric tables here yet. Absolute wall time and block sizes depend heavily on the bedrock config (block cadence and confirmation depth) and on dev-mode vs real proving; re-run the bench locally to get numbers for your own setup. Canonical numbers will be added once the bench runs against the standard configuration. ## Scenarios @@ -25,7 +16,7 @@ End-to-end LEZ scenarios driven through the wallet against an in-process sequenc ## Dev-mode vs real-proving -`RISC0_DEV_MODE=1` makes the prover emit stub receipts instead of running the recursive STARK pipeline. The table compares each quantity in **dev mode vs real proving** for the two classes of scenarios: +`RISC0_DEV_MODE=1` makes the prover emit stub receipts instead of running the recursive STARK pipeline. The table compares each quantity in dev mode vs real proving for the two classes of scenarios: | Quantity | Public-only scenarios (dev → real) | PPE-bearing scenarios (dev → real) | |---|---|---| @@ -33,71 +24,14 @@ End-to-end LEZ scenarios driven through the wallet against an in-process sequenc | `public_tx_bytes` | same in both modes | same in both modes | | `ppe_tx_bytes` | n/a | dev ≈ 2 KB stub → real ≈ 225 KB (matches `S_agg` from cycle_bench) | | `block_bytes` | same in both modes | real adds ~225 KB per PPE tx in the block | -| `bedrock_finality_ms` | same in both modes | same in both modes (L1 cadence, not LEZ prover) | +| `bedrock_finality_s` | same in both modes | same in both modes (L1 cadence, not LEZ prover) | | Blocks captured | similar in both modes | real captures more empty clock-only ticks that fill prove wall-time | -Tables below report dev-mode for all five scenarios. Real-proving numbers are included for `amm_swap_flow` (representative all-public) and `private_chained_flow` (representative chained-private flow); the public-only scenarios converge between modes within run-to-run jitter, so a full real-proving sweep is not run here. +Numbers are intentionally omitted in this document until the canonical run lands. Public-only scenarios converge between modes within run-to-run jitter; the qualitative differences are captured by the table above. -## Step latencies — dev mode (`RISC0_DEV_MODE=1`) +## Methodology -Per-scenario wall time and Bedrock L1-finality latency for the closing tip. - -| Scenario | total_ms | total_s | bedrock_finality_ms | bedrock_finality_s | -|---|---:|---:|---:|---:| -| token_onboarding | 60,808 | 60.81 | 24,593 | 24.59 | -| amm_swap_flow | 162,058 | 162.06 | 19,210 | 19.21 | -| multi_recipient_fanout | 222,206 | 222.21 | 16,020 | 16.02 | -| private_chained_flow | 80,700 | 80.70 | 23,963 | 23.96 | -| parallel_fanout | 244,387 | 244.39 | 23,770 | 23.77 | - -Total dev-mode wall time across all five: 912.9 s. - -## Step latencies — real proving (selected scenarios) - -| Scenario | total_ms | total_s | bedrock_finality_ms | bedrock_finality_s | Δ vs dev | -|---|---:|---:|---:|---:|---:| -| amm_swap_flow | 162,437 | 162.44 | ~19,210 | ~19.21 | ~0 (all-public) | -| private_chained_flow | 354,843 | 354.84 | 23,778 | 23.78 | +274.14 s (≈ 91 s per PPE step × 3) | - -Per-step breakdown for `private_chained_flow` in real proving: - -| Step | submit_ms | inclusion_ms | total_ms | total_s | -|---|---:|---:|---:|---:| -| token_new_fungible (public) | 1.1 | 20,276.0 | 20,291.2 | 20.29 | -| shielded_transfer (PPE) | 111,683.3 | 1.0 | 111,730.4 | 111.73 | -| deshielded_transfer (PPE) | 111,454.7 | 1.1 | 111,511.2 | 111.51 | -| private_to_private (PPE) | 111,237.0 | 1.1 | 111,293.0 | 111.29 | - -PPE steps move the cost from `inclusion_ms` (waiting for the next sealed block) to `submit_ms` (the wallet itself proving the PPE circuit before sending). Each PPE prove is ≈ 111 s on this CPU. - -## Block + tx sizes (borsh) — dev mode - -Per scenario, every produced block is fetched via `getBlock(BlockId)` and serialized with `borsh::to_vec(&Block)`. Each transaction is serialized individually and counted by variant. The empty clock-only ticks at `min` give the per-block fixed-cost baseline (≈ 334 bytes across all scenarios). - -| Scenario | blocks | block_bytes (mean) | block_bytes (min..max) | public_tx (mean / n) | ppe_tx (mean / n) | -|---|---:|---:|---|---:|---:| -| token_onboarding | 6 | 881 | 334..2,890 | 206 / 8 | 2,556 / 1 | -| amm_swap_flow | 16 | 553 | 334..1,011 | 248 / 24 | n/a | -| multi_recipient_fanout | 22 | 513 | 334..707 | 221 / 33 | n/a | -| private_chained_flow | 8 | 1,399 | 334..3,565 | 177 / 9 | 2,715 / 3 | -| parallel_fanout | 24 | 646 | 334..3,904 | 248 / 45 | n/a | - -## Block + tx sizes (borsh) — real proving - -| Scenario | blocks | block_bytes (mean) | block_bytes (min..max) | public_tx (mean / n) | ppe_tx (mean / n) | -|---|---:|---:|---|---:|---:| -| amm_swap_flow | 16 | 553 | 334..1,011 | 248 / 24 | n/a | -| private_chained_flow | 35 | 19,692 | 334..226,578 | 159 / 36 | 225,728 / 3 | - -`amm_swap_flow` is byte-identical between dev and real (no proof payload). `private_chained_flow`'s `ppe_tx_bytes` matches the cycle_bench `S_agg` measurement (≈ 225 KB borsh InnerReceipt). The `block_bytes` max (226,578) is the block containing the largest PPE transaction. - -## Findings - -- Public-only scenarios converge between dev mode and real proving in both latency and byte counts. Either mode is suitable to characterize them. -- PPE transactions are ≈ 225 KB on the wire in real proving, dominated by the outer succinct proof. Dev mode emits a ≈ 2 KB stub that does not represent the L1 payload — fee-model storage gas inputs must come from a real-proving run. -- Per-PPE-step prove cost on M2 Pro CPU is ≈ 110-120 s, paid on the wallet side at submit time (not on the sequencer). For a single-program chained flow the cost stacks linearly. -- Empty clock-only ticks set the per-block fixed-cost baseline at ≈ 334 bytes across all scenarios and both modes. -- Bedrock L1 finality stays around 20 s regardless of proving mode, because finality is paced by L1 cadence, not the LEZ prover. +Per scenario, every produced block is fetched via `getBlock(BlockId)` and serialized with `borsh::to_vec(&Block)`. Each transaction is serialized individually and counted by variant. Empty clock-only ticks give the per-block fixed-cost baseline. Wall time is captured per step (submit + inclusion + wallet sync) and per scenario (setup + steps + closing bedrock finality wait). ## Reproduce @@ -122,4 +56,5 @@ JSON output: `target/e2e_bench_dev.json` / `target/e2e_bench_prove.json` (suffix - Dev-mode `ppe_tx_bytes` and PPE-step latencies are not representative of production; use real-proving numbers for any fee-model input that touches the storage or prover-cost components. - Single-host run, no GPU acceleration. Real-proving on production prover hardware will move per-step latencies by orders of magnitude; byte counts will not change. - Bedrock running locally; no real network latency between sequencer and Bedrock. +- Bedrock L1 finality (`bedrock_finality_s`) is set by the bedrock config in `LEZ_BEDROCK_CONFIG_DIR` (block cadence × confirmation depth). Different configs will shift `bedrock_finality_s` materially. - Some scenarios share account state via the same wallet; this is intentional (mirrors `integration_tests::TestContext`) and not a realistic multi-wallet workload.