r4bbit 9d5eea2b41 test(twap): cover RecordTick end-to-end and add zkVM cycle benchmark
Add the first end-to-end coverage of the oracle's RecordTick path, which
previously existed only as native unit tests:

- amm_twap_observations_accumulate_across_swaps_and_yield_time_weighted_average:
  drives swaps + RecordTick across simulated time, then checks the cumulative
  accumulator and the consulted time-weighted average.
- amm_twap_record_tick_sampling_guard_skips_calls_below_min_interval: exercises
  the min-interval sampling guard through the real instruction path.

Running RecordTick through the zkVM surfaced that committing the oracle-owned
~100 KiB observations account costs ~50.9M cycles — over the 2^25 (~33.5M)
public-execution limit — so the instruction aborted on chain. Reduce
OBSERVATIONS_CAPACITY 6396 -> 2048 (~16.8M cycles, ~half the limit); window
coverage is unchanged, only sampling resolution.

Add programs/benchmark, a standalone crate (excluded from the workspace so CI
and the Makefile skip it) that runs the guest ELF through the RISC Zero
executor and reports the per-instruction cycle split, reproducing the on-chain
pass/fail at the limit. Its cost-vs-capacity sweep still spans to 6396, guarding
against bumping capacity back into the over-budget range.
2026-06-23 16:30:18 +02:00
..

TWAP Oracle — RecordTick and the zkVM Cycle Budget

Status: fixed. OBSERVATIONS_CAPACITY was reduced from 6396 to 2048, bringing RecordTick to ~16.8M cycles (~half the limit). The rest of this document is the diagnosis that led there; the numbers labelled "cap 6396" are the pre-fix measurements that motivated the change.

At capacity 6396, RecordTick could not run on chain: a single call over a full-size observations buffer cost ~50.9M zkVM cycles, over the ~33.5M (2²⁵) public-execution limit, and the runtime aborted it. This document explains what a cycle is, why the limit exists, what the measurements actually show (the cause is not what it first looks like), and the fix.

It is backed by a runnable benchmark — programs/benchmark/tests/twap_cycle_bench.rs — and by two end-to-end tests in tests/amm.rs (amm_twap_observations_accumulate_*, amm_twap_record_tick_sampling_guard_*).


1. What a "cycle" is

LEZ programs run inside the RISC Zero zkVM — a proven RISC-V (rv32im) virtual machine. A cycle is one step of that virtual CPU: essentially one executed RISC-V instruction (most cost 1 cycle, a few cost more, plus some fixed overhead). "50M cycles" ≈ "~50 million instruction-steps," not a unit of wall-clock time.

The prover turns execution into an arithmetic trace with one row per cycle, and proving cost — time and memory — scales roughly linearly with cycle count. So the cap is fundamentally an economic/latency bound on how large a computation the network will prove for one public call.

The executor reports cycles in a few buckets (visible in the benchmark output):

  • user cycles — the RISC-V instructions the guest logic actually runs.
  • paging cycles — the zkVM's memory is a Merkle-committed image of ~1 KiB pages; touching a page hashes it in/out. Large buffers page many pages. (In practice ~34% here.)
  • reserved cycles — padding up to the proof system's power-of-two boundaries.

2. The limit

nssa sets the executor's session_limit to MAX_NUM_CYCLES_PUBLIC_EXECUTION = 1024 * 1024 * 32 = 33_554_432 = 2²⁵ cycles (nssa/src/program.rs). Every program invocation — public or chained — runs under this cap via the single Program::execute path; when a run reaches it, the executor aborts with Session limit exceeded: 33554432 >= 33554432, which nssa surfaces as ProgramExecutionFailed.

3. What the measurements show

All numbers below are total cycles from the RISC Zero executor running the real twap_oracle guest ELF, reproducing Program::execute's exact input encoding (see the benchmark). They were cross-checked against accounts extracted from a live V03State, which match to the cycle.

                                     total cycles     vs 2²⁵ limit
CreatePriceObservations (cap 6396)     17_301_504      ok
RecordTick, owned account  (cap 6396)  50_855_936      OVER  ← aborts on chain
RecordTick, UNOWNED account (cap 6396) 17_825_792      ok

The first surprise: RecordTick's Borsh work is not the problem. With the observations account left uninitialized (program_owner = 0) the same instruction over the same 102,388-byte buffer costs only 17.8M — comfortably under budget. The full deserialize + mutate + reserialize round trip is the cheap part.

The actual driver is account ownership. The benchmark holds everything else fixed and flips only the observations account's program_owner:

obs_owner = 0       → 17_825_792
obs_owner = oracle  → 50_855_936   (+33.0M, 2.9×)

(The current_tick account's owner makes no difference — it's only 12 bytes.)

Why ownership costs ~33M cycles

When a program touches an initialized (owned) account, the runtime cryptographically binds that account's state into the proof — on both sides: the pre-state it read and the post-state it wrote. That work is proportional to the account's serialized size. For a max-size ~100 KiB account it is ~1617M cycles per side.

That single fact explains the whole picture:

instruction owned ~100 KiB account is… owned commits total
CreatePriceObservations written only (input is uninitialized) 1× (write) 17.3M
RecordTick read and written 2× (read + write) 50.9M

CreatePriceObservations pays the commit once and fits; RecordTick pays it twice and blows the budget. Everything else (Borsh, paging, the tick arithmetic) is secondary.

Cost scales linearly with capacity

OBSERVATIONS_CAPACITY = 6396 exists to fill the 100 KiB account ceiling. Because both the commit cost and the Borsh floor scale with account size, RecordTick's cost scales ~linearly with capacity:

RecordTick (owned)   total cycles     vs 2²⁵
cap   512              4_259_840       ok
cap  1024              8_388_608       ok
cap  2048             16_777_216       ok
cap  4096             32_538_624       ok  (only ~3% headroom)
cap  6396             50_855_936       OVER

The largest power-of-two capacity that fits is 4096, but only barely. 2048 (≈16.8M, ~2× headroom) or smaller is a safe target.

Empty buffers cost exactly the same

Fill level is irrelevant. An effectively-empty buffer (just created, write_index = 1, one used entry) and an all-non-zero buffer of the same capacity cost identical cycles:

empty buffer  total = 50_855_936
filled buffer total = 50_855_936   (cap 6396, both 102,388 bytes)

The account is allocated at full size the moment CreatePriceObservations runs (it writes all OBSERVATIONS_CAPACITY entries up front), so a brand-new, "empty" feed already pays the full cost. The cost tracks allocated size, not how much meaningful data is stored. Reducing capacity is therefore the only lever — you can't dodge it by keeping the buffer sparse.

The overhead is per-byte, not a flat per-account tax

Measuring owned vs. uninitialized across account sizes, the owned-account overhead is linear in size at ~320 cycles per byte (for the read + write the account gets in RecordTick; ~160/byte per side):

  cap    bytes |       owned     unowned       delta   cyc/byte
   32      564 |      524_288      262_144      262_144      465*
  256    4_148 |    2_359_296    1_048_576    1_310_720      316
 1024   16_436 |    8_388_608    3_145_728    5_242_880      319
 2048   32_820 |   16_777_216    5_767_168   11_010_048      335
 4096   65_588 |   32_538_624   11_534_336   21_004_288      320
 6396  102_388 |   50_855_936   17_825_792   33_030_144      323
(* tiny accounts round up to the executor's 2^18-cycle segment quantum)

So you do not lose a fixed slice of the 2²⁵ budget for every owned account — you lose ~320 cycles per byte of owned account you read-modify-write (~160/byte if you only read, or only write). For ordinary accounts this is noise: a ~50-byte token holding costs ~16 K cycles (<0.05% of budget). It only becomes significant in the tens-of-KB range.

This is a general size/cycle tension, not a TWAP quirk

The practical consequence: with ~320 cyc/byte for a read-modify-write plus the Borsh/IO floor, the largest owned account a single instruction can read-modify-write within budget is ~65 KB — and at that size there is essentially no budget left for real work. Any program attempting to read-modify-write a near-max-size (100 KiB) owned account hits the same wall TWAP did, needing ~50 M cycles against a ~33.5 M cap.

In other words, DATA_MAX_LENGTH = 100 KiB and MAX_NUM_CYCLES_PUBLIC_EXECUTION = 2²⁵ are not jointly satisfiable for full-size read-modify-write. The runtime permits 100 KiB accounts, but the cycle budget can't commit one on both the read and write side. The implicit design rule is: large accounts must be read-only, write-only, or paged per instruction — never fully rewritten in place. That's a normal ZK-rollup constraint, but it's currently unstated.

This is worth raising with the LEZ runtime team as a protocol-parameter question. Three levers, none free:

  • Lower DATA_MAX_LENGTH to a size that is committable and leaves room to compute (e.g. so a full read-modify-write fits well under budget). Safest guarantee, but caps every program's account size — and read-only consumers of large accounts don't need it lowered.
  • Raise MAX_NUM_CYCLES_PUBLIC_EXECUTION so a max-size account is affordable. Directly inflates proving time/cost for every program, including ones that never touch big accounts.
  • Leave both, document the rule that large accounts are not full-rewrite-able in one call, and provide a paging pattern. Lowest blast radius; pushes complexity to programs that need big state.

The TWAP fix below (a smaller buffer) sidesteps the tension for this program regardless of which lever the protocol ultimately picks.

4. The fix

The budget-breaking cost is committing the owned ~100 KiB account, which is intrinsic to the account's size. The fix must shrink what gets committed.

Reduce OBSERVATIONS_CAPACITY — the simple, effective fix

Cutting capacity reduces the committed account size, and cost falls ~linearly. Critically, window coverage is unaffected: the sampling guard derives min_interval = window_duration / capacity, so coverage = capacity × min_interval = window_duration regardless of capacity — only resolution (samples per window) drops. At capacity 2048 a 24 h window still samples every ~42 s; a 7 d window every ~5 min. That is ample for a TWAP.

Applied: OBSERVATIONS_CAPACITY = 2048 (≈16.8M cycles, ~2× headroom). 4096 fits but leaves no margin for runtime variation; 2048 keeps RecordTick at roughly half the limit.

Linked observation pages — if full resolution is required

Already sketched in twap-oracle-observation-capacity.md: keep a small fixed-size "head" account plus older page accounts. RecordTick then commits only the small head, so per-call cost is bounded regardless of total history. More moving parts (page PDAs, chain-walking readers); reserve it for when a single reduced-capacity account genuinely can't hold enough resolution.

Byte-patching RecordTick to skip the Borsh round-trip — does NOT fix this

An earlier hypothesis was that the cost was the full-buffer Borsh deserialize/reserialize, and that patching the serialized bytes in place would make it O(1). The measurements refute this. The Borsh round trip is only part of the 17.8M unowned floor; the ~33M that breaks the budget is the owned-account commitment, which the runtime performs regardless of how the guest computes the new bytes. Byte-patching would shave a little off the floor and leave RecordTick at ~33M+ — still at or over the edge. Avoid this as the primary fix; it addresses the wrong cost.

Raising MAX_NUM_CYCLES_PUBLIC_EXECUTION — not a real fix

It's a platform-wide nssa constant; raising it inflates proving cost/time for every program and only defers the wall, since the cost still scales with account size.

5. Reproducing

# Faithful cycle benchmark (synthetic inputs; reproduces the on-chain pass/fail at 2²⁵).
# `programs/benchmark` is a standalone crate, excluded from the workspace, so run it by manifest:
cargo test --manifest-path programs/benchmark/Cargo.toml -- --ignored --nocapture

# The end-to-end TWAP tests through the real zkVM path:
RISC0_DEV_MODE=1 cargo test -p integration_tests --test amm twap

The benchmark uses risc0-zkvm with the prove feature to run the guest with the session limit lifted and read the user/paging/reserved/total cycle split. It only executes the guest — it never proves. It lives in the workspace-excluded programs/benchmark crate so normal builds/tests/CI never compile it.

6. Acceptance — met

With OBSERVATIONS_CAPACITY = 2048:

  • twap_cycle_bench reports RecordTick (owned, cap 2048) at ~16.8M, under the 2²⁵ limit (and its sweep still shows cap 6396 aborting, guarding against bumping capacity back up).
  • The two tests/amm.rs TWAP tests pass through the real zkVM path (no longer #[ignore]d).