Add specs for stack handling (#1381)

* Add specs for stack handling * Apply comments
2026-02-18 21:03:39 +00:00 · 2023-11-22 18:55:53 -05:00 · 2023-11-22 18:55:53 -05:00 · d4b05f3730
commit d4b05f3730
parent a736458617
3 changed files with 73 additions and 34 deletions
--- a/evm/spec/cpulogic.tex
+++ b/evm/spec/cpulogic.tex
@ -126,7 +126,78 @@ ecAdd, ecMul and ecPairing precompiles.

 \subsection{Stack handling}

-TODO
+\subsubsection{Top of the stack}
+
+The majority of memory operations involve the stack. The stack is a segment in memory, and stack operations (popping or pushing) use the memory channels.
+Every CPU instruction performs between 0 and 4 pops, and may push at most once. However, for efficiency purposes, we hold the top of the stack in
+the first memory channel \texttt{current\_row.mem\_channels[0]}, only writing it in memory if necessary.
+
+\paragraph*{Motivation:}
+
+See \href{https://github.com/0xPolygonZero/plonky2/issues/1149}{this issue}.
+
+\paragraph*{Top reading and writing:}
+
+When a CPU instruction modifies the stack, it must update the top of the stack accordingly. There are three cases.
+
+\begin{itemize}
+  \item \textbf{The instruction pops and pushes:} The new top of the stack is stored in \texttt{next\_row.mem\_channels[0]}; it may be computed by the instruction,
+or it could be read from memory. In either case, the instruction is responsible for setting \texttt{next\_row.mem\_channels[0]}'s flags and address columns correctly.
+After use, the previous top of the stack is discarded and doesn't need to be written in memory.
+  \item \textbf{The instruction pushes, but doesn't pop:} The new top of the stack is stored in \texttt{next\_row.mem\_channels[0]}; it may be computed by the instruction,
+or it could be read from memory. In either case, the instruction is responsible for setting \texttt{next\_row.mem\_channels[0]}'s flags and address columns correctly.
+If the stack wasn't empty (\texttt{current\_row.stack\_len > 0}), the instruction performs a memory read in \texttt{current\_row.partial\_ channel}. \texttt{current\_row.partial\_channel}
+shares its values with \texttt{current\_ row.mem\_channels[0]} (which holds the current top of the stack). If the stack was empty, \texttt{current\_row.partial\_channel}
+is disabled.
+  \item \textbf{The instruction pops, but doesn't push:} After use, the current top of the stack is discarded and doesn't need to be written in memory.
+If the stack isn't empty now (\texttt{current\_row.stack\_len > num\_pops}), the new top of the stack is set in \texttt{next\_row.mem\_channels[0]}
+with a memory read from the stack segment. If the stack is now empty, \texttt{next\_row.mem\_channels[0]} is disabled.
+\end{itemize}
+
+In the last two cases, there is an edge case if \texttt{current\_row.stack\_len} is equal to a \texttt{special\_len}. For a strictly pushing instruction,
+this happens if the stack is empty, and \texttt{special\_len = 0}. For a strictly popping instruction, this happens if the next stack is empty, i.e. if
+all remaining elements are popped, and \texttt{special\_len = num\_pops}. Note that we do not need to check for values below \texttt{num\_pops}, since this
+would be a stack underflow exception which is handled separately.
+The edge case is detected with the compound flag
+$$\texttt{1 - not\_special\_len * stack\_inv\_aux,}$$
+where $$\texttt{not\_special\_len = current\_row - special\_len}$$
+
+
+and \texttt{stack\_inv\_aux} is constrained to be the modular inverse of \texttt{not\_special\_ len} if it's non-zero, or 0 otherwise. The flag is 1
+if \texttt{stack\_len} is equal to \texttt{special\_len}, and 0 otherwise.
+
+This logic can be found in code in the \texttt{eval\_packed\_one} function of \href{https://github.com/0xPolygonZero/plonky2/blob/main/evm/src/cpu/stack.rs}{stack.rs}.
+The function multiplies all of the stack constraints with the degree 1 filter associated with the current instruction.
+
+\paragraph*{Operation flag merging:}
+
+To reduce the total number of columns, many operation flags are merged together (e.g. \texttt{DUP} and \texttt{SWAP}) and are distinguished with the binary decomposition of their opcodes.
+The filter for a merged operation is now of degree 2: for example, \texttt{is\_swap = dup\_swap * opcode\_bits[4]} since the 4th bit is set to 1 for a \texttt{SWAP} and 0 for a \texttt{DUP}.
+If the two instructions have different stack behaviors, this can be a problem: \texttt{eval\_packed\_one}'s constraints are already of degree 3 and it can't support degree 2 filters.
+
+When this happens, stack constraints are defined manually in the operation's dedicated file (e.g. \texttt{dup\_swap.rs}). Implementation details vary case-by-case and can be found in the files.
+
+\subsubsection{Stack length checking}
+
+The CPU must make sure that the stack length never goes below zero and, in user mode, never grows beyond the maximum stack size. When this happens, an honest prover should trigger the
+corresponding exception. If a malicious prover doesn't trigger the exception, constraints must fail the proof.
+
+\paragraph*{Stack underflow:}
+There is no explicit constraint checking for stack underflow. An underflow happens when the CPU tries to pop the empty stack, which would perform a memory read at virtual address \texttt{-1}.
+Such a read cannot succeed: in Memory, the range-check argument requires the gap between two consecutive addresses to be lower than the length of the Memory trace. Since the prime of the Plonky2 field is 64-bit long,
+this would require a Memory trace longer than $2^{32}$.
+
+\paragraph*{Stack overflow:}
+An instruction can only push at most once, meaning that an overflow occurs whenever the stack length is exactly one more than the maximum stack size ($1024+1$) in user mode.
+To constrain this, the column \texttt{stack\_len\_bounds\_aux} contains:
+
+\begin{itemize}
+  \item[--] the modular inverse of \texttt{stack\_len - 1025} if we're in user mode and \texttt{stack\_len $\neq$ 1025},
+  \item[--] 0 if \texttt{stack\_len = 1025} or if we're in kernel mode.
+\end{itemize}
+Then overflow can be checked with the flag
+$$\texttt{(1 - is\_kernel\_mode) - stack\_len * stack\_len\_bounds\_aux}.$$
+The flag is 1 if \texttt{stack\_len = 1025} and we're in user mode, and 0 otherwise.

 \subsection{Gas handling}

@ -193,7 +264,7 @@ unverifiable.
 When this happens, the EVM jumps to \texttt{exc\_stack\_overflow}. The kernel then checks that the current stack length is smaller than the minimum 
 stack length required by the faulty opcode.
 If the exception is not raised, the popping memory operation's address offset would underflow, and the Memory range check would require the Memory trace to be too
-large to be provable ($>2^{60}$).
+large ($>2^{32}$).
  \item[\textbf{Invalid JUMP destination:}] Raised when the program counter jumps to an invalid location (i.e. not a JUMPDEST). When this happens, the EVM jumps to
 \texttt{exc\_invalid\_jump\_destination}. The kernel then checks that the opcode is a JUMP, and that the destination is not a JUMPDEST by checking the
 JUMPDEST segment.
--- a/evm/spec/zkevm.pdf
+++ b/evm/spec/zkevm.pdf
--- a/evm/src/cpu/docs/memory-handling.md
+++ b/evm/src/cpu/docs/memory-handling.md
@ -1,32 +0,0 @@
-# Memory structure
-The CPU interacts with the EVM memory via memory channels. At each CPU row, a memory channel can execute a write, a read, or be disabled. A full memory channel is composed of:
- 1 `used` flag. If it's set to `true`, a memory operation is executed in this channel at this row. If it's set to `false`, no operation is done but its columns might be reused for other purposes.
- 1 `is_read` flag. It indicates if a memory operation is a read or a write.
- 3 address columns. A memory address is made of three parts: `context`, `segment` and `virtual`.
- 8 value columns. EVM words are 256 bits long, and they are broken down in 8 32-bit limbs.
-The last memory channel is `current_row.partial_channel`: it doesn't have its own column values and shares them with the first memory channel `current_row.mem_channels[0]`. This allows use to save eight columns.
-
-# Top of the stack
-The majority of memory operations involve the stack. The stack is a segment in memory, and stack operations (popping or pushing) use the memory channels. Every CPU instruction performs between 0 and 4 pops, and may push at most once. However, for efficiency purposes, we hold the top of the stack in `current_row.mem_channels[0]`, only writing it in memory if necessary.
-
-## Motivation
-See https://github.com/0xPolygonZero/plonky2/issues/1149.
-
-## Top reading and writing
-When a CPU instruction modifies the stack, it must update the top of the stack accordingly. There are three cases.
-
- **The instruction pops and pushes:** The new top of the stack is stored in `next_row.mem_channels[0]`; it may be computed by the instruction, or it could be read from memory. In either case, the instruction is responsible for setting `next_row.mem_channels[0]`'s flags and address columns correctly. After use, the previous top of the stack is discarded and doesn't need to be written in memory.
-
- **The instruction pushes, but doesn't pop:** The new top of the stack is stored in `next_row.mem_channels[0]`; it may be computed by the instruction, or it could be read from memory. In either case, the instruction is responsible for setting `next_row.mem_channels[0]`'s flags and address columns correctly. If the stack wasn't empty (`current_row.stack_len > 0`), the current top of the stack is stored with a memory read in `current_row.partial_channel`, which shares its values with `current_row.mem_channels[0]` (which holds the current top of the stack). If the stack was empty, `current_row.partial_channel` is disabled.
-
- **The instruction pops, but doesn't push:** After use, the current top of the stack is discarded and doesn't need to be written in memory. If the stack isn't now empty (`current_row.stack_len > num_pops`), the new top of the stack is set in `next_row.mem_channels[0]` with a memory read from the stack segment. If the stack is now empty, `next_row.mem_channels[0]` is disabled.
-
-In the last two cases, there is an edge case if `current_row.stack_len` is equal to a `special_len`. For a strictly pushing instruction, this happens if the stack is empty, and `special_len = 0`. For a strictly popping instruction, this happens if the next stack is empty, i.e. that all remaining elements are popped, and `special_len = num_pops`. Note that we do not need to check for values below `num_pops`, since this would be a stack underflow exception which is handled separately.
-The edge case is detected with the compound flag `1 - not_special_len * stack_inv_aux`, where `not_special_len = current_row - special_len` and `stack_inv_aux` is constrained to be the modular inverse of `is_not_special_len` if it's non-zero, or `0` otherwise. The flag is `1` if `stack_len` is equal to `special_len`, and `0` otherwise.
-
-This logic can be found in code in the `eval_packed_one` function of `stack.rs`, which multiplies all of the constraints with a degree 1 filter passed as argument.
-
-## Operation flag merging
-To reduce the total number of columns, many operation flags are merged together (e.g. DUP and SWAP) and are distinguished with the binary decomposition of their opcodes. The filter for a merged operation is now of degree 2: for example, `is_swap = current_row.op.dup_swap * current_row.opcode_bits[4]` since the 4th bit is set to 1 for a SWAP and 0 for a DUP. If the two instructions have different stack behaviors, this can be a problem: `eval_packed_one`'s degrees are already of degree 3 and it can't support degree 2 filters.
-
-When this happens, stack constraints are defined manually in the operation's dedicated file (e.g. `dup_swap.rs`). Implementation details vary case-by-case and can be found in the files.