plonky2/evm/spec/cpulogic.tex
Linda Guiga 06933b1da4
Starting the specs for the CPU logic (#1377)
* Add CPU logic section to specs

* Add kernel specs

* Apply comments

---------

Co-authored-by: Hamy Ratoanina <hamy.ratoanina@toposware.com>
2023-11-21 17:08:15 -05:00

171 lines
15 KiB
TeX

\section{CPU logic}
\label{cpulogic}
The CPU is in charge of coordinating the different STARKs, proving the correct execution of the instructions it reads and guaranteeing
that the final state of the EVM corresponds to the starting state after executing the input transaction. All design choices were made
to make sure these properties can be adequately translated into constraints of degree at most 3 while minimizing the size of the different
table traces (number of columns and number of rows).
In this section, we will detail some of these choices.
\subsection{Kernel}
The kernel is in charge of the proving logic. This section aims at providing a high level overview of this logic. For details about any specific part of the logic, one can consult the various ``asm'' files in the \href{https://github.com/0xPolygonZero/plonky2/tree/main/evm/src/cpu/kernel}{``kernel'' folder}.
We prove one transaction at a time. These proofs can later be aggregated recursively to prove a block. Proof aggregation is however not in the scope of this section. Here, we assume that we have an initial state of the EVM, and we wish to prove that a single transaction was correctly executed, leading to a correct update of the state.
Since we process one transaction at a time, a few intermediary values need to be provided by the prover. Indeed, to prove that the registers in the EVM state are correctly updated, we need to have access to their initial values. When aggregating proofs, we can also constrain those values to match from one transaction to the next. Let us consider the example of the transaction number. Let $n$ be the number of transactions executed so far in the current block. If the current proof is not a dummy one (we are indeed executing a transaction), then the transaction number should be updated: $n := n+1$. Otherwise, the number remains unchanged. We can easily constrain this update. When aggregating the previous transaction proof ($lhs$) with the current one ($rhs$), we also need to check that the output transaction number of $lhs$ is the same as the input transaction number of $rhs$.
Those prover provided values are stored in memory prior to entering the kernel, and are used in the kernel to assert correct updates. The list of prover provided values necessary to the kernel is the following:
\begin{enumerate}
\item the previous transaction number: $t_n$,
\item the gas used before executing the current transaction: $g\_u_0$,
\item the gas used after executing the current transaction: $g\_u_1$,
\item the block bloom filter before executing the current transaction: $b\_f_0$
\item the block bloom filter after executing the current transaction: $b\_f_1$,
\item the state, transaction and receipts MPTs before executing the current transaction: $\texttt{tries}_0$,
\item the hash of all MPTs before executing the current transaction: $\texttt{digests}_0$,
\item the hash of all MPTs after executing the current transaction: $\texttt{digests}_1$,
\item the RLP encoding of the transaction.
\end{enumerate}
\paragraph*{Initialization:} The first step consists in initializing:
\begin{itemize}
\item The shift table: it maps the number of bit shifts $s$ with its shifted value $1 << s$. Note that $0 \leq s \leq 255$.
\item The block bloom filter: the current block bloom filter is initialized with $b\_f_0$.
\item The initial MPTs: the initial state, transaction and receipt tries $\texttt{tries}_0$ are loaded from memory and hashed. The hashes are then compared to $\texttt{digests}\_0$.
\item We load the transaction number $t\_n$ and the current gas used $g\_u_0$ from memory.
\end{itemize}
If no transaction is provided, we can halt after this initialization. Otherwise, we start processing the transaction. The transaction is provided as its RLP encoding. We can deduce the various transaction fields (such as its type or the transfer value) from its encoding. Based on this, the kernel updates the state trie by executing the transaction. Processing the transaction also includes updating the transactions MPT with the transaction at hand.
The processing of the transaction returns a boolean ``success'' that indicates whether the transaction was executed successfully, along with the leftover gas.
The following step is then to update the receipts MPT. Here, we update the transaction's bloom filter and the block bloom filter. We store ``success'', the leftover gas, the transaction bloom filter and the logs in memory. We also store some additional information that facilitates the RLP encoding of the receipts later.
If there are any withdrawals, they are performed at this stage.
Finally, once the three MPTs have been updated, we need to carry out final checks:
\begin{itemize}
\item the gas used after the execution is equal to $g\_u_1$,
\item the new transaction number is $n+1$ if there was a transaction,
\item the updated block bloom filter is equal to $b\_f_1$,
\item the three MPTs are hashed and checked against $\texttt{digests}_1$.
\end{itemize}
Once those final checks are performed, the program halts.
\paragraph{MPT hashing:}
MPTs are a complex structure in the kernel, and we will not delve into all of its aspects. Here, we only explain how the hashing works, since it is part of the initialization and final checks.
The data required for the MPTs are stored in the ``TrieData'' segment in memory. Whenever we need to hash an MPT, we recover the information from the ``TrieData'' segment and write it in the correct format in the ``RlpRaw'' segment. We start by getting the node type. If the node is a hash node, we simply return its value. Otherwise, we RLP encode the node recursively:
\begin{itemize}
\item If it is an empty node, the encoding is $\texttt{0x80}$.
\item If it is a branch node, we encode the node's value and append it to the RLP tape. Then, we encode each of the children and append the encodings to the RLP tape.
\item If it is an extension node, we RLP encode its child and hex prefix it.
\item If it is a leaf, we RLP encode it depending on the type of trie, and hex prefix the encoding. Note that for a receipt leaf, the encoding is RLP($\texttt{type} || \texttt{RLP(receipt)})$. In the case of a transaction, their RLP encoding is already provided by the input, so we simply load it from memory.
\end{itemize}
Finally, we hash the output of the RLP encoding, stored in ``RlpRaw'' -- unless it is already a hash.
\subsection{Privileged instructions}
To ease and speed-up proving time, the zkEVM supports custom, privileged instructions that can only be executed by the kernel.
Any appearance of those privileged instructions in a contract bytecode for instance would result in an unprovable state.
In what follows, we denote by $p_{BN}$ the characteristic of the BN254 curve base field, curve for which Ethereum supports the
ecAdd, ecMul and ecPairing precompiles.
\begin{enumerate}[align=left]
\item[0x0C.] \texttt{ADDFP254}. Pops 2 elements from the stack interpreted as BN254 base field elements, and pushes their addition modulo $p_{BN}$ onto the stack.
\item[0x0D.] \texttt{MULFP254}. Pops 2 elements from the stack interpreted as BN254 base field elements, and pushes their product modulo $p_{BN}$ onto the stack.
\item[0x0E.] \texttt{SUBFP254}. Pops 2 elements from the stack interpreted as BN254 base field elements, and pushes their difference modulo $p_{BN}$ onto the stack.
This instruction behaves similarly to the SUB (0x03) opcode, in that we subtract the second element of the stack from the initial (top) one.
\item[0x0F.] \texttt{SUBMOD}. Pops 3 elements from the stack, and pushes the modular difference of the first two elements of the stack by the third one.
It is similar to the SUB instruction, with an extra pop for the custom modulus.
\item[0x21.] \texttt{KECCAK\_GENERAL}. Pops 4 elements (successively the context, segment, and offset portions of a Memory address, followed by a length $\ell$)
and pushes the hash of the memory portion starting at the constructed address and of length $\ell$. It is similar to KECCAK256 (0x20) instruction, but can be applied to
any memory section (i.e. even privileged ones).
\item[0x49.] \texttt{PROVER\_INPUT}. Pushes a single prover input onto the stack.
\item[0xC0-0xDF.] \texttt{MSTORE\_32BYTES}. Pops 4 elements from the stack (successively the context, segment, and offset portions of a Memory address, and then a value), and pushes
a new offset' onto the stack. The value is being decomposed into bytes and written to memory, starting from the reconstructed address. The new offset being pushed is computed as the
initial address offset + the length of the byte sequence being written to memory. Note that similarly to PUSH (0x60-0x7F) instructions there are 31 MSTORE\_32BYTES instructions, each
corresponding to a target byte length (length 0 is ignored, for the same reasons as MLOAD\_32BYTES, see below). Writing to memory an integer fitting in $n$ bytes with a length $\ell < n$ will
result in the integer being truncated. On the other hand, specifying a length $\ell$ greater than the byte size of the value being written will result in padding with zeroes. This
process is heavily used when resetting memory sections (by calling MSTORE\_32BYTES\_32 with the value 0).
\item[0xF6.] \texttt{GET\_CONTEXT}. Pushes the current context onto the stack. The kernel always has context 0.
\item[0xF7.] \texttt{SET\_CONTEXT}. Pops the top element of the stack and updates the current context to this value. It is usually used when calling another contract or precompile,
to distinguish the caller from the callee.
\item[0xF8.] \texttt{MLOAD\_32BYTES}. Pops 4 elements from the stack (successively the context, segment, and offset portions of a Memory address, and then a length $\ell$), and pushes
a value onto the stack. The pushed value corresponds to the U256 integer read from the big-endian sequence of length $\ell$ from the memory address being reconstructed. Note that an
empty length is not valid, nor is a length greater than 32 (as a U256 consists in at most 32 bytes). Missing these conditions will result in an unverifiable proof.
\item[0xF9.] \texttt{EXIT\_KERNEL}. Pops 1 element from the stack. This instruction is used at the end of a syscall, before proceeding to the rest of the execution logic.
The popped element, \textit{kexit\_info}, contains several informations like the current program counter, current gas used, and if we are in kernel (i.e. privileged) mode.
\item[0xFB.] \texttt{MLOAD\_GENERAL}. Pops 3 elements (successively the context, segment, and offset portions of a Memory address), and pushes the value stored at this memory
address onto the stack. It can read any memory location, general (similarly to MLOAD (0x51) instruction) or privileged.
\item[0xFC.] \texttt{MSTORE\_GENERAL}. Pops 4 elements (successively a value, then the context, segment, and offset portions of a Memory address), and writes the popped value from
the stack at the reconstructed address. It can write to any memory location, general (similarly to MSTORE (0x52) / MSTORE8 (0x53) instructions) or privileged.
\end{enumerate}
\subsection{Stack handling}
TODO
\subsection{Gas handling}
TODO
\subsection{Exceptions}
Sometimes, when executing user code (i.e. contract or transaction code), the EVM halts exceptionally (i.e. outside of a STOP, a RETURN or a REVERT).
When this happens, the CPU table invokes a special instruction with a dedicated operation flag \texttt{exception}.
Exceptions can only happen in user mode; triggering an exception in kernel mode would make the proof unverifiable.
No matter the exception, the handling is the same:
-- The opcode which would trigger the exception is not executed. The operation flag set is \texttt{exception} instead of the opcode's flag.
-- We push a value to the stack which contains: the current program counter (to retrieve the faulty opcode), and the current value of \texttt{gas\_used}.
The program counter is then set to the corresponding exception handler in the kernel (e.g. \texttt{exc\_out\_of\_gas}).
-- The exception handler verifies that the given exception would indeed be triggered by the faulty opcode. If this is not the case (if the exception has already happened or if it doesn't happen after executing
the faulty opcode), then the kernel panics: there was an issue during witness generation.
-- The kernel consumes the remaining gas and returns from the current context with \texttt{success} set to 0 to indicate an execution failure.
Here is the list of the possible exceptions:
\begin{enumerate}[align=left]
\item[\textbf{Out of gas:}] Raised when a native instruction (i.e. not a syscall) in user mode pushes the amount of gas used over the current gas limit.
When this happens, the EVM jumps to \texttt{exc\_out\_of\_gas}. The kernel then checks that the consumed gas is currently below the gas limit,
and that adding the gas cost of the faulty instruction pushes it over it.
If the exception is not raised, the prover will panic when returning from the execution: the remaining gas is checked to be positive after STOP, RETURN or REVERT.
\item[\textbf{Invalid opcode:}] Raised when the read opcode is invalid. It means either that it doesn't exist, or that it's a privileged instruction and
thus not available in user mode. When this happens, the EVM jumps to \texttt{exc\_invalid\_opcode}. The kernel then checks that the given opcode is indeed invalid.
If the exception is not raised, decoding constraints ensure no operation flag is set to 1, which would make it a padding row. Halting constraints would then make the proof
unverifiable.
\item[\textbf{Stack underflow:}] Raised when an instruction which pops from the stack is called when the stack doesn't have enough elements.
When this happens, the EVM jumps to \texttt{exc\_stack\_overflow}. The kernel then checks that the current stack length is smaller than the minimum
stack length required by the faulty opcode.
If the exception is not raised, the popping memory operation's address offset would underflow, and the Memory range check would require the Memory trace to be too
large to be provable ($>2^{60}$).
\item[\textbf{Invalid JUMP destination:}] Raised when the program counter jumps to an invalid location (i.e. not a JUMPDEST). When this happens, the EVM jumps to
\texttt{exc\_invalid\_jump\_destination}. The kernel then checks that the opcode is a JUMP, and that the destination is not a JUMPDEST by checking the
JUMPDEST segment.
If the exception is not raised, jumping constraints will fail the proof.
\item[\textbf{Invalid JUMPI destination:}] Same as the above, for JUMPI.
\item[\textbf{Stack overflow:}] Raised when a pushing instruction in user mode pushes the stack over 1024. When this happens, the EVM jumps
to \texttt{exc\_stack\_overflow}. The kernel then checks that the current stack length is exactly equal to 1024 (since an instruction can only
push once at most), and that the faulty instruction is pushing.
If the exception is not raised, stack constraints ensure that a stack length of 1025 in user mode will fail the proof.
\end{enumerate}