Add specs for BytePackingStark (#1373)

* Start * Finish documenting BytePackingStark * Apply comments * Apply comment --------- Co-authored-by: Robin Salen <salenrobin@gmail.com>
2026-02-26 16:53:12 +00:00 · 2023-11-22 15:59:21 -05:00 · 2023-11-22 15:59:21 -05:00 · 8d473168d6
commit 8d473168d6
parent fe311c7f90
2 changed files with 51 additions and 1 deletions
--- a/evm/spec/tables/byte-packing.tex
+++ b/evm/spec/tables/byte-packing.tex
@ -1,4 +1,54 @@
 \subsection{Byte Packing}
 \label{byte-packing}

-TODO
+The BytePacking STARK module is used for reading and writing non-empty byte sequences of length at most 32 to memory.
+The "packing" term highlights that reading a sequence in memory will pack the bytes into an EVM word (i.e. U256), while
+the "unpacking" operation consists in breaking down an EVM word into its byte sequence and writing it to memory.
+
+This allows faster memory copies between two memory locations, as well as faster memory reset
+(see \href{https://github.com/0xPolygonZero/plonky2/blob/main/evm/src/cpu/kernel/asm/memory/memcpy.asm}{memcpy.asm} and 
+\href{https://github.com/0xPolygonZero/plonky2/blob/main/evm/src/cpu/kernel/asm/memory/memset.asm}{memset.asm} modules).
+
+The `BytePackingStark' table will have $\ell$ rows per packing/unpacking operation, where $0 < \ell \leq 32$ is the length of the sequence being processed.
+
+Each row contains the following columns:
+\begin{enumerate}
+    \item 5 columns containing information on the initial memory address from which the sequence starts
+    (namely a flag differentiating read and write operations, address context, segment and offset values, as well as timestamp),
+    \item $f_{\texttt{end}}$, a flag indicating the end of a sequence,
+    \item 32 columns $b_i$ indicating the index of the byte being read/written at a given row,
+    \item 32 columns $v_i$ indicating the values of the bytes that have been read or written during a sequence,
+    \item 2 columns $r_i$ needed for range-checking the byte values.
+\end{enumerate}
+
+\paragraph{Notes on columns generation:}
+Whenever a byte unpacking operation is called, the value $\texttt{val}$ is read from the stack, but because the EVM and the STARKs use different endianness, we need to convert $\texttt{val}$ to a little-endian byte sequence. Only then do we resize it to the appropriate length, and prune extra zeros and higher bytes in the process. Finally, we reverse the byte order and write this new sequence into the $v_i$ columns of the table. 
+
+Whenever the operation is a byte packing, the bytes are read one by one from memory and stored in the $v_i$ columns of the BytePackingStark table.
+
+Note that because of the different endianness on the memory and EVM sides, we generate rows starting with the final virtual address value (and the associated byte). We decrement the address at each row.
+
+The $b_i$ columns hold a boolean value. $b_i = 1$ whenever we are currently reading or writing the i-th element in the byte sequence. $b_i = 0$ otherwise.
+
+\paragraph{Cross-table lookups:}
+The read or written bytes need to be checked against both the cpu and the memory tables. Whenever we call $\texttt{MSTORE\_32BYTES}$, $\texttt{MLOAD\_32BYTES}$ or $\texttt{PUSH}$ on the cpu side, we make use of `BytePackingStark' to make sure we are carrying out the correct operation on the correct values. For this, we check that the following values correspond:
+\begin{enumerate}
+    \item the address (comprising the context, the segment, and the virtual address),
+    \item the length of the byte sequence,
+    \item the timestamp,
+    \item the value (either written to or read from the stack)
+\end{enumerate}
+
+On the other hand, we need to make sure that the read and write operations correspond to the values read or stored on the memory side. We therefore need a CTL for each byte, checking that the following values are identical in `MemoryStark' and `BytePackingStark':
+\begin{enumerate}
+    \item a flag indicating whether the operation is a read or a write,
+    \item the address (context, segment and virtual address),
+    \item the byte (followed by 0s to make sure the memory address contains a byte and not a U256 word),
+    \item the timestamp
+\end{enumerate}
+
+\paragraph*{Note on range-check:} Range-checking is necessary whenever we do a memory unpacking operation that will
+write values to memory. These values are constrained by the range-check to be 8-bit values, i.e. fitting between 0 and 255 included.
+While range-checking values read from memory is not necessary, because we use the same $\texttt{byte\_values}$ columns for both read
+and write operations, this extra condition is enforced throughout the whole trace regardless of the operation type.
+
--- a/evm/spec/zkevm.pdf
+++ b/evm/spec/zkevm.pdf