* Use static `KERNEL` in tests
* Print opcode count
* Update criterion
* Combine all syscalls into one flag (#802)
* Combine all syscalls into one flag
* Minor: typo
* Daniel PR comments
* Check that `le_sum` won't overflow
* security notes
* Test reverse_index_bits
Thanks to Least Authority for this
* clippy
* EVM shift left/right operations (#801)
* First parts of shift implementation.
* Disable range check errors.
* Tidy up ASM.
* Update comments; fix some .sum() expressions.
* First full draft of shift left/right.
* Missed a +1.
* Clippy.
* Address Jacqui's comments.
* Add comment.
* Fix missing filter.
* Address second round of comments from Jacqui.
* Remove signed operation placeholders from arithmetic table. (#812)
Co-authored-by: wborgeaud <williamborgeaud@gmail.com>
Co-authored-by: Daniel Lubarov <daniel@lubarov.com>
Co-authored-by: Jacqueline Nabaglo <jakub@mirprotocol.org>
Co-authored-by: Hamish Ivey-Law <426294+unzvfu@users.noreply.github.com>
Again borrowing syntax from NASM. Example from the test:
%macro spin
%%start:
PUSH %%start
JUMP
%endmacro
One thing this lets us do is create "wrapper" macros which call a function, then return to the code immediately following the macro call, such as
%macro decode_rlp_scalar
%stack (pos) -> (pos, %%after)
%jump(decode_rlp_scalar)
%%after:
%endmacro
I used this to clean up `type_0.asm`.
However, since such macros need to insert `%%after` beneath any arguments in the stack, using them will be suboptimal in some cases. I wouldn't worry about it generally, but we might want to avoid them in performance-critical code, or functions with many arguments like `memcpy`.
Uses a variant of Dijkstra's, with a few pruning mechanics, to find a path of instructions between the two stack states. We don't explicitly store the graph though.
The Dijkstra implementation is somewhat inspired by the `pathfinding` crate. That crate doesn't quite fit our needs though.
If we need to make it faster later, there are a lot of allocations and clones that we could probably eliminate.
The kernel is hashed using a Keccak based sponge for now. We could switch to Poseidon later if our kernel grows too large.
Note that we use simple zero-padding (pad0*) instead of the standard pad10* rule. It's simpler, and we don't care that the prover can add extra 0s at the end of the code. The program counter can never reach those bytes, and even if it could, they'd be 0 anyway given the EVM's zero-initialization rule.
In one CPU row, we can do a whole Keccak hash (via the CTL), absorbing 136 bytes. But we can't actually bootstrap that many bytes of kernel code in one row, because we're also limited by memory bandwidth. Currently we can write 4 bytes of the kernel to memory in one row.
So we treat the `keccak_input_limbs` columns as a buffer. We gradually fill up this buffer, 4 bytes (one `u32` word) at a time. Every `136 / 4 = 34` rows, the buffer will be full, so at that point we activate the Keccak CTL to absorb the buffer.
For now it doesn't log filenames, but we can compare against the list of filenames in `combined_kernel`.
Current output:
```
[DEBUG plonky2_evm::cpu::kernel::assembler] Assembled file size: 0 bytes
[DEBUG plonky2_evm::cpu::kernel::assembler] Assembled file size: 49 bytes
[DEBUG plonky2_evm::cpu::kernel::assembler] Assembled file size: 387 bytes
[DEBUG plonky2_evm::cpu::kernel::assembler] Assembled file size: 27365 bytes
[DEBUG plonky2_evm::cpu::kernel::assembler] Assembled file size: 0 bytes
[DEBUG plonky2_evm::cpu::kernel::assembler] Assembled file size: 11 bytes
[DEBUG plonky2_evm::cpu::kernel::assembler] Assembled file size: 7 bytes
[DEBUG plonky2_evm::cpu::kernel::aggregator::tests] Total kernel size: 27819 bytes
```
This shows that most of our kernel code is from `curve_add.asm`, which makes sense since it invovles a couple uses of the large `inverse` macro. Thankfully that will be replaced at some point.