Ultimately they're encoded as `[F; 8]`s in the table, but I don't anticipate that we'll have any use cases where we want to store more than 256 bits. Might as well store `U256` until we actually build the table since they're more compact.
It's a bit more type-safe (can't mix up segment with context or virtual addr), and this way uniqueness of ordinals is enforced, partially addressing a concern raised in #591.
To avoid making `Segment` public (which I don't think would be appropriate), I had to make some other visibility changes, and had to move `generate_random_memory_ops` into the test module.
The kernel is hashed using a Keccak based sponge for now. We could switch to Poseidon later if our kernel grows too large.
Note that we use simple zero-padding (pad0*) instead of the standard pad10* rule. It's simpler, and we don't care that the prover can add extra 0s at the end of the code. The program counter can never reach those bytes, and even if it could, they'd be 0 anyway given the EVM's zero-initialization rule.
In one CPU row, we can do a whole Keccak hash (via the CTL), absorbing 136 bytes. But we can't actually bootstrap that many bytes of kernel code in one row, because we're also limited by memory bandwidth. Currently we can write 4 bytes of the kernel to memory in one row.
So we treat the `keccak_input_limbs` columns as a buffer. We gradually fill up this buffer, 4 bytes (one `u32` word) at a time. Every `136 / 4 = 34` rows, the buffer will be full, so at that point we activate the Keccak CTL to absorb the buffer.
This should improve cache locality - since we generally access several values at a time in a given row, we want themt to be close together in memory.
There are a few steps that make more sense column-wise, though, such as generating the `COUNTER` column. I put those after the transpose.
For now it doesn't log filenames, but we can compare against the list of filenames in `combined_kernel`.
Current output:
```
[DEBUG plonky2_evm::cpu::kernel::assembler] Assembled file size: 0 bytes
[DEBUG plonky2_evm::cpu::kernel::assembler] Assembled file size: 49 bytes
[DEBUG plonky2_evm::cpu::kernel::assembler] Assembled file size: 387 bytes
[DEBUG plonky2_evm::cpu::kernel::assembler] Assembled file size: 27365 bytes
[DEBUG plonky2_evm::cpu::kernel::assembler] Assembled file size: 0 bytes
[DEBUG plonky2_evm::cpu::kernel::assembler] Assembled file size: 11 bytes
[DEBUG plonky2_evm::cpu::kernel::assembler] Assembled file size: 7 bytes
[DEBUG plonky2_evm::cpu::kernel::aggregator::tests] Total kernel size: 27819 bytes
```
This shows that most of our kernel code is from `curve_add.asm`, which makes sense since it invovles a couple uses of the large `inverse` macro. Thankfully that will be replaced at some point.
This adds padding rows which satisfy the ordering checks. To ensure that they also satisfy the value consistency checks, I just copied the address and value from the last operation.
I think this method of padding feels more natural, though it is a bit more code since we need to calculate the max range check in a different way. But on the plus side, the constraints are a bit smaller and simpler.
Also added a few constraints that I think we need for soundness:
- Each `is_channel` flag is bool.
- Sum of `is_channel` flags is bool.
- Dummy operations must be reads (otherwise the prover could put writes in the memory table which aren't in the CPU table).
By no longer storing unsorted operations; they are effectively stored in the CPU table already.
I ran into some issues with sorting, since the existing sort method didn't include `is_channel` columns. Rather than update the existing method, I removed it and added a sort on the `MemoryOp`s, which I think seems cleaner.