Lots of little bugs!
- The Keccak sponge table's padding logic was wrong, it was mixing up the number of rows with the number of hashes.
- The Keccak sponge table's Keccak-looking data was wrong - input to Keccak-f should be after xor'ing in the block.
- The Keccak sponge table's logic-looking filter was wrong. We do 5 logic CTLs for any final-block row, even if some of the xors are with 0s from Keccak padding.
- The CPU was using the wrong/outdated output memory channel for its Keccak sponge and logic CTLs.
- The Keccak table just didn't have a way to filter out padding rows. I added a filter column for this.
- The Keccak table wasn't remembering the original preimage of a permutation; lookers were seeing the preimage of the final step. I added columns for the original preimage.
- `ctl_data_logic` was using the wrong memory channel
- Kernel bootloading generation was using the wrong length for its Keccak sponge CTL, and its `keccak_sponge_log` was seeing the wrong clock since it was called after adding the final bootloading row.
* Make jumps, logic, and syscalls read from/write to memory columns
* Change CTL convention (outputs precede inputs)
* Change convention so outputs follow inputs in memory channel order
I was thinking we could have two sets of shared columns:
- First, a set of "core" columns which would contain instruction decoding registers during an execution cycle, or some counter data during a kernel bootloading cycle.
- Second, a set of "general" columns which would be more general-purpose. For now it could contain "looking" columns for most CTLs (Keccak, arithmetic and logic; NOT memory since memory can be used simultaneously with the others). It could potentially be reused for other things too, such as the registers used for `EQ` and `IS_ZERO` (but I know it's nontrivial to share those since we would need to use lower-degree constraints, so I wouldn't bother for now).
This PR implements just the latter. If it looks good I'll proceed with the former afterward.
It's a bit more type-safe (can't mix up segment with context or virtual addr), and this way uniqueness of ordinals is enforced, partially addressing a concern raised in #591.
To avoid making `Segment` public (which I don't think would be appropriate), I had to make some other visibility changes, and had to move `generate_random_memory_ops` into the test module.