I think `%mload_kernel_code_u32` is good when we need to do random access, but since the indices are constant here, let's just hardcode them like this.
This reduces the assembled size of `compression.asm` from 1827 to 1454 bytes. I think there's still a lot more we could do to shrink it, though it's not that important.