* First draft for division.
* `eval_division` work
* Division
* Minor: outdated fixme
* Tests and better column names
* Minor lints
* Remove redundant constraint
* Make division proof more formal
* Minor proof and comments
Co-authored-by: Hamish Ivey-Law <hamish@ivey-law.name>
* Halo2 style lookup arguments in System Zero
It's a really nice and simple protocol, particularly for the verifier since the constraints are trivial (aside from the underlying batched permutation checks, which we already support). See the [Halo2 book](https://zcash.github.io/halo2/design/proving-system/lookup.html) and this [talk](https://www.youtube.com/watch?v=YlTt12s7vGE&t=5237s) by @daira.
Previously we generated the whole trace in row-wise form, but it's much more efficient to generate these "permuted" columns column-wise. So I changed our STARK framework to accept the trace in column-wise form. STARK impls now have the flexibility to do some generation row-wise and some column-wise (without extra costs; there's a single transpose as before).
* sorting
* fixes
* PR feedback
* into_iter
* timing
* Initial implementation of quintic extensions.
* Update to/from_biguint() methods.
* Draft of fast multiplication on quintic extensions over 64-bit base.
* cargo fmt
* Typo.
* Document functions (a bit).
* Refactor reduction step.
* Change multiplication call so that LLVM generates better assembly.
* Use one main accumulator instead of two minor ones; faster reduce.
* Use one main accumulator in square too; clean up redundant code.
* Call faster routines from Mul and Square impls.
* Fix reduction function.
* Fix square calculation.
* Slightly faster reduction.
* Clean up names and types.
* cargo fmt
* Move extension field mul/sqr specialisations to their own file.
* Rename functions to have unique prefix.
* Add faster quadratic multiplication/squaring.
* Faster quartic multiplication and squaring.
* cargo fmt
* clippy
* Alternative reduce160 function.
* Typo.
* Remove alternative reduction function.
* Remove delayed reduction implementation of squaring.
* Enforce assumptions about extension generators.
* Make the accumulation variable a u32 instead of u64.
* Add test to trigger carry branch in reduce160.
* cargo fmt
* Some documentation.
* Clippy; improved comments.
* cargo fmt
* Remove redundant Square specialisations.
* Fix reduce*() visibility.
* Faster reduce160 from Jakub.
* Change mul-by-const functions to operate on 160 bits instead of 128.
* Move code for extensions of GoldilocksField to its own file.
* Implement a mul-add circuit in the ALU
The inputs are assumed to be `u32`s, while the output is encoded as four `u16 limbs`. Each output limb is range-checked.
So, our basic mul-add constraint looks like
out_0 + 2^16 out_1 + 2^32 out_2 + 2^48 out_3 = in_1 * in_2 + in_3
The right hand side will never overflow, since `u32::MAX * u32::MAX + u32::MAX < |F|`. However, the left hand side could overflow, even though we know each limb is less than `2^16`.
For example, an operation like `0 * 0 + 0` could have two possible outputs, 0 and `|F|`, both of which would satisfy the constraint above. To prevent these non-canonical outputs, we need a comparison to enforce that `out < |F|`.
Thankfully, `F::MAX` has all zeros in its low 32 bits, so `x <= F::MAX` is equivalent to `x_lo == 0 || x_hi != u32::MAX`. `x_hi != u32::MAX` can be checked by showing that `u32::MAX - x_hi` has an inverse. If `x_hi != u32::MAX`, the prover provides this (purported) inverse in an advice column.
See @bobbinth's [post](https://hackmd.io/NC-yRmmtRQSvToTHb96e8Q#Checking-element-validity) for details. That post calls the purported inverse column `m`; I named it `canonical_inv` in this code.
* fix
* PR feedback
* naming
* Batch multiple perm args into one Z and compute Z columnwise
It's slightly complex because we batch `constraint_degree - 1` permutation arguments into a single `Z` polynomial. This is a slight generalization of the [technique](https://zcash.github.io/halo2/design/proving-system/lookup.html) described in the Halo2 book.
Without this batching, we would simply have `num_challenges` random challenges (betas and gammas). With this batching, however, we need to use different randomness for each permutation argument within the same batch. Hence we end up generating `batch_size * num_challenges` challenges for all permutation arguments.
* Feedback + updates for recursion code