Just passing the "combined constraints" buffer into `eval_filtered_recursively`, so that we can combine a mul by the filter with an add into the buffer. Saves 56 wires.
* Specialize `InterpolationGate`
To cosets of subgroups of roots of unity. This way
- `InterpolationGate` needs fewer routed wires, bringing our minimum routed wires down from 28 to 25.
- The recursive `compute_evaluation` avoids some multiplications, saving 100~200 gates depending on `num_routed_wires`.
* Update test
* feedback
PoseidonGate's recursive evaluations were using a lot of gates, and the MDS layer was the main culprit.
The other issue is that `constant_layer_recursive` creates a bunch of `ArithmeticGate`s with unique constants. We could either change `ArithmeticGate` to support different constants per operation, or wire in constants from `ConstantGate`, and change `ConstantGate` to support several constants per gate.
This won't really help anything near term since we're still between 2^12 and 2^13, but could have some benefits later, depending on what recursion arities and security settings we end up using.
`PoseidonMdsGate` needs `2 * D * WIDTH = 48` routed wires, and the combination of adding a gate and increasing routed wires slows down the prover a bit. So for now, I kept it at 28 wires, and the old code path is still used.
* Batched eval_vanishing_poly_base
* Reduce the number of allocations
* Lints
* Delete unused things
* Minor: fix a debug_assert
* Daniel PR comments
* Lints
* Daniel PR comments
For examlpe, if I change a test to use `ConstantArityBits(4, 5)`, I get
To efficiently perform FRI checks with an arity of 16, at least 152 wires are needed. Consider reducing arity.
* Split up `PartitionWitness` data
This addresses two minor inefficiencies:
- Some preprocessed forest data was being cloned during proving.
- Some of the `ForestNode` data (like node sizes) is only needed in preprocessing, not proving. It was taking up cache space during proving because it was interleaved with data that is used during proving (parents, values).
Now `Forest` contains the disjoint-set forest. `PartitionWitness` is now mainly a Vec of target values; it also holds a reference to the (preprocessed) representative map.
On my laptop, this speeds up witness generation ~12%, resulting in an overall ~0.5% speedup.
* Feedback
* No size data (#278)
* No size data
* feedback