- At least on my Linux machine, a signiciant amount of time (> 50%) was spent inside
OsRng.
- Likely due to blocking behaviour of the rng devices on Linux.
- thread_rng should not block, but at the same time should provide good
enough rng.
Closes#10. This combines Lagrange interpolation with FFTs as mentioned there.
I was previously thinking that all our polynomial encodings might as well just use power-of-two length vectors, so they'll be "FFT-ready", with no need to trim/pad. This sort of breaks that assumption though, as e.g. I think we'll want to compute interpolants with three coefficients in the batch opening argument.
I think we can still skip trimming/padding in most cases, since it the majority of our polynomials will have power-of-two-minus-1 degrees with high probability. But we'll now have one or two uses where that's not the case.
... by switching to Rescue Prime (which has a smaller security margin), and precomputing an addition chain for the exponent used in the cubic root calculation. Also adds a benchmark.
... and other minor refactoring.
`bench_recursion` will be the default bin run by `cargo run`; the otheres can be selected with the `--bin` flag.
We could probably delete some of the other binaries later. E.g. `field_search` might not be useful any more. `bench_fft` should maybe be converted to a benchmark (although there are some pros and cons, e.g. the bench framework has a minimum number of runs, and isn't helpful in testing multi-core performance).
When we had a large field, we could just pick random shifts, and get disjoint cosets with high probability. With a 64-bit field, I think the probability of a collision is non-negligible (something like 1 in a million), so we should probably verify that the cosets are disjoint.
If there are any concerns with this method (or if it's just confusing), I think it would also be reasonable to use the brute force approach of explicitly computing the cosets and checking that they're disjoint. I coded that as well, and it took like 80ms, so not really a big deal since it's a one-time preprocessing cost.
Also fixes some overflow bugs in the inversion code.