* Use built-in `reverse_bits`; remove duplicate `reverse_index_bits`.
* Reduce precomputation time/space complexity from quadratic to linear.
* Several working cache-friendly FFTs.
* Fix to allow FFT of constant polynomial.
* Simplify FFT strategy choice.
* Add PrimeField and CHARACTERISTIC properties to Fields.
* Add faster method for inverse of 2^m.
* Pre-compute some of the roots; tidy up loop iteration.
* Precomputation for both FFT variants.
* Refactor precomputation; add optional parameters; rename some things.
* Unrolled version with zero tail.
* Iterative version of Unrolled precomputation.
* Test zero tail algo.
* Restore default degree.
* Address comments from @dlubarov and @wborgeaud.
... and other minor refactoring.
`bench_recursion` will be the default bin run by `cargo run`; the otheres can be selected with the `--bin` flag.
We could probably delete some of the other binaries later. E.g. `field_search` might not be useful any more. `bench_fft` should maybe be converted to a benchmark (although there are some pros and cons, e.g. the bench framework has a minimum number of runs, and isn't helpful in testing multi-core performance).