* Add MULX/ADOX/ADCX assembly for squaring 4 limbs
* Add squarings for 6 limbs
* Use the new square assembly where relevant
* Fix 32-bit register name and calling convention
* typo
* Disable MontRed ASM for 2 limbs or less
* consistent naming for dbl-width
* Isolate double-width Fp2 mul
* Implement double-width complex multiplication
* Lay out Fp4 double-width mul
* Off by p in square Fp4 as well :/
* less copies and stack space in addition chains
* Address https://github.com/mratsim/constantine/issues/154 partly
* Fix#154, faster Fp4 square: less non-residue, no Mul, only square (bit more ops total)
* Fix typo
* better assembly scheduling for add/sub
* Double-width -> Double-precision
* Unred -> Unr
* double-precision modular addition
* Replace canUseNoCarryMontyMul and canUseNoCarryMontySquare by getSpareBits
* Complete the double-precision implementation
* Use double-precision path for Fp4 squaring and mul
* remove mixin annotations
* Lazy reduction in Fp4 prod
* Fix assembly for sum2xMod
* Assembly for double-precision negation
* reduce white spaces in pairing benchmarks
* ADX implies BMI2
The running time of the test suite has increased significantly with:
- new tests (for example scalar mul implementations)
- new tests that stresses the whole stack/tower
- x3 randomizers for fuzzing
- new CI and platforms: Total 16x runs per commit
This would let all tests take less than 10 min on CI even non-parallelized one like on Windows.
* Add test case for #30 - Euler's criterion doesn't return 1 for a square
* Detect #42 in the test suite
* Detect #43 in the test suite
* comment in sqrt tests
* Add #67 to the anti-regression suite
* Add #61 to the anti-regression suite
* Add #62 to anti-regression suite
* Add #60 to the anti-regression suite
* Add #64 to the test suite
* Add #65 - case 1
* Add #65 case 2
* Add #65 case 3
* Add debug check to isSquare/Euler's Criterion/Legendre Symbol
* Make sure our primitives are correct
* For now deactivate montySquare CIOS fix#61#62
* Narrow down #42 and #43 to powinv on 32-bit
* Detect #42#43 at the fast squaring level
* More #42, #43 tests, Use multiplication instead of squaring as a temporary workaround, see https://github.com/mratsim/constantine/issues/68
* Prevent regression of #67 now that squaring is "fixed"