Mamy Ratsimbazafy 5806cc4638
Double-Precision towering (#155)
* consistent naming for dbl-width

* Isolate double-width Fp2 mul

* Implement double-width complex multiplication

* Lay out Fp4 double-width mul

* Off by p in square Fp4 as well :/

* less copies and stack space in addition chains

* Address https://github.com/mratsim/constantine/issues/154 partly

* Fix #154, faster Fp4 square: less non-residue, no Mul, only square (bit more ops total)

* Fix typo

* better assembly scheduling for add/sub

* Double-width -> Double-precision

* Unred -> Unr

* double-precision modular addition

* Replace canUseNoCarryMontyMul and canUseNoCarryMontySquare by getSpareBits

* Complete the double-precision implementation

* Use double-precision path for Fp4 squaring and mul

* remove mixin annotations

* Lazy reduction in Fp4 prod

* Fix assembly for sum2xMod

* Assembly for double-precision negation

* reduce white spaces in pairing benchmarks

* ADX implies BMI2
2021-02-09 22:57:45 +01:00
..
2020-10-02 00:01:09 +02:00
2020-10-02 00:01:09 +02:00
2020-09-27 13:13:45 +02:00
2020-09-27 13:13:45 +02:00
2020-09-27 09:15:14 +02:00
2021-01-21 21:25:42 +01:00
2020-10-11 21:33:59 +02:00