Commit Graph

62 Commits

Author SHA1 Message Date
Mamy Ratsimbazafy 9ac9862401
Optimize Miller Loop and prepare Multi-pairing (#159)
* Pairing with affine: align API to BLST and Gurvy and common use-case.

* Implement multi-pairing / aggregate verif for BLS12-381 (+2% pairing perf)

* Generalize the optimized miller loop for single pairing

* Immplement the miller loop addchain for BLS12-377

* Miller addition chain for BN254-Nogami

* no Miller adchain for BN254-Snarks

* Update the line test with new tower https://github.com/mratsim/constantine/pull/153

* Somewhat sparse for Fp2 M-Twist

* Implement line by line multiplication for Fp12 D-Twist

* Somewhat sparse Mul for Fp12 D-Twist

* Finish the sparse and somewhat sparse multiplications
2021-02-14 13:06:57 +01:00
Mamy Ratsimbazafy 5806cc4638
Double-Precision towering (#155)
* consistent naming for dbl-width

* Isolate double-width Fp2 mul

* Implement double-width complex multiplication

* Lay out Fp4 double-width mul

* Off by p in square Fp4 as well :/

* less copies and stack space in addition chains

* Address https://github.com/mratsim/constantine/issues/154 partly

* Fix #154, faster Fp4 square: less non-residue, no Mul, only square (bit more ops total)

* Fix typo

* better assembly scheduling for add/sub

* Double-width -> Double-precision

* Unred -> Unr

* double-precision modular addition

* Replace canUseNoCarryMontyMul and canUseNoCarryMontySquare by getSpareBits

* Complete the double-precision implementation

* Use double-precision path for Fp4 squaring and mul

* remove mixin annotations

* Lazy reduction in Fp4 prod

* Fix assembly for sum2xMod

* Assembly for double-precision negation

* reduce white spaces in pairing benchmarks

* ADX implies BMI2
2021-02-09 22:57:45 +01:00
Mamy André-Ratsimbazafy 2c5e12d5f8
Workaround aliasing in Fp12[BLS12-377] inversion, fix #147 2021-02-02 12:53:36 +01:00
Mamy Ratsimbazafy 83dcd988b3
FpDbl revisited (#144) - 7% perf improvement everywhere, up to 30% in double-width primitives
* reorg mul -> limbs_double_width, ConstantineASM CttASM

* Implement squaring specialized scalar path (22% faster than mul)

* Implement "portable" assembly for squaring

* stash part of the changes

* Reorg montgomery reduction - prepare to introduce Comba optimization

* Implement comba Montgomery reduce (but it's slower!)

* rename t -> a

* 30% performance improvement by avoiding toOpenArray!

* variable renaming

* Fix 32-bit imports

* slightly better assembly for sub2x

* There is an annoying bottleneck

* use out-of-place Fp assembly instead of in-place

* diffAlias is unneeded now

* cosmetic

* speedup fpDbl sub by 20%

* Fix Fp2 -> Fp6 -> Fp12 towering. It seems 5% faster

* Stash ADCX/ADOX squaring
2021-02-01 03:52:27 +01:00
Mamy Ratsimbazafy d12d5faf21
Implement Jacobian mixed addition (#142) 2021-01-30 14:21:55 +01:00
Mamy Ratsimbazafy 7e97cd4ac5
Fuzz fix - non-unique modular representation after Assembly negate (#137)
* Fix #114 - Negating 0 left the prime modulus, which is working most of the time for everything except for comparison. (also somehow triggers and workaround weird compiler bug where exceptions tracking is activated in macros and all the curve enums were stringified as their ordinal value)

* https://github.com/mratsim/constantine/issues/136 was also fixed, add to anti-regression

* add comment in test

* Fix the pure Nim fallback as well
2021-01-24 12:35:27 +01:00
Mamy Ratsimbazafy 82819b1b10
Square Root & Inversion addition chains - 20% perf increase (#132)
* Addition chain for sqrt BLS12-381: 20% perf improvement

* sqrt addchain for BN254_Snarks - 20% perf improvement as well

* Fix operation count [skip ci]

* BLS12-377 sqrt - 10% perf improvement

* sqrt addition chain for BW6-761 - 6% speedup

* BN254_Nogami inversion addchain

* sqrt addchain for BN254_Nogami

* Inversion addchain for BLS12-377

* inversion ddition chain for BW6-761
2021-01-23 20:55:40 +01:00
Mamy Ratsimbazafy 638cb71e16
Fr: Finite Field parametrized by the curve order (#115)
* Introduce Fr type: finite field over curve order. Need workaround for https://github.com/nim-lang/Nim/issues/16774

* Split curve properties into core and derived

* Attach field properties to an instantiated field instead of the curve enum

* Workaround https://github.com/nim-lang/Nim/issues/14021, yet another "working with types in macros" is difficult https://github.com/nim-lang/RFCs/issues/44

* Implement finite field over prime order of a curve subgroup

* skip OpenSSL tests on windows
2021-01-22 00:09:52 +01:00
Mamy Ratsimbazafy ac6300555a
Fix test suite (#116)
* Pin nim-serialization. Workaround #113 and https://github.com/status-im/nim-serialization/issues/33

* Need to workaround nimble installing dependency multiple times

* non-interactive

* UB sanitizer missing on mingw

* Fix OpenSSL benchmark on non-Linux platforms

* Accelerate CI:
- Skip 32-bit on 64-bit tests
- Only test leaf functionality.

* Don't define -fstack-protector-all with MinGW

* skip line functions and cyclotomic tests (already tested in pairing) + only compile the benches don't run them.
2021-01-21 21:25:42 +01:00
Mamy Ratsimbazafy 023e690efc
Fix #111 2021-01-11 08:25:02 +01:00
Mamy André-Ratsimbazafy e89429e822
SHA256 Hash function 2020-12-15 19:18:36 +01:00
mratsim 45ef3a65e0 Skip 32-bit tests on 64-bit machines (too long) 2020-10-31 14:51:17 +01:00
mratsim 1383aae105 Remove outdated TODOs [skip ci]
- noinline consts: https://github.com/nim-lang/RFCs/issues/257
2020-10-11 21:33:59 +02:00
Mamy Ratsimbazafy 6530596032
Endomorphism acceleration for BN254-Nogami (#102) 2020-10-10 18:53:48 +02:00
Mamy Ratsimbazafy a2f46f77b7
Sage constants & tests codegen (#101)
* Implement a Sage codegenerator for frobenius constants

* Sage codegen for pairings

* Autogen of endomorphism acceleration constants

* The autogen fixed a copy-paste bug in lattice decomposition. We can use conditional negation now and save an add+dbl in scalar mul

* small fixes

* sage code for square root bls12-377 is not old

* readme updates

* Provide test suggestions for derive_frobenius

* indentation + add equation form to sage

* Sage test vector generator

* Use the json vectors
- includes type system workaround: generic sandwich https://github.com/nim-lang/Nim/issues/11225
- converting NimNode to typedesc: https://github.com/nim-lang/Nim/issues/6785

* Delete old sage code

* Install nim-serialization and nim-json-serialization in CI

* CI nimble install force yes
2020-10-10 16:19:23 +02:00
Mamy Ratsimbazafy 71bb4c799a
BW6-761 part 1 (#100)
* Add Fp, Fp2, Fp6 support for BW6-761

* Add G1 for BW6-761

* Prepare to support G2 twists on the same field as G1

* Remove a useless dependent type for lines

* Implement G2 for BW6-761

* Fix Line leftover
2020-10-09 07:51:47 +02:00
Mamy Ratsimbazafy fc1c3472ce
Fused projective line eval (#96)
* Reorg line functions to allow for Jacobian eval

* 2x faster Miller loop!!! with fused line eval double

* Support Line Double Fusion for D-Twists

* Implement fused line addition
2020-10-04 09:39:02 +02:00
Mamy Ratsimbazafy 986245b5c1
Jacobian coordinates (#95)
* Add projective-> affine bench

* Add conditional copy and div2 benches

* Fp4 benchmarks

* Constant-time Jacobian addition

* Jacobian doubling

* Use a simpler Add+Dbl complete formula

* Update tests

* Fix conditional negate

* Rollaback complete addition, we were only handling curve coef a == 0
2020-10-02 00:01:09 +02:00
Mamy André-Ratsimbazafy d04ccdd578
Move the cubic root to GLV files 2020-09-27 16:01:31 +02:00
Mamy Ratsimbazafy 0e4dbfe400
BLS12-377 (#91)
* add Sage for constant time tonelli shanks

* Fused sqrt and invsqrt via Tonelli Shanks

* isolate sqrt in their own folder

* Implement constant-time Tonelli Shanks for any prime

* Implement Fp2 sqrt for any non-residue

* Add tests for BLS12_377

* Lattice decomposition script for BLS12_377 G1

* BLS12-377 G1 GLV ok, G2 GLV issue

* Proper endomorphism acceleration support for BLS12-377

* Add naive pairing support for BLS12-377

* Activate more bench for BLS12-377

* Fix MSB computation

* Optimize final exponentiation + add benches
2020-09-27 09:15:14 +02:00
Mamy Ratsimbazafy 6ecbedbd09
Mixed addition (#90)
* ptrettier comments

* Implement mixed addition on G1

* Test for mixed addition in G2 and use it for Miller Loop
2020-09-26 09:16:29 +02:00
Mamy Ratsimbazafy 03ecb31c57
Pairings for BN254-Nogami and BN254-Snarks (#86)
* Implement optimized final exponentiation for BN254-Nogami

* And BN254 Snarks support

* Optimize D-Twist sparse Fp12 x line multiplication

* Move quadruple/octuple and add to Github issues: https://github.com/mratsim/constantine/issues/88 [skip ci]
2020-09-25 21:58:20 +02:00
Mamy Ratsimbazafy f78ed23dad
Pairing optim (#85)
* Fix fp12 Frobenius map

* Implement cyclotomic subgroup acceleration

* make cyclotomic squaring in-place

* Add back out-place cycl squaring and add cyclotomic inverse

* Implement state-of-the-art BLS12-381 final exponentiation

* save a cyclotomic squaring

* Accelerate sparse line multiplication in Miller loop

* Add pairing bench

* fix comments
2020-09-24 17:18:23 +02:00
Mamy Ratsimbazafy d84edcd217
Naive pairings + Naive cofactor clearing (#82)
* Pairing - initial commit
- line functions
- sparse Fp12 functions

* Small fixes:
- Line parametrized by twist for generic algorithm
- Add a conjugate operator for quadratic extensions
- Have frobenius use it
- Create an Affine coordinate type for elliptic curve

* Implement (failing) pairing test

* Stash pairing debug session, temp switch Fp12 over Fp4

* Proper naive pairing on BLS12-381

* Frobenius map

* Implement naive pairing for BN curves

* Add pairing tests to CI + reduce time spent on lower-level tests

* Test without assembler in Github Actions + less base layers test iterations
2020-09-21 23:24:00 +02:00
Mamy Ratsimbazafy 4a308c2148
Frobenius endomorphism ψ = φ−1 πp φ (psi = untwist-Frobenius-Twist) (#78)
* Sage script for frobenius isogeny

* Implement ψ (Psi) - Untwist-Frobenius-Twist Endomorphism on G2

* Implement sparse mul for frpbenius endomorphism

* Implement optimized psi2
2020-08-31 23:18:48 +02:00
Mamy André-Ratsimbazafy 442c3f6cf6
Consolidate output folders of bench and testsuite 2020-08-22 23:00:05 +02:00
Mamy Ratsimbazafy d41c653c8a
Double-width tower extension part 1 (#72)
* Implement double-width field multiplication for double-width towering

* Fp2 mul acceleration via double-width lazy reduction (pure Nim)

* Inline assembly for basic add and sub

* Use 2 registers instead of 12+ for ASM conditional copy

* Prepare assembly for extended multiprecision multiplication support

* Add assembly for mul

* initial implementation of assembly reduction

* stash current progress of assembly reduction

* Fix clobbering issue, only P256 comparison remain buggy

* Fix asm montgomery reduction for NIST P256 as well

* MULX/ADCX/ADOX multi-precision multiplication

* MULX/ADCX/ADOX reduction v1

* Add (deactivated) assembly for double-width substraction + rework benches

* Add bench to nimble and deactivate double-width for now. slower than classic

* Fix x86-32 running out of registers for mul

* Clang needs to be at v9 to support flag output constraints (Xcode 11.4.2 / OSX Catalina)

* 32-bit doesn't have enough registers for ASM mul

* Fix again Travis Clang 9 issues

* LLVM 9 is not whitelisted in travis

* deactivated assembler with travis clang

* syntax error

* another

* ...

* missing space, yeah ...
2020-08-20 10:21:39 +02:00
Mamy Ratsimbazafy d97bc9b61c
Assembly backend (#69)
* Proof-of-Concept Assembly code generator

* Tag inline per procedure so we can easily track the tradeoff on tower fields

* Implement Assembly for modular addition (but very curious off-by-one)

* Fix off-by one for moduli with non msb set

* Stash (super fast) alternative but still off by carry

* Fix GCC optimizing ASM away

* Save 1 register to allow compiling for BLS12-381 (in the GMP test)

* The compiler cannot find enough registers if the ASM file is not compiled with -O3

* Add modsub

* Add field negation

* Implement no-carry Assembly optimized field multiplication

* Expose UseX86ASM to the EC benchmark

* omit frame pointer to save registers instead of hardcoding -O3. Also ensure early clobber constraints for Clang

* Prepare for assembly fallback

* Implement fallback for CPU that don't support ADX and BMI2

* Add CPU runtime detection

* Update README closes #66

* Remove commented out code
2020-07-24 22:02:30 +02:00
Mamy Ratsimbazafy ec76ac5ea6
Fuzzing campaign fixes (#58)
* Add test case for #30 - Euler's criterion doesn't return 1 for a square

* Detect #42 in the test suite

* Detect #43 in the test suite

* comment in sqrt tests

* Add #67 to the anti-regression suite

* Add #61 to the anti-regression suite

* Add #62 to anti-regression suite

* Add #60 to the anti-regression suite

* Add #64 to the test suite

* Add #65 - case 1

* Add #65 case 2

* Add #65 case 3

* Add debug check to isSquare/Euler's Criterion/Legendre Symbol

* Make sure our primitives are correct

* For now deactivate montySquare CIOS fix #61 #62

* Narrow down #42 and #43 to powinv on 32-bit

* Detect #42 #43 at the fast squaring level

* More #42, #43 tests, Use multiplication instead of squaring as a temporary workaround, see https://github.com/mratsim/constantine/issues/68

* Prevent regression of #67 now that squaring is "fixed"
2020-06-23 01:27:40 +02:00
Mamy Ratsimbazafy 0400187f05
Use GMP and GNU parallel in GIthub Actions (#63)
* Use GMP and GNU parallel in GIthub Actions

* try gmp:i386

* Don't do GMP tests on i386
2020-06-20 19:46:30 +02:00
Mamy Ratsimbazafy a2a2495351
Github Action CI (without GMP) (#29)
* Github Action CI (without GMP)

* Deactivate MacOS, spurious failures: https://github.com/actions/virtual-environments/issues/841

* force install with nimble

* Add badge

* Don"t include Nim 1.2.x https://github.com/mratsim/constantine/pull/20#issuecomment-646327952

* Action branch mistake

* Add back OSX? https://github.com/actions/virtual-environments/issues/841, https://github.com/actions/virtual-environments/issues/969

* fix MacOS target

* comment out RDTSC on i386

* Add initialization canaries

* Add more verbose output to debug windows failures

* spurious windows i386 test

* For now only activate Linux and mac

* missed include
2020-06-19 22:08:15 +02:00
Mamy André-Ratsimbazafy 43abf9dfc4
SHorter test names for github display 2020-06-15 23:15:01 +02:00
Mamy Ratsimbazafy d376f08d1b
G2 / Operations on the twisted curve E'(Fp2) (#51)
* Split elliptic curve tests to better use parallel testing

* Add support for printing points on G2

* Implement multiplication and division by optimal sextic non-residue (BLS12-381)

* Implement modular square root in 𝔽p2

* Support EC add and EC double on G2 (for BLS12-381)

* Support G2 divisive twists with non-unit sextic-non-residue like BN254 snarks

* Add EC G2 bench

* cleanup some unused warnings

* Reorg the tests for parallelization and to avoid instantiating huge files
2020-06-15 22:58:56 +02:00
Mamy Ratsimbazafy 2613356281
Endomorphism acceleration for Scalar Multiplication (#44)
* Add MultiScalar recoding from "Efficient and Secure Algorithms for GLV-Based Scalar Multiplication" by Faz et al

* precompute cube root of unity - Add VM precomputation of Fp - workaround upstream bug https://github.com/nim-lang/Nim/issues/14585

* Add the φ-accelerated lookup table builder

* Add a dedicated bithacks file

* cosmetic import consistency

* Build the φ precompute table with n-1 EC additions instead of 2^(n-1) additions

* remove binary

* Add the GLV precomputations to the sage scripts

* You can't avoid it, bigint multiplication is needed at one point

* Add bigint multiplication discarding some low words

* Implement the lattice decomposition in sage

* Proper decomposition for BN254

* Prepare the code for a new scalar mul

* We compile, and now debugging hunt

* More helpers to debug GLV scalar Mul

* Fix conditional negation

* Endomorphism accelerated scalar mul working for BN254 curve

* Implement endomorphism acceleration for BLS12-381 (needed cofactor clearing of the point)

* fix nimble test script after bench rename
2020-06-14 15:39:06 +02:00
Mamy Ratsimbazafy f8fb54faef
Build and run tests in parallel (#41)
* Build and run tests in parallel

* Workaround travis OSX Homebrew bug

* semicolons ...

* Don't auto-update before installing homebrew packages
2020-06-07 19:39:34 +02:00
Mamy Ratsimbazafy 82ceca6e3b
Scalar mul tests (#28)
* Add sage script for BN254

* Implement (failing) scalar multiplication tests

* Add a first test against sagemath

* Finish the tests against SAGE for BN254

* Add significant test coverage of scalar multiplication with reference checks for BN254_Snarks and BLS12_381
2020-06-04 20:37:29 +02:00
Mamy André-Ratsimbazafy f7818b566b
Fix ARM bench and ignore Windows 32-bit bench 2020-04-15 21:28:37 +02:00
Mamy André-Ratsimbazafy d7e170288f
Add benches to the test suite to prevent forgetting about updating them 2020-04-15 19:46:25 +02:00
Mamy André-Ratsimbazafy ce350d1dac
add EC bench to nimble tasks 2020-04-15 19:43:31 +02:00
Mamy Ratsimbazafy 2f839cb1bf
Initial support for Elliptic Curve (#24)
* Elliptic curve and Twisted curve templates - initial commit

* Support EC Add on G2 (Sextic Twisted curve for BN and BLS12 families)

* Refactor the config parser to prepare for elliptic coefficient support

* Add elliptic curve parameter for BN254 (Snarks), BLS12-381 and Zexe curve BLS12-377

* Add accessors to curve parameters

* Allow computing the right-hand-side of of Weierstrass equation "y² = x³ + a x + b"

* Randomized test infrastructure for elliptic curves

* Start a testing suite on ellptic curve addition (failing)

* detail projective addition

* Fix EC addition test (forgot initializing Z=1 and that there ar emultiple infinity points)

* Test with random Z coordinate + add elliptic curve test to test suite

* fix reference to the (deactivated) addchain inversion for BN curves [skip ci]

* .nims file leftover [skip ci]
2020-04-13 19:25:59 +02:00
Mamy Ratsimbazafy 42109d4f1c
Square roots (#22)
* Add modular square root for p ≡ 3 (mod 4)

* Exhaustive tests for sqrt with p ≡ 3 (mod 4)

* fix typo
2020-04-11 23:53:21 +02:00
Mamy André-Ratsimbazafy a6e4517be2
Implement 𝔽p12 inversion, enable 𝔽p12 tests and bench 2020-04-09 14:28:01 +02:00
Mamy André-Ratsimbazafy 2d5b173a39
Less magics, les macros, faster compile-times (or not, Fp6 starts to get really slow, like 5s) + some cleanups in curve families + test 𝔽p6 on 32-bit 2020-03-22 12:28:53 +01:00
Mamy André-Ratsimbazafy ff4a54daba
Add multiplication in 𝔽p6 = 𝔽p2[∛(1+𝑖)] 2020-03-21 19:03:57 +01:00
Mamy André-Ratsimbazafy 9e78cd5d6d
Benchmark template for 𝔽p, 𝔽p2, 𝔽p6 2020-03-21 02:31:31 +01:00
Mamy André-Ratsimbazafy bde619155b
30% faster constant-time inversion 2020-03-20 23:03:52 +01:00
Mamy Ratsimbazafy 6423be0dfb
Add optimized squaring (~15% speedup) (#18)
* Add optimized squaring (~15% speedup)

* avoid repetitions in tests
2020-03-17 22:04:37 +01:00
Mamy Ratsimbazafy 4ff0e3d90b
Internals refactor + renewed focus on perf (#17)
* Lay out the refactoring objectives and tradeoffs

* Refactor the 32 and 64-bit primitives [skip ci]

* BigInts and Modular BigInts compile

* Make the bigints test compile

* Fix modular reduction

* Fix reduction tests vs GMP

* Implement montegomery mul, pow, inverse, WIP finite field compilation

* Make FiniteField compile

* Fix exponentiation compilation

* Fix Montgomery magic constant computation  for 2^64 words

* Fix typo in non-optimized CIOS - passing finite fields IO tests

* Add limbs comparisons [skip ci]

* Fix on precomputation of the Montgomery magic constant

* Passing all tests including 𝔽p2

* modular addition, the test for mersenne prime was wrong

* update benches

* Fix "nimble test" + typo on out-of-place field addition

* bigint division, normalization is needed: https://travis-ci.com/github/mratsim/constantine/jobs/298359743

* missing conversion in subborrow non-x86 fallback - https://travis-ci.com/github/mratsim/constantine/jobs/298359744

* Fix little-endian serialization

* Constantine32 flag to run 32-bit constantine on 64-bit machines

* IO Field test, ensure that BaseType is used instead of uint64 when the prime can field in uint32

* Implement proper addcarry and subborrow fallback for the compile-time VM

* Fix export issue when the logical wordbitwidth == physical wordbitwidth - passes all tests (32-bit and 64-bit)

* Fix uint128 on ARM

* Fix C++ conditional copy and ARM addcarry/subborrow

* Add investigation for SIGFPE in Travis

* Fix debug display for unsafeDiv2n1n

* multiplexer typo

* moveMem bug in glibc of Ubuntu 16.04?

* Was probably missing an early clobbered register annotation on conditional mov

* Note on Montgomery-friendly moduli

* Strongly suspect a GCC before GCC 7 codegen bug (https://gcc.gnu.org/bugzilla/show_bug.cgi?id=87139)

* hex conversion was (for debugging) not taking requested order into account + inlining comment

* Use 32-bit limbs on ARM64, uint128 builtin __udivti4 bug?

* Revert "Use 32-bit limbs on ARM64, uint128 builtin __udivti4 bug?"

This reverts commit 087f9aa7fb40bbd058d05cbd8eec7fc082911f49.

* Fix subborrow fallback for non-x86 (need to maks the borrow)
2020-03-16 16:33:51 +01:00
Mamy André-Ratsimbazafy feacf2b2ea
Fix 64-bit limbs, passing all tests 2020-02-29 14:49:38 +01:00
Mamy Ratsimbazafy 80f822c227
Set up CI with Azure Pipelines (#13)
* Set up CI with Azure Pipelines

[skip ci]

* Add task for testing without GMP

* Add C++ testing + no GMP on windows

* Add the Nim wrapper for GMP to Azure build

* Add Azure badge

* Fix nimble test tasks

* Workaround windows path in Azure

* Fix nim binaries path and mingw on 32-bit

* add stew test dependency

* Fix nim/nimble path

* disable GMP tests on windows
2020-02-23 18:27:26 +01:00