Commit Graph

83 Commits

Author SHA1 Message Date
Mamy Ratsimbazafy 99c9730793
Self-contained bindings generation (#196)
* First draft at bindings generation

* finite field bindings PoC

* support openarray, export NimMain

* PoC extension fields and elliptic curve bindings

* Pasta

* expose more bindings, remove nimZeroMem, remove tracer when unused, codegen name_mangling`gensym issue

* workaround bad C gensym codegen with {.inline.} pragma in non-dirty template nested in generic proc instantiated by template
2022-08-06 19:05:54 +02:00
Mamy Ratsimbazafy e29e529f18
Add multipairing for BN curves (#194) 2022-05-08 19:01:23 +02:00
Mamy Ratsimbazafy 39a8a413de
Pasta curves (#191)
* Pasta curves field arithmetic

* implement elliptic curve arith for the Pasta curves
2022-04-27 00:58:48 +02:00
Mamy Ratsimbazafy e9e7a1809c
BN254 - Hash-to-Curve (SVDW method) (#190)
* Hash to BN254-Snarks

* Test SVDW code path with old v7 vectors for BLS12-381

* add benches
2022-04-26 21:24:07 +02:00
Mamy Ratsimbazafy 742cecce08
Poly1305 Message Authentication Code (#186)
* Groundwork for Poly1305 MAC

* Implement fast reduction for Poly1305

* don't import assembly files when compiling without assembly
2022-03-05 23:39:24 +01:00
Mamy Ratsimbazafy c2eb42b769
Add ChaCha20 stream cipher 2022-03-02 01:18:47 +01:00
Mamy Ratsimbazafy ffacf61e8a
Don't dump all in "backend" (#184)
* backend -> math

* towers -> extension fields

* move ISA and compiler specific code out of math/

* fix export
2022-02-27 01:49:08 +01:00
Mamy Ratsimbazafy 5bc6d1d426
BLS signatures for Ethereum (BLS sig on BLS12-381 G2 with SHA256) (#183)
* Finally add the (Ethereum) bls signatures (on BLS12-381 G2)

* fix test path and remove old low-level signature test
2022-02-26 21:22:34 +01:00
Mamy Ratsimbazafy fe500a6a79
Productionize: move protocols top-level vs backend (#179)
* Productionize: move protocols top-level vs backend

* fix path

* import fix

* the last one

* benches as well
2022-02-21 01:04:53 +01:00
Mamy Ratsimbazafy 81acfb1626
Nim 1.6 in CI (#170)
* try 1.6 CI

* Try CI with 1.6 and windows.

* Bend the knee

* have fun debugging CI

* have fun debugging CI

* more CI spam

* branch -> nim_version

* fight or flight

* properly detect windows

* Fix galore

* 🐍 🐍 snake:

* meh give up on parallelizing windows and dealing with windows PATH issues

* ¯\_ (ツ)_/¯
2022-02-20 23:44:00 +01:00
Mamy Ratsimbazafy dc73c71801
Pairings optimizations (#178)
* bench for cyclotomic square, exp and rename cyclotomic exp + multipairings for BLS12-377

* refactor/unify lines and cyclotomic functions

* Add Karabina's compressed squaring

* Use compressed squarings in final exponentiation

* Weighted addchain for bn254_snarks

* Add new towering options and cost functions

* Rearrange bench summaries

* fix BW6-761
2022-02-20 20:15:20 +01:00
Mamy Ratsimbazafy 53c4db7ead
Fast modular inversion (#172)
* split modular inversion in its own file

* Stash fast GCD inversion https://eprint.iacr.org/2020/972.pdf

* Stash Pornin's bingcd -> issue with inner modular reduction

* Implement Bernstein-Yang inversion

* Avoid Nim checks on signed integers (32-bit runtime issue)

* cleanup: remove old inversion impls

* cleanup: static moduli, move div2

* small comments (skip ci)

* comment cleanup (skip ci)

* fix total iterations on 32-bit

* Add batch conversion to affine coordinates using simultaneous inversion trick

* fix conditional setZero and batchAffine conversion

* cleanup unneeded branches following affine conversion unification

* Fix batchAffine with zero inputs and add fuzz failure to test suite
2022-02-10 14:05:07 +01:00
Mamy Ratsimbazafy 50717d8de6
Test GT-subgroup for BW6-761 (#171) 2022-01-08 17:30:26 +01:00
Mamy Ratsimbazafy f6c02fe075
Optimized subgroup checks and cofactor clearing (#169)
* Move cofactor clearing to dedicated per-curve subgroups file

* Add BLS12-381 fast subgroup checks

* Implement fast cofactor clearing for BN254_snarks

* Add fast subgroup check to BN254Snarks

* add BLS12_377 optimized cofactor and subgroup functions

* Add BN254_Nogami

* Add GT-subgroup tests

* Use the new subgroup checks for Eth1 EVM precompiles
2022-01-03 14:12:58 +01:00
Mamy Ratsimbazafy 53f9708c2b
Initial support for Twisted Edwards curves (#167)
* Point decoding: optimized sqrt for p ≡ 5 (mod 8) (Curve25519)

* Implement fused sqrt(u/v) for twisted edwards point deserialization

* Introduce twisted edwards affine

* Allow declaration of curve field elements (and fight against recursive dependencies

* Twisted edwards group law + tests

* Add support for jubjub and bandersnatch #162

* test twisted edwards scalar mul
2021-12-29 01:54:17 +01:00
Mamy Ratsimbazafy 1195e5e980
Eth1 evm precompiles (#166)
* Prepare support for Eth1 EVM

* Implement EIP 196 (Ethereum BN254 add/mul)

* Implement ETH1 pairing precompile

* Accelerate isOnCurve for G2 with precomputation
2021-12-15 00:02:11 +01:00
Mamy Ratsimbazafy f5c0b6245d
Multipairing (#165)
* Productionize multipairings for BLS12-381

* typo

* arg order + benchmark

* Introduce mul_3way_sparse_sparse

* cleanup MultiMiller loop

* fix init sparse optimization in multimiller loop [skip ci]
2021-08-16 22:22:51 +02:00
Mamy Ratsimbazafy 979d183657
Tests for the eth2 BLS signature protocol (BLS12-381, pubkeys G1, signatures G2) using low-level primitives (#164) 2021-08-15 11:41:46 +02:00
Mamy Ratsimbazafy 0bc228126a
hash-to-curve BLS12-381 perf (#163)
* fp square noasm split from non-4 non-6 limbs fallback (40% speedup)

* optimized cofactor clearing for BLS12-381 G2

* Support jacobian isogenies and point_add on isogenies

* fuse addition and isogeny map

* {.noInit.} and sparseMul

* poly_eval_horner init

* dedicated invsqrt + cleanup square root file

* hash to field: reduce copy overhead and don't return arrays

* h2c isogeny jacobian reuse pow 3 precomputed value

* Fix sqrt bench
2021-08-14 21:01:50 +02:00
Mamy Ratsimbazafy 499f9605b2
Hash to curve - BLS12-381 (#110)
* Hash to Curve: impl expand_message_xmd

* Try to precompute part of hash to curve at compile-time

* sha256 bench - use the new hashes module

* [WIP] smoke test hash to field

* Implement hash_to_field with expected output

* unoptimized hash-to-curve G2 for BLS12-381

* Don't run sanitizer on hash to field as it uses GC-ed strings
2021-08-13 22:07:26 +02:00
Mamy André-Ratsimbazafy 3e977488a9
add bench whole summary for curves 2021-02-14 14:24:48 +01:00
Mamy Ratsimbazafy 9ac9862401
Optimize Miller Loop and prepare Multi-pairing (#159)
* Pairing with affine: align API to BLST and Gurvy and common use-case.

* Implement multi-pairing / aggregate verif for BLS12-381 (+2% pairing perf)

* Generalize the optimized miller loop for single pairing

* Immplement the miller loop addchain for BLS12-377

* Miller addition chain for BN254-Nogami

* no Miller adchain for BN254-Snarks

* Update the line test with new tower https://github.com/mratsim/constantine/pull/153

* Somewhat sparse for Fp2 M-Twist

* Implement line by line multiplication for Fp12 D-Twist

* Somewhat sparse Mul for Fp12 D-Twist

* Finish the sparse and somewhat sparse multiplications
2021-02-14 13:06:57 +01:00
Mamy Ratsimbazafy 5806cc4638
Double-Precision towering (#155)
* consistent naming for dbl-width

* Isolate double-width Fp2 mul

* Implement double-width complex multiplication

* Lay out Fp4 double-width mul

* Off by p in square Fp4 as well :/

* less copies and stack space in addition chains

* Address https://github.com/mratsim/constantine/issues/154 partly

* Fix #154, faster Fp4 square: less non-residue, no Mul, only square (bit more ops total)

* Fix typo

* better assembly scheduling for add/sub

* Double-width -> Double-precision

* Unred -> Unr

* double-precision modular addition

* Replace canUseNoCarryMontyMul and canUseNoCarryMontySquare by getSpareBits

* Complete the double-precision implementation

* Use double-precision path for Fp4 squaring and mul

* remove mixin annotations

* Lazy reduction in Fp4 prod

* Fix assembly for sum2xMod

* Assembly for double-precision negation

* reduce white spaces in pairing benchmarks

* ADX implies BMI2
2021-02-09 22:57:45 +01:00
Mamy André-Ratsimbazafy 2c5e12d5f8
Workaround aliasing in Fp12[BLS12-377] inversion, fix #147 2021-02-02 12:53:36 +01:00
Mamy Ratsimbazafy 83dcd988b3
FpDbl revisited (#144) - 7% perf improvement everywhere, up to 30% in double-width primitives
* reorg mul -> limbs_double_width, ConstantineASM CttASM

* Implement squaring specialized scalar path (22% faster than mul)

* Implement "portable" assembly for squaring

* stash part of the changes

* Reorg montgomery reduction - prepare to introduce Comba optimization

* Implement comba Montgomery reduce (but it's slower!)

* rename t -> a

* 30% performance improvement by avoiding toOpenArray!

* variable renaming

* Fix 32-bit imports

* slightly better assembly for sub2x

* There is an annoying bottleneck

* use out-of-place Fp assembly instead of in-place

* diffAlias is unneeded now

* cosmetic

* speedup fpDbl sub by 20%

* Fix Fp2 -> Fp6 -> Fp12 towering. It seems 5% faster

* Stash ADCX/ADOX squaring
2021-02-01 03:52:27 +01:00
Mamy Ratsimbazafy d12d5faf21
Implement Jacobian mixed addition (#142) 2021-01-30 14:21:55 +01:00
Mamy Ratsimbazafy 7e97cd4ac5
Fuzz fix - non-unique modular representation after Assembly negate (#137)
* Fix #114 - Negating 0 left the prime modulus, which is working most of the time for everything except for comparison. (also somehow triggers and workaround weird compiler bug where exceptions tracking is activated in macros and all the curve enums were stringified as their ordinal value)

* https://github.com/mratsim/constantine/issues/136 was also fixed, add to anti-regression

* add comment in test

* Fix the pure Nim fallback as well
2021-01-24 12:35:27 +01:00
Mamy Ratsimbazafy 82819b1b10
Square Root & Inversion addition chains - 20% perf increase (#132)
* Addition chain for sqrt BLS12-381: 20% perf improvement

* sqrt addchain for BN254_Snarks - 20% perf improvement as well

* Fix operation count [skip ci]

* BLS12-377 sqrt - 10% perf improvement

* sqrt addition chain for BW6-761 - 6% speedup

* BN254_Nogami inversion addchain

* sqrt addchain for BN254_Nogami

* Inversion addchain for BLS12-377

* inversion ddition chain for BW6-761
2021-01-23 20:55:40 +01:00
Mamy Ratsimbazafy 638cb71e16
Fr: Finite Field parametrized by the curve order (#115)
* Introduce Fr type: finite field over curve order. Need workaround for https://github.com/nim-lang/Nim/issues/16774

* Split curve properties into core and derived

* Attach field properties to an instantiated field instead of the curve enum

* Workaround https://github.com/nim-lang/Nim/issues/14021, yet another "working with types in macros" is difficult https://github.com/nim-lang/RFCs/issues/44

* Implement finite field over prime order of a curve subgroup

* skip OpenSSL tests on windows
2021-01-22 00:09:52 +01:00
Mamy Ratsimbazafy ac6300555a
Fix test suite (#116)
* Pin nim-serialization. Workaround #113 and https://github.com/status-im/nim-serialization/issues/33

* Need to workaround nimble installing dependency multiple times

* non-interactive

* UB sanitizer missing on mingw

* Fix OpenSSL benchmark on non-Linux platforms

* Accelerate CI:
- Skip 32-bit on 64-bit tests
- Only test leaf functionality.

* Don't define -fstack-protector-all with MinGW

* skip line functions and cyclotomic tests (already tested in pairing) + only compile the benches don't run them.
2021-01-21 21:25:42 +01:00
Mamy Ratsimbazafy 023e690efc
Fix #111 2021-01-11 08:25:02 +01:00
Mamy André-Ratsimbazafy e89429e822
SHA256 Hash function 2020-12-15 19:18:36 +01:00
mratsim 45ef3a65e0 Skip 32-bit tests on 64-bit machines (too long) 2020-10-31 14:51:17 +01:00
mratsim 1383aae105 Remove outdated TODOs [skip ci]
- noinline consts: https://github.com/nim-lang/RFCs/issues/257
2020-10-11 21:33:59 +02:00
Mamy Ratsimbazafy 6530596032
Endomorphism acceleration for BN254-Nogami (#102) 2020-10-10 18:53:48 +02:00
Mamy Ratsimbazafy a2f46f77b7
Sage constants & tests codegen (#101)
* Implement a Sage codegenerator for frobenius constants

* Sage codegen for pairings

* Autogen of endomorphism acceleration constants

* The autogen fixed a copy-paste bug in lattice decomposition. We can use conditional negation now and save an add+dbl in scalar mul

* small fixes

* sage code for square root bls12-377 is not old

* readme updates

* Provide test suggestions for derive_frobenius

* indentation + add equation form to sage

* Sage test vector generator

* Use the json vectors
- includes type system workaround: generic sandwich https://github.com/nim-lang/Nim/issues/11225
- converting NimNode to typedesc: https://github.com/nim-lang/Nim/issues/6785

* Delete old sage code

* Install nim-serialization and nim-json-serialization in CI

* CI nimble install force yes
2020-10-10 16:19:23 +02:00
Mamy Ratsimbazafy 71bb4c799a
BW6-761 part 1 (#100)
* Add Fp, Fp2, Fp6 support for BW6-761

* Add G1 for BW6-761

* Prepare to support G2 twists on the same field as G1

* Remove a useless dependent type for lines

* Implement G2 for BW6-761

* Fix Line leftover
2020-10-09 07:51:47 +02:00
Mamy Ratsimbazafy fc1c3472ce
Fused projective line eval (#96)
* Reorg line functions to allow for Jacobian eval

* 2x faster Miller loop!!! with fused line eval double

* Support Line Double Fusion for D-Twists

* Implement fused line addition
2020-10-04 09:39:02 +02:00
Mamy Ratsimbazafy 986245b5c1
Jacobian coordinates (#95)
* Add projective-> affine bench

* Add conditional copy and div2 benches

* Fp4 benchmarks

* Constant-time Jacobian addition

* Jacobian doubling

* Use a simpler Add+Dbl complete formula

* Update tests

* Fix conditional negate

* Rollaback complete addition, we were only handling curve coef a == 0
2020-10-02 00:01:09 +02:00
Mamy André-Ratsimbazafy d04ccdd578
Move the cubic root to GLV files 2020-09-27 16:01:31 +02:00
Mamy Ratsimbazafy 0e4dbfe400
BLS12-377 (#91)
* add Sage for constant time tonelli shanks

* Fused sqrt and invsqrt via Tonelli Shanks

* isolate sqrt in their own folder

* Implement constant-time Tonelli Shanks for any prime

* Implement Fp2 sqrt for any non-residue

* Add tests for BLS12_377

* Lattice decomposition script for BLS12_377 G1

* BLS12-377 G1 GLV ok, G2 GLV issue

* Proper endomorphism acceleration support for BLS12-377

* Add naive pairing support for BLS12-377

* Activate more bench for BLS12-377

* Fix MSB computation

* Optimize final exponentiation + add benches
2020-09-27 09:15:14 +02:00
Mamy Ratsimbazafy 6ecbedbd09
Mixed addition (#90)
* ptrettier comments

* Implement mixed addition on G1

* Test for mixed addition in G2 and use it for Miller Loop
2020-09-26 09:16:29 +02:00
Mamy Ratsimbazafy 03ecb31c57
Pairings for BN254-Nogami and BN254-Snarks (#86)
* Implement optimized final exponentiation for BN254-Nogami

* And BN254 Snarks support

* Optimize D-Twist sparse Fp12 x line multiplication

* Move quadruple/octuple and add to Github issues: https://github.com/mratsim/constantine/issues/88 [skip ci]
2020-09-25 21:58:20 +02:00
Mamy Ratsimbazafy f78ed23dad
Pairing optim (#85)
* Fix fp12 Frobenius map

* Implement cyclotomic subgroup acceleration

* make cyclotomic squaring in-place

* Add back out-place cycl squaring and add cyclotomic inverse

* Implement state-of-the-art BLS12-381 final exponentiation

* save a cyclotomic squaring

* Accelerate sparse line multiplication in Miller loop

* Add pairing bench

* fix comments
2020-09-24 17:18:23 +02:00
Mamy Ratsimbazafy d84edcd217
Naive pairings + Naive cofactor clearing (#82)
* Pairing - initial commit
- line functions
- sparse Fp12 functions

* Small fixes:
- Line parametrized by twist for generic algorithm
- Add a conjugate operator for quadratic extensions
- Have frobenius use it
- Create an Affine coordinate type for elliptic curve

* Implement (failing) pairing test

* Stash pairing debug session, temp switch Fp12 over Fp4

* Proper naive pairing on BLS12-381

* Frobenius map

* Implement naive pairing for BN curves

* Add pairing tests to CI + reduce time spent on lower-level tests

* Test without assembler in Github Actions + less base layers test iterations
2020-09-21 23:24:00 +02:00
Mamy Ratsimbazafy 4a308c2148
Frobenius endomorphism ψ = φ−1 πp φ (psi = untwist-Frobenius-Twist) (#78)
* Sage script for frobenius isogeny

* Implement ψ (Psi) - Untwist-Frobenius-Twist Endomorphism on G2

* Implement sparse mul for frpbenius endomorphism

* Implement optimized psi2
2020-08-31 23:18:48 +02:00
Mamy André-Ratsimbazafy 442c3f6cf6
Consolidate output folders of bench and testsuite 2020-08-22 23:00:05 +02:00
Mamy Ratsimbazafy d41c653c8a
Double-width tower extension part 1 (#72)
* Implement double-width field multiplication for double-width towering

* Fp2 mul acceleration via double-width lazy reduction (pure Nim)

* Inline assembly for basic add and sub

* Use 2 registers instead of 12+ for ASM conditional copy

* Prepare assembly for extended multiprecision multiplication support

* Add assembly for mul

* initial implementation of assembly reduction

* stash current progress of assembly reduction

* Fix clobbering issue, only P256 comparison remain buggy

* Fix asm montgomery reduction for NIST P256 as well

* MULX/ADCX/ADOX multi-precision multiplication

* MULX/ADCX/ADOX reduction v1

* Add (deactivated) assembly for double-width substraction + rework benches

* Add bench to nimble and deactivate double-width for now. slower than classic

* Fix x86-32 running out of registers for mul

* Clang needs to be at v9 to support flag output constraints (Xcode 11.4.2 / OSX Catalina)

* 32-bit doesn't have enough registers for ASM mul

* Fix again Travis Clang 9 issues

* LLVM 9 is not whitelisted in travis

* deactivated assembler with travis clang

* syntax error

* another

* ...

* missing space, yeah ...
2020-08-20 10:21:39 +02:00
Mamy Ratsimbazafy d97bc9b61c
Assembly backend (#69)
* Proof-of-Concept Assembly code generator

* Tag inline per procedure so we can easily track the tradeoff on tower fields

* Implement Assembly for modular addition (but very curious off-by-one)

* Fix off-by one for moduli with non msb set

* Stash (super fast) alternative but still off by carry

* Fix GCC optimizing ASM away

* Save 1 register to allow compiling for BLS12-381 (in the GMP test)

* The compiler cannot find enough registers if the ASM file is not compiled with -O3

* Add modsub

* Add field negation

* Implement no-carry Assembly optimized field multiplication

* Expose UseX86ASM to the EC benchmark

* omit frame pointer to save registers instead of hardcoding -O3. Also ensure early clobber constraints for Clang

* Prepare for assembly fallback

* Implement fallback for CPU that don't support ADX and BMI2

* Add CPU runtime detection

* Update README closes #66

* Remove commented out code
2020-07-24 22:02:30 +02:00
Mamy Ratsimbazafy ec76ac5ea6
Fuzzing campaign fixes (#58)
* Add test case for #30 - Euler's criterion doesn't return 1 for a square

* Detect #42 in the test suite

* Detect #43 in the test suite

* comment in sqrt tests

* Add #67 to the anti-regression suite

* Add #61 to the anti-regression suite

* Add #62 to anti-regression suite

* Add #60 to the anti-regression suite

* Add #64 to the test suite

* Add #65 - case 1

* Add #65 case 2

* Add #65 case 3

* Add debug check to isSquare/Euler's Criterion/Legendre Symbol

* Make sure our primitives are correct

* For now deactivate montySquare CIOS fix #61 #62

* Narrow down #42 and #43 to powinv on 32-bit

* Detect #42 #43 at the fast squaring level

* More #42, #43 tests, Use multiplication instead of squaring as a temporary workaround, see https://github.com/mratsim/constantine/issues/68

* Prevent regression of #67 now that squaring is "fixed"
2020-06-23 01:27:40 +02:00