* [Threadpool] Fix syncAll releasing while a thread was attempting to steal + force no exception in tasks
* fix unguarded access on MacOS barriers
* parallel batchadd
* moved import
* sha256: separate message scheduling and state updates to help implement specific use-cases like #205; also implement SSSE3 acceleration (2006, Intel Core 2 Duo)
* sha256: simplify update flow, store less metadata in context
* sha256: Fix reworked update function
* Implement x86 hardware SHA acceleration
* typo
* First draft at bindings generation
* finite field bindings PoC
* support openarray, export NimMain
* PoC extension fields and elliptic curve bindings
* Pasta
* expose more bindings, remove nimZeroMem, remove tracer when unused, codegen name_mangling`gensym issue
* workaround bad C gensym codegen with {.inline.} pragma in non-dirty template nested in generic proc instantiated by template
* Skeleton of hash to curve for BLS12-381 G1
* Remove isodegree parameter
* Fix polynomial evaluation of hashToG1
* Optimize hash_to_curve and add bench for hash to G1
* slight optim of jacobian isomap + v7 test vectors
* Add specific fromMont conversion routine. Rename montyResidue to getMont
* missed test file
* Add x86_64 ASM for fromMont
* Add x86_64 MULX/ADCX/ADOX for fromMont
* rework Montgomery Multiplication with prefetch/latency hiding techniques
* Fix ADX autodetection, closes#174. Rollback faster mul_mont attempt, no improvement and debug pain.
* finalSub in fromMont & adx_bmi -> adx
* Some {.noInit.} to avoid Nim zeroMem (which should be optimized away but who knows)
* Uniformize name 'op+domain': mulmod - mulmont
* Fix asm codegen bug "0x0000555555565930 <+896>: sbb 0x20(%r8),%r8" with Clang in final substraction
* Prepare for skipping final substraction
* Don't forget to copy the result when we skip the final substraction
* Seems like we need to stash the idea of skipping the final substraction for now, needs bounds analysis https://eprint.iacr.org/2017/1057.pdf
* fix condition for ASM 32-bit
* optim modular addition when sparebit is available
* split modular inversion in its own file
* Stash fast GCD inversion https://eprint.iacr.org/2020/972.pdf
* Stash Pornin's bingcd -> issue with inner modular reduction
* Implement Bernstein-Yang inversion
* Avoid Nim checks on signed integers (32-bit runtime issue)
* cleanup: remove old inversion impls
* cleanup: static moduli, move div2
* small comments (skip ci)
* comment cleanup (skip ci)
* fix total iterations on 32-bit
* Add batch conversion to affine coordinates using simultaneous inversion trick
* fix conditional setZero and batchAffine conversion
* cleanup unneeded branches following affine conversion unification
* Fix batchAffine with zero inputs and add fuzz failure to test suite
* Move cofactor clearing to dedicated per-curve subgroups file
* Add BLS12-381 fast subgroup checks
* Implement fast cofactor clearing for BN254_snarks
* Add fast subgroup check to BN254Snarks
* add BLS12_377 optimized cofactor and subgroup functions
* Add BN254_Nogami
* Add GT-subgroup tests
* Use the new subgroup checks for Eth1 EVM precompiles
* add more Fp tests for Twisted Edwards curves
* add fused sqrt+division bench
* Significant fused sqrt+division improvement for any prime field over algorithm described in "High-Speed High-Security Signature", Bernstein et al, p15 "Fast decompression", https://ed25519.cr.yp.to/ed25519-20110705.pdf
* Activate secp256k1 field benches + spring renaming of field multiplication
* addition chains for inversion and sqrt of Curve25519
* Make isSquare use addition chains
* add double-prec mul/square bench for <256-bit prime fields.
* Hash to Curve: impl expand_message_xmd
* Try to precompute part of hash to curve at compile-time
* sha256 bench - use the new hashes module
* [WIP] smoke test hash to field
* Implement hash_to_field with expected output
* unoptimized hash-to-curve G2 for BLS12-381
* Don't run sanitizer on hash to field as it uses GC-ed strings
* Pairing with affine: align API to BLST and Gurvy and common use-case.
* Implement multi-pairing / aggregate verif for BLS12-381 (+2% pairing perf)
* Generalize the optimized miller loop for single pairing
* Immplement the miller loop addchain for BLS12-377
* Miller addition chain for BN254-Nogami
* no Miller adchain for BN254-Snarks
* Update the line test with new tower https://github.com/mratsim/constantine/pull/153
* Somewhat sparse for Fp2 M-Twist
* Implement line by line multiplication for Fp12 D-Twist
* Somewhat sparse Mul for Fp12 D-Twist
* Finish the sparse and somewhat sparse multiplications
* consistent naming for dbl-width
* Isolate double-width Fp2 mul
* Implement double-width complex multiplication
* Lay out Fp4 double-width mul
* Off by p in square Fp4 as well :/
* less copies and stack space in addition chains
* Address https://github.com/mratsim/constantine/issues/154 partly
* Fix#154, faster Fp4 square: less non-residue, no Mul, only square (bit more ops total)
* Fix typo
* better assembly scheduling for add/sub
* Double-width -> Double-precision
* Unred -> Unr
* double-precision modular addition
* Replace canUseNoCarryMontyMul and canUseNoCarryMontySquare by getSpareBits
* Complete the double-precision implementation
* Use double-precision path for Fp4 squaring and mul
* remove mixin annotations
* Lazy reduction in Fp4 prod
* Fix assembly for sum2xMod
* Assembly for double-precision negation
* reduce white spaces in pairing benchmarks
* ADX implies BMI2
* Pin nim-serialization. Workaround #113 and https://github.com/status-im/nim-serialization/issues/33
* Need to workaround nimble installing dependency multiple times
* non-interactive
* UB sanitizer missing on mingw
* Fix OpenSSL benchmark on non-Linux platforms
* Accelerate CI:
- Skip 32-bit on 64-bit tests
- Only test leaf functionality.
* Don't define -fstack-protector-all with MinGW
* skip line functions and cyclotomic tests (already tested in pairing) + only compile the benches don't run them.
* Implement a Sage codegenerator for frobenius constants
* Sage codegen for pairings
* Autogen of endomorphism acceleration constants
* The autogen fixed a copy-paste bug in lattice decomposition. We can use conditional negation now and save an add+dbl in scalar mul
* small fixes
* sage code for square root bls12-377 is not old
* readme updates
* Provide test suggestions for derive_frobenius
* indentation + add equation form to sage
* Sage test vector generator
* Use the json vectors
- includes type system workaround: generic sandwich https://github.com/nim-lang/Nim/issues/11225
- converting NimNode to typedesc: https://github.com/nim-lang/Nim/issues/6785
* Delete old sage code
* Install nim-serialization and nim-json-serialization in CI
* CI nimble install force yes
* Add Fp, Fp2, Fp6 support for BW6-761
* Add G1 for BW6-761
* Prepare to support G2 twists on the same field as G1
* Remove a useless dependent type for lines
* Implement G2 for BW6-761
* Fix Line leftover
* add Sage for constant time tonelli shanks
* Fused sqrt and invsqrt via Tonelli Shanks
* isolate sqrt in their own folder
* Implement constant-time Tonelli Shanks for any prime
* Implement Fp2 sqrt for any non-residue
* Add tests for BLS12_377
* Lattice decomposition script for BLS12_377 G1
* BLS12-377 G1 GLV ok, G2 GLV issue
* Proper endomorphism acceleration support for BLS12-377
* Add naive pairing support for BLS12-377
* Activate more bench for BLS12-377
* Fix MSB computation
* Optimize final exponentiation + add benches