* Clear cofactor in BN254 G2 testgen and frobenius
* Implement G2 endomorphism acceleration in Sage
* Somewhat working accelerated scalar mul G2 (2.2x) faster
- OK for BN254_Snarks
- Some test failing for BLS12-381
* Fix negative miniscalars by adding an extra bit of encoding
* Cleanup accel params
* Small recoding optimizations
* Fix 8x bigger than necessary encoding size of miniscalars in scalar mul
* initial windowed GLV-SAC implementation
* Simplify table encoding to match k0 without flipping bits
* Implement double-width field multiplication for double-width towering
* Fp2 mul acceleration via double-width lazy reduction (pure Nim)
* Inline assembly for basic add and sub
* Use 2 registers instead of 12+ for ASM conditional copy
* Prepare assembly for extended multiprecision multiplication support
* Add assembly for mul
* initial implementation of assembly reduction
* stash current progress of assembly reduction
* Fix clobbering issue, only P256 comparison remain buggy
* Fix asm montgomery reduction for NIST P256 as well
* MULX/ADCX/ADOX multi-precision multiplication
* MULX/ADCX/ADOX reduction v1
* Add (deactivated) assembly for double-width substraction + rework benches
* Add bench to nimble and deactivate double-width for now. slower than classic
* Fix x86-32 running out of registers for mul
* Clang needs to be at v9 to support flag output constraints (Xcode 11.4.2 / OSX Catalina)
* 32-bit doesn't have enough registers for ASM mul
* Fix again Travis Clang 9 issues
* LLVM 9 is not whitelisted in travis
* deactivated assembler with travis clang
* syntax error
* another
* ...
* missing space, yeah ...
* Proof-of-Concept Assembly code generator
* Tag inline per procedure so we can easily track the tradeoff on tower fields
* Implement Assembly for modular addition (but very curious off-by-one)
* Fix off-by one for moduli with non msb set
* Stash (super fast) alternative but still off by carry
* Fix GCC optimizing ASM away
* Save 1 register to allow compiling for BLS12-381 (in the GMP test)
* The compiler cannot find enough registers if the ASM file is not compiled with -O3
* Add modsub
* Add field negation
* Implement no-carry Assembly optimized field multiplication
* Expose UseX86ASM to the EC benchmark
* omit frame pointer to save registers instead of hardcoding -O3. Also ensure early clobber constraints for Clang
* Prepare for assembly fallback
* Implement fallback for CPU that don't support ADX and BMI2
* Add CPU runtime detection
* Update README closes#66
* Remove commented out code
* Add test case for #30 - Euler's criterion doesn't return 1 for a square
* Detect #42 in the test suite
* Detect #43 in the test suite
* comment in sqrt tests
* Add #67 to the anti-regression suite
* Add #61 to the anti-regression suite
* Add #62 to anti-regression suite
* Add #60 to the anti-regression suite
* Add #64 to the test suite
* Add #65 - case 1
* Add #65 case 2
* Add #65 case 3
* Add debug check to isSquare/Euler's Criterion/Legendre Symbol
* Make sure our primitives are correct
* For now deactivate montySquare CIOS fix#61#62
* Narrow down #42 and #43 to powinv on 32-bit
* Detect #42#43 at the fast squaring level
* More #42, #43 tests, Use multiplication instead of squaring as a temporary workaround, see https://github.com/mratsim/constantine/issues/68
* Prevent regression of #67 now that squaring is "fixed"
* Split elliptic curve tests to better use parallel testing
* Add support for printing points on G2
* Implement multiplication and division by optimal sextic non-residue (BLS12-381)
* Implement modular square root in 𝔽p2
* Support EC add and EC double on G2 (for BLS12-381)
* Support G2 divisive twists with non-unit sextic-non-residue like BN254 snarks
* Add EC G2 bench
* cleanup some unused warnings
* Reorg the tests for parallelization and to avoid instantiating huge files
* Add MultiScalar recoding from "Efficient and Secure Algorithms for GLV-Based Scalar Multiplication" by Faz et al
* precompute cube root of unity - Add VM precomputation of Fp - workaround upstream bug https://github.com/nim-lang/Nim/issues/14585
* Add the φ-accelerated lookup table builder
* Add a dedicated bithacks file
* cosmetic import consistency
* Build the φ precompute table with n-1 EC additions instead of 2^(n-1) additions
* remove binary
* Add the GLV precomputations to the sage scripts
* You can't avoid it, bigint multiplication is needed at one point
* Add bigint multiplication discarding some low words
* Implement the lattice decomposition in sage
* Proper decomposition for BN254
* Prepare the code for a new scalar mul
* We compile, and now debugging hunt
* More helpers to debug GLV scalar Mul
* Fix conditional negation
* Endomorphism accelerated scalar mul working for BN254 curve
* Implement endomorphism acceleration for BLS12-381 (needed cofactor clearing of the point)
* fix nimble test script after bench rename
* Add sage script for BN254
* Implement (failing) scalar multiplication tests
* Add a first test against sagemath
* Finish the tests against SAGE for BN254
* Add significant test coverage of scalar multiplication with reference checks for BN254_Snarks and BLS12_381
* Mention that the inverse of 0 is 0 (TODO tests)
* Introduce "Higher-Kinded tower extensions"
* rename isCOmplexExtension -> fromComplexExtension
* update benchmarks with the new tower scheme
* Try to recover some speed on mul/squaring for an optimal tower (but this was not it)
* Elliptic curve and Twisted curve templates - initial commit
* Support EC Add on G2 (Sextic Twisted curve for BN and BLS12 families)
* Refactor the config parser to prepare for elliptic coefficient support
* Add elliptic curve parameter for BN254 (Snarks), BLS12-381 and Zexe curve BLS12-377
* Add accessors to curve parameters
* Allow computing the right-hand-side of of Weierstrass equation "y² = x³ + a x + b"
* Randomized test infrastructure for elliptic curves
* Start a testing suite on ellptic curve addition (failing)
* detail projective addition
* Fix EC addition test (forgot initializing Z=1 and that there ar emultiple infinity points)
* Test with random Z coordinate + add elliptic curve test to test suite
* fix reference to the (deactivated) addchain inversion for BN curves [skip ci]
* .nims file leftover [skip ci]
* Allow tagging BarretoNaehrig family
* Refactor the constant generation and fix XDeclaredButNotUsed
* BN field inversion via addition chain (but slower than generic :/ so deactivated)
* Lay out the refactoring objectives and tradeoffs
* Refactor the 32 and 64-bit primitives [skip ci]
* BigInts and Modular BigInts compile
* Make the bigints test compile
* Fix modular reduction
* Fix reduction tests vs GMP
* Implement montegomery mul, pow, inverse, WIP finite field compilation
* Make FiniteField compile
* Fix exponentiation compilation
* Fix Montgomery magic constant computation for 2^64 words
* Fix typo in non-optimized CIOS - passing finite fields IO tests
* Add limbs comparisons [skip ci]
* Fix on precomputation of the Montgomery magic constant
* Passing all tests including 𝔽p2
* modular addition, the test for mersenne prime was wrong
* update benches
* Fix "nimble test" + typo on out-of-place field addition
* bigint division, normalization is needed: https://travis-ci.com/github/mratsim/constantine/jobs/298359743
* missing conversion in subborrow non-x86 fallback - https://travis-ci.com/github/mratsim/constantine/jobs/298359744
* Fix little-endian serialization
* Constantine32 flag to run 32-bit constantine on 64-bit machines
* IO Field test, ensure that BaseType is used instead of uint64 when the prime can field in uint32
* Implement proper addcarry and subborrow fallback for the compile-time VM
* Fix export issue when the logical wordbitwidth == physical wordbitwidth - passes all tests (32-bit and 64-bit)
* Fix uint128 on ARM
* Fix C++ conditional copy and ARM addcarry/subborrow
* Add investigation for SIGFPE in Travis
* Fix debug display for unsafeDiv2n1n
* multiplexer typo
* moveMem bug in glibc of Ubuntu 16.04?
* Was probably missing an early clobbered register annotation on conditional mov
* Note on Montgomery-friendly moduli
* Strongly suspect a GCC before GCC 7 codegen bug (https://gcc.gnu.org/bugzilla/show_bug.cgi?id=87139)
* hex conversion was (for debugging) not taking requested order into account + inlining comment
* Use 32-bit limbs on ARM64, uint128 builtin __udivti4 bug?
* Revert "Use 32-bit limbs on ARM64, uint128 builtin __udivti4 bug?"
This reverts commit 087f9aa7fb40bbd058d05cbd8eec7fc082911f49.
* Fix subborrow fallback for non-x86 (need to maks the borrow)