* Clear cofactor in BN254 G2 testgen and frobenius
* Implement G2 endomorphism acceleration in Sage
* Somewhat working accelerated scalar mul G2 (2.2x) faster
- OK for BN254_Snarks
- Some test failing for BLS12-381
* Fix negative miniscalars by adding an extra bit of encoding
* Cleanup accel params
* Small recoding optimizations
* Proof-of-Concept Assembly code generator
* Tag inline per procedure so we can easily track the tradeoff on tower fields
* Implement Assembly for modular addition (but very curious off-by-one)
* Fix off-by one for moduli with non msb set
* Stash (super fast) alternative but still off by carry
* Fix GCC optimizing ASM away
* Save 1 register to allow compiling for BLS12-381 (in the GMP test)
* The compiler cannot find enough registers if the ASM file is not compiled with -O3
* Add modsub
* Add field negation
* Implement no-carry Assembly optimized field multiplication
* Expose UseX86ASM to the EC benchmark
* omit frame pointer to save registers instead of hardcoding -O3. Also ensure early clobber constraints for Clang
* Prepare for assembly fallback
* Implement fallback for CPU that don't support ADX and BMI2
* Add CPU runtime detection
* Update README closes#66
* Remove commented out code
* Split elliptic curve tests to better use parallel testing
* Add support for printing points on G2
* Implement multiplication and division by optimal sextic non-residue (BLS12-381)
* Implement modular square root in 𝔽p2
* Support EC add and EC double on G2 (for BLS12-381)
* Support G2 divisive twists with non-unit sextic-non-residue like BN254 snarks
* Add EC G2 bench
* cleanup some unused warnings
* Reorg the tests for parallelization and to avoid instantiating huge files