constantine

Commit Graph

Author	SHA1	Message	Date
Mamy Ratsimbazafy	47b4f48dfb	fix overflow when truncating in submod2k, fix Guido fuzzing failure 8 (#251 )	2023-07-11 09:06:46 +02:00
Mamy Ratsimbazafy	cb038bb515	fix bigint mul non-compilation after #231	2023-07-09 18:57:12 +02:00
Mamy Ratsimbazafy	d69c7bf8e9	Fuzz Fix - Hash-To-Curve - Isogeny EC add non-fully-reduced input (#250 ) * H2C: fix fuzz failure 2, non-fully reduced in isogeny EC addition * faster hashToG2 by using sparsity	2023-07-03 06:57:22 +02:00
Mamy Ratsimbazafy	d0f4ad8cda	Fix fuzz #1 failure: incorrect reduction of BigInt (#246 )	2023-07-02 17:15:02 +02:00
Mamy Ratsimbazafy	72f36530ba	Fix Fuzz 5: off-by-1 in even modexp (#247 )	2023-07-02 17:14:50 +02:00
Mamy Ratsimbazafy	0eba593951	Pasta / Halo2 MSM bench (#243 ) * Pasta bench * cleanup env variables * [MSM]: generate benchmark coef-points pairs in parallel * try to fix windows Ci * add diagnostic info * fix old test for new codecs/io primitives * Ensure the projective point at infinity is not all zeros, but (0, 1, 0)	2023-06-04 17:41:54 +02:00
Mamy Ratsimbazafy	b1ef2682d6	Modular exponentiation (arbitrary output) and EIP-198 (#242 ) * implement arbitrary precision modular exponentiation (prerequisite EIP-198) * [modexp] implement exponentiation modulo 2ᵏ * add inversion (mod 2ᵏ) * [modexp] High-level wrapper for powmod with odd modulus * [modexp] faster exponentiation (mod 2ᵏ) for even case and Euler's totient function odd case * [modexp] implement general fast modular exponentiation * Fix modular reduction with 64-bit modulus + fuzz powmod vs GMP * add benchmark * add EIP-198 support * fixups following self review * fix test paths	2023-06-01 23:38:41 +02:00
Mamy Ratsimbazafy	d996ccd5d8	Path reorgs (#240 ) * move tests * move threadpool to root path * fix hints and warnings, print nim versions for tests for debugging the new strange issue in CI * print nim version * mixup on branches * mixup on branches reloaded	2023-05-29 20:14:30 +02:00
Mamy Ratsimbazafy	33c3a2e8c4	[Research] x86 code generator (#234 ) * rename compilers -> intrinsics, math_gpu -> math_codegen * stash x86 codegen in research	2023-04-27 21:52:51 +02:00
Mamy Ratsimbazafy	c6d9a213f2	Rework assembly to be compatible with LTO (#231 ) * rework assembler register/mem and constraint declarations * Introduce constraint UnmutatedPointerToWriteMem * Create invidual memory cell operands * [Assembly] fully support indirect memory addressing * fix calling convention for exported procs * Prepare for switch to intel syntax to avoid clang constant propagation asm symbol name interfering OR pointer+offset addressing * use modifiers to prevent bad string mixin fo assembler to linker of propagated consts * Assembly: switch to intel syntax * with working memory operand - now works with LTO on both GCC and clang and constant folding * use memory operand in more places * remove some inline now that we have lto * cleanup compiler config and benches * tracer shouldn't force dependencies when unused * fix cc on linux * nimble fixes * update README [skip CI] * update MacOS CI with Homebrew Clang * oops nimble bindings disappeared * more nimble fixes * fix sha256 exported symbol * improve constraints on modular addition * Add extra constraint to force reloading of pointer in reg inputs * Fix LLVM gold linker running out of registers * workaround MinGW64 GCC 12.2 bad codegen in t_pairing_cyclotomic_subgroup with LTO	2023-04-26 06:58:31 +02:00
Mamy Ratsimbazafy	9a7137466e	C API for Ethereum BLS signatures (#228 ) * [testsuite] Rework parallel test runner to buffer beyond 65536 chars and properly wait for process exit * [testsuite] improve error reporting * rework openArray[byte/char] for BLS signature C API * Prepare for optimized library and bindings * properly link to constantine * Compiler fixes, global sanitizers, GCC bug with --opt:size * workaround/fix #229: don't inline field reduction in Fp2 * fix clang running out of registers with LTO * [C API] missed length parameters for ctt_eth_bls_fast_aggregate_verify * double-precision asm is too large for inlining, try to fix Linux and MacOS woes at https://github.com/mratsim/constantine/pull/228#issuecomment-1512773460 * Use FORTIFY_SOURCE for testing * Fix #230 - gcc miscompiles Fp6 mul with LTO * disable LTO for now, PR is too long	2023-04-18 22:02:23 +02:00
Mamy Ratsimbazafy	6c48975aee	Parallel Multi-Scalar-Multiplication (#226 ) * try parallel reduction in batch add, but alas it's slower than custom chunking. Except maybe on arch with performance/efficiency cores * initial impl of parallel MSM - scaling to debug, threads not woken fast enough * improve comment [skip ci] * skip top window when c divides the number of bits * for some reason parallel-for loops scale on 5+ threads while spawn only on 2x threads. Thread wakeup issue? * Add counters and timers to audit threadpool bottlenecks * metrics and profiling fixes, (slower) latency hiding, activate tests * fix thief thread trying to wake another before canceling its own sleep * easier to sort metrics and parallel endomorphism application * selective endomorphism acceleration * some tuning * spawn can handle compile-time literals, static and type parameters. Also introduce spawnAwaitable to await void procs * improve MSM overview [skip ci] * bench cleanup	2023-04-10 23:30:14 +02:00
Mamy Ratsimbazafy	4dc2610557	Bindings "filesystem" (#225 ) * bindings structure * missed some renaming * add back the headers * path fixes * need to sleep at night * windows path mystery is unfathomable	2023-03-01 12:59:06 +01:00
Mamy Ratsimbazafy	bf32c2d408	Parallel for (#222 ) * introduce reserve threads to minimize latency and maximize throughput when awaiting a future * introduce a ceilDiv proc * threadpool: implement parallel-for loops * 10x perf improvement by not waking reserveBackoff on syncAll * bench overhead: new reserve system might introduce too much wakeup latency, 2x slower, for fine-grained parallelism * add parallelForStrided * Threadpool: Implement parallel reductions * refactor parallel loop codegen: introduce descriptor, parsing and codegen stages * parallel strided, test transpose bench * tight loop is faster when backoff is not inline * no POSIX stuff on windows, larger types for histogram bench * fix tests * max RSS overflow? * missed an undefined var * exit histogram on 32-bit * forgot to return early dor 32-bit	2023-02-24 09:47:36 +01:00
Mamy Ratsimbazafy	8993789ddf	fix #221	2023-02-16 13:54:21 +01:00
Mamy Ratsimbazafy	e5612f5705	Multi-Scalar-Multiplication / Linear combination (#220 ) * unoptimized msm * MSM: reorder loops * add a signed windowed recoding technique * improve wNAF table access * use batchAffine * revamp EC tests * MSM signed digit support * refactor MSM: recode signed ahead of time * missing test vector * refactor allocs and Alloca sideeffect * add an endomorphism threshold * Add Jacobian extended coordinates * refactor recodings, prepare for parallelizable on-the-fly signed recoding * recoding changes, introduce proper NAF for pairings * more pairings refactoring, introduce miller accumulator for EVM * some optim to the addchain miller loop * start optimizing multi-pairing * finish multi-miller loop refactoring * minor tuning * MSM: signed encoding suitable for parallelism (no precompute) * cleanup signed window encoding * add prefetching * add metering * properly init result to infinity * comment on prefetching * introduce vartime inversion for batch additions * fix JacExt infinity conversion * add batchAffine for MSM, though slower than JacExtended at the moment * add a batch affine scheduler for MSM * Add Multi-Scalar-Multiplication endomorphism acceleration * some tuning * signed integer fixes + 32-bit + tuning * Some more tuning * common msm bench + don't use affine for c < 9 * nit	2023-02-16 12:45:05 +01:00
Mamy Ratsimbazafy	082cd1deb9	MSB-to-LSB minimum Hamming Weight Recoding (#219 ) * signed recoding * use recoding	2023-02-07 16:27:53 +01:00
Mamy Ratsimbazafy	7c5421ffdc	move staticFor to the inner repo, not helpers/ for unblocking nimble install (#216 )	2023-02-07 13:11:44 +01:00
Mamy Ratsimbazafy	cbb454fff1	Codecs (#217 ) * create a codecs.nim file for hex/base64 and other encoding conversions * improve maintenance/readability of hex conversion * add skeleton of constant-time base64 decoding * use raw casts * use raw casts only for same size types	2023-02-07 13:10:17 +01:00
Mamy Ratsimbazafy	495ef4497b	Parallel batchadd (#215 ) * [Threadpool] Fix syncAll releasing while a thread was attempting to steal + force no exception in tasks * fix unguarded access on MacOS barriers * parallel batchadd * moved import	2023-01-29 01:06:37 +01:00
Mamy Ratsimbazafy	ff8c26c1fe	BLS Aggregate and Batch verify (#214 ) * pairing -> pairings, and use alloca arrays instead of static arrays * aggregate and batched BLS signature * DLL generation broken by path changes	2023-01-27 00:42:12 +01:00
Mamy Ratsimbazafy	188f3e710c	add fast_aggregate_verify	2023-01-23 01:54:40 +01:00
Mamy Ratsimbazafy	4be89d309f	chore: remove stew/byteutils dependencies and unneeded imports	2023-01-12 20:25:57 +01:00
Mamy Ratsimbazafy	4052a07611	chore: cleanup TODOs, unused constants	2023-01-12 01:27:23 +01:00
Mamy Ratsimbazafy	1f4bb174a3	[Backend] Add support for Nvidia GPUs (#210 ) * Add PoC of JIT exec on Nvidia GPUs [skip ci] * Split GPU bindings into low-level (ABI) and high-level [skip ci] * small typedef reorg [skip ci] * refine LLVM IR/Nvidia GPU hello worlds * [Nvidia GPU] PoC implementation of field addition [skip ci] * prod-ready field addition + tests on Nvidia GPUs via LLVM codegen	2023-01-12 01:01:57 +01:00
Mamy Ratsimbazafy	c0b30a08be	style: casing of WordBitWidth/WordBitwidth	2023-01-11 19:31:23 +01:00
Mamy Ratsimbazafy	928f515582	Batch additions (#207 ) * Batch elliptic curve addition * accelerate chained muls * jac mixed add handle doubling. jac additions handle aliasing when adding infinity * properly skip sanitizer on BLS signature test * properly skip sanitizer² on BLS signature test	2022-10-29 22:43:40 +02:00
Mamy Ratsimbazafy	351a3f6bd2	Sha256 refactor (#206 ) * sha256: separate message scheduling and state updates to help implement specific use-cases like #205; also implement SSSE3 acceleration (2006, Intel Core 2 Duo) * sha256: simplify update flow, store less metadata in context * sha256: Fix reworked update function * Implement x86 hardware SHA acceleration * typo	2022-09-19 02:02:57 +02:00
Mamy Ratsimbazafy	fb594c5938	OpenSSL upstream: no more SHA256 public function :/, skip in Windows CI	2022-09-19 01:22:01 +02:00
Mamy Ratsimbazafy	df048112c3	Example+Test C API vs GMP (#203 ) * Example+Test C API vs GMP * Create build directory for bindings test * --nimMainPrefix is 1.6 only * Add libdl for dynamic loading * absolute paths * add static link test * Fix man main, rename Nimmain to init_NimMain * Deal with MacOS annoying linker w.r.t. static libraries * use .exe extension to satisfy windows (?) * annoying GCC which doesn't create paths * Try skipping DLL test on windows * windows extensions ... * no lib prefix on windows	2022-09-15 17:11:57 +02:00
Mamy Ratsimbazafy	962e7ccf49	CI: enable GMP tests on Windows and Linux 32-bit and fix caching (#204 ) * Try to compile with GMP on windows and 32-bit linux * remove leftover msys shell * Don't use GMP Mersenne Twister, bad randomness and untested Nim wrapper * properly cache nim * fix path after cache * run pacman in msys2 env * rework msys2 ... again * shell compat for file clearing * shell compat try-again for file clearing * force bash for clearing parallel builds on windows * Use nimscript directly (why didn't it work last time?) * Avoid IO redirection to support any shell * Avoid IO redirection v2 to support any shell * add debug data * add debug again * Introduce pararun, a parallel test runner to remove need of GNU parallel * pararun: style	2022-09-15 09:33:34 +02:00
Mamy Ratsimbazafy	094445482b	Eip2333 (#202 ) * HMAC-SHA256 * EIP2333 * activate EIP2333 tests and faster random test case generation	2022-08-16 12:07:57 +02:00
Mamy Ratsimbazafy	9770b3108c	Fp12 over fp6 (#201 ) * introduce sumprod for direct fp6_mul * change curves -> constants * forgotten constants * Full pairing using Fp2->Fp6->Fp12 towering	2022-08-14 09:48:10 +02:00
Mamy Ratsimbazafy	99c9730793	Self-contained bindings generation (#196 ) * First draft at bindings generation * finite field bindings PoC * support openarray, export NimMain * PoC extension fields and elliptic curve bindings * Pasta * expose more bindings, remove nimZeroMem, remove tracer when unused, codegen name_mangling`gensym issue * workaround bad C gensym codegen with {.inline.} pragma in non-dirty template nested in generic proc instantiated by template	2022-08-06 19:05:54 +02:00
Mamy Ratsimbazafy	e29e529f18	Add multipairing for BN curves (#194 )	2022-05-08 19:01:23 +02:00
Mamy Ratsimbazafy	39a8a413de	Pasta curves (#191 ) * Pasta curves field arithmetic * implement elliptic curve arith for the Pasta curves	2022-04-27 00:58:48 +02:00
Mamy Ratsimbazafy	e9e7a1809c	BN254 - Hash-to-Curve (SVDW method) (#190 ) * Hash to BN254-Snarks * Test SVDW code path with old v7 vectors for BLS12-381 * add benches	2022-04-26 21:24:07 +02:00
Mamy Ratsimbazafy	65eedd1cf7	Hash-to-Curve BLS12-381 G1 (#189 ) * Skeleton of hash to curve for BLS12-381 G1 * Remove isodegree parameter * Fix polynomial evaluation of hashToG1 * Optimize hash_to_curve and add bench for hash to G1 * slight optim of jacobian isomap + v7 test vectors	2022-04-11 00:57:16 +02:00
Mamy Ratsimbazafy	bde4f97b56	Line refactor (#188 ) * Align line evaluations to papers notations * Adjust line fusion op * precompute G2 b' for costly D-Twists	2022-04-04 10:10:36 +02:00
Mamy Ratsimbazafy	742cecce08	Poly1305 Message Authentication Code (#186 ) * Groundwork for Poly1305 MAC * Implement fast reduction for Poly1305 * don't import assembly files when compiling without assembly	2022-03-05 23:39:24 +01:00
Mamy Ratsimbazafy	c2eb42b769	Add ChaCha20 stream cipher	2022-03-02 01:18:47 +01:00
Mamy Ratsimbazafy	26954f905a	Constant time (#185 ) * Implement fully constant-time division closes #2 closes #9 * constant-time hex parsing * prevent cache timing attacks in toHex() conversion (which is only for test/debug purposes anyway)	2022-02-28 09:23:26 +01:00
Mamy Ratsimbazafy	ffacf61e8a	Don't dump all in "backend" (#184 ) * backend -> math * towers -> extension fields * move ISA and compiler specific code out of math/ * fix export	2022-02-27 01:49:08 +01:00
Mamy Ratsimbazafy	5bc6d1d426	BLS signatures for Ethereum (BLS sig on BLS12-381 G2 with SHA256) (#183 ) * Finally add the (Ethereum) bls signatures (on BLS12-381 G2) * fix test path and remove old low-level signature test	2022-02-26 21:22:34 +01:00
Mamy Ratsimbazafy	fe500a6a79	Productionize: move protocols top-level vs backend (#179 ) * Productionize: move protocols top-level vs backend * fix path * import fix * the last one * benches as well	2022-02-21 01:04:53 +01:00
Mamy Ratsimbazafy	81acfb1626	Nim 1.6 in CI (#170 ) * try 1.6 CI * Try CI with 1.6 and windows. * Bend the knee * have fun debugging CI * have fun debugging CI * more CI spam * branch -> nim_version * fight or flight * properly detect windows * Fix galore * 🐍 🐍 snake: * meh give up on parallelizing windows and dealing with windows PATH issues * ¯\_ (ツ)_/¯	2022-02-20 23:44:00 +01:00
Mamy Ratsimbazafy	dc73c71801	Pairings optimizations (#178 ) * bench for cyclotomic square, exp and rename cyclotomic exp + multipairings for BLS12-377 * refactor/unify lines and cyclotomic functions * Add Karabina's compressed squaring * Use compressed squarings in final exponentiation * Weighted addchain for bn254_snarks * Add new towering options and cost functions * Rearrange bench summaries * fix BW6-761	2022-02-20 20:15:20 +01:00
Mamy Ratsimbazafy	14af7e8724	Low-level refactoring (#175 ) * Add specific fromMont conversion routine. Rename montyResidue to getMont * missed test file * Add x86_64 ASM for fromMont * Add x86_64 MULX/ADCX/ADOX for fromMont * rework Montgomery Multiplication with prefetch/latency hiding techniques * Fix ADX autodetection, closes #174. Rollback faster mul_mont attempt, no improvement and debug pain. * finalSub in fromMont & adx_bmi -> adx * Some {.noInit.} to avoid Nim zeroMem (which should be optimized away but who knows) * Uniformize name 'op+domain': mulmod - mulmont * Fix asm codegen bug "0x0000555555565930 <+896>: sbb 0x20(%r8),%r8" with Clang in final substraction * Prepare for skipping final substraction * Don't forget to copy the result when we skip the final substraction * Seems like we need to stash the idea of skipping the final substraction for now, needs bounds analysis https://eprint.iacr.org/2017/1057.pdf * fix condition for ASM 32-bit * optim modular addition when sparebit is available	2022-02-14 00:16:55 +01:00
Mamy Ratsimbazafy	53c4db7ead	Fast modular inversion (#172 ) * split modular inversion in its own file * Stash fast GCD inversion https://eprint.iacr.org/2020/972.pdf * Stash Pornin's bingcd -> issue with inner modular reduction * Implement Bernstein-Yang inversion * Avoid Nim checks on signed integers (32-bit runtime issue) * cleanup: remove old inversion impls * cleanup: static moduli, move div2 * small comments (skip ci) * comment cleanup (skip ci) * fix total iterations on 32-bit * Add batch conversion to affine coordinates using simultaneous inversion trick * fix conditional setZero and batchAffine conversion * cleanup unneeded branches following affine conversion unification * Fix batchAffine with zero inputs and add fuzz failure to test suite	2022-02-10 14:05:07 +01:00
Mamy Ratsimbazafy	404a966601	^k to ᵏ (skip ci)	2022-02-06 15:38:26 +01:00

1 2 3 4 5

218 Commits