Commit Graph

23 Commits

Author SHA1 Message Date
Mamy Ratsimbazafy 6c48975aee
Parallel Multi-Scalar-Multiplication (#226)
* try parallel reduction in batch add, but alas it's slower than custom chunking. Except maybe on arch with performance/efficiency cores

* initial impl of parallel MSM - scaling to debug, threads not woken fast enough

* improve comment [skip ci]

* skip top window when c divides the number of bits

* for some reason parallel-for loops scale on 5+ threads while spawn only on 2x threads. Thread wakeup issue?

* Add counters and timers to audit threadpool bottlenecks

* metrics and profiling fixes, (slower) latency hiding, activate tests

* fix thief thread trying to wake another before canceling its own sleep

* easier to sort metrics and parallel endomorphism application

* selective endomorphism acceleration

* some tuning

* spawn can handle compile-time literals, static and type parameters. Also introduce spawnAwaitable to await void procs

* improve MSM overview [skip ci]

* bench cleanup
2023-04-10 23:30:14 +02:00
Mamy Ratsimbazafy bf32c2d408
Parallel for (#222)
* introduce reserve threads to minimize latency and maximize throughput when awaiting a future

* introduce a ceilDiv proc

* threadpool: implement parallel-for loops

* 10x perf improvement by not waking reserveBackoff on syncAll

* bench overhead: new reserve system might introduce too much wakeup latency, 2x slower, for fine-grained parallelism

* add parallelForStrided

* Threadpool: Implement parallel reductions

* refactor parallel loop codegen: introduce descriptor, parsing and codegen stages

* parallel strided, test transpose bench

* tight loop is faster when backoff is not inline

* no POSIX stuff on windows, larger types for histogram bench

* fix tests

* max RSS overflow?

* missed an undefined var

* exit histogram on 32-bit

* forgot to return early dor 32-bit
2023-02-24 09:47:36 +01:00
Mamy Ratsimbazafy 8993789ddf
fix #221 2023-02-16 13:54:21 +01:00
Mamy Ratsimbazafy e5612f5705
Multi-Scalar-Multiplication / Linear combination (#220)
* unoptimized msm

* MSM: reorder loops

* add a signed windowed recoding technique

* improve wNAF table access

* use batchAffine

* revamp EC tests

* MSM signed digit support

* refactor MSM: recode signed ahead of time

* missing test vector

* refactor allocs and Alloca sideeffect

* add an endomorphism threshold

* Add Jacobian extended coordinates

* refactor recodings, prepare for parallelizable on-the-fly signed recoding

* recoding changes, introduce proper NAF for pairings

* more pairings refactoring, introduce miller accumulator for EVM

* some optim to the addchain miller loop

* start optimizing multi-pairing

* finish multi-miller loop refactoring

* minor tuning

* MSM: signed encoding suitable for parallelism (no precompute)

* cleanup signed window encoding

* add prefetching

* add metering

* properly init result to infinity

* comment on prefetching

* introduce vartime inversion for batch additions

* fix JacExt infinity conversion

* add batchAffine for MSM, though slower than JacExtended at the moment

* add a batch affine scheduler for MSM

* Add Multi-Scalar-Multiplication endomorphism acceleration

* some tuning

* signed integer fixes + 32-bit + tuning

* Some more tuning

* common msm bench + don't use affine for c < 9

* nit
2023-02-16 12:45:05 +01:00
Mamy Ratsimbazafy 082cd1deb9
MSB-to-LSB minimum Hamming Weight Recoding (#219)
* signed recoding

* use recoding
2023-02-07 16:27:53 +01:00
Mamy Ratsimbazafy 7c5421ffdc
move staticFor to the inner repo, not helpers/ for unblocking nimble install (#216) 2023-02-07 13:11:44 +01:00
Mamy Ratsimbazafy cbb454fff1
Codecs (#217)
* create a codecs.nim file for hex/base64 and other encoding conversions

* improve maintenance/readability of hex conversion

* add skeleton of constant-time base64 decoding

* use raw casts

* use raw casts only for same size types
2023-02-07 13:10:17 +01:00
Mamy Ratsimbazafy 495ef4497b
Parallel batchadd (#215)
* [Threadpool] Fix syncAll releasing while a thread was attempting to steal + force no exception in tasks

* fix unguarded access on MacOS barriers

* parallel batchadd

* moved import
2023-01-29 01:06:37 +01:00
Mamy Ratsimbazafy ff8c26c1fe
BLS Aggregate and Batch verify (#214)
* pairing -> pairings, and use alloca arrays instead of static arrays

* aggregate and batched BLS signature

* DLL generation broken by path changes
2023-01-27 00:42:12 +01:00
Mamy Ratsimbazafy 4be89d309f
chore: remove stew/byteutils dependencies and unneeded imports 2023-01-12 20:25:57 +01:00
Mamy Ratsimbazafy 4052a07611
chore: cleanup TODOs, unused constants 2023-01-12 01:27:23 +01:00
Mamy Ratsimbazafy c0b30a08be
style: casing of WordBitWidth/WordBitwidth 2023-01-11 19:31:23 +01:00
Mamy Ratsimbazafy 928f515582
Batch additions (#207)
* Batch elliptic curve addition

* accelerate chained muls

* jac mixed add handle doubling. jac additions handle aliasing when adding infinity

* properly skip sanitizer on BLS signature test

* properly skip sanitizer² on BLS signature test
2022-10-29 22:43:40 +02:00
Mamy Ratsimbazafy 962e7ccf49
CI: enable GMP tests on Windows and Linux 32-bit and fix caching (#204)
* Try to compile with GMP on windows and 32-bit linux

* remove leftover msys shell

* Don't use GMP Mersenne Twister, bad randomness and untested Nim wrapper

* properly cache nim

* fix path after cache

* run pacman in msys2 env

* rework msys2 ... again

* shell compat for file clearing

* shell compat try-again for file clearing

* force bash for clearing parallel builds on windows

* Use nimscript directly (why didn't it work last time?)

* Avoid IO redirection to support any shell

* Avoid IO redirection v2 to support any shell

* add debug data

* add debug again

* Introduce pararun, a parallel test runner to remove need of GNU parallel

* pararun: style
2022-09-15 09:33:34 +02:00
Mamy Ratsimbazafy 9770b3108c
Fp12 over fp6 (#201)
* introduce sumprod for direct fp6_mul

* change curves -> constants

* forgotten constants

* Full pairing using Fp2->Fp6->Fp12 towering
2022-08-14 09:48:10 +02:00
Mamy Ratsimbazafy 99c9730793
Self-contained bindings generation (#196)
* First draft at bindings generation

* finite field bindings PoC

* support openarray, export NimMain

* PoC extension fields and elliptic curve bindings

* Pasta

* expose more bindings, remove nimZeroMem, remove tracer when unused, codegen name_mangling`gensym issue

* workaround bad C gensym codegen with {.inline.} pragma in non-dirty template nested in generic proc instantiated by template
2022-08-06 19:05:54 +02:00
Mamy Ratsimbazafy e29e529f18
Add multipairing for BN curves (#194) 2022-05-08 19:01:23 +02:00
Mamy Ratsimbazafy 39a8a413de
Pasta curves (#191)
* Pasta curves field arithmetic

* implement elliptic curve arith for the Pasta curves
2022-04-27 00:58:48 +02:00
Mamy Ratsimbazafy e9e7a1809c
BN254 - Hash-to-Curve (SVDW method) (#190)
* Hash to BN254-Snarks

* Test SVDW code path with old v7 vectors for BLS12-381

* add benches
2022-04-26 21:24:07 +02:00
Mamy Ratsimbazafy 65eedd1cf7
Hash-to-Curve BLS12-381 G1 (#189)
* Skeleton of hash to curve for BLS12-381 G1

* Remove isodegree parameter

* Fix polynomial evaluation of hashToG1

* Optimize hash_to_curve and add bench for hash to G1

* slight optim of jacobian isomap + v7 test vectors
2022-04-11 00:57:16 +02:00
Mamy Ratsimbazafy bde4f97b56
Line refactor (#188)
* Align line evaluations to papers notations

* Adjust line fusion op

* precompute G2 b' for costly D-Twists
2022-04-04 10:10:36 +02:00
Mamy Ratsimbazafy 26954f905a
Constant time (#185)
* Implement fully constant-time division closes #2 closes #9

* constant-time hex parsing

* prevent cache timing attacks in toHex() conversion (which is only for test/debug purposes anyway)
2022-02-28 09:23:26 +01:00
Mamy Ratsimbazafy ffacf61e8a
Don't dump all in "backend" (#184)
* backend -> math

* towers -> extension fields

* move ISA and compiler specific code out of math/

* fix export
2022-02-27 01:49:08 +01:00