9 Commits

Author SHA1 Message Date
Mamy Ratsimbazafy
b1ef2682d6
Modular exponentiation (arbitrary output) and EIP-198 (#242)
* implement arbitrary precision modular exponentiation (prerequisite EIP-198)

* [modexp] implement exponentiation modulo 2ᵏ

* add inversion (mod 2ᵏ)

* [modexp] High-level wrapper for powmod with odd modulus

* [modexp] faster exponentiation (mod 2ᵏ) for even case and Euler's totient function odd case

* [modexp] implement general fast modular exponentiation

* Fix modular reduction with 64-bit modulus + fuzz powmod vs GMP

* add benchmark

* add EIP-198 support

* fixups following self review

* fix test paths
2023-06-01 23:38:41 +02:00
Mamy Ratsimbazafy
bf32c2d408
Parallel for (#222)
* introduce reserve threads to minimize latency and maximize throughput when awaiting a future

* introduce a ceilDiv proc

* threadpool: implement parallel-for loops

* 10x perf improvement by not waking reserveBackoff on syncAll

* bench overhead: new reserve system might introduce too much wakeup latency, 2x slower, for fine-grained parallelism

* add parallelForStrided

* Threadpool: Implement parallel reductions

* refactor parallel loop codegen: introduce descriptor, parsing and codegen stages

* parallel strided, test transpose bench

* tight loop is faster when backoff is not inline

* no POSIX stuff on windows, larger types for histogram bench

* fix tests

* max RSS overflow?

* missed an undefined var

* exit histogram on 32-bit

* forgot to return early dor 32-bit
2023-02-24 09:47:36 +01:00
Mamy Ratsimbazafy
2931913b67
Add a threadpool (#213)
* Implement a threadpool

* int and SomeUnsignedInt ...

* Type conversion for windows SynchronizationBarrier

* Use the latest MacOS 11, Big Sur API (jan 2021) for MacOS futexes, Github action offers MacOS 12 and can test them

* bench need posix timer not available on windows and darwin futex

* Windows: nimble exec empty line is an error, Mac: use defined(osx) instead of defined(macos)

* file rename

* okay, that's the last one hopefully

* deactivate stealHalf for now
2023-01-24 02:32:28 +01:00
Mamy Ratsimbazafy
4052a07611
chore: cleanup TODOs, unused constants 2023-01-12 01:27:23 +01:00
Mamy Ratsimbazafy
1f4bb174a3
[Backend] Add support for Nvidia GPUs (#210)
* Add PoC of JIT exec on Nvidia GPUs [skip ci]

* Split GPU bindings into low-level (ABI) and high-level [skip ci]

* small typedef reorg [skip ci]

* refine LLVM IR/Nvidia GPU hello worlds

* [Nvidia GPU] PoC implementation of field addition [skip ci]

* prod-ready field addition + tests on Nvidia GPUs via LLVM codegen
2023-01-12 01:01:57 +01:00
Mamy Ratsimbazafy
d4e202ead5
Don't use array[^1], it can throw and cannot be locally turn off 2022-09-17 18:52:52 +02:00
Mamy Ratsimbazafy
a17fb3b4c1
Fix compiler hints and warnings (unused import/variables, ...) 2022-08-06 19:55:35 +02:00
Mamy Ratsimbazafy
26954f905a
Constant time (#185)
* Implement fully constant-time division closes #2 closes #9

* constant-time hex parsing

* prevent cache timing attacks in toHex() conversion (which is only for test/debug purposes anyway)
2022-02-28 09:23:26 +01:00
Mamy Ratsimbazafy
ffacf61e8a
Don't dump all in "backend" (#184)
* backend -> math

* towers -> extension fields

* move ISA and compiler specific code out of math/

* fix export
2022-02-27 01:49:08 +01:00