GF(2^64-2^32+1)

Bulat-Ziganshin 2022-07-05 18:13:31 +03:00
parent 850d557b2e
commit ac44828aea
1 changed files with 14 additions and 1 deletions

15
ECC.md

@ -39,4 +39,17 @@ For GF(2^n), MUL is implemented via tables. For x86 SIMD, it uses PSHUFB which i
This means that we need 4x more time for GF(2^32) MUL than for GF(2^16) MUL, making computations in GF(2^32) about 2x slower (and much slower for division since we can't keep table of reciprocals).
GPUs are better to consider as 32-bit processors.
GPUs are better to consider as 32-bit processors.
### GF(2^64-2^32+1)
Code: https://github.com/pornin/ecgfp5
It's much faster than fields implemented in FastECC - but only on CPUs supporting `64*64=128` multiplication.
For x86 SIMD, GPUs and 32-bit CPUs, this multiplication will be implemented via four `32*32=64` multiplications - exactly like `a*b mod p` for 32-bit values. But since it processes 64 bits of data, it still be 2x faster in terms of GB/s processed.
Probably, on x64 scalar implementation would be faster than anything less than AVX-512, allowing much simpler code.
Moreover, GF(2^64-2^32+1) field is more dense than GF(0xFFF0001) - i.e. we can recode about gigabyte of data into values of this field plus a single bit (compared to only 4 KB for my field)