GF(2^64-2^32+1)

2022-07-05 18:13:31 +03:00 · 2022-07-05 18:13:31 +03:00 · ac44828aea
parent 850d557b2e
commit ac44828aea
1 changed files with 14 additions and 1 deletions
--- a/ECC.md
+++ b/ECC.md
@ -39,4 +39,17 @@ For GF(2^n), MUL is implemented via tables. For x86 SIMD, it uses PSHUFB which i

 This means that we need 4x more time for GF(2^32) MUL than for GF(2^16) MUL, making computations in GF(2^32) about 2x slower (and much slower for division since we can't keep table of reciprocals).

-GPUs are better to consider as 32-bit processors.
+GPUs are better to consider as 32-bit processors.
+
+
+### GF(2^64-2^32+1)
+
+Code: https://github.com/pornin/ecgfp5
+
+It's much faster than fields implemented in FastECC - but only on CPUs supporting `64*64=128` multiplication.
+
+For x86 SIMD, GPUs and 32-bit CPUs, this multiplication will be implemented via four `32*32=64` multiplications - exactly like `a*b mod p` for 32-bit values. But since it processes 64 bits of data, it still be 2x faster in terms of GB/s processed.
+
+Probably, on x64 scalar implementation would be faster than anything less than AVX-512, allowing much simpler code.
+
+Moreover, GF(2^64-2^32+1) field is more dense than GF(0xFFF0001) - i.e. we can recode about gigabyte of data into values of this field plus a single bit (compared to only 4 KB for my field)