Goldilocks field emulation in `circom` -------------------------------------- Experimenting with Goldilocks field emulation (the native field being the BN254 scalar field). The "Goldilocks" prime is `P = 2^64 - 2^32 + 1`, and the corresponding prime field is used for example by the [Plonky2](https://github.com/0xPolygonZero/plonky2/) proof system, which is our primary motivation here. We want to do computations in the Goldilocks field in a `circom` circuit instantiated over say the BN254 curve's scalar field (a 254 bit prime field). ### Basic approach We are in a relatively simple situation, as the field we want to emulate is much smaller than the native field we are working inside. In particular, even the product of 3 Goldilocks elements fit inside a BN254 element without any danger of overflowing. The critical part is thus that we need range checks for the range `[0..P-1]`. To start with, we can do a simple bit-decomposition based range check for the range `[0..2^64-1]`. Then for example we could do another such range check for `x + 2^32 - 1` instead of `x`, and that would be sufficient; however, this doubles the number of range checks. Instead, we can exploit the special form of the prime `P`: Observe, that in binary decomposition, `P - 1` looks like this: P-1 = 2^64 - 2^32 = 0x FFFF FFFF 0000 0000 This means, that if we already know that `x` fits into 64 bits, then it's enough to check that IF the top 32 bits are all 1, THEN the bottom 32 bits are all 0 (if the top bits are anything else, then obviously we are already in the desired range). This is relatively easy to do in `circom`. ### Type safety in `circom` In recent versions of `circom`, we can use so called "buses" and "tags" to enforce invariants. For example, we can define a Goldilocks "bus" (read: type); a safe casting function from arbitray signals into Goldilocks signal, which checks that the input is in the range `[0..P-1]`; and then write all operations such that they expect their inputs have Goldilocks type (that is, already checked), and they themselves enforce that their output is also reduced modulo P (and has the right type to show it). This approach (which is very standard in normal statically typed programming languages) makes using these sub-circuits _much_ safer. ### Soundness testing We have the [`r1cs-solver`](https://github.com/faulhornlabs/r1cs-solver) soundness testing tool at our disposal, which can, for any given concrete inputs, try to solve the equations specialized to those inputs. In general this has a good chance of succeeding for small circuits, so we can actually _prove_ that the circuit behaves as expected, at least for _particular inputs_. Unfortunately, this tool currently only supports very small prime fields (for example, `P' = 65537`). The reason for this is mainly performance: Both performance of "normal" operations like addition and multiplication, and probably more importantly, we also need efficient square roots (as the tool solves systems of quadratic equations). With such a small field, we can simply store a table of square roots in memory. With a ~256 bit field, this would be impossible, and a square root algorithm would be most probably way too slow. What we can still do though, is to make the operations we want to test generic over [Solinas primes](https://en.wikipedia.org/wiki/Solinas_prime) of the form `P = 2^a - 2^b + 1` with `a > b > 0`. For example `2^8 - 2^4 + 1 = 241` is a prime. We can probably do _exhaustive_ testing over this small field, checking the _soundness_ over every single possible inputs for the basic operations, which would give us a very high confidence that there are no bugs in the implementation. Furthermore, only inversion and division needs an actual prime modulus; the rest of the operations can be tested with any modulus.