The Auditors Handbook (#1280) - first pass

* initial auditor handbook commit

* Move in publishable state

* otu of-line comment to avoid mdbook bug

* Fix links

* Add threat model

* Update: move rocksdb and protobuf, placeholder for areas of focus
This commit is contained in:
Mamy Ratsimbazafy 2020-07-07 10:44:21 +02:00 committed by GitHub
parent e5eee67426
commit 0f55ab13e8
No known key found for this signature in database
GPG Key ID: 4AEE18F83AFDEB23
32 changed files with 1287 additions and 0 deletions

View File

@ -199,4 +199,21 @@ publish-book: | book
git worktree remove -f tmp-book && \
rm -rf tmp-book
auditors-book:
cd docs/the_auditors_handbook && \
mdbook build
publish-auditors-book: | auditors-book
git worktree add tmp-book gh-pages && \
mkdir -p tmp-book/auditors-book && \
rm -rf tmp-book/auditors-book/* && \
cp -a docs/the_auditors_handbook/book/* tmp-book/auditors-book/ && \
cd tmp-book && \
git add . && { \
git commit -m "make publish-auditors-book" && \
git push origin gh-pages || true; } && \
cd .. && \
git worktree remove -f tmp-book && \
rm -rf tmp-book
endif # "variables.mk" was not included

1
docs/the_auditors_handbook/.gitignore vendored Normal file
View File

@ -0,0 +1 @@
book

View File

@ -0,0 +1,6 @@
[book]
authors = ["Mamy André-Ratsimbazafy"]
language = "en"
multilingual = false
src = "src"
title = "The Auditors Handbook to Nimbus Beacon Chain"

View File

@ -0,0 +1,14 @@
# Introduction
The Nimbus Nim-Beacon-Chain (NBC) project is an implementation of the [Ethereum 2 Beacon Chain specification](https://github.com/ethereum/eth2.0-specs) in the [Nim programming language](https://nim-lang.org/).
The Auditors' Handbook aims to be provide a comprehensive introduction to:
- The Nim programming language, as used in the project.
- The NBC project.
- The dependencies of the project.
A particular focus will be given to features related to safety, correctness, error handling, testing, fuzzing, or inspecting Nim code.
One of the major highlights of Nim is that it compiles to C or C++ before compiling to native code. All techniques available to audit C code can be used to audit Nim.
The dependencies NBC rely on are detailed per audit phase in the [build system and dependencies](03.2_build_system_and_dependencies.md) section.

View File

@ -0,0 +1,53 @@
# Operators and bit manipulation
## Operators
A set of symbol and keywords can be used as infix operators
Nim supports operator overloading.
Those symbols are:
```
= + - * / < >
@ $ ~ & % |
! ? ^ . : \
```
The keyword operators are
```
and or not xor shl shr div mod in notin is isnot of as
```
In-particular:
- bitwise and, or, flip, xor are denoted `and`, `or`, `not`, `xor` instead of using a symbol
- shift left and right are `shl` and `shr`
- division and modulo are `div` and `mod`
### Implementation-defined behaviors
`mod` is defined as the mathematical remainder, like C. With signed integers `a mod b` has the same sign as `a`
`shr` of a signed integer will not preserve the sign bit.
`ashr` can be used for arithmetic right shift.
This distinction was introduced recently and may lead to subtle bugs,
also `ashr` relies on the C compiler actually lowering `>>` to SAR for signed integer.
This is specified for GCC and Clang ([https://gcc.gnu.org/onlinedocs/gcc/Integers-implementation.html#Integers-implementation](https://gcc.gnu.org/onlinedocs/gcc/Integers-implementation.html#Integers-implementation))
but implementation defined in general.
### Operator precedence
Operator precedence is specified described in the manual:
[https://nim-lang.org/docs/manual.html#syntax-precedence](https://nim-lang.org/docs/manual.html#syntax-precedence)
### Additional system operators
Nim system exports additional operators with a `%` like `+%` and `-%`.
Those cast the signed integer operand to unsigned and cast the result back to signed.
This is intended to use with the ByteAddress type for pointer arithmetic
## Bit manipulation
[https://github.com/status-im/nim-stew/blob/master/stew/bitops2.nim](https://github.com/status-im/nim-stew/blob/master/stew/bitops2.nim)

View File

@ -0,0 +1,3 @@
# Pointer manipulation
https://github.com/status-im/nim-stew/blob/master/stew/ptrops.nim

View File

@ -0,0 +1,3 @@
# Emitting raw C or Assembly code
TODO

View File

@ -0,0 +1,24 @@
# Closures and closures iterators
TODO
## At a low-level
Closures and closures iterators are implemented via
a pointer + an environment that stores the captured state necessary
to execute the function.
The Nim compiler has a limited form of borrow checking and prevents
capturing mutable variable or openarray (pointer+length pair).
It otherwise copies the capture variables in case of objects with value semantics
or increment the reference count in case of ref object.
The Chronos library needs to generate a closure iterator for all async proc instantiated
which may lead to long-lived ref objects and unreclaimable memory.
This may also extend to other resources like socket connections or file descriptors and
a significant effort is underway to track memory usage and detect such scenarios:
- [https://github.com/status-im/nim-libp2p/issues/145](https://github.com/status-im/nim-libp2p/issues/145)
- [https://github.com/status-im/nim-libp2p/issues/207](https://github.com/status-im/nim-libp2p/issues/207)

View File

@ -0,0 +1 @@
# Nim FAQ

View File

@ -0,0 +1,378 @@
# Nim Routines
Nim offers several kinds of "routines" that:
- do computation
- produce side-effect
- generate code
Those are:
- `proc` and `func`
- `method`
- `converter`
- `iterator`
- `template`
- `macro`
## `proc` and `func`
`proc` and `func` are the most basic routines.
At the moment, Nim requires forward declaration of proc and func.
Also it prevents circular dependencies, this means that a procedure is
always coming from one of the imports.
Additionally, all dependencies are submodules and a proc can be found by greping
`procname*`, the `*` being the export marker.
The only exception being the standard library. Procedures from the standard library are all listed in "[The Index](https://nim-lang.org/docs/theindex.html)".
### Function call syntax
Nim provides flexible call syntax, the following are possible:
```Nim
prof foo(a: int) =
discard
foo(a)
foo a
a.foo()
a.foo
```
Additionally this is also possible for strings:
```Nim
let a = fromHex"0x12345" # Without spaces
```
Nim doesn't enforce namespacing by default but is an option
```Nim
let a = byteutils.fromhex"0x12345"
```
### Parameters
Mutable parameters must be tagged with `var`
TODO
### Symbol resolution
If 2 procedures are visible in the same module (a module is a file) and
have the same name the compiler will infer which to call from the arguments
signatures. In case both are applicable, the compiler will throw an "ambiguous call" compile-time error.
Note that a procedure specialized to a concrete type has priority over a generic procedure, for example a procedure for int64 over a procedure for all number types.
### `func` and side effect analysis
In Nim a proc is considered to have side-effect if it accesses a global variable.
Mutating a declared function parameter is not considered a side-effect as there is no access to a global variable.
Printing to the standard output or reading the standard input is considered a sideeffect.
`func` are syntactic sugar for `proc` without sideeffects. In particular this means that `func` behaviors are fully determined by their input parameters.
In the codebase, logging at the `trace` level are not considered a sideeffect.
Additionally some logging statements and metrics statement may be in an explicit `{.noSideEffect.}:` code-block.
### Returning values
There are 3 syntaxes to return a value from a procedure:
1. The return statement
2. The implicit `result` variable
3. The "last statement as expression"
```Nim
proc add1(x: int): int =
return x + 1
proc add2(x: int): int =
result = x + 2
proc add3(x: int): int =
x + 3
```
The main differences are:
1. `return` allows early returns, in particular from a loop.
2. `result` offers Return Value Optimization and Copy Elision
which is particularly valuable for array types.
3. Requires the last statement to be a valid expression.
This is particularly interesting for conditional return values
as forgetting to set the value in a branch will be a compile-time error,
for example:
```Nim
proc select(ctl: bool, a, b: int): int =
if ctl:
echo "heavy processing"
a
else:
echo "heavy processing"
b
```
Omitting `a` or `b` will be a compiletime error, unlike
```Nim
proc select(ctl: bool, a, b: int): int =
if ctl:
echo "heavy processing"
return a
else:
echo "heavy processing"
# Forgot to return b
```
```Nim
proc select(ctl: bool, a, b: int): int =
if ctl:
echo "heavy processing"
result = a
else:
echo "heavy processing"
# Forgot to result = b
```
Due to the differences we prefer using the "last statement as expression" unless
- copying the type is expensive (SHA256 hash for example)
- or we need early returns
#### Ignoring return values
Unlike C, return values MUST be used or explicitly discarded.
#### Mutable return values
TODO
### At a low-level
#### Argument passing
Nim passes arguments by value if they take less than 3*sizeof(pointer) (i.e. 24 bytes on 64-bit OS) and passes them by pointer with the C backend or reference with the C++ backend if they are bigger.
Mutable arguments are always passed by pointer.
This behavior can be changed on a type-by-type bases by tagging them {.bycopy.} or {.byref.}. This is only used for interfacing with non-Nim code.
#### Stacktraces
With `--stacktrace:on`, Nim create a stackframe on `proc` entry and destroys it on exit. This is used for reporting stacktraces.
NBC is always compiled with `--stacktraces:on`
NBC uses [libbacktrace](https://github.com/ianlancetaylor/libbacktrace) to have less costly stacktraces.
#### Name in the C code or Assembly
`proc` and `func` are materialized in the produced C code with name-mangling appended at the end. For the purpose of building Nim libraries, the name can be controlled by:
- `{.exportc.}` so that the generated C name is the same as Nim
- `{.exportc: "specific_name".} to generate a specific name
## `method`
`method`s are used for dynamic dispatch when an object has an inherited subtype only known at runtime.
`method` are dispatched using a dispatch tree in the C code instead of a VTable.
There might be some cases where `method` were used not for their intended purpose
## `converter`
Converters are procedures that are implicitly called on a value to change its type.
For example with a fictional option type that automatically extracts the boxed type.
```Nim
type Option[T] = object
case hasValue: bool
of true:
value: T
else:
discard
converter get[T](x: Option[T]): T =
x.value
let x = Option[int](hasValue: true, value: 1)
let y = Option[int](hasValue: true, value: 2)
let z = x + y
```
Even though the `+` operator is not defined for `Option[int]` it is defined for `int` and Nim implicitly calls the converter.
`converter` are seldom used in the codebase as we prefer explicit over implicit.
Note that in case an operation is defined on both the convertible and the converted type, the operation without conversion should be preferred however [the compiler might throw an ambiguous call instead](https://github.com/nim-lang/Nim/issues/7416).
## Iterators
Iterators are construct that transforms a for loop.
For example to iterate on a custom array collection
```Nim
const MaxSize = 7
type SmallVec[T] = object
buffer*: array[MaxSize, T]
len*: int
iterator items*[T](a: SmallVec[T]): T =
for i in 0 ..< a.len:
yield a.data[i]
```
Now iterating becomes
```Nim
for value in a.items():
echo a
```
A singly-linked list forward iterator could be implemented as
```Nim
iterator items[T](head: ref T): ref T =
## Singly-linked list iterator
assert: not head.isNil
var cur = head
while true:
let next = cur.next
yield cur
cur = next
if cur.isNil:
break
```
a doubly-linked list backward iterator as
```Nim
iterator backward[T](tail: ptr T): ptr T =
var cur = tail
while not cur.isNil:
let prev = cur.prev
yield cur
cur = prev
```
an iterator to unpack individual bits from a byte as:
```Nim
iterator unpack(scalarByte: byte): bool =
yield bool((scalarByte and 0b10000000) shr 7)
yield bool((scalarByte and 0b01000000) shr 6)
yield bool((scalarByte and 0b00100000) shr 5)
yield bool((scalarByte and 0b00010000) shr 4)
yield bool((scalarByte and 0b00001000) shr 3)
yield bool((scalarByte and 0b00000100) shr 2)
yield bool((scalarByte and 0b00000010) shr 1)
yield bool( scalarByte and 0b00000001)
```
In all cases, the syntax to iterate on the collection remains:
```Nim
for value in a.items():
echo a
for value in b.backward():
echo b
for bit in s.unpack():
echo s
```
The echo is inlined at "yield".
Iterators are not present in the produced C code, they are always inlined at the callsite.
Iterators are prone to code bloat, for example
```Nim
iterator iterate[T](s: seq[T], backward: bool): T =
if backward:
for i in s.len-1 .. 0:
yield s[i]
else:
for i in 0 ..< s.len:
yield s[i]
for value in s.iterate(backward = false):
## Long-series of operations
echo value
```
The long series of operation will be duplicated.
### `items` and `pairs`
The `items` and `pairs` iterator are special cased and implictly call
if there is respectively one and two iteration variables hence:
```Nim
for x in collection:
echo x
```
will automatically call the `items` proc defined for the collection (or error)
```Nim
for x, y in collection:
echo x
echo y
```
will automatically call the `pairs` proc defined for the collection (or error)
### `fields` and `fieldPairs`
`fields` and `fieldsPairs` are iterator-like magic, that allow "iterating" on an object field. Note that those are unrolled at compile-time.
## Closures and closure iterators
Will be covered in a dedicated section.
They are the backbone of Chronos, our async/await framework and also
have a major potential for memory leaks.
## `template`
`template`s in Nim allows raw code substitution.
`templates` are hygienic and typechecked unlike the C preprocessor.
Also they create their own scope unless tagged with the `{.dirty.}` pragma.
A major issue with templates is that as they "copy-paste" code, it is very easy to misuse them and do a computation twice.
For instance
```Nim
proc foo(): int =
echo "launch missile"
return 1
template doSomething(a: int) =
process(a)
log(a)
```
This would be transformed to:
```Nim
process(foo())
log(foo())
```
and triggers the "launch missile" side-effect twice.
Another issue with templates is that they may not generate stacktraces properly as
they are not materialized in the C code.
### Symbol visibility and {.inject.}
TODO
## `macro`
TODO
## The `do` notation
TODO

View File

@ -0,0 +1,3 @@
# Builtin types
TODO

View File

@ -0,0 +1,23 @@
# Casting and low-level memory representation
## Conversions
Casting to or from a signed integer will lead to a range check
## Casting integers
Unsigned integer casts behave like C.
Upcasting will lead to zero extension
Downcasting will lead to truncation
Signed integer ranges will not be checked with a cast.
## Casting to/from other types
Casting to or from any other types will be done through:
- union casts by default
- or C conversion if the type is a pointer.
In practice this means that the bit-pattern will be reinterpreted as the new type, similar to C++ reinterpret cast.

View File

@ -0,0 +1,33 @@
# Nim memory management
Nim memory management is on a per-type basis.
Plain objects and char and numerical types are allocated on the stack.
Sequences and strings are allocated on the heap but have value semantics.
They are copied on assignment
Ref types are allocated on the heap and have reference semantics, i.e. an unique instance
can be held by multiple variables and only when all those variables go out-of-scope is
the ref type discarded.
By default Nim uses a deferred reference counting GC. Additionally, if the type can lead
to cycles, Nim will add "mark-and-sweep" passes to collect them.
## Destructors
TODO
## Nim allocator
Nim GCs are backed by a TLSF allocator which allows Nim to provide soft real-time guarantees if needed.
## Analyzing memory leaks
Nim can be compiled with `-d:useMalloc` to bypass the TLSF allocator and directly use malloc/free
## References
- [https://nim-lang.org/docs/gc.html](https://nim-lang.org/docs/gc.html)
- [http://www.gii.upv.es/tlsf/](http://www.gii.upv.es/tlsf/)

View File

@ -0,0 +1,30 @@
# Generics and Static types
Nim types can be parametrized by types (generics) or compile-time values (static)
For example
```Nim
type
MySeq[T] = object
len, reserved: int
data: ptr UncheckedArray[T]
```
The generics can be restricted
```Nim
type
MySeq[T: int32 or int64] = object
len, reserved: int
data: ptr UncheckedArray[T]
```
With static types
```Nim
type
SmallSeq[MaxLen: static int, T] = object
len: int
data: array[MaxLen, T]
```

View File

@ -0,0 +1,21 @@
# Arrays, openarrays, varargs
## Arrays
TODO
## Openarrays
Openarray are a parameter-only type that represent a (pointer, length) pair.
In other languages they are also known as slices, ranges, views, spans.
_The name openarray is inherited from Pascal, Oberon and Modula 2_
Arrays and sequences are implictily converted to openarray.
The compiler has a limited form of escape analysis to prevent capturing openarrays in closures
or returning them.
## UncheckedArrays
TODO

View File

@ -0,0 +1,2 @@
# Runtime types, variants, object-oriented programming
TODO

View File

@ -0,0 +1,2 @@
# Compile-time evaluation
TODO

View File

@ -0,0 +1,109 @@
# Nim data types
## Builtin types
### Numerical types
For integers, Nim offers `uint8`, `uint16`, `uint32` and `uint64` types as well as `int8`, `int16`, `int32`, `int64` types.
Operations on signed integers are overflowed and underflowed checked unless the flag `--overflowChecks:off` is used.
Nim also offers `int` and `uint` types, which have the same size as the platform word size, so 4 bytes when compiled for 32-bit OS
or 8 bytes for a 64-bit OS.
Conversion between `int` and `int32` or `int64` must be explicit except for string literals.
Integer literals default to `int`.
`float32` and `float64` maps to C `float` and `double`. `float` is an alias to `float64` whether on a 32-bit or 64-bit platform
#### Binary blobs
Nim has a specific `byte` type which behaves like uint8.
It is the preferred type to represent binary blobs, i.e. we use `seq[byte]` over `string`, `seq[char]` or `seq[uint8]` for binary data.
#### Range types
Nim allows defining ranges of valid value which will be runtime checked everytime the value changes for example
Nim defines by default `type Natural = range[0 .. high(int)]`. If the value of a Natural becomes less than 0
an RangeError exception will be thrown.
This is valuable to catch / prevent underflows.
## Sequences
Sequences are heap-allocated containers similar to C++ vectors.
They have value-semantics and are copied on assignments.
Sequences have a data pointer, reserved memory and current length
## Strings
Sequences are heap-allocated containers.
They have value-semantics and are copied on assignments.
Strings have a data pointer, reserved memory and current length.
The data is terminated by `\0`.
Nim strings automatically decays without copy to C strings in FFI calls
The representation is the same as a `seq[byte]` or `seq[char]` except for the terminating nul-byte,
including within the GC tracking data structures
Consequently, strings can be cast to `seq[byte]` but inversely casting `seq[byte]` to string
will lead to non nul-terminated string in C FFI and buffer overflows.
## Record / structs
Nim has 3 kinds of record types
- Value
```Nim
type Foo = object
field0: int
field1: float
```
- Reference
```Nim
type Foo = ref object
field0: int
field1: float
```
- Pointer
```Nim
type Foo = ptr object
field0: int
field1: float
```
### Value
A value object is allocated on the stack (unless if nested in heap-allocated types).
The equality check is structural.
Copy is done by copyMem (C memcpy)
Object variants do not have an equality operator set by default
### Reference
A reference object is allocated on the heap and managed by the GC.
They are not thread-safe.
The equality check is by reference
Copy is done by copying the reference and increment the reference count
### Pointer
A pointer object is manually managed, it can be on the heap (malloc) or on the stack (alloca)

View File

@ -0,0 +1,101 @@
# Correctness, distinct, mutability, effects, exceptions
The Nim compiler provides several constraints that can be used to enforce
proper usage of variables, types and error handling at compile-time.
One was already mentioned in previous paragraphs:
- Side-Effect analysis via using `func` or `{.noSideEffect.}` (in the routines chapter)
_Note that range types currently work at runtime only_
## Side-Effect
As mentioned in the routines chapter, using a `func` or a proc tagged with `{.noSideEffect.}` will prevent reading or writing to global variables (i.e. variables that are neither parameters or locals).
Note that side-effect analysis cannot analyse raw emitted C code
Additionally, allocating a sequence or a string, even if they technically
access a global memory allocator, is not considered a side-effect.
The compiler will ignore statements in a `{.noSIdeEffect.}` block for the purpose of side-effect analysis. At the moment this is only used for trace and some debug logs, as writing to stdout/stderr is considered writing to a global variables and so a side-effect.
## `not nil`
The compiler exposes a "not nil" annotation for reference and pointer types.
This enforces that parameters are proven always initialized in tagged:
- procedures
- types
This is not used in the codebase as a more powerful prover is required for our application.
Currently, the compiler warns when it cannot prove that a result reference is not nil.
## `distinct` types
A distinct type is a type that has the same representation as a base type at a low-level but cannot be used in its stead.
```Nim
type Miles = distinct float32
type Meters = distinct float32
```
Procedures accepting:
- `float32` will not accept `Miles` or `Meters`
- `Miles` will not accept `float32` or `Meters`
- `Meters` will not accept `float32` or `Miles`
`distinct` type can reuse the base type procedures and fields via the borrow annotation as described in the manual ([https://nim-lang.org/docs/manual.html#types-distinct-type](https://nim-lang.org/docs/manual.html#types-distinct-type))
## Enforcing exception handling
The codebase uses a mix of Result and Exceptions for error handling
with option types and bool in some cases.
As an effort to sanitize error handling and ensure that all exception paths are handled, we use the effect tracking system the following way:
```Nim
proc foo() {.raises: [].} =
discard
```
The procedure above will refuse to compile if its body can throw an unhandled exception.
```Nim
proc foo() {.raises: [ValueError].} =
discard
```
The procedure above will refuse to compile if its body can throw an exception besides a ValueError.
In particular Nim distinguishes between Defects, which are non-recoverable, and Exceptions, which we should recover from.
For our purposes, we allow all procedures to throw a Defect (for example an assertion broken), this is done by adding `{.push raises:[Defect]}` at the start of a file
```Nim
{.push raises:[Defect]}
proc foo1() =
discard
proc foo2() =
discard
{.pop.}
```
## Mutability
Only variables declared with `var` and `var` parameters are mutable.
Note that mutability analysis is not deep for reference types or pointer types.
You can always mutate through a pointer.
## Future
Significant improvements are in-progress planned to improve Nim safety:
- `lent` return values and owned reference, a limited form of borrow checker: [https://nim-lang.org/docs/destructors.html#lent-type](https://nim-lang.org/docs/destructors.html#lent-type)
- Z3 theorem prover for bound checks and nil checks at compile-time: [https://nim-lang.org/docs/drnim.html](https://nim-lang.org/docs/drnim.html)
- write-tracking to prevent deep immutability: [https://github.com/nim-lang/RFCs/issues/234](https://github.com/nim-lang/RFCs/issues/234) ([https://nim-lang.org/araq/writetracking.html](https://nim-lang.org/araq/writetracking.html))
however it is too early to use them in a production codebase.

View File

@ -0,0 +1,67 @@
# Debugging Nim
Reference article: [https://nim-lang.org/blog/2017/10/02/documenting-profiling-and-debugging-nim-code.html](https://nim-lang.org/blog/2017/10/02/documenting-profiling-and-debugging-nim-code.html)
## GDB / LLDB
Nim can be instrumented with sourcemaps:
- by passing `--debugger:native` to the compiler, so that the stacktraces in gdb shows the Nim source code
- by passing `--passC:"-g"` to the compiler, so that the stacktraces in gdb shows the C source code
## Sanitizers & Valgrind
LLVM and GCC sanitizers can be used with
```Nim
nim c --cc:clang -r -d:release --debugger:native \
--passC:"-fsanitize=address" --passL:"-fsanitize=address" \
--outdir:build target_application.nim
```
Note on deactivating Nim memory allocator:
As mentioned in the memory management section, Nim has
- a garbage collector, by default deferred reference counting + cycle detection via mark-and-sweep if the types can have cycles (and is not tagged {.acyclic.})
- an allocator based on TLSF
Instead of Nim custom allocators, the sys malloc/free can be used by passing
`-d:useMalloc` in the command-line
_Some GC/versions might not properly accept the flag, this is a Nim bug and we can patch upstream and our own fork in that case_
## Fuzzers
TODO
### Internal fuzzing
We are currently adding fuzzing to our repositories via libFuzzer and AFL.
- Fuzzing abstraction: [https://github.com/status-im/nim-testutils](https://github.com/status-im/nim-testutils)
- Network lib fuzzing: [https://github.com/status-im/nim-eth/tree/4f533eb5/tests/fuzzing](https://github.com/status-im/nim-eth/tree/4f533eb5/tests/fuzzing)
- Eth2 core/Validator core fuzzing: [https://github.com/status-im/nim-blscurve/tree/271a5738/tests/fuzzing](https://github.com/status-im/nim-blscurve/tree/271a5738/tests/fuzzing)
### External "Consensus" fuzzing
Sigma Prima is fuzzing all Ethereum 2 clients on the spec (Eth2 core/Validator core)
We provide them with a simple C API
- [https://github.com/status-im/nim-beacon-chain/blob/4a2e1806/nfuzz/libnfuzz.h](https://github.com/status-im/nim-beacon-chain/blob/4a2e1806/nfuzz/libnfuzz.h)
That we implement:
- [https://github.com/status-im/nim-beacon-chain/blob/4a2e1806/nfuzz/libnfuzz.nim](https://github.com/status-im/nim-beacon-chain/blob/4a2e1806/nfuzz/libnfuzz.nim)
And compile the nim code as a shared or static library.
On usage, the only extra limitation compared to a C library is the need to call `NimMain()` before calling any Nim function to initialize the Nim runtime.
## Inspecting the generated intermediate code and objects
The intermediate C code and object files are stored in
- `$HOME/.cache/nim/compiled_target_d` or `$HOME/.cache/nim/compiled_target_r` on UNIX (d for debug, r for release)
- `$HOME/nimcache/compiled_target_d` or `$HOME/.cache/nim/compiled_target_r` on Windows
The cache directory can be set with the `--nimcache=<targetdir>` compiler flag.
Repositories vendoring `nimbus-build-system` do not use the default setting
but generate directly in the (gitignored) `nimcache` folder of the repo.

View File

@ -0,0 +1,48 @@
# Foreign language interop
## Wrapping C
### Using shared library
Example: secp256k1
[https://github.com/status-im/nim-secp256k1/blob/master/secp256k1_abi.nim](https://github.com/status-im/nim-secp256k1/blob/master/secp256k1_abi.nim)
### Compiling directly the C files
Example: Apache Milagro Crypto
[https://github.com/status-im/nim-blscurve/blob/master/blscurve/milagro.nim](https://github.com/status-im/nim-blscurve/blob/master/blscurve/milagro.nim)
## Wrapping C++
Beyond the syntax for wrapping C, Nim offers a flexible syntax for wrapping C++,
for example for vectors:
```Nim
type
CppVector* {.importcpp"std::vector", header: "<vector>", byref.} [T] = object
proc newCppVector*[T](): CppVector[T] {.importcpp: "std::vector<'*0>()", header: "<vector>", constructor.}
proc newCppVector*[T](size: int): CppVector[T] {.importcpp: "std::vector<'*0>(#)", header: "<vector>", constructor.}
proc len*(v: CppVector): int {.importcpp: "#.size()", header: "<vector>".}
proc add*[T](v: var CppVector[T], elem: T){.importcpp: "#.push_back(#)", header: "<vector>".}
proc `[]`*[T](v: CppVector[T], idx: int): T{.importcpp: "#[#]", header: "<vector>".}
proc `[]`*[T](v: var CppVector[T], idx: int): var T{.importcpp: "#[#]", header: "<vector>".}
proc `[]=`*[T](v: var CppVector[T], idx: int, value: T) {.importcpp: "#[#]=#", header: "<vector>".}
```
Example: ttmath
[https://github.com/status-im/nim-ttmath/blob/master/src/ttmath.nim](https://github.com/status-im/nim-ttmath/blob/master/src/ttmath.nim)
## Exporting
See "Fuzzing" chapter for exporting a C API for [fuzzing](02.4_debugging_Nim_sanitizers_fuzzers.md#Fuzzers)
## References
- Nim manual:
- [https://nim-lang.org/docs/manual.html#foreign-function-interface-importc-pragma](https://nim-lang.org/docs/manual.html#foreign-function-interface-importc-pragma)
- [https://nim-lang.org/docs/manual.html#implementation-specific-pragmas-importcpp-pragma](https://nim-lang.org/docs/manual.html#implementation-specific-pragmas-importcpp-pragma)
- Nim backend integration: [https://nim-lang.org/docs/backends.html](https://nim-lang.org/docs/backends.html)

View File

@ -0,0 +1,12 @@
# Nim threat model
Nim and its standard library are not the focus of the audits.
In particular the codebase intentionally limits reliance on the standard library
so that it is not tied to Nim release schedule, instead of the standard library
we use `stew` most of the time: [https://github.com/status-im/nim-stew](https://github.com/status-im/nim-stew).
Nim standard library is implemented here:
- [https://github.com/nim-lang/Nim/tree/v1.2.4/lib](https://github.com/nim-lang/Nim/tree/v1.2.4/lib)
We target Nim v1.2.2

View File

@ -0,0 +1 @@
# Nim standard library use in Nimbus

View File

@ -0,0 +1,90 @@
# The Nim Programming Language
The Nim programming language is a compiled language, with strong static typing.
The rest of the Handbook will assume that Nim-by-example was read.
Nim compilation process is in 2 phases, first lowering the Nim code to C, C++ or Javascript. Then for machine code, rely on the C/C++ compiler to produce the final code.
Nim can target any combination of C compiler, host OS and hardware architecture as long as the C compiler supports it.
## Installing
Nim can be installed via:
- A Linux distribution package manager or Homebrew on MacOS
- Instructions at [https://nim-lang.org/install.html](https://nim-lang.org/install.html)
- Docker: [https://hub.docker.com/r/nimlang/nim/](https://hub.docker.com/r/nimlang/nim/)
- Choosenim [https://github.com/dom96/choosenim](https://github.com/dom96/choosenim)
Nim Vagrant [https://github.com/status-im/nim-vagrant](https://github.com/status-im/nim-vagrant) is unmaintained but might
help setting up your own virtualized environment.
We target Nim 1.2.2 and should be compatible with the latest stable, Nim 1.2.4
## Casings
Nim has unusual partial case insensitivity for identifiers. The rationales being:
- Preventing bugs when using `SDL_QUIT` instead of `SDL_Quit`.
- Having consistent casing in a codebase even when relying on external dependencies with different casing.
The convention used in Nim-Beacon-Chain is:
- `snake_case` for fields and procedures names from the Ethereum spec
- `MACRO_CASE` for Ethereum spec constants
- `PascalCase` for all types (Ethereum or additional)
- `camelCase` for our own additional code
- `PascalCase` for our additional constants
In summary, we respect the Ethereum spec for Ethereum specified identifiers
and use Nim NEP-1 for the rest.
## Checking the C code
By default the intermediate C code produced by the Nim compiler is available at
- `$HOME/.nim/compiled_project_d` on UNIX systems
- `$HOME/nimcache/compiled_project_d` on Windows
The suffix `_d` indicates a debug build, the suffix `_d` indicates a release build
## Compiler options
At the time of writing, NBC targets Nim v1.2.2 compiler.
The build system is at [https://github.com/status-im/nimbus-build-system](https://github.com/status-im/nimbus-build-system)
No patching is done at the moment on the Nim compiler, we use vanilla v1.2.2 upstream.
Nim compiler offers debug, release with `-d:release` and danger with `-d:danger` flag.
The debug and `-d:release` build differ by, the verbosity of stacktraces and passing `-O3` or equivalent to the C compiler.
Runtime checks (overflow, array bounds checks, nil checks, ...) are still included in `-d:release` build. We also choose to have verbose stacktraces in NBC.
A danger build optimizes away all runtime checks and debugging help like stackframes. This might have a significant impact on performance
as it may enable optimizations that were not possible like optimizing tail calls. This is not used in NBC.
## References
- Nim by example:
- [https://nim-by-example.github.io/getting_started/](https://nim-by-example.github.io/getting_started/)
- The Nim Manual, is a specification of how Nim should behave\
[https://nim-lang.org/docs/manual.html](https://nim-lang.org/docs/manual.html)
- Nim tutorials
- [https://nim-lang.org/docs/tut1.html](https://nim-lang.org/docs/tut1.html)
- [https://nim-lang.org/docs/tut2.html](https://nim-lang.org/docs/tut2.html)
- [https://nim-lang.org/docs/tut3.html](https://nim-lang.org/docs/tut3.html)
- Nim for
- the C programmer: [https://github.com/nim-lang/Nim/wiki/Nim-for-C-programmers](https://github.com/nim-lang/Nim/wiki/Nim-for-C-programmers)
- Python programmer: [https://github.com/nim-lang/Nim/wiki/Nim-for-Python-Programmers](https://github.com/nim-lang/Nim/wiki/Nim-for-Python-Programmers)
Further resources are collected at:
- [https://nim-lang.org/learn.html](https://nim-lang.org/learn.html)
### Compiler configuration
- Compiler User Guide: [https://nim-lang.org/docs/nimc.html](https://nim-lang.org/docs/nimc.html)
### Style Guide
- [https://nim-lang.org/docs/nep1.html](https://nim-lang.org/docs/nep1.html)

View File

@ -0,0 +1,99 @@
# Nimbus build system & dependencies
## Build system
### NBC repo
`nim-beacon-chain` uses a build system based on makefiles.
Usage is described in the README.
In particular a `env.sh` scripts is provided that setups the environment variable
expected.
### Libraries
Library repositories uses the official package manager called `nimble`
for building and testing.
In particular, we use by convention the following command to run the test suite via nimble task system:
```Nim
nimble test
```
The details are implemented in `<package name>.nimble` at the root of the repository.
The continuous integration setup for Travis, Azure Pipelines and/or Appveyor
are also setting up a Nim environment from scratch.
## Package dependencies
### NBC repo
For the `nim-beacon-chain` application, all dependencies are vendored
in the `vendor` folder.
With regards to the audit scope we have the following dependencies:
#### All
- Async
- asynctools
- nim-chronos
- Logging
- jswebsockets
- karax
- nim-chronicles
- nim-chronicles-tail (devtool, out-of-audit-scope)
- nim-libbacktrace
- websocket.nim
- nim-protobuf-serialization
- nim-rocksdb
- CLI interface
- nim-confutils
- nim-prompt
- nim-unicodedb
- Metrics
- nim-metrics (the server endpoint is out-of-audit-scope)
- Helpers
- nim-stew
- nim-testutils (fuzzing)
- nimYAML (should be test only)
#### Network core
- Cryptography
- nim-bearssl
- nim-secp256k1
- nimcrypto
- Networking & Discovery
- nim-eth
- nim-libp2p
- nim-nat-traversal
- Serialization
- nim-faststreams
- nim-serialization
- nim-snappy
- BigInt
- nim-stint (Note: one of the main use is 256-bit bloomfilter, a dedicated library should be used instead)
#### ETH2 Specification core
- Cryptography
- nim-blscurve
- Database
- nim-sqlite3-abi
- Eth1 monitoring
- nim-web3
- Build system
- nimbus-build-system
#### Validator core
- Cryptography
- nim-blscurve
- RPC
- news
- nim-http-utils
- nim-json-rpc
- nim-json-serialization

View File

@ -0,0 +1,42 @@
# NBC Threat model
NBC primarily targets resource restricted devices like Raspberry Pi or smartphones to desktop computers.
We focus on:
- remote attacks
This includes but not limited to:
- denial-of-services
- retrieving a secret key
- via software defect or misuse of libraries (cryptography, networking)
- or poor key management practices or UI or documentation confusing the user
- collusion of peers (eclipse, surround voting, ...)
- loss of funds
In particular, we do not defend against a malicious cloud provider which would offer
a “Rent-a-Raspberry-Pi” service for validators and would subsequently engage in attacks against
those validators, the implest being, taking the validator offline when it's their duty time.
Sensitive data:
1. Signing key: each validator has a secret signing key used to sign attestation and blocks with 32 ETH at stake (about 7500 USD at the current rate)
Each beacon node instance can have dozens to hundreds of validators attached.
Leaking the signing key would allow an attacker to double-vote with the key, leading to slashing and ultimately ejecting the validator
once its stake reaches below 16 ETH.
The signing key must be present in memory as a validator may have signing duties every 6 seconds
2. Withdrawal key: a withdrawal key allows withdrawing the stake.
A withdrawal key is unnecessary during day-to-day operation and can be stored in cold storage, possibly hardware wallet
3. IP address: if a validator IP address leaks, they might be subject to target DoS attacks taking their node offline and making them miss their duties.
In a regular setting, the network will have "relayer"/"listener" nodes and validator nodes that shouldn't be distinguishable.
4. Random number generation:
- Strong recommendations to generate the withdrawal key offline.
- The RNG is used in particular to select which peer to connect to from a pool of candidates.
## Cryptographic libraries
A review in in progress to select BLS signature backend.
See [https://notes.status.im/nim-bls-curve-backends#Threat-model](https://notes.status.im/nim-bls-curve-backends#Threat-model)
## Resources
Surround vote detection (not in audit scope): [https://github.com/protolambda/eth2-surround](https://github.com/protolambda/eth2-surround)

View File

@ -0,0 +1,63 @@
# Nimbus Beacon Chain
[https://github.com/status-im/nim-beacon-chain](https://github.com/status-im/nim-beacon-chain)
Nimbus Beacon Chain (NBC) is an implementation of an Ethereum 2 client.
## Audit scope
### Network Core (leveraging the libp2p framework)
| Sub-topic |
| -------------------------------------- |
| Discovery Protocol (discv5) |
| Publish/Subscribe protocol |
| Eth2 Request/Response protocol |
| SSZ - (De)serialization & tree hashing |
| Wire encryption |
### ETH2 Specification core
| Sub-topic |
| ------------------------------------- |
| State transition logic |
| Signature verification |
| Epoch finalisation and justification |
| Reward processing |
| Eth1 data processing |
| Fork choice logic |
| Block processing and production |
| Attestation processing and production |
| Block synchronization |
| Peer pool management |
### Validator core and user experience
| Sub-topic |
| --------------------------------- |
| Block/attestation signing |
| Slash-prevention mechanisms |
| RPC API |
| Accounts management & key storage |
| Command Line Interface (CLI) |
## High-level view of the stack
[https://miro.com/app/board/o9J_kvfytDI=/](https://miro.com/app/board/o9J_kvfytDI=/)
## Diagram
TODO
## Specifications
We target v0.12.1 phase0 of [https://github.com/ethereum/eth2.0-specs](https://github.com/ethereum/eth2.0-specs)
- [https://github.com/ethereum/eth2.0-specs/tree/v0.12.1/specs/phase0](https://github.com/ethereum/eth2.0-specs/tree/v0.12.1/specs/phase0)
The p2p-interface specs in particular describe the subset of libp2p spec that
are used to implement Ethereum 2
## Resources
Ethereum 2.0 Ask Me Anything:
- [https://docs.ethhub.io/other/ethereum-2.0-ama/](https://docs.ethhub.io/other/ethereum-2.0-ama/)

View File

@ -0,0 +1,2 @@
# Serialization

View File

@ -0,0 +1 @@
# Async/Await with the Chronos library

View File

@ -0,0 +1 @@
# Cryptography

View File

@ -0,0 +1 @@
# Ethereum Networking

View File

@ -0,0 +1,36 @@
# Summary
- [Introduction](01_introduction.md)
- [The Nim Programming Language](02_the_Nim_programming_language.md)
- [Nim routines, procedures, functions, templates, macros](02.1_nim_routines_proc_func_templates_macros.md)
- [Operators, bit manipulation](02.1.1_operators_bit_manipulation.md)
- [Closure iterators](02.1.4_closure_iterators.md)
- [Datatypes: value, ref, ptr](02.2_stack_ref_ptr_types.md)
- [Casting and low-level memory representation](02.2.2_casting_and_low_level_memory_representation.md)
- [Memory Management and Garbage Collection](02.2.3_memory_management_gc.md)
- [Generic types & static types](02.2.4_generics_types_static_types.md)
- [Arrays, openarrays, strings, C-strings](02.2.5_arrays_openarrays_strings_cstring.md)
- [Correctness: distinct, mutability, effects, exceptions](02.3_correctness_distinct_mutability_effects_exceptions.md)
- [Debugging Nim, sanitizers, fuzzers](02.4_debugging_Nim_sanitizers_fuzzers.md)
- [Foreign lang interop: C and C++](02.5_foreign_lang_to_from_interop.md)
- [Nim threat model](02.8_Nim_threat_model.md)
- [Nim FAQ](02.10_Nim_FAQ.md)
- [Nimbus NBC - The Nim-Beacon-Chain](03_nbc_nimbus_beacon_chain.md)
- [Build system and dependencies](03.2_build_system_and_dependencies.md)
- [Threat model](03.5_the_threat_model.md)
- [Serialization](04_serialization.md)
- [Async/Await with Chronos](05_async_with_chronos.md)
- [Cryptography](06_cryptography_and_rng.md)
- [Ethereum Networking](07_nim-eth.md)
<!-- Not fleshed out, out of line because mdbook bug -->
<!-- - [Pointer manipulation](02.1.2_pointer_manipulation.md) -->
<!-- - [Emitting raw C or Assembly code](02.1.3_emitting_raw_C_assembly_code.md) -->
<!-- - [Builtin types](02.2.1_builtin_types.md) -->
<!-- - [Runtime types: Variants & Object-Oriented Programming](02.2.6_runtime_types_variants_oop.md) -->
<!-- - [Compile-time Evaluation](02.2.7_compiletime_evaluation.md) -->
<!-- - [Nim standard library use in Nimbus](02.9_Nim_stdlib_use_in_nimbus.md) -->