* use LTO in release builds
This significantly (40%) speeds up block replay and hashing - for example replaying first 1000
blocks, without/with LTO:
```
[arnetheduck@tempus ncli]$ ../env.sh nim c -d:release ncli_db
[arnetheduck@tempus ncli]$ ./ncli_db bench --db:db --network:medalla --slots:1000
Loaded 215006 blocks, head slot 307400
All time are ms
Average, StdDev, Min, Max, Samples, Test
Validation is turned off meaning that no BLS operations are performed
25468.481, 0.000, 25468.481, 25468.481, 1, Initialize DB
0.297, 0.516, 0.053, 13.645, 721, Load block from database
26.458, 0.000, 26.458, 26.458, 1, Load state from database
20.737, 8.288, 11.096, 199.325, 690, Apply block
333.069, 62.798, 45.225, 429.452, 31, Apply epoch block
0.000, 0.000, 0.000, 0.000, 0, Database block store
```
```
[arnetheduck@tempus ncli]$ ../env.sh nim c -d:release --passc:-flto --passl:-flto --stacktrace:off ncli_db
[arnetheduck@tempus ncli]$ ./ncli_db bench --db:db --network:medalla --slots:1000
Loaded 215006 blocks, head slot 307400
All time are ms
Average, StdDev, Min, Max, Samples, Test
Validation is turned off meaning that no BLS operations are performed
23903.006, 0.000, 23903.006, 23903.006, 1, Initialize DB
0.253, 0.122, 0.047, 0.731, 721, Load block from database
24.455, 0.000, 24.455, 24.455, 1, Load state from database
18.734, 7.062, 10.346, 167.397, 690, Apply block
194.869, 33.175, 29.311, 226.981, 31, Apply epoch block
```
Epoch processing is heavy on both arithmetics and hash caching, both of which get a
significant boost here.
This makes sense: nim creates lots of small functions spread out over many C files. A much
worse solution is to try to annotate code with `inline` - it copies functions to multiple
C files but still doesn't do intermodule optimizations significantly limiting the
compilers' ability to reason about the code, causing bloat and misrepresenting the usefulness
of a function to the call frequency analysis that drives actual (C-compiler) inlining and many
other optimizations.
In particular, many nim functions are part of `system` or the `C` backend - stack tracing,
memory allocation etc - nim's inlining system is pretty incomplete in that it does not deal
with these and many other cases.
* windows workaround
* skip LTO on windows for now
* Clean up PR + bump nimbus-build-system
* pcre on 32-bit + Improve env variable handling + cache mingw
* Add badge + fix setting env variable
* Auto cancel if commit becomes outdated
* fix shell for deriving env variable
* Add more cancellation points
* Add finalization tests to Github Action
* Fix case
* change cancel actions + fixes for windows and finalization
* have to use matrix variable for cache path/key
* ARCH_OVERRIDE=LATFORM issue rebuild cache
* Update scripts - deactivate workflows with identified issues + reactivate caching
* workaround mac getopt
* Disable all aAVX512f extensions (Error: invalid register for .seh_savexmm in Cygwin)
* Fix cross compile of libminiupmp #1723
* Cache fetch-dlls to avoid being a drag on nim-lang.org
* Fix windows downloading DLLs twice and set CFLAGS env variable for Linux32
* fix silly yaml mistake
* .
* reactivate win32 after https://github.com/status-im/nim-beacon-chain/pull/1726
* Comment out minimal tests for now
* addPeer() and addPeerNoWait() now returns PeerStatus, not bool.
Minor refactoring of PeerPool.
Fix tests.
* Refactor PeerPool.
Add lenSpace.
Add tests for lenSpace.
PeerPool.add procedures now return different error codes.
Fix SyncManager break/continue problem.
Fix connectWorker break/continue problem.
Refactor connectWorker and discoveryLoop.
Fix incoming/outgoing blocking problem.
* Refactor discovery loop.
Add checkPeer.
* Fix logic and compilation bugs.
* Adjust position of debugging log.
* Fix issue with maximum peers in PeerPool.
Optimize node record decoding.
* fix discoveryLoop.
* Remove aliases and fix tests using aliases.
add section addressing `address already in use error`. we already sort of cover this in the guide, but i think it makes sense to have it explicitly mentioned here.
* Bump BLST
* Test for https://github.com/supranational/blst/issues/22 regression
* Use SHA256 from BLST + bump nim-blscurve to reenable fno-tree-vectorize
* SHA256 on non-blst platforms import fixes
* import fixes again
* can't prefix with nimcrypto
* address review comment [skip ci]
* {.noInit.} on the digests
This is one way we could organize the flat file storage for blocks - the
alternative would be to not do `type` in the file itself, but have a
single type per file which arguably is simpler but may become annoying.
Another potential restriction would be to require that blocks are
ordered - with this format, it's a little bit more involved to recreate
an index file, and it's easy to accidentally build in assumptions about
the block order in the main data file.
about 40% better slot processing times (with LTO enabled) - these don't
do BLS but are used
heavily during replay (state transition = slot + block transition)
tests using a recent medalla state and advancing it 1000 slots:
```
./ncli slots --preState2:state-302271-3c1dbf19-c1f944bf.ssz --slot:1000
--postState2:xx.ssz
```
pre:
```
All time are ms
Average, StdDev, Min, Max, Samples,
Test
Validation is turned off meaning that no BLS operations are performed
39.236, 0.000, 39.236, 39.236, 1,
Load state from file
0.049, 0.002, 0.046, 0.063, 968,
Apply slot
256.504, 81.008, 213.471, 591.902, 32,
Apply epoch slot
28.597, 0.000, 28.597, 28.597, 1,
Save state to file
```
cast:
```
All time are ms
Average, StdDev, Min, Max, Samples,
Test
Validation is turned off meaning that no BLS operations are performed
37.079, 0.000, 37.079, 37.079, 1,
Load state from file
0.042, 0.002, 0.040, 0.090, 968,
Apply slot
215.552, 68.763, 180.155, 500.103, 32,
Apply epoch slot
25.106, 0.000, 25.106, 25.106, 1,
Save state to file
```
cast+rewards:
```
All time are ms
Average, StdDev, Min, Max, Samples,
Test
Validation is turned off meaning that no BLS operations are performed
40.049, 0.000, 40.049, 40.049, 1,
Load state from file
0.048, 0.001, 0.045, 0.060, 968,
Apply slot
164.981, 76.273, 142.099, 477.868, 32,
Apply epoch slot
28.498, 0.000, 28.498, 28.498, 1,
Save state to file
```
cast+rewards+shr
```
All time are ms
Average, StdDev, Min, Max, Samples,
Test
Validation is turned off meaning that no BLS operations are performed
12.898, 0.000, 12.898, 12.898, 1,
Load state from file
0.039, 0.002, 0.038, 0.054, 968,
Apply slot
139.971, 68.797, 120.088, 428.844, 32,
Apply epoch slot
24.761, 0.000, 24.761, 24.761, 1,
Save state to file
```