History

Jordan Hrycaj ca07c40a48 Fearture/poa clique tuning (#765 ) * Provide API details: API is bundled via clique.nim. * Set extraValidation as default for PoA chains why: This triggers consensus verification and an update of the list of authorised signers. These signers are integral part of the PoA block chain. todo: Option argument to control validation for the nimbus binary. * Fix snapshot state block number why: Using sub-sequence here, so the len() function was wrong. * Optional start where block verification begins why: Can speed up time building loading initial parts of block chain. For PoA, this allows to prove & test that authorised signers can be (correctly) calculated starting at any point on the block chain. todo: On Goerli around blocks #193537..#197568, processing time increases disproportionally -- needs to be understand * For Clique test, get old grouping back (7 transactions per log entry) why: Forgot to change back after troubleshooting * Fix field/function/module-name misunderstanding why: Make compilation work * Use eth_types.blockHash() rather than utils.hash() in Clique modules why: Prefer lib module * Dissolve snapshot_misc.nim details: .. into clique_verify.nim (the other source file clique_unused.nim is inactive) * Hide unused AsyncLock in Clique descriptor details: Unused here but was part of the Go reference implementation * Remove fakeDiff flag from Clique descriptor details: This flag was a kludge in the Go reference implementation used for the canonical tests. The tests have been adapted so there is no need for the fakeDiff flag and its implementation. * Not observing minimum distance from epoch sync point why: For compiling PoA state, the go implementation will walk back to the epoch header with at least 90000 blocks apart from the current header in the absence of other synchronisation points. Here just the nearest epoch header is used. The assumption is that all the checkpoints before have been vetted already regardless of the current branch. details: The behaviour of using the nearest vs the minimum distance epoch is controlled by a flag and can be changed at run time. * Analysing processing time (patch adds some debugging/visualisation support) why: At the first half million blocks of the Goerli replay, blocks on the interval #194854..#196224 take exceptionally long to process, but not due to PoA processing. details: It turns out that much time is spent in p2p/excecutor.processBlock() where the elapsed transaction execution time is significantly greater for many of these blocks. Between the 1371 blocks #194854..#196224 there are 223 blocks with more than 1/2 seconds execution time whereas there are only 4 such blocks before and 13 such after this range up to #504192. * fix debugging symbol in clique_desc (causes CI failing) * Fixing canonical reference tests why: Two errors were introduced earlier but ovelooked: 1. "Remove fakeDiff flag .." patch was incomplete 2. "Not observing minimum distance .." introduced problem w/tests 23/24 details: Fixing 2. needed to revert the behaviour by setting the applySnapsMinBacklog flag for the Clique descriptor. Also a new test was added to lock the new behaviour. * Remove cruft why: Clique/PoA processing was intended to take place somewhere in executor/process_block.processBlock() but was decided later to run from chain/persist_block.persistBlock() instead. * Update API comment * ditto		2021-07-30 15:06:51 +01:00
..
assets	upgrade jquery to 3.5.0, uikit to 3.4.0 and silence github security alert	2020-05-01 11:24:49 +07:00
.gitignore	refactor utils	2019-02-27 13:30:18 +02:00
configuration.nim	drop PublicNetwork enum usage and replace it with NetworkId	2021-05-20 14:04:16 +07:00
debug.nim	Fearture/poa clique tuning (#765 )	2021-07-30 15:06:51 +01:00
downloader.nim	fixes premix/downloader.nim to compile with new json_rpc	2021-03-19 12:30:00 +07:00
dumper.nim	Fearture/poa clique tuning (#765 )	2021-07-30 15:06:51 +01:00
hunter.nim	Fearture/poa clique tuning (#765 )	2021-07-30 15:06:51 +01:00
index.html	add pagination to premix report page	2019-03-11 10:49:36 +07:00
js_tracer.nim	sign with GPG	2019-01-28 20:58:06 +07:00
parser.nim	London: fix test_blockchain_json and test_generalstate_json	2021-06-30 20:41:29 +07:00
persist.nim	reduce warnings	2020-07-21 13:15:06 +07:00
premix.nim	reduce warnings	2020-07-21 13:15:06 +07:00
premixcore.nim	preparation for EIP-1559 implementation	2021-06-29 07:33:48 +07:00
prestate.nim	implement eth_estimateGas	2020-07-29 12:42:32 +07:00
readme.md	new member of premix tool set	2019-02-27 13:44:01 +02:00
regress.nim	Fearture/poa clique tuning (#765 )	2021-07-30 15:06:51 +01:00

readme.md

Premix

Premix is premium gasoline mixed with lubricant oil and it is used in two-stroke internal combustion engines. It tends to produce a lot of smoke.

This Premix is a block validation debugging tool for the Nimbus Ethereum client. Premix will query transaction execution steps from other Ethereum clients and compare them with those generated by Nimbus. It will then produce a web page to present comparison results that can be inspected by the developer to pinpoint the faulty instruction.

Premix will also produce a test case for the specific problematic transaction, complete with a database snapshot to execute transaction validation in isolation. This test case can then be integrated with the Nimbus project's test suite.

Requirements

Before you can use the Premix debugging tool there are several things you need to prepare. The first requirement is a recent version of geth installed from source or binary. The minimum required version is 1.8.18. Beware that version 1.8.x contains bugs in transaction tracer, upgrade it to 1.9.x soon after it has been released. Afterwards, you can run it with this command:

geth --rpc --rpcapi eth,debug --syncmode full --gcmode=archive

You need to run it until it fully syncs past the problematic block you want to debug (you might need to do it on an empty db, because some geth versions will keep on doing a fast sync if that's what was done before). After that, you can stop it by pressing CTRL-C and rerun it with the additional flag --maxpeers 0 if you want it to stop syncing

or just let it run as is if you want to keep syncing.

The next requirement is building Nimbus and Premix:

# in the top-level directory:
make

After that, you can run Nimbus with this command:

./build/nimbus --prune:archive --port:30304

Nimbus will try to sync up to the problematic block, then stop and execute Premix which will then load a report page in your default browser. If it fails to do that, you can see the report page by manually opening premix/index.html.

In your browser, you can explore the tracing result and find where the problem is.

Tools

Premix

Premix is the main debugging tool. It produces reports that can be viewed in a browser and serialised debug data that can be consumed by the debug tool. Premix consumes data produced by either nimbus, persist, or dumper.

You can run it manually using this command:

./build/premix debug*.json

Persist

Because the Nimbus P2P layer still contains bugs, you may become impatient when trying to sync blocks. In the ./premix directory, you can find a persist tool. It will help you sync relatively quicker because it will bypass the P2P layer and download blocks from geth via rpc-api.

When it encounters a problematic block during syncing, it will stop and produce debugging data just like Nimbus does.

./build/persist [--dataDir:your_database_directory] [--head: blockNumber] [--maxBlocks: number] [--numCommits: number]

Debug

In the same ./premix directory you'll find the debug tool that you can use to process previously generated debugging info in order to work with one block and one transaction at a time instead of multiple confusing blocks and transactions.

./build/debug block*.json

where block*.json contains the database snapshot needed to debug a single block produced by the Premix tool.

Dumper

Dumper was designed specifically to produce debugging data that can be further processed by Premix from information already stored in database. It will create tracing information for a single block if that block has been already persisted.

If you want to generate debugging data, it's better to use the Persist tool. The data generated by Dumper is usually used to debug Premix features in general and the report page logic in particular.

# usage:
./build/dumper [--datadir:your_path] --head:blockNumber

Hunter

Hunter's purpose is to track down problematic blocks and create debugging info associated with them. It will not access your on-disk database, because it has its own prestate construction code.

Hunter will download all it needs from geth, just make sure your geth version is at least 1.8.18.

Hunter depends on eth_getProof(EIP1186). Make sure your installed geth supports this functionality (older versions don't have this implemented).

# usage:
./build/hunter --head:blockNumber --maxBlocks:number

blockNumber is the starting block where the hunt begins.

maxBlocks is the number of problematic blocks you want to capture before stopping the hunt.

Regress

Regress is an offline block validation tool. It will not download block information from anywhere like Persist tool. Regress will validate your already persisted block in database. It will try to find any regression introduced either by bugfixing or refactoring.

# usage:
./build/regress [--dataDir:your_db_path] --head:blockNumber