nim-codex/.github/workflows
Marcin Czenko 92a0eda79a
Validator historical state restoration (#922)
* adds a new overload of queryPastEvents allowing to query past events based on timestamp in the past

* adds state restoration to validator

* refactors a bit to get the tests back to work

* replaces deprecated generic methods from Market with methods for specific event types

* Refactors binary search

* adds market tests for querying past SlotFilled events and binary search

* Takes into account that <<earliest>> block available is not necessarily the genesis block

* Adds more logging and makes testing earliest block boundary more reliable

* adds validation tests for historical state restoration

* adds mockprovider to simplify and improve testing of the edge conditions

* adds slot reservation to the new tests after rebasing

* adds validation groups and group index in logs of validator

* adds integration test with two validators

* adds comment on how to enable logging in integration test executable itself

* testIntegration: makes list is running nodes injected and available in the body of the test

* validation: adds integration test for historical state

* adds more logging to validator

* integration test: validator only looks 30 days back for historical state

* adds logging of the slotState when removing slots during validation

* review and refactor validator integration tests

* adds validation to the set of integration tests

* Fixes mistyped name of the mock provider module in testMarket

* Fixes a typo in the name of the validation suite in integration tests

* Makes validation unit test a bit easier to follow

* better use of logScopes to reduce duplication

* improves timing and clarifies the test conditions

* uses http as default RPC provider for nodes running in integration tests as a workaround for dropped subscriptions

* simplifies the validation integration tests by waiting for failed request instead of tracking slots

* adds config option allowing selectively to set different provider url

* Brings back the default settings for RPC provider in integration tests

* use http RPC provider for clients in validation integration tests

* fine-tune the tests

* Makes validator integration test more robust - adds extra tracking

* brings tracking of marketplace event back to validator integration test

* refactors integration tests

* deletes tmp file

* adds <<return>> after forcing integration test to fail preliminarily

* re-enables all integration tests and matrix

* stops debug output in CI

* allows to choose a different RPC provider for a given integration test suite

* fixes signature of <<getBlock>> method in mockProvider

* adds missing import which seem to be braking integration tests on windows

* makes sure that clients, SPs, and validators use the same provider url

* makes validator integration tests using http at 127.0.0.1:8545

* testvalidator: stop resubscribing as we are now using http polling as rpc provider

* applying review comments

* groups queryPastStorage overrides together (review comment)

* groups the historical validation tests into a sub suite

* removes the temporary extensions in marketplacesuite and multinodesuite allowing to specify provider url

* simplifies validation integration tests

* Removes debug logs when waiting for request to fail

* Renaming waitForRequestFailed => waitForRequestToFail

* renames blockNumberForBlocksAgo to pastBlockTag and makes it private

* removes redundant debugging logs

* refines logging in validation

* removes dev logging from mockmarket

* improves exception handling in provider helper procs and prepares for extraction to a separate module

* Uses chronos instead of std/times for Duration

* extracts provider and binary search helpers to a separate module

* removes redundant log entry params from validator

* unifies the notation to consistently use method call syntax

* reuses ProviderError from nim-ethers in the provider extension

* clarifies the comment in multinodesuite

* uses == operator to check the predefined tags and raises exception when `BlockTag.pending` is requested.

* when waiting for request to fail, we break on any request state that is not Started

* removes tests that were moved to testProvider from testMarket

* extracts tests that use MockProvider to a separate async suite

* improves performance of the historical state restoration

* removing redundant log messages in validator (groupIndex and groups)

* adds testProvider to testContracts group

* removes unused import in testMarket
2024-12-14 05:07:55 +00:00
..
Readme.md Update links to codex-storage organization (#420) 2023-05-23 23:01:13 +03:00
ci-reusable.yml Fix concurrency issues (#993) 2024-11-25 11:23:04 +00:00
ci.yml Validator historical state restoration (#922) 2024-12-14 05:07:55 +00:00
docker-dist-tests.yml Run release tests for docker images (#1017) 2024-12-04 11:52:51 +00:00
docker-reusable.yml Run release tests for docker images (#1017) 2024-12-04 11:52:51 +00:00
docker.yml Use latest tag for Tags and Default branch only (#540) 2023-08-30 13:09:52 +03:00
docs.yml Build Postman Collection (#973) 2024-10-28 13:53:41 +00:00
nim-matrix.yml ci: linux ci runs on ubuntu-20.04 (#953) 2024-10-14 11:24:53 +00:00
release.yml ci: use rust 1.7.9 for release workflow and dockerfile (#999) 2024-11-25 16:13:14 +00:00

Readme.md

Tips for shorter build times

Runner availability

Currently, the biggest bottleneck when optimizing workflows is the availability of Windows and macOS runners. Therefore, anything that reduces the time spent in Windows or macOS jobs will have a positive impact on the time waiting for runners to become available. The usage limits for Github Actions are described here. You can see a breakdown of runner usage for your jobs in the Github Actions tab (example).

Windows is slow

Performing git operations and compilation are both slow on Windows. This can easily mean that a Windows job takes twice as long as a Linux job. Therefore it makes sense to use a Windows runner only for testing Windows compatibility, and nothing else. Testing compatibility with other versions of Nim, code coverage analysis, etc. are therefore better performed on a Linux runner.

Parallelization

Breaking up a long build job into several jobs that you run in parallel can have a positive impact on the wall clock time that a workflow runs. For instance, you might consider running unit tests and integration tests in parallel. Keep in mind however that availability of macOS and Windows runners is the biggest bottleneck. If you split a Windows job into two jobs, you now need to wait for two Windows runners to become available! Therefore parallelization often only makes sense for Linux jobs.

Refactoring

As with any code, complex workflows are hard to read and change. You can use composite actions and reusable workflows to refactor complex workflows.

Steps for measuring time

Breaking up steps allows you to see the time spent in each part. For instance, instead of having one step where all tests are performed, you might consider having separate steps for e.g. unit tests and integration tests, so that you can see how much time is spent in each.

Fix slow tests

Try to avoid slow unit tests. They not only slow down continuous integration, but also local development. If you encounter slow tests you can consider reworking them to stub out the slow parts that are not under test, or use smaller data structures for the test.

You can use unittest2 together with the environment variable NIMTEST_TIMING=true to show how much time is spent in every test (reference).

Caching

Ensure that caches are updated over time. For instance if you cache the latest version of the Nim compiler, then you want to update the cache when a new version of the compiler is released. See also the documentation for the cache action.

Fail fast

By default a workflow fails fast: if one job fails, the rest are cancelled. This might seem inconvenient, because when you're debugging an issue you often want to know whether you introduced a failure on all platforms, or only on a single one. You might be tempted to disable fail-fast, but keep in mind that this keeps runners busy for longer on a workflow that you know is going to fail anyway. Consequent runs will therefore take longer to start. Fail fast is most likely better for overall development speed.