The linux-amd64 release build was intermittently failing with:
'asm' operand has impossible constraints
in secp256k1's hand-written x86_64 assembly (scalar_4x64_impl.h,
USE_ASM_X86_64 path). Root cause, confirmed against upstream
(bitcoin-core/secp256k1#1623): on GCC 11 (Ubuntu 22.04's default),
-march=native's CPU autodetection misidentifies early AMD Zen4 chips
as znver3 while still enabling -mavx512f via CPUID feature probing.
-march=znver3 -mavx512f is an invalid combination matching no real
CPU -- GCC 11 predates Zen4 entirely -- and its register allocator
cannot satisfy the asm's constraints under it. Upstream confirmed
this is fixed by a GCC version that knows about znver4 (13+), and
closed the report as a GCC autodetection bug, not a secp256k1 defect.
This explains the apparent intermittency: it only reproduces on
whichever physical Zen4 host GitHub's runner fleet happens to assign
to a given job.
Two independent changes address this:
- Bump the linux-amd64 and linux-arm64 release builders from
ubuntu-22.04(-arm) to ubuntu-24.04(-arm), kept on the same Ubuntu
version across both architectures. Ubuntu 24.04 ships GCC 13 by
default (GCC isn't installed separately in this workflow -- the
runner image's preinstalled toolchain comes from its Ubuntu base),
which correctly recognizes znver4 and produces a valid -march=native
expansion.
- Also pass -d:disableMarchNative on the linux-amd64 build (matrix-
scoped; arm64/macOS don't compile the affected x86_64 asm path).
This is kept even after the GCC bump because -march=native is
independently unsound for a publicly distributed binary: it bakes
in whatever instruction set extensions the build runner's CPU
happens to have (e.g. AVX-512) into the shipped binary, which would
crash with an illegal instruction on end-user machines lacking that
extension -- a separate, permanent concern from the GCC 11 bug.
Tips for shorter build times
Runner availability
When running on the Github free, pro or team plan, the bottleneck when optimizing workflows is the availability of macOS runners. Therefore, anything that reduces the time spent in macOS jobs will have a positive impact on the time waiting for runners to become available. On the Github enterprise plan, this is not the case and you can more freely use parallelization on multiple runners. The usage limits for Github Actions are described here. You can see a breakdown of runner usage for your jobs in the Github Actions tab (example).
Windows is slow
Performing git operations and compilation are both slow on Windows. This can easily mean that a Windows job takes twice as long as a Linux job. Therefore it makes sense to use a Windows runner only for testing Windows compatibility, and nothing else. Testing compatibility with other versions of Nim, code coverage analysis, etc. are therefore better performed on a Linux runner.
Parallelization
Breaking up a long build job into several jobs that you run in parallel can have a positive impact on the wall clock time that a workflow runs. For instance, you might consider running unit tests and integration tests in parallel. When running on the Github free, pro or team plan, keep in mind that availability of macOS runners is a bottleneck. If you split a macOS job into two jobs, you now need to wait for two macOS runners to become available.
Refactoring
As with any code, complex workflows are hard to read and change. You can use composite actions and reusable workflows to refactor complex workflows.
Steps for measuring time
Breaking up steps allows you to see the time spent in each part. For instance, instead of having one step where all tests are performed, you might consider having separate steps for e.g. unit tests and integration tests, so that you can see how much time is spent in each.
Fix slow tests
Try to avoid slow unit tests. They not only slow down continuous integration, but also local development. If you encounter slow tests you can consider reworking them to stub out the slow parts that are not under test, or use smaller data structures for the test.
You can use unittest2 together with the environment variable
NIMTEST_TIMING=true to show how much time is spent in every test
(reference).
Caching
Ensure that caches are updated over time. For instance if you cache the latest version of the Nim compiler, then you want to update the cache when a new version of the compiler is released. See also the documentation for the cache action.
Fail fast
By default a workflow fails fast: if one job fails, the rest are cancelled. This might seem inconvenient, because when you're debugging an issue you often want to know whether you introduced a failure on all platforms, or only on a single one. You might be tempted to disable fail-fast, but keep in mind that this keeps runners busy for longer on a workflow that you know is going to fail anyway. Consequent runs will therefore take longer to start. Fail fast is most likely better for overall development speed.