7 Commits

Author SHA1 Message Date
E M
bc7a277d9b
chore: reduce GKE release test cluster provisioning time and cost
- Configure runners-ci node pool inline in the cluster resource instead
  of using remove_default_node_pool=true, eliminating the
  provision-then-delete cycle that added ~5 min to terraform apply
- Remove the separate infra pool; runners-ci is now the only pool on
  the critical path of cluster creation
- Set tests-pods pool min_node_count=0 so no node is provisioned at
  apply time — nodes scale up only when test pods are scheduled
- Enable spot instances on the tests-pods pool for ~60-91% cost saving
- Add 60 min job timeout to release-tests to bound hung cluster cost
- Add Terraform plugin cache keyed on the lock file to skip provider
  re-downloads on subsequent runs (~30-60s saved)
- Install gke-gcloud-auth-plugin via setup-gcloud to fix kubectl auth

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-28 15:41:25 +10:00
E M
cf6d93f52c
chore: use zonal GKE cluster to reduce provisioning time
Switch cluster and all node pools from regional to zonal (`europe-west4-b`) to avoid the 40+ minute provisioning time of a regional (multi-zone) cluster. Adds a `zone` variable to the GKE module and cluster config, and updates the workflow's `gcloud get-credentials` call to use `--zone` instead of `--region`.
2026-05-28 15:41:25 +10:00
E M
2e82a4a15c
rename cluster to match previous change 2026-05-28 15:41:25 +10:00
E M
ae72954b4d
reduce length of cluster name 2026-05-28 15:41:25 +10:00
E M
7c74437bb7
Port terraform cluster creation/destruction from digital ocean to gcp 2026-05-28 15:41:24 +10:00
E M
1a376d80db
chore: rename Codex references to Logos Storage in release tests
Replace all "Codex" branding in the release test workflow and supporting
files: rename the K8s cluster, Terraform state key, secret, log paths,
env var (CODEXDOCKERIMAGE → STORAGEDOCKERIMAGE), and test runner image
(cs-codex-dist-tests → logos-storage-dist-tests) to align with the
already-updated logos-storage-nim-cs-dist-tests repo in https://github.com/logos-storage/logos-storage-nim-cs-dist-tests/pull/124. Also fix the
dotnet test path to the correct Tests/LogosStorageReleaseTests directory.
2026-05-28 15:41:24 +10:00
E M
451356e0fe
Add release tests workflow
Adds a workflow for release tests:
- builds a docker image for launching nodes in the tests (basically has additional nimflags set)
- creates a K8s cluster in Digital Ocean
- one pod in the cluster is dedicated as the test runner (uses the logos-storage-nim-cs-dist-tests:latest image)
- the release will fail if the docker image build or the release tests fail
- the K8s cluster is torn down after the tests finish (failure or not)
2026-05-28 15:41:23 +10:00