15 Commits

Author SHA1 Message Date
E M
5f2b537fcd
refactor: replace scheduling affinity with explicit node pool label selection
Replace the indirect `SetSchedulingAffinity(notIn: "false")` / `allow-tests-pods` mechanism with `ScheduleInPoolsWithLabel(key, value)` and `AddToleration(key, value, effect)` in ContainerRecipeFactory. This is much more readable from an API perspective. `SetSchedulingAffinity(notIn: "false")` was a double-negative (hard to reason about) and it was not clear that this was meant to schedule on pools with labels `allow-tests-pods=true`.

Previously, pods were steered to the spot node pool via a node affinity exclusion on a boolean label (`allow-tests-pods NotIn ["false"]`), and spot taint toleration was added implicitly by using the `system-node-critical` priority class. The priority class was removed earlier because it caused a ResourceQuota admission error in GCP, which silently broke spot node scheduling.

The new API is explicit: recipes call `ScheduleInPoolsWithLabel` to set a nodeSelector label that targets the intended pool, and `AddToleration` to declare any taints the pool carries. Tolerations are set at the recipe level to allow for the recipe to move back to Digital Ocean if needed (removing the unneeded toleration). All four recipes (storage, prometheus, discord bot, rewarder bot) now call both.

Cleanup applied alongside:
- `PodToleration` converted to a record for structural equality and simpler deduplication
- `ExposedPorts`, `InternalPorts`, `EnvVars`, `Volumes` on `ContainerRecipe` changed to
  `IReadOnlyList<T>` for consistent immutable typing
- `SetCriticalPriority` property renamed to `IsCriticalPriority`
- `GetPriorityClassName` returns `string?` instead of `null!`
- `Reset()` extracted in `ContainerRecipeFactory` to consolidate post-create state reset
- Fixed bug: `nodePoolLabels` and `tolerations` were passed by reference and then cleared,
  leaving the recipe with empty collections; now snapshotted before clearing
- `SchedulingAffinity.cs` deleted (no remaining callers)
2026-04-29 16:45:55 +10:00
Ben
e03f5982d3
Requires new contracts image with configurable marketplace config 2025-08-25 11:11:56 +02:00
Ben
ecd0e70261
fixes serialization issue of containerAdditionals 2024-08-21 10:45:17 +02:00
benbierens
9d9f65c5a3
Fixes missing name and null events 2024-07-29 11:02:24 +02:00
benbierens
87271f4f37
Sets up starting event and bootstrap event 2024-07-29 10:16:37 +02:00
benbierens
aa416d50b3
ensuring enough mounted disk space 2024-06-08 10:36:23 +02:00
Ben
e42f1ddbd7
Adds support for command overrides to container recipes. 2024-03-13 10:01:37 +01:00
benbierens
5dc918287c
Merge branch 'master' into feature/public-testnet-deploying 2023-12-11 08:30:25 +01:00
benbierens
3761e236a3
Sets node critical priority for codex and geth nodes 2023-11-23 14:50:54 +01:00
benbierens
485e3cf02e
Merge branch 'master' into feature/public-testnet-deploying 2023-11-14 10:50:41 +01:00
benbierens
b47b596062
Fetches used external ports in order to guarantee no collisions. 2023-11-14 10:49:14 +01:00
benbierens
7de0e5a1c4
Sets up tests-runners as avoided scheduling affinity. 2023-11-14 10:16:00 +01:00
benbierens
4192952a37
Prevents reuse of external port numbers 2023-11-13 16:05:41 +01:00
benbierens
b82c74865e
Adds numberSource for correct range for external ports 2023-11-13 15:27:52 +01:00
benbierens
ed56d9edcc
Cleanup of kubernetesWorkflow assembly. 2023-11-12 10:07:23 +01:00