gmega
|
4cbb401d12
|
feat: add optional data removal with adjusted quotas
|
2025-06-09 19:59:00 -03:00 |
|
gmega
|
67ca362ee7
|
misc: minor refactor, add simple network perf test deploy
|
2025-04-16 12:56:34 -03:00 |
|
gmega
|
b2491c26f9
|
fix: fix workflow expressions
|
2025-02-27 18:49:48 -03:00 |
|
gmega
|
81cda58a9d
|
feat: add download speed plot, dedup experiment datasets
|
2025-02-27 18:47:36 -03:00 |
|
gmega
|
a366f04e7c
|
feat: allow re-running failed experiments from previous workflow runs
|
2025-02-25 12:14:15 -03:00 |
|
gmega
|
5a9543259b
|
feat: add support for region k8s annotations
|
2025-02-24 14:16:59 -03:00 |
|
gmega
|
8dbc3faed8
|
feat: add tunable parallelism
|
2025-02-23 11:33:49 -03:00 |
|
gmega
|
73219922f6
|
feat: add Codex chart values for cluster experiments
|
2025-02-20 12:16:05 -03:00 |
|
gmega
|
48e71a315a
|
feat: add support for setting the node tag in benchmark workflow
|
2025-02-20 12:14:49 -03:00 |
|
gmega
|
688091c965
|
feat: allow use of custom runner and node tags for Codex
|
2025-02-20 11:59:24 -03:00 |
|
gmega
|
a8c19364b7
|
fix: minikube env param in workflow
|
2025-02-20 10:21:45 -03:00 |
|
gmega
|
0d08814929
|
feat: generalize benchmark workflow to run Codex in addition to Deluge
|
2025-02-17 10:44:00 -03:00 |
|
gmega
|
38434f4590
|
fix container label for codex experiment runner
|
2025-02-14 15:59:59 -03:00 |
|
gmega
|
e8441b7bea
|
fix: respect logger increments even when stream returns less data than expected
|
2025-02-14 15:59:28 -03:00 |
|
gmega
|
f7adf878eb
|
feat: add memory parameter to Deluge values file
|
2025-02-14 14:30:56 -03:00 |
|
gmega
|
205f926f89
|
feat: add stable bootstrap node
|
2025-02-14 14:30:18 -03:00 |
|
gmega
|
f336df8da7
|
fix: adjust Codex logging cooldown, insert polling backoff on download completion, define default Codex experiment
|
2025-02-14 12:14:52 -03:00 |
|
gmega
|
68ee1bad87
|
feat: add working Codex helm chart
|
2025-02-14 11:00:17 -03:00 |
|
gmega
|
74ee71889e
|
feat: add Codex node and initial integration tests
|
2025-02-04 19:18:58 -03:00 |
|
gmega
|
99992d2e7e
|
fix: enable cleanup on failure by default
|
2025-02-03 15:46:26 -03:00 |
|
gmega
|
61f2172304
|
feat: add workflow for the final experiment
|
2025-01-30 11:48:09 -03:00 |
|
gmega
|
94893c0f93
|
fix: conditional expression for cleanup
|
2025-01-29 20:35:26 -03:00 |
|
gmega
|
a29c010e7a
|
feat: allow keeping pods around on failure, add optional log parsing at end of experiment run
|
2025-01-29 08:47:01 -03:00 |
|
gmega
|
7ed29ddb4c
|
fix: add RAM settings on deluge node
|
2025-01-28 20:33:13 -03:00 |
|
gmega
|
1b83f8047c
|
feat: update RBAC for codex workflows
|
2025-01-28 18:20:47 -03:00 |
|
gmega
|
ee67a92726
|
feat: grant codex runner permissions to launch subworkflows
|
2025-01-27 18:07:56 -03:00 |
|
gmega
|
ba1b93d77c
|
feat: add structured experiment iteration logs
|
2025-01-27 17:26:09 -03:00 |
|
gmega
|
90dda4f932
|
fix: add -C so tars do not include parent folders
|
2025-01-24 19:19:54 -03:00 |
|
gmega
|
4d4d06e7a9
|
feat: add log parsing workflow with upload to hetzner storage bucket
|
2025-01-24 18:28:28 -03:00 |
|
gmega
|
fdac384ad8
|
fix: add autoscaler eviction annotations to prevent pods from being relocated mid-experiment
|
2025-01-23 12:12:42 -03:00 |
|
gmega
|
a9b9fd8332
|
fix: quotation so argo does not screw up the value array
|
2025-01-23 08:06:43 -03:00 |
|
gmega
|
8096c9f4e0
|
feat: add ordering to parameter matrix expander
|
2025-01-22 17:12:46 -03:00 |
|
gmega
|
d70b87d2bb
|
fix: production values for Argo workflows and RBAC
|
2025-01-22 10:31:08 -03:00 |
|
gmega
|
aeb2f044c8
|
chore: remove leftover values from chart
|
2025-01-20 20:01:48 -03:00 |
|
gmega
|
882392bef2
|
fix: add missing parameters to cleanup hook
|
2025-01-20 18:41:11 -03:00 |
|
gmega
|
6ae5b1620f
|
chore: add missing EOL
|
2025-01-20 17:59:07 -03:00 |
|
gmega
|
7e07eda3c2
|
feat: allow running workflows from locally loaded images under Minikube
|
2025-01-20 17:57:21 -03:00 |
|
gmega
|
5a203fad18
|
chore: eliminate 5GB experiment for now
|
2025-01-20 15:29:27 -03:00 |
|
gmega
|
ab100c4841
|
feat: runnable experiment with working test runner and agents
|
2025-01-20 15:24:03 -03:00 |
|
gmega
|
94556d7a53
|
working deployment of agents on minikube
|
2025-01-20 11:39:43 -03:00 |
|
gmega
|
60fd274b18
|
feat: add node affinity/anti-affinity and storage class knobs to run this on a cluster
|
2025-01-15 11:52:32 -03:00 |
|
gmega
|
fc0630224f
|
fix: remove redundant group suffix from node ID
|
2025-01-10 16:31:12 -03:00 |
|
gmega
|
b505e7a3e1
|
fix: fix README link, add missing precommit config, bump ruff
|
2025-01-09 16:48:44 -03:00 |
|
gmega
|
bfabd1c4c8
|
feat: label components with /component label, use /name to refer to benchmark pods; add README
|
2025-01-09 09:27:21 -03:00 |
|
gmega
|
a4fe12e620
|
feat: add new Helm chart parameters to workflow
|
2025-01-08 16:43:01 -03:00 |
|
gmega
|
4d1eef9d53
|
feat: standardize labelling in Helm chart to facilitate log consumption
|
2025-01-08 15:10:10 -03:00 |
|
gmega
|
d417f55ffd
|
add config sketch for setting up vector on minikube
|
2025-01-07 18:59:19 -03:00 |
|
gmega
|
59f3a9a584
|
fix: remove useless sync point which was causing issues
|
2024-12-20 18:00:32 -03:00 |
|
gmega
|
470e9a989e
|
feat: add standard labels to chart resources to facilitate log querying
|
2024-12-20 14:09:54 -03:00 |
|
gmega
|
f3a66d9637
|
fix: workaround for broken Argo exit hooks
|
2024-12-20 07:51:58 -03:00 |
|