logos-storage-nim

mirror of https://github.com/logos-storage/logos-storage-nim.git synced 2026-07-30 05:23:30 +00:00

Author	SHA1	Message	Date
E M	db73173ebd	re-enable release test workflow	2026-05-28 16:17:30 +10:00
E M	63a16e249a	Make summary more readable	2026-05-28 15:41:30 +10:00
E M	2a514e379c	update workflow run summary - add retention date - update titles and links for readability	2026-05-28 15:41:29 +10:00
E M	a021908788	Read test result ConfigMap instead of trying to scrape logs for test result info	2026-05-28 15:41:29 +10:00
E M	640d7f0943	more accurate name for step	2026-05-28 15:41:29 +10:00
E M	01677bf9cb	move test summary generation above check job status if check job status was a failure (eg when a test failed), then the test summary generation was being skipped. Moving the test summary generation step above the job status check avoids this.	2026-05-28 15:41:29 +10:00
E M	41be805412	Inc runner node disk space Attempt to avoid insufficient disk space errors in tests	2026-05-28 15:41:29 +10:00
E M	1ad70fec22	don't wait for pvc disks to be deleted, delete all at end in case runner crashes	2026-05-28 15:41:29 +10:00
E M	d0b794ff4f	increase memory of the runner pod was seeing exit code 137 (OOM)	2026-05-28 15:41:29 +10:00
E M	8e9d39197f	Keep 1 node alive so autoscaler doesn't scale to 0 Should help speed up startup, avoiding errors like "pod couldn't be scheduled"	2026-05-28 15:41:29 +10:00
E M	2d7aca1054	wait for pvcs to be deleted before destroying the cluster	2026-05-28 15:41:29 +10:00
E M	17a1c556cc	use on demand VMs instead of spot instances Attempting to fix a lot of errors in the console relating to spot instances being unschedulable.	2026-05-28 15:41:29 +10:00
E M	f84fd7f25c	do not generate test summary if previous steps were skipped/cancelled	2026-05-28 15:41:29 +10:00
E M	5203cf93e4	fix error in "print storage logs url" step	2026-05-28 15:41:29 +10:00
E M	b4180c471b	delete terraform state lock When the workflow is cancelled, either manually, or automatically from a long-running step (timeout), the terraform state lock had to be manually deleted, or else the next workflow run would never succeed. This change ensures that the state lock file is always deleted after each run.	2026-05-28 15:41:28 +10:00
E M	0e46c9f684	generate test summary to show in workflow summary	2026-05-28 15:41:28 +10:00
E M	37f14a6821	Move from a single zone to multiple zones to increase spot instance availability	2026-05-28 15:41:28 +10:00
E M	8fccef9fb2	Reduce nodes in pool from 10 to 5 Reduces resource contention. 2 parallel tests x 10 containers => 2-3 nodes needed, 5 gives room	2026-05-28 15:41:28 +10:00
E M	c1855fb13a	put cluster name in an env var	2026-05-28 15:41:28 +10:00
E M	10ca94261b	avoid sleeping a full 60s to wait for job completion Instead, wait for a job condition using kubectl wait	2026-05-28 15:41:28 +10:00
E M	3679040178	try to ensure the log stream survives long silences	2026-05-28 15:41:28 +10:00
E M	e58c8f93c7	add starttime param to logging URL	2026-05-28 15:41:28 +10:00
E M	f72dbb9c9d	cap boot drive size to 20gb (default is 100gb) to avoid resource exhaustion	2026-05-28 15:41:28 +10:00
E M	b04672ebce	Add a "Delete PVCs before cluster teardown" step to the workflow to prevent future PVC leaks	2026-05-28 15:41:28 +10:00
E M	eac099b819	try zone a one more time	2026-05-28 15:41:28 +10:00
E M	2c627c9ed2	hanging at 64% deploying again, trying zone c	2026-05-28 15:41:28 +10:00
E M	c520e79383	fix encoding of logging url	2026-05-28 15:41:27 +10:00
E M	be582eca17	move back to europe-west4-b zone due to exhausted quota	2026-05-28 15:41:27 +10:00
E M	82630eead6	refactor: remove allow-tests-pods node label from GKE node pools The `allow-tests-pods` boolean label was used by the test framework to steer pods away from runner nodes via a node affinity exclusion. Pod scheduling now uses the existing `workload-type` label directly as a nodeSelector, making the boolean label redundant.	2026-05-28 15:41:27 +10:00
E M	68c319863a	Logging URL filters by RUNID instead of namespace/container name	2026-05-28 15:41:27 +10:00
E M	4f86040c2c	fix: avoid building in parallel Avoids "file in use" errors while building in CI	2026-05-28 15:41:27 +10:00
E M	db5eada055	remove unneeded priority request	2026-05-28 15:41:27 +10:00
E M	fd5c29db31	set cluster creation timeout to 20mins temporary timeout so we can see if the latest commits work without waiting too long between tries	2026-05-28 15:41:27 +10:00
E M	97750a47ca	Try changing zones in case the cluster deployment stall is due to a zonal unavailability.	2026-05-28 15:41:27 +10:00
E M	5616b50bfb	change monitoring to default service Cluster deployment seems to be stalling because the metrics service is not started. So returning it to default to see if that fixes the issue.	2026-05-28 15:41:27 +10:00
E M	7d6701d444	inline node pools so they can be created in parallel speeds up cluster creation	2026-05-28 15:41:27 +10:00
E M	bbc4b1caf3	remove unneeded setup	2026-05-28 15:41:27 +10:00
E M	898010d58f	move state bucket from gh secret to variable	2026-05-28 15:41:27 +10:00
E M	77e8d6d64a	create the terraform cache dir first	2026-05-28 15:41:26 +10:00
E M	0e298bddbd	add debug output	2026-05-28 15:41:26 +10:00
E M	48b444d8fe	change script so it doesn't non-zero exit when no pods exist	2026-05-28 15:41:26 +10:00
E M	7a9b93a981	fix terraform cache, should remove warning	2026-05-28 15:41:26 +10:00
E M	d4d52c008a	fix polling script	2026-05-28 15:41:26 +10:00
E M	3ed677c9d1	check pod phase instead	2026-05-28 15:41:26 +10:00
E M	cd972ef9bb	refactor polling loop	2026-05-28 15:41:26 +10:00
E M	1696aa83a9	temp comment out releasee workflow	2026-05-28 15:41:26 +10:00
E M	a901e1495c	temp comment out build workflow	2026-05-28 15:41:26 +10:00
E M	7f782cf6a1	temp comment out build to make testing ci changes faster	2026-05-28 15:41:26 +10:00
E M	dabdc6d3e9	Keeps timing out waiting for start, so try polling loop	2026-05-28 15:41:26 +10:00
E M	3cb3a176b2	wait for runners-ci node to be ready before continuing workflow	2026-05-28 15:41:26 +10:00

1 2 3 4 5 ...

991 Commits