Chrysostomos Nanakos 0c8e28fa46
feat(k8s): add Vector logging infrastructure for benchmarks
Add Vector agent/aggregator deployment for collecting logs from Codex
benchmark experiments in K8s. Includes PVC for log storage, S3 secret
template and RBAC.

Vector collects logs from benchmark pods and writes JSONL files for
post-processing by the log-parsing workflow.

Signed-off-by: Chrysostomos Nanakos <chris@include.gr>
2025-10-21 13:13:49 +03:00
..

This folder contains the required Kubernetes and Argo Workflow resources required to run experiments in Kubernetes both in local (e.g. Minikube, Kind) and remote clusters.

Prerequisites

Argo Workflows

Whatever cluster you choose must be running Argo Workflows.

Local clusters. For local clusters, you can follow the instructions in the Argo Workflows Quickstart Guide to get Argo Workflows running.

For remote clusters, it's best to consult the Argo Workflows Operator Manual.

Argo CLI Tool. You will also need to install the Argo CLI tool to submit workflows.

Permissions. Codex workflows assume that they are running in a namespace called codex-benchmarks. We have a sample manifest which creates the namespace as well as the proper service account with RBAC permissions here. For local clusters, you can apply this manifest as it is. For remote clusters, you might need to customize it to your needs.

Logs

Experiments require logs to be stored for later parsing during analysis. For local clusters, this can be achieved by running Vector and outputting pods logs to a persistent volume. The manifests for setting the persistent volume, as well as vector, can be found here.

Submitting Workflows

Once everything is set up, workflows can be submitted with:

argo submit -n argo ./deluge-benchmark-workflow.yaml

for local clusters, you should add:

argo submit -n argo ./deluge-benchmark-workflow.yaml --insecure-skip-verify

To observe progress, you can use the Argo Wokflows UI which can be accessed by port-forwarding the Argo Workflows server.