1
0
Fork 0
Commit Graph

4 Commits (8ecee4133f70696dd00d0dbabcc3e155568bb356)

Author SHA1 Message Date
Dustin 2d7fec1cdf v-m: vmstorage: Add pod anti-affinity
One of the reasons for moving to 4 `vmstorage` replicas was to ensure
that the load was spread evenly between the physical VM host machines.
To ensure that is the case as much as possible, we need to keep one
pod per Kubernetes node.
2024-06-26 18:29:49 -05:00
Dustin ab458df415 v-m/vmstorage: Start pods in parallel
By default, Kubernetes waits for each pod in a StatefulSet to become
"ready" before starting the next one.  If there is a problem starting
that pod, e.g. data corruption, then the others will never start.  This
sort of defeats the purpose of having multiple replicas.  Fortunately,
we can configure the pod management policy to start all the pods at
once, regardless of the status of any individual pod.  This way, if
there is a problem with the first pod, the others will still come up
and serve whatever data they have.
2024-06-26 18:29:49 -05:00
Dustin 54e7a25f93 v-m: vmstorage: Remove startup/ready probes
Kubernetes will not start additional Pods in a StatefulSet until the
existing ones are Ready.  This means that if there is a problem bringing
up, e.g. `vmstorage-0`, it will never start `vmstorage-1` or
`vmstorage-2`.  Since this pretty much defeats the purpose of having a
multi-node `vmstorage` cluster, we have to remove the readiness probe,
so the Pods will be Ready as soon as they start.  If there is a problem
with one of them, it will matter less, as the others can still run.
2024-01-22 16:43:46 -06:00
Dustin 8f088fb6ae v-m: Deploy (clustered) Victoria Metrics
Since *mtrcs0.pyrocufflink.blue* (the Metrics Pi) seems to be dying,
I decided to move monitoring and alerting into Kubernetes.

I was originally planning to have a single, dedicated virtual machine
for Victoria Metrics and Grafana, similar to how the Metrics Pi was set
up, but running Fedora CoreOS instead of a custom Buildroot-based OS.
While I was working on the Ignition configuration for the VM, it
occurred to me that monitoring would be interrupted frequently, since
FCOS updates weekly and all updates require a reboot.  I would rather
not have that many gaps in the data.  Ultimately I decided that
deploying a cluster with Kubernetes would probably be more robust and
reliable, as updates can be performed without any downtime at all.

I chose not to use the Victoria Metrics Operator, but rather handle
the resource definitions myself.  Victoria Metrics components are not
particularly difficult to deploy, so the overhead of running the
operator and using its custom resources would not be worth the minor
convenience it provides.
2024-01-01 17:48:10 -06:00