1
0
Fork 0
Commit Graph

5 Commits (7158ff89df72c8f170fbfce672137c32c17f1186)

Author SHA1 Message Date
Dustin 4c1992b3c9 v-m/vmagent: Start in parallel
As with AlertManager, the point of having multiple replicas of `vmagent`
is so that one is always running, even if the other fails.  Thus, we
want to start the pods in parallel so that if the first one does not
come up, the second one at least has a chance.
2025-09-07 10:49:22 -05:00
Dustin 7ad8fff7c6 v-m/vmagent: Use ephemeral storage
The `vmagent` needs a place to spool data it has not yet sent to
Victoria Metrics, but it doesn't really need to be persistent.  As long
as all of the `vmagent` nodes _and_ all of the `vminsert` nodes do not
go down simultaneously, there shouldn't be any data loss.  If they are
all down at the same time, there's probably something else going on and
lost metrics are the least concerning problem.
2025-09-07 08:27:19 -05:00
Dustin 8491d2ded7 v-m: Switch to quay.io for container images
Docker Hub has blocked ("rate limited") my IP address.  Moving as much
as I can to use images from other sources.  Hopefully they'll unblock me
soon and I can deploy a caching proxy.
2025-07-07 08:43:20 -05:00
Dustin 225fd8469c v-m/vmagent: Allow listing all pods in cluster
The original RBAC configuration allowed `vmagent` only to list the pods
in the `victoria-metrics` namespace.  In order to allow it to monitor
other applications' pods, it needs to be assigned permission to list
pods in all namespaces.
2024-01-02 11:25:54 -06:00
Dustin 8f088fb6ae v-m: Deploy (clustered) Victoria Metrics
Since *mtrcs0.pyrocufflink.blue* (the Metrics Pi) seems to be dying,
I decided to move monitoring and alerting into Kubernetes.

I was originally planning to have a single, dedicated virtual machine
for Victoria Metrics and Grafana, similar to how the Metrics Pi was set
up, but running Fedora CoreOS instead of a custom Buildroot-based OS.
While I was working on the Ignition configuration for the VM, it
occurred to me that monitoring would be interrupted frequently, since
FCOS updates weekly and all updates require a reboot.  I would rather
not have that many gaps in the data.  Ultimately I decided that
deploying a cluster with Kubernetes would probably be more robust and
reliable, as updates can be performed without any downtime at all.

I chose not to use the Victoria Metrics Operator, but rather handle
the resource definitions myself.  Victoria Metrics components are not
particularly difficult to deploy, so the overhead of running the
operator and using its custom resources would not be worth the minor
convenience it provides.
2024-01-01 17:48:10 -06:00