kubernetes

infra

Author	SHA1	Message	Date
Dustin	7ad8fff7c6	v-m/vmagent: Use ephemeral storage The `vmagent` needs a place to spool data it has not yet sent to Victoria Metrics, but it doesn't really need to be persistent. As long as all of the `vmagent` nodes _and_ all of the `vminsert` nodes do not go down simultaneously, there shouldn't be any data loss. If they are all down at the same time, there's probably something else going on and lost metrics are the least concerning problem.	2025-09-07 08:27:19 -05:00
Dustin	8491d2ded7	v-m: Switch to quay.io for container images Docker Hub has blocked ("rate limited") my IP address. Moving as much as I can to use images from other sources. Hopefully they'll unblock me soon and I can deploy a caching proxy.	2025-07-07 08:43:20 -05:00
Dustin	225fd8469c	v-m/vmagent: Allow listing all pods in cluster The original RBAC configuration allowed `vmagent` only to list the pods in the `victoria-metrics` namespace. In order to allow it to monitor other applications' pods, it needs to be assigned permission to list pods in all namespaces.	2024-01-02 11:25:54 -06:00
Dustin	8f088fb6ae	v-m: Deploy (clustered) Victoria Metrics Since mtrcs0.pyrocufflink.blue (the Metrics Pi) seems to be dying, I decided to move monitoring and alerting into Kubernetes. I was originally planning to have a single, dedicated virtual machine for Victoria Metrics and Grafana, similar to how the Metrics Pi was set up, but running Fedora CoreOS instead of a custom Buildroot-based OS. While I was working on the Ignition configuration for the VM, it occurred to me that monitoring would be interrupted frequently, since FCOS updates weekly and all updates require a reboot. I would rather not have that many gaps in the data. Ultimately I decided that deploying a cluster with Kubernetes would probably be more robust and reliable, as updates can be performed without any downtime at all. I chose not to use the Victoria Metrics Operator, but rather handle the resource definitions myself. Victoria Metrics components are not particularly difficult to deploy, so the overhead of running the operator and using its custom resources would not be worth the minor convenience it provides.	2024-01-01 17:48:10 -06:00

4 Commits (ab38df1d9f79a6635984a5a05cf738307b6480b7)