From 3d40424cf77526114858c0c97eb22aee3bb11939 Mon Sep 17 00:00:00 2001 From: "Dustin C. Hatch" Date: Tue, 5 Nov 2024 07:05:55 -0600 Subject: [PATCH] fleetlock: Use patched server from Github PR The _fleetlock_ server drains all pods from a node before allocating the reboot lock to that node. Unfortunately, it doesn't actually wait for those pods to be completely evicted. If some pods take too long to shut down, they may get stuck in `Terminating` state once the machine starts rebooting. This makes it so those pods cannot be replaced on another node with the original one is offline, which pretty much defeats the purpose of using Fleetlock in the first place. It seems upstream has abandoned this project, as there is an open [Pull Request][0] to fix this issue that has so far been ignored. Fortunately, building a new container image containing the patch is easy enough, so we can run our own patched build. [0]: https://github.com/poseidon/fleetlock/pull/271 --- fleetlock/kustomization.yaml | 5 +++++ 1 file changed, 5 insertions(+) diff --git a/fleetlock/kustomization.yaml b/fleetlock/kustomization.yaml index 333cffa..81b425f 100644 --- a/fleetlock/kustomization.yaml +++ b/fleetlock/kustomization.yaml @@ -19,3 +19,8 @@ patches: name: fleetlock spec: clusterIP: 10.96.1.15 + +images: +- name: quay.io/poseidon/fleetlock + newName: git.pyrocufflink.net/containerimages/fleetlock + newTag: vadimberezniker-wait_evictions