d4638239b3bd89cee7af9e0110fc12869582f664
All checks were successful
dustin/k8s-reboot-coordinator/pipeline/head This commit looks good
There was a race condition while waiting for a node to be drained, especially if there are pods that cannot be evicted immediately when the wait starts. It was possible for the `wait_drained` function to return before all of the pods had been deleted, if the wait list temporarily became empty at some point. This could happen, for example, if multiple `WatchEvent` messages were processed from the stream before any messages were processed from the channel; even though there were pod identifiers waiting in the channel to be added to the wait list, if the wait list became empty after processing the watch events, the loop would complete. This is made much more likely if a PodDisruptionBudget temporarily prevents a pod from being evicted; it could take 5 or more seconds for that pod's identifier to be pushed to the channel, and in that time, the rest of the pods could be deleted. To resolve this, we need to ensure that the `wait_drained` function never returns until the sender side of the channel is dropped. This way, we are sure that no more pods will be added to the wait list, so when it gets emptied, we are sure we are actually done.
Description
No description provided
Languages
Rust
97%
Shell
2%
Dockerfile
1%