The Promtail container image is pretty big, so it takes quite some time
to pull on a slow machine like a Raspberry Pi. Let's increase the
startup timeout so the service is less likely to fail while the image is
still being pulled.
[Promtail][0] is the log collection agent for Grafana Loki. It reads
logs from various locations, including local files and the _systemd_
journal and sends them to Loki via HTTP.
Loki configuration is a highly-structured YAML document. Thus, instead
of using Tera template syntax for loops, conditionals, etc., we can use
the full power of CUE to construct the configuration. Using the
`Marshal` function from the built-in `encoding/yaml` package, we
serialize the final configuration structure as a string and write it
verbatim to the configuration file.
I have modeled most of the Promtail configuration schema in the
`du5t1n.me/cfg/app/promtail/schema` package. Having the schema modeled
will ensure the generated configuration is valid during development
(i.e. `cue export` will fail if it is not), which will save time pushing
changes to machines and having Loki complain.
The `#promtail` "function" in `du5t1n.me/cfg/env/prod` makes it easy to
build our desired configuration. It accepts an optional `#scrape`
field, which can be used to provide specific log scraping definitions.
If it is unspecified, the default configuration is to scrape the systemd
journal. Hosts with additional needs can supply their own list,
probably including the `promtail.scrape.journal` object in it to get the
default journal scrape job.
[0]: https://grafana.com/docs/loki/latest/send-data/promtail/
According to the [Grafana Loki documentation][0], sending SIGHUP to the
Loki process will instruct it to reload its configuration. This is
necessary in order for it to re-read its server certificate after it has
been renewed.
[0]: https://grafana.com/docs/loki/latest/configure/#reload-at-runtime
Before going into production with Grafana Loki, I want to set it up to
use TLS. To that end, I have configured _cert-manager_ to issue it a
certificate, signed by _DCH CA_. In order to use said certificate,
we need to configure `fetchcert` to run on the Loki server.
The `fetchcert` tool is a short shell script that fetches an X.509
certificate and corresponding private key from a Kubernetes Secret,
using the Kubernetes API. I originally wrote it for the Frigate server
so it could fetch the _pyrocufflink.blue_ wildcard certificate, which is
managed by _cert-manager_. Since then, I have adapted it to be more
generic, so it will be useful to fetch the _loki.pyrocufflink.blue_
certificate for Grafana Loki.
Although the script is rather simple, it does have several required
configuration parameters. It needs to know the URL of the Kubernetes
API server and have the certificate for the CA that signs the server
certificate, as well as an authorization token. It also needs to know
the namespace and name of the Secret from which it will fetch the
certificate and private key. Finally, needs to know the paths to the
files where the fetched data will be written.
Generally, after certificates are updated, some action needs to be
performed in order to make use of them. This typically involves
restarting or reloading a daemon. Since the `fetchcert` tool runs in
a container, it can't directly perform those actions, so it simply
indicates via a special exit code that the certificate has been updated
and some further action may be needed. The
`/etc/fetchcert/postupdate.sh` script is executed by _systemd_ after
`fetchcert` finishes. If the `EXIT_STATUS` environment variable (which
is set by _systemd_ to the return code of the main service process)
matches the expected code, the configured post-update actions will be
executed.
Since *systemd* starts the *reload-udev-rules.service* unit as soon as
any file in the `/run/containers/udev-rules` directory changes, the `cp`
command may start before all of the files have been copied out of the
container. If this happens, some of the rules will not get copied to
the final path, and thus will not be processed by *udev*.
Togive the container a chance to finish copying all of the files before
we process them, we need a bit of a delay. Obviously, this is not a
perfect solution, as it could potentially take longer than 250ms to copy
the files in some cases, but hopefully those cases are rare enough to
not worry about.
I do not like how Fedora CoreOS configures `sudo` to allow the *core*
user to run privileged processes without authentication. Rather than
assign the user a password, which would then have to be stored
somewhere, we'll install *pam_ssh_agent_auth* and configure `sudo` to
use it for authentication. This way, only users with the private key
corresponding to one of the configured public keys can run `sudo`.
Naturally, *pam_ssh_agent_auth* has to be installed on the host system.
We achieve this by executing `rpm-ostree` via `nsenter` to escape the
container. Once it is installed, we configure the PAM stack for
`sudo` to use it and populate the authorized keys database. We also
need to configure `sudo` to keep the `SSH_AUTH_SOCK` environment
variable, so *pam_ssh_agent_auth* knows where to look for the private
keys. Finally, we disable the default NOPASSWD rule for `sudo`, if
and only if the new configuration was installed.
Setting `AutoUpdate=registry` will tell Podman to automatically fetch
an updated container image from its corresponding registry and restart
the container. The `podman-auto-update.timer` systemd unit needs to be
active for this to happen on a schedule.
`dest` is not a valid option for the `--mount` argument to `podman`. To
specify where the target path, only `target`, `destination`, and `dst`
are valid.
`upsmon` is the component of NUT that tracks the status of UPSs and
reacts to their changing by sending notifications and/or shutting down
the system. It is a networked application that can run on any system;
it can run on a different system than `upsd`, and indeed can run on
multiple systems simultaneously.
Each system that runs `upsmon` will need a username and password for
each UPS it will monitor. Using the CUE [function pattern][0], I've
made it pretty simple to declare the necessary values under
`nut.monitor`.
[0]: https://cuetorials.com/patterns/functions/
*collectd* logs to syslog, so its output is lost when it's running in a
container. We can capture messages from it by mounting the journald
syslog socket into the container.
The `/run/udev/rules.d` directory may not always exist, especially at
boot. We need to ensure that it does before we try to copy rules
exported by containers into it, or the unit will fail.
Even with *collectd* configured to report filesystem usage by device, it
still only reports filesystems that are mounted (in its namespace).
Thus, in order for it to report filesystems like `/boot`, these need to
be mounted in the container.
I keep going back-and-forth on whether or not collectd should run in a
container on Fedora CoreOS machines. On the one hand, running it
directly on the host allows it to monitor filesystem usage by mount
point, which is consistent with how non-FCOS machines are monitored.
On the other hand, installing packages on FCOS with `rpm-ostree` is a
nightmare. It's _incredibly_ slow. There's also occasionally issues
installing packages if the base layer has not been updated in a while
and the new packages require an existing package to be updated.
For the NUT server specifically, I have changed my mind again: the
*collectd-nut* package depends on *nut-client*, which in turn depends on
Python. I definitely want to avoid installing Python on the host, but I
do not want to lose the ability to monitor the UPSs via collectd. Using
a container, I can strip out the unnecessary bits of *nut-client* and
avoid installing Python at all. I think that's worth having to monitor
filesystem usage by device instead of by mount point.
The only privilege NUT needs is access to the USB device nodes. Using a
device CGroup rule to allow this is significantly better than disabling
all restrictions. Especially since I discovered that `--privileged`
implies `--security-opt label=disable`, effectively disabling SELinux
confinement of the container.
NUT needs some udev rules in order to set the proper permissions on USB
etc. devices so it can run as an otherwise unprivileged user. Since
udev rules can only be processed on the host, these rules need to be
copied out of the container and evaluated before the NUT server starts.
To enable this, the *nut-server* container image copies the rules it
contains to `/etc/udev/rules.d` if that directory is a mount point. By
bind mounting a directory on the host at that path, we can get a copy of
the rules files outside the container. Then, using a systemd path unit,
we can tell the udev daemon to reload and reevaluate its rules.
SELinux prevents processes in containers from writing to
`/etc/udev/rules.d` directly, so we have to use an intermediate location
and then copy the rules files to their final destination.