The rule is "if it is accessible on the Internet, its name ends in .net"
Although Vaultwarden can be accessed by either name, the one specified
in the Domain URL setting is the only one that works for WebAuthn.
The HTTP->HTTPS redirect for chmod777.sh was only working by
coincidence. It needs its own virtual host to ensure it works
irrespective of how other websites are configured.
Tabitha's Hatch Learning Center site has two user submission forms: one
for signing in/out students for class, and another for parents to
register new students for the program. These are handled by
*formsubmit* and store data in CSV spreadsheets.
Domain controllers only allow users in the *Domain Admins* AD group to
use `sudo` by default. *dustin* and *jenkins* need to be able to apply
configuration policy to these machines, but they are not members of said
group.
If the Python bindings for SELinux policy management are not installed
when Ansible gathers host facts, no SELinux-related facts will be set.
Thus, any tasks that are conditional based on these facts will not run.
Typically, such tasks are required for SELinux-enabled hosts, but must
not be performed for non-SELinux hosts. If they are not run when they
should, the deployment may fail or applications may experience issues at
runtime.
To avoid these potential issues, the *base* role now forces Ansible to
gather facts again if it installed the Python SELinux bindings.
Note: one might suggest using `meta: clear_facts` instead of `setup` and
letting Ansible decide if and when to gather facts again. Unfortunately,
this for some reason doesn't work; the `clear_facts` meta task just
causes Ansible to crash with a "shared connection to {host} closed."
Some playbooks/roles require facts from machines other than the target.
The `facts.yml` playbook can be used to gather facts from machines
without running any other tasks.
The *dch-selinux* package contains a SELinux policy module for Samba AD
DC. This policy defines a `samba_t` domain for the `samba` process.
While the domain is (currently) unconfined, it is necessary in order to
provide a domain transition rule for `winbindd`. Without this rule,
`winbindd` would run in `unconfined_service_t`, which causes its IPC
pipe files to be incorrectly labelled, preventing other confined
services like `sshd` from accessing them.
The *dch-selinux* package contains customized SELinux policy modules.
I haven't worked out exactly how to build an publish it through a
continuous integration pipeline yet, so for now it's just hosted in my
user `public_html` folder on the main file server.
Samba AD DC does not implement [DFS-R for replication of the SYSVOL][0]
contents. This does not make much of a difference to me, since
the SYSVOL is really only used for Group Policy. Windows machines may
log an error if they cannot access the (basically empty) GPO files, but
that's pretty much the only effect if the SYSVOL is in sync between
domain controllers.
Unfortunately, there is one side-effect of the missing DFS-R
functionality that does matter. On domain controllers, all user,
computer, and group accounts need to have Unix UID/GID numbers mapped.
This is different than regular member machines, which only need UID/GID
numbers for users that will/are allowed to log into them. LDAP entries
only have ID numbers mapped for the latter class of users, which does
not include machine accounts. As a result, Samba falls back to
generating local ID numbers for the rest of the accounts. Those ID
numbers are stored in a local database file,
`/var/lib/samba/private/idmap.ldb`. It would seem that it wouldn't
actually matter if accounts have different ID numbers on different
domain controllers, but there are evidently [situations][1] where DCs
refuse to allocate ID numbers at all, which can cause authentication to
fail. As such, the `idmap.ldb` file needs to be kept in sync.
If we're going to go through the effort of synchronizing `idmap.ldb`, we
might as well keep the SYSVOL in sync as well. To that end, I've
written a script to synchronize both the SYSVOL contents and the
`idmap.ldb` file. It performs a simple one-way synchronization using
`rsync` from the DC with the PDC emulator role, as discovered using DNS
SRV records. To ensure the `idmap.ldb` file is in a consistent state,
it only copies the most recent backup file. If the copied file differs
from the local one, the script stops Samba and restores the local
database from the backup. It then flushes Samba's caches and restarts
the service. Finally, it fixes the NT ACLs on the contents of the
SYSVOL.
Since the contents of the SYSVOL are owned by root, naturally the
synchronization process has to run as root as well. To attempt to limit
the scope of control this would give the process, we use as much of the
systemd sandbox capabilities as possible. Further, the SSH key pairs
the DCs use to authenticate to one another are restricted to only
running rsync. As such, the `sysvolsync` script itself cannot run
`tdbbackup` to back up `idmap.ldb`. To handle that, I've created a
systemd service and corresponding timer unit to run `tdbbackup`
periodically.
I considered for a long time how to best implement this process, and
although I chose this naïve implementation, I am not exactly happy with
it. Since I do not fully understand *why* keeping
the `idmap.ldb` file in sync is necessary, there are undoubtedly cases
where blindly copying it from the PDC emulator is not correct. There
are definitely cases where the contents of the SYSVOL can be updated on
a DC besides the PDC emulator, but again, we should not run into them
because we don't really use the SYSVOL at all. In the end, I think this
solution is good enough for our needs, without being so complicated
[0]: https://wiki.samba.org/index.php?title=SysVol_replication_(DFS-R)&oldid=18120
[1]: https://lists.samba.org/archive/samba/2021-November/238370.html
We need to import the `dyngroups.yml` playbook so that the dynamic host
groups are populated. Without this, the *RedHat* group is empty, so the
*collectd-version* role is never applied.
I changed the naming convention for domain controller machines. They
are no longer "numbered," since the plan is to rotate through them
quickly. For each release of Fedora, we'll create two new domain
controllers, replacing the existing ones. Their names are now randomly
generated and contain letters and numbers, so the Blackbox Exporter
check for DNS records needs to account for this.
Zigbee2MQTT now has a web GUI, which makes it *way* easier to manage the
Zigbee network. Now that I've got all the Philips Hue bulbs controlled
by Zigbee2MQTT instead of the Hue Hub, having access to the GUI is
awesome.
The latest version of the *ansible* container runs processes as the
unprivileged *jenkins* user, provides its own "sleep forever" default
command, and sets the correct LANG environment variable. Since it runs
processes as *jenkins*, we need to override HOME and set it to the
WORKSPACE to ensure Jenkins has a writable path for arbitrary files.
Gitea package names (e.g. OCI images, etc.) can contain `/` charactres.
These are encoded as %2F in request paths. Apache needs to forward
these sequences to the Gitea server without decoding them.
Unfortunately, the `AllowEncodedSlashes` setting, which controls this
behavior, is a per-virtualhost setting that is *not* inherited from the
main server configuration, and therefore must be explicitly set inside
the `VirtualHost` block. This means Gitea needs its own virtual host
definition, and cannot rely on the default virtual host.
Hopefully this will fix this warning from Ansible:
> [WARNING]: An error occurred while calling
> ansible.utils.display.initialize_locale (unsupported locale setting).
> This may result in incorrectly calculated text widths that can cause
> Display to print incorrect line lengths
I don't know why I didn't think of this before! There's no reason to
have to have already copied the `ssh_known_hosts` file from to
`/etc/ssh` before running `ansible-playbook`. In fact, keys just end up
getting copied from `/etc/ssh/ssh_known_hosts` into `~/.ssh/known_hosts`
anyway. So let's just make it so that step isn't necessary: copy the
host key database directly to `~/.ssh` and avoid the trouble.
We'll use the `podTemplate` block to define an ephemeral agent running in
a Kubernetes pod as the node for this pipeline. This takes the place of
the Docker container we used previously.
I moved the metrics Pi from the red network to the blue network. I
started to get uncormfortable with the firewall changes that were
required to host a service on the red network. I think it makes the
most sense to define the red network as egress only.
The only major change that affects the configuration policy is the
introduction of the `webhook.ALLOWED_HOST_LIST` setting. For some dumb
reason, the default value of this setting *denies* access to machines on
the local network. This makes no sense; why do they expect you to host
your CI or whatever on a *public* network? Of course, the only reason
given is "for security reasons."
This work-around is no longer necessary as the default Fedora policy now
covers the Samba DC daemon. It never really worked correctly, anyway,
because Samba doesn't start `winbindd` fast enough for the
`/run/samba/winbindd` directory to be created before systemd spawns the
`restorecon` process, so it would usually fail to start the service the
first time after a reboot.
Sometimes, Frigate crashes in situations that should be recoverable or
temporary. For example, it will fail to start if the MQTT server is
unreachable initially, and does not attempt to connect more than once.
To avoid having to manually restart the service once the MQTT server is
ready, we can configure the systemd unit to enable automatic restarts.
If the *vaultwarden* service terminates unexpectedly, e.g. due to a
power loss, `podman` may not successfully remove the container. We
therefore need to try to delete it before starting it again, or `podman`
will exit with an error because the container already exists.
When I added the *systemd-networkd* configuration for the Kubernetes
network interface on the VM hosts, I only added the `.netdev`
configuration and forgot the `.network` part. Without the latter,
*systemd-networkd* creates the interface, but does not configure or
activate it, so it is not able to handle traffic for the VMs attached to
the bridge.
Both *zwavejs2mqtt* and *zigbee2mqtt* have various bugs that can cause
them to crash in the face of errors that should be recoverable.
Specifically, when there are network errors, the processes do not always
handle these well. Especially during first startup, they tend to crash
instead of retry. Thus, we'll move the retry logic into systemd.
The *zwavejs2mqtt* and *zigbee2mqtt* services need to wait until the
system clock is fully synchronized before starting. If the system clock
is wrong, they may fail to validate the MQTT server certificate.
The *time-sync.target* unit is not started until after services that
sync the clock, e.g. using NTP. Notably, the *chrony-wait.service* unit
delays *time-sync.target* until `chrony waitsync` returns.
The *vlan99* interface needs to be created and activated by
`systemd-networkd` before `dnsmasq` can start and bind to it. Ordering
the *dnsmasq.service* unit after *network.target* and
*network-online.target* should ensure that this is the case.