configpolicy

dustin

Author	SHA1	Message	Date
Dustin	fefa85c83b	gw1: squid: Allow access to PXE/kickstarts The PXE server now hosts the kickstart scripts.	2025-07-12 16:12:23 -05:00
Dustin	a23bb1f043	r/pxe: Add directory for serving kickstarts Now that kickstart scripts are generated from templates by a Jenkins job, they need to be stored somewhere besides Gitea. It makes sense to serve them from the PXE server, since it's involved in the installation process anyway (at least for physical machines). Thus, we need a path where the generated files can be uploaded by Jenkins and served by Apache.	2025-07-12 16:12:23 -05:00
Dustin	6447ff5f4b	v-l: Add data volume for logs storage	2025-07-12 16:08:40 -05:00
Dustin	4218137e1e	r/minio-backups-cert: Fix nsupdate kinit for f42 The version of Samba in Fedora 42 has got some really weird bugs. In this case, it seems `net ads kerberos kinit -P` no longer works. It prints a vague `NT_STATUS_INTERNAL_ERROR` message, with no other indication of what went wrong. Fortunately, it's still possible to get a ticket-granting ticket for the machine account using the host keytab.	2025-07-12 16:08:21 -05:00
Dustin	87d90a617d	minio-backups: Disable nginx access logs entirely The _nginx_ access log files are absolutely spammed with requets from Restic and WAL-G, to the point where they fill the log volume on _chromie_ every day. They're not particularly useful anyway; I've never looked at them, and any information they contain can be obtained in another way, if necessary, for troubleshooting.	2025-07-03 11:15:40 -05:00
Dustin	f3c432dbff	r/minio: Do not pull images automatically We don't want `podman` pulling a new container image and updating without our concent. The image will already be there on the first start, since we pulled it in an Ansible task.	2025-07-02 09:23:18 -05:00
Dustin	5edfbf2408	r/minio: Do not mount storage volume with :Z The `:Z` flag tells the container runtime to run `chcon` recursively on the specified path, in order to ensure that the files are accessible inside the container. For a very large volume like the MinIO storage directory, this can take an extremely long time. It's really only necessary on the first startup anyway, because the context won't change after that. To avoid spending a bunch of time, we can set the context correctly when we create the directory, and then not worry about it after that.	2025-07-02 09:21:57 -05:00
Dustin	84cd6022c0	r/k8s-worker: Use K8s API to create join token Using the Kubernetes API to create bootstrap tokens makes it possible for the host-provisioner to automatically add new machines to the Kubernetes cluster. The host provisioner cannot connect to existing machines, and thus cannot run the `kubeadm token create` command on a control plane node. With the appropriate permissions assigned to the service account associated with the pod it runs in, though, it can directly create the secret via the API. There are actually two pieces of information required for a node to join a cluster, though: a bootstrap token and the CA certificate. When using the `kubeadm token create` command to issue a bootstrap token, it also provides (a hash of) the CA certificate with the command it prints. When creating the token manually, we need an alternative method for obtaining and distributing the CA certificate, so we use the `cluster-info` ConfigMap. This contains a stub `kubeconfig` file, which includes the CA certificate, which can be used by the `kubeadm join` command with a join configuration file. Generating both of these files may be a bit more involved than computing the CA certificate hash and passing that on the command line, but there are a couple of advantages. First, it's more extensible, as the join configuration file can specify additional configuration for the node (which we may want to use later). It's also somewhat more secure, since the token is not passed as a command-line argument. Interestingly, the most difficult part of this implementation was getting the expiration timestamp. Ansible exposes very little date math capability; notably lacking is the ability to construct a `timedelta` object, so the only way to get a timestamp in the future is to convert the `datetime` object returned by `now` to a Unix timestamp and add some number of seconds to it. Further, there is no direct way to get a `datetime` object from the computed Unix timestamp value, but we can rely on the fact that Python class methods can be called on instances, too, so `now().fromtimestamp()` works the same as `datetime.fromtimestamp()`.	2025-07-01 08:09:11 -05:00
Dustin	a399591f16	hosts: Decommission node-refrain.k.p.b I did something stupid to this machine trying to clear up its `/var/lib/containers/storage` volume and now it won't start any new pods. Killing it and replacing.	2025-06-21 17:51:06 -05:00
Dustin	025f2ddd8c	hosts: Remove VM hosts from AD domain Having the VM hosts as members of the domain has been troublesome since the very beginning. In full shutdown events, it's often difficult or impossible to log in to the VM hosts while the domain controller VMs are down or still coming up, even with winbind caching. Now that we have the `users.yml` playbook, the SSH certificate authority, and `doas`+pam_ssh_agent_auth, we really don't need the AD domain for centralized authentication.	2025-06-08 09:04:27 -05:00
Dustin	2d4eb76f24	users: Do not clear supplemental groups To ensure the `users.yml` playbook is idempotent in cases where the users it manages are also managed by other playbooks, we have to set `append: true`. This prevents the managed user(s) from being removed from additional groups other playbooks may have added them to.	2025-06-08 09:00:16 -05:00
Dustin	a5e2920223	r/victoria-logs: Update to v1.23.3	2025-06-03 18:59:25 -05:00
Dustin	3df5b06169	site: Import victoria-logs PB	2025-05-30 21:52:23 -05:00
Dustin	d4d3f0ef81	r/victoria-logs: Deploy VictoriaLogs I've become rather frusted witih Grafana Loki lately. It has several bugs that affect my usage, including issues with counting and aggregation, completely broken retention and cleanup, spamming itself with bogus error log messages, and more. Now that VitoriaLogs has first-class support in Grafana and support for alerts, it seems like a good time to try it out. It's under very active development, with bugs getting fixed extremely quickly, and new features added constantly. Indeed, as I was experimenting with it, I thought, "it would be nice if the web UI could decode ANSI escapes for terminal colors," and just a few days later, that feature was added! Native support for syslog is also a huge benefit, as it will allow me to collect logs directly from network devices, without first collecting them into a file on the Unifi controller. This new role deploys VictoriaLogs in a manner very similar to how I have Loki set up, as a systemd-managed Podman container. As it has no built-in authentication or authorization, we rely on Caddy to handle that. As with Loki, mTLS is used to prevent anonymous access to querying the logs, however, authentication via Authelia is also an option for human+browser usage. I'm re-using the same certificate authority as with Loki to simplify Grafana configuration. Eventually, I would like to have a more robust PKI, probably using OpenBao, at which point I will (hopefully) have decided which log database I will be using, and can use a proper CA for it.	2025-05-30 21:19:05 -05:00
Dustin	1768678213	frigate: Set logout URL Although I'm sure it will never be used, we might as well set the logout URL to the correct value. When the link is clicked, the browser will navigate to the Authelia logout page, which will invalidate all SSO sessions.	2025-04-21 08:28:49 -05:00
Dustin	57c5afc0c8	r/frigate: Fix Authelia redirect HTTP 301 is "moved permanently." Browsers will cache this response and never send the request to the real server again. We need to use a temporary redirect, such as "see other" to avoid getting stuck in a login loop.	2025-04-21 08:27:34 -05:00
Dustin	113ffa2b96	r/frigate: Update to v0.15 Frigate has evolved a lot over the past year or so since v0.13. Notably, some of the configuration options have been renamed, and _events_ have become _alerts_ and _detections_. There's also now support for authenication, though we don't need it because we're using Authelia.	2025-04-20 16:23:04 -05:00
Dustin	1b94530b1f	frigate: Add front yard camera We're trying to sell the Hustler lawn mower, so we plan to set it out at the end of the driveway for passers-by to see. I've temporarily installed one of the Annke cameras in the kitchen, pointed out the front window, to monitor it.	2025-04-20 14:10:27 -05:00
Dustin	641ddf8613	site: Import restic playbook This will automatically configure backups on new machines that should have them.	2025-03-29 09:37:55 -05:00
Dustin	6df0cc39da	unifi: Back up with Restic The Unifi Network data will now be backed up by Restic.	2025-03-29 09:36:37 -05:00
Dustin	572022b557	restic: Trust dch-root-ca certificate Since the MinIO server that Restic uses to store snapshots has a certificate signed by the DCH CA, we need to trust the root certificate in order to communicate with it. Existing servers already had this CA trusted by the `pyrocufflink.yml` playbook, but new servers are not (usually) AD domain members anymore, so we need to be explicit now.	2025-03-29 09:34:17 -05:00
Dustin	daa59bdba5	r/useproxy: Configure dnf to use proxy Although running `dnf` from the command line works without explicitly configuring the proxy, because it inherits the environment variables set by PAM on login from the user's shell, the `dnf` Ansible module does not, as it does not inherit those variables. Thus, we need to explicitly configure the `proxy` setting in `dnf.conf` in order to be able to install packages via Ansible. Since `dnf` does not have separate settings for different protocols (e.g. HTTP, HTTPS, FTP), we need a way to specify which of the configured proxies to use if there are multiple. As such, the useproxy role will attempt to use the value of the `dnf_proxy` variable, if it is set, falling back to `yum_proxy` and finally `http_proxy`. This should cover most situations without any explicit configuration, but allows flexibility for other cases.	2025-03-29 09:30:08 -05:00
Dustin	cdd64b6309	unifi: Fix Promtail log scrape paths The linuxserver.io Unifi container stored Unifi server and device logs under `/var/lib/unifi/logs`, while the new container stores them under `/var/log/unifi`.	2025-03-29 09:28:48 -05:00
Dustin	923c8a3ebc	r/unifi: Open firewall port for syslog server The Unifi Network controller runs a syslog server (listening on UDP port 5514) where Unifi devices can send their logs. We need to open the port in the firewall in order for it to receive log messages and write them to disk.	2025-03-29 09:27:28 -05:00
Dustin	00acc54402	Merge branch 'unifi-redux'	2025-03-29 08:03:39 -05:00
Dustin	0c070c9807	gw1/squid: Allow Unifi controller to internal repos I've move the Unifi controller back to running on a Fedora Linux machine. It therefore needs access to Fedora RPM repositories, as well as the internal "dch" RPM repository, for system packages. I also created a new custom container image for the Unifi Network software (the linuxserver.io one sucks), so the server needs access to the OCI repo on Gitea.	2025-03-29 08:01:50 -05:00
Dustin	0a956297c1	r/vmhost: Update for latest libvirt Some time ago, _libvirt_ was refactored to use separate daemons and sockets for each of its responsibilities, and the original "monolithic" `libvirtd` was made obsolete. The Fedora packages have more recently been adjusted to favor this new approach, and now default to omitting the monolithic daemon entirely (when `install_weak_deps` is disabled). One interesting packaging snafu, though, is that without the weak dependencies, there is _no_ way for clients to connect by default. Clients run `which virt-ssh-helper` to see if it is installed, which it is, but `which` is not. They then fall back to running `nc`, which is _also_ not installed. So even though the tools they actually need are present, their logic for detecting this is broken. As such, we need to explicitly install `which` to satisfy them.	2025-03-29 07:34:58 -05:00
Dustin	78d70af574	hosts: Add Unifi controllers to needproxy group Since the network device management network does not have access to the Internet, the Unifi controller machines must access it via the proxy.	2025-03-19 07:50:52 -05:00
Dustin	3877547c03	bootstrap: Import useproxy playbook Hosts that must use the proxy in order to access the Internet need to have that configured very early on, before any package installation is attempted.	2025-03-19 07:46:28 -05:00
Dustin	e76bc2c36c	host-setup: Import users playbook This ensures that default users are created on all new hosts during initial provisioning.	2025-03-16 17:17:00 -05:00
Dustin	2e1cc6a130	site: Import UniFi playbook	2025-03-16 17:17:00 -05:00
Dustin	db5d1fb91a	unifi: Switch from nginx to Caddy Mostly for built-in ACME support.	2025-03-16 17:17:00 -05:00
Dustin	fbbe86c651	r/unifi: Do not deploy exporter The _unifi_exporter_ has been broken since several versions of UniFi Network ago.	2025-03-16 16:40:57 -05:00
Dustin	db54b03aa8	r/unifi: Switching to custom container image The _linuxserver.io_ image for UniFi Network is deprecated. It sucked anyway. I've created a simple image based on Debian that installs the _unifi_ package from the upstream apt repository. This image doesn't require running anything as _root_, so it doesn't need a user namespace.	2025-03-16 16:40:57 -05:00
Dustin	a423826fcd	newvm: Add host to some groups by default There are some groups that all hosts should belong to in almost all cases. Rather than have to remember to add the `--group` arguments for each of these, the `newvm.sh` script will now enable them by default. For hosts that should _not_ belong to (at least one of) these groups, the `--no-default-groups` argument can be provided to suppress that behavior. The default groups, initially, are _chrony_ and _collectd_.	2025-03-16 16:37:19 -05:00
Dustin	c300dc1b6c	chrony: Add role/PB for chrony I continually struggle with machines' (physical and virtual, even the Roku devices!) clocks getting out of sync. I have been putting off fixing this because I wanted to set up a Windows-compatible NTP server (i.e. on the domain controllers, with Kerberos signing), but there's really no reason to wait for that to fix the clocks on all the non-Windows machines, especially since there are exactly 0 Windows machines on the network right now. The chrony role and corresponding `chrony.yml` playbook are generic, configured via the `chrony_pools`, `chrony_servers`, and `chrony_allow` variables. The values for these variables will configure the firewall to act as an NTP server, synchronizing with the NTP pool on the Internet, while all other machines will synchronize with it. This allows machines on networks without Internet access to keep their clocks in sync.	2025-03-16 16:37:19 -05:00
Dustin	4ba5f2ced0	hosts.gw: Add gw1 to sudo group The default privilege elevation tool is now `doas`, but _gw1_ uses `sudo`.	2025-03-16 16:37:19 -05:00
Dustin	e4a4944fbc	postgresql: Add receipts/user DB The Receipt application needs a PostgreSQL database on the central server.	2025-03-16 14:47:30 -05:00
Dustin	fd59f3ebb2	users: Do not create users on domain members The point of the `users.yml` playbook is to manage static users for machines that are not members of the AD domain. Since this playbook is included in `site.yml`, it gets applied to _all_ machines, even those that _are_ (or will become) domain members. Thus, we want to avoid actually doing anything on those machines.	2025-02-25 21:03:59 -06:00
Dustin	5f4b1627db	hosts: Add nut1.p.b to pyrocufflink group nut1.pyrocufflink.blue is a member of the pyrocufflink.blue AD domain. I'm not sure how it got to be so without belonging to the _pyrocufflink_ Ansible group...	2025-02-25 21:03:14 -06:00
Dustin	9ea8756610	inventory: Exclude test machines by default We don't want Jenkins attemptying to manage test VMs. I thought of various ways to exclude them, but in the end, I think a simple name match will work fine. The host provisioner _should_ manage test VMs, though, so it will need to be configured to set the `PYROCUFFLINK_EXCLUDE_TEST` environment variable to `false` to override the default behavior.	2025-02-14 10:04:48 -06:00
Dustin	6ae3404b3a	vmhost: Allow host provisioner to log in This commit adds tasks to the `vmhost.yml` playbook to ensure the jenkins user has the Host Provisioner's SSH key in its `authorized_keys` file. This allows the Host Provisioner to log in and access the read-only _libvirt_ socket in order to construct the dynamic Ansible inventory.	2025-02-08 16:49:14 -06:00
Dustin	757494b48b	newvm: Use kickstart from Gitea by default The canonical location for kickstart scripts has _finally_ moved to a real server instead of my desktop...	2025-02-08 16:49:14 -06:00
Dustin	e7de5142f3	newvm: Allow setting cfgpol branch The script that runs on first boot of a new machine that triggers host provisioning can read the name of the configuration policy branch to checkout from the QEMU firmware configuration option. This commit adds a `--cfg-branch` argument to `newvm.sh` that sets that value. This will be useful for testing new policy on a new VM.	2025-02-08 16:49:14 -06:00
Dustin	4d30798f54	newvm: Configure VM for dynamic inventory This commit adds a new `--group` argument to the `newvm` script, which adds the host to an Ansible group by listing it in the _libvirt_ domain metadata. Multiple groups can be specified by repeating the argument. Additionally, the VM title is now always set to machine's FQDN, which is what the dynamic inventory plugin uses to determine the inventory hostname. The dynamic inventory plugin parses the _libvirt_ domain metadata and extracts group membership from the `<dch:groups>` XML element. Each `<dch:group>` sub-element specifies a group to which the host belongs. Unfortunately, `virt-install` does not support modifying the `<metadata>` element in the _libvirt_ domain XML document, so we have to resort to using `virsh`. To ensure the metadata are set before the guest OS boots and tries to access them, we fork and run `virsh` in a separate process.	2025-02-08 15:35:34 -06:00
Dustin	7ff18ab75e	Introduce dynamic inventory In order to fully automate host provisioning, we need to eliminate the manual step of adding hosts to the Ansible inventory. Ansible has had the _community.libvirt.libvirt_ inventory plugin for quite a while, but by itself it is insufficient, as it has no way to add hosts to groups dynamically. It does expose the domain XML, but parsing that and extracting group memberships from that using Jinja templates would be pretty terrible. Thus, I decided the easiest and most appropriate option would be to develop my own dynamic inventory plugin. * Supports multiple _libvirt_ servers * Can connect to the read-only _libvirt_ socket * Can optionally exclude VMs that are powered off * Can exclude VMs based on their operating system (if the _libosinfo_ metadata is specified in the domain metadata) * Can add hosts to groups as specified in the domain metadata * Exposes guest info as inventory host variables (requires QEMU guest agent running in the VM and does not work with a read-only _libvirt_ connection)	2025-02-08 15:29:58 -06:00
Dustin	e9d6020563	all: Set root authorized keys The `root_authorized_keys` variable was originally defined only for the pyrocufflink group. This used to effectively be "all" machines, since everything was a member of the AD domain. Now that we're moving away from that deployment model, we still want to have the break-glass option, so we need to define the authorized keys for the _all_ group.	2025-02-08 15:29:57 -06:00
Dustin	d916545e29	synapse: Remove group variables This was the last group that had an entire file encrypted with Ansible Vault. Now that the Synapse server is long gone, rather than convert it to having individually-encrypted values, we can get rid of it entirely.	2025-02-08 15:29:57 -06:00
Dustin	8bd0722422	pyrocufflink: Remove root password While having a password set for _root_ provides a convenient way of accessing a machine even if it is not available via SSH, using a static password in this way is quite insecure and not worth the risk. I may try to come up with a better way to set a unique password for each machine eventually, but for now, having this password here is too dangerous to keep.	2025-02-08 15:29:57 -06:00
Dustin	6f37f1417f	site: Add master site playbook The `site.yml` playbook imports all of the other playbooks, providing a way to deploy _everything_. Normally, this would only be done for a single host, as part of its initial provisioning, to quickly apply all common configuration and any application-specific configuration for whatever roles the host happens to hold. The `host-setup.yml` playbook provides an entry point for configuring all common configuration. Basically anything we want to do to _every_ machine, regardless of its location or role.	2025-02-08 15:29:55 -06:00

1 2 3 4 5 ...

1083 Commits (fefa85c83bc43dc6fcfcae4d33ead5e75295bd57) All Branches Search

1083 Commits (fefa85c83bc43dc6fcfcae4d33ead5e75295bd57)

All Branches