Commit Graph

242 Commits (7b61a7da7e95734c95d8fff9e63b9057cdf732b7)

Author SHA1 Message Date
Dustin 61844e8a95 pyrocufflink: Add Luma SSH keys for root
Sometimes I need to connect to a machine when there is an AD issue (e.g.
domain controllers are down, clocks are out of sync, etc.) but I can't
do it from my desktop.
2023-07-05 16:35:57 -05:00
Dustin 0a68d84121 metricspi: Scrape hatchlearningcenter.org
To monitor site availability and certificate expiration.
2023-06-21 14:31:33 -05:00
Dustin 4e608e379f metricspi/alerts: Correct BURP archive alert query
When the RAID array is being resynchronized after the archived disk has
been reconnected, md changes the disk status from "missing" to "spare."
Once the synchronization is complete, it changes from "spare" to
"active."  We only want to trigger the "disk needs archived" alert once
the synchronization process is complete; otherwise, both the "disks need
swapped" and "disk needs archived" alerts would be active at the same
time, which makes no sense.  By adjusting the query for the "disk needs
archived" alert to consider disks in both "missing" and "spare" status,
we can delay firing that alert until the proper time.
2023-06-20 11:58:35 -05:00
Dustin bf4d57b5cb frigate: Configure journal2ntfy for MD RAID
The Frigate server has a RAID array that it uses to store video
recordings.  Since there have been a few occasions where the array has
suddenly stopped functioning, probably because of the cheap SATA
controller, it will be nice to get an alert as soon as the kernel
detects the problem, so as to minimize data loss.
2023-06-08 10:05:36 -05:00
Dustin 87e8ec2ed4 synapse: Back up data using BURP
Most of the Synapse server's state is in its SQLite database.  It also
has a `media_store` directory that needs to be backed up, though.

In order to back up the SQLite database while the server is running, the
database must be in "WAL mode."  By default, Synapse leaves the database
in the default "rollback journal mode," which disallows multiple
processes from accessing the database, even for read-only operations.
To change the journal mode:

```sh
sudo systemctl stop synapse
sudo -u synapse sqlite3 /var/lib/synapse/homeserver.db 'PRAGMA journal_mode=WAL;'
sudo systemctl start synapse
```
2023-05-23 09:52:50 -05:00
Dustin 78296f7198 Merge branch 'journal2ntfy' 2023-05-23 08:31:52 -05:00
Dustin 347cda74fd metrics: Scrape metrics from Kubernetes API server
Kubernetes exports a *lot* of metrics in Prometheus format.  I am not
sure what all is there, yet, but apparently several thousand time series
were added.

To allow anonymous access to the metrics, I added this RoleBinding:

```yaml
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
  name: prometheus
rules:
- apiGroups:
  - ""
  resources:
  - nodes/metrics
  verbs:
  - get
- nonResourceURLs:
  - /metrics
  verbs:
  - get
```
2023-05-22 21:21:08 -05:00
Dustin c0bb387b18 metricspi: Scrape metrics from MinIO backup storage
MinIO exposes metrics in Prometheus exposition format.  By default, it
requires an authentication token to access the metrics, but I was unable
to get this to work.  Fortunately, it can be configured to allow
anonymous access to the metrics, which is fine, in my opinion.
2023-05-22 21:19:25 -05:00
Dustin a7319c561d journal2ntfy: Script to send log messagess via ntfy
The `journal2ntfy.py` script follows the systemd journal by spawning
`journalctl` as a child process and reading from its standard output
stream.  Any command-line arguments passed to `journal2ntfy` are passed
to `journalctl`, which allows the caller to specify message filters.
For any matching journal message, `journal2ntfy` sends a message via
the *ntfy* web service.

For the BURP server, we're going to use `journal2ntfy` to generate
alerts about the RAID array.  When I reconnect the disk that was in the
fireproof safe, the kernel will log a message from the *md* subsystem
indicating that the resynchronization process has begun.  Then, when
the disks are again in sync, it will log another message, which will
let me know it is safe to archive the other disk.
2023-05-17 14:51:21 -05:00
Dustin 2c002aa7c5 alerts: Add alert to archive BURP disk
This alert will fire once the MD RAID resynchronization process has
completed and both disks in the array are online.  It will clear when
one disk is disconnected and moved to the safe.
2023-05-16 08:33:13 -05:00
Dustin 877dcc3879 alerts: Add alerts for missed client backups
When BURP fails to even *start* a backup, it does not trigger a
notification at all.  As a result, I may not notice for a few days when
backups are not happening.  That was the case this week, when clients'
backups were failing immediately, because of a file permissions issue on
the server.  To hopefully avoid missing backups for too long in the
future, I've added two new alerts:

* The *no recent backups* alert fires if there have not been *any* BURP
  backups recently.  This may also fire, for example, if the BURP
  exporter is not working, or if there is something wrong with the BURP
  data volume.
* The *missed client backup* alert fires if an active BURP client (i.e.
  one that has had at least one backup in the past 90 days) has not been
  backed up in the last 24 hours.
2023-05-14 11:48:36 -05:00
Dustin a2bcd5ccbb alerts: Adjust BURP RAID disk swap alert
Using a 30-day window for the `tlast_change_over_time` function
effectively "caps out" the value at 30 days.  Thus, the alert reminding
me to swap the BURP backup volume will never fire, since the value will
never be greater than the 30-day threshold.  Using a wider window
resolves that issue (though the query will still produce inaccurate
results beyond the window).
2023-05-14 11:38:00 -05:00
Dustin ad9fb6798e samba-dc: Omit tls cafile setting
The `tls cafile` setting in `smb.conf` is not necessary.  It is used for
verifying peer certificates for mutual TLS authentication, not to
specify the intermediate certificate authority chain like I thought.

The setting cannot simply be left out, though.  If it is not specified,
Samba will attempt to load a file from a built-in default path, which
will fail, causing the server to crash.  This is avoided by setting the
value to the empty string.
2023-05-10 08:28:49 -05:00
Dustin 9722fed1b8 metricspi: Scrape dustinandtabitha.com 2023-05-09 21:30:11 -05:00
Dustin f6f286ac24 alerts: Correct BURP volume swap alert
The `tlast_change_over_time` function needs an interval wide enough to
consider the range of time we are intrested in.  In this case, we want
to see if the BURP volume has been swapped in the last thirty days, so
the interval needs to be `30d`.
2023-05-03 11:06:34 -05:00
Dustin 5ed3ee525e synapse: Update LDAP server URI 2023-05-01 12:36:33 -05:00
Dustin a4cc9d0c46 metricspi: Scrape tabitha.biz 2023-04-23 20:03:43 -05:00
Dustin 6c68126a3a grafana: Update LDAP server host name
*dc0.p.b* has been gone for a while now.  All the current domain
controllers use LDAPS certificates signed by Let's Encrypt and include
the *pyrocufflink.blue* name, so we can now use the apex domain A record
to connect to the directory.
2023-04-12 14:07:51 -05:00
Dustin 78f65355fa gitea: Back up with BURP 2023-04-12 14:07:51 -05:00
Dustin 1da4c17a8c alerts: Add alerts for HTTPS certificates
These alerts will generate notifications when websites' HTTPS
certificates are not properly renewed automatically and become in danger
of expiring.
2023-04-12 13:55:31 -05:00
Dustin bf4133652c metrics: Scrape Jenkins with blackbox exporter
This is mostly to monitor the HTTPS certificate expiration.
2023-04-12 13:55:31 -05:00
Dustin dc2a05dc8f alerts: Add alert for BURP RAID array swap
This alert counts how long its been since the number of "active" disks
in the RAID array on the BURP server has changed.  The assumption is
that the number will typically be `1`, but it will be `2` when the
second disk synchronized before the swap occurs.
2023-04-11 22:25:36 -05:00
Dustin 2394bf7436 metricspi: Fix vmalert links
1. Grafana 8 changed the format of the query string parameters for the
   Explore page.
2. vmalert no longer needs the http.pathPrefix argument when behind a
   reverse proxy, rather it uses the request path like the other
   Victoria Metrics components.
2023-04-11 21:46:43 -05:00
Dustin 6c562c9821 alerts: Ignore missing mdraid disk for BURP
The way I am handling swapping out the BURP disk now is by using the
Linux MD RAID driver to manage a RAID 1 mirror array.  The array
normally operates with one disk missing, as it is in the fireproof safe.
When it is time to swap the disks, I reattach the offline disk, let the
array resync, then disconnect and store the other disk.

This works considerably better than the previous method, as it does not
require BURP or the NFS server to be offline during the synchronization.
2023-04-11 20:08:07 -05:00
Dustin a59f24a8b5 metricspi: Stop scraping speedtest
Running the speed test periodically was just wasting bandwidth.  It
failed frequently, and generally did not provide useful information.
2023-04-02 11:05:16 -05:00
Dustin 94de5d6067 samba-dc: Decrease Samba log level
The default log level (3) produces too much output and quickly fills the
`/var/log` volume on the domain controllers.
2023-03-08 11:26:57 -06:00
Dustin 748c432334 vaultwarden: Change Domain URL
The rule is "if it is accessible on the Internet, its name ends in .net"

Although Vaultwarden can be accessed by either name, the one specified
in the Domain URL setting is the only one that works for WebAuthn.
2023-03-03 11:17:07 -06:00
Dustin 632e1dd906 metricspi: Update LDAP configuration
All domain controllers now use the Let's Encrypt wildcard certificate
for the *pyrocufflink.blue* domain.  Further, *dc2.p.b* is
decommissioned.
2023-01-09 12:23:54 -06:00
Dustin 90f9e5eba5 samba-dc: Manage sudoers
Domain controllers only allow users in the *Domain Admins* AD group to
use `sudo` by default.  *dustin* and *jenkins* need to be able to apply
configuration policy to these machines, but they are not members of said
group.
2022-12-23 08:47:31 -06:00
Dustin 9408ee31c3 home-assistant: Back up Zigbee/ZWave/Mosquitto
Mosquitto, Zigbee2MQTT, and ZWaveJS2MQTT all have persistent state that
needs to be backed up in addition to Home Assistant's own data.
2022-12-23 06:56:52 -06:00
Dustin 77191c8b5a Fedora37: Set collectd SELinux domain permissive
*collectd* is broken by default on Fedora 36 and 36.  Several plugins
generate AVC denials.
2022-12-19 10:22:00 -06:00
Dustin 637289036a blackbox: Update pyrocufflink DNS check
I changed the naming convention for domain controller machines.  They
are no longer "numbered," since the plan is to rotate through them
quickly.  For each release of Fedora, we'll create two new domain
controllers, replacing the existing ones.  Their names are now randomly
generated and contain letters and numbers, so the Blackbox Exporter
check for DNS records needs to account for this.
2022-12-19 09:04:37 -06:00
Dustin caef7f342b vm-hosts: Update autostart list
* Remove DC0 (decommissioned)
* Remove Jenkins and its build VMs (Migrated to Kubernetes)
* Add pxe0 (Required for Basement HUD)
2022-12-18 19:55:48 -06:00
Dustin 77c6408187 metricspi: Remove sensors scrape job
Sensor data are retrieved via Home Assistant.
2022-12-18 19:16:10 -06:00
Dustin 244482ac52 websites: Add hatchlearningcenter.org
This is the website for Tabitha's new hybrid private school! 👩‍🎓
2022-11-30 22:04:29 -06:00
Dustin 772f669ab2 r/gitea: Handle encoded / characters in HTTP paths
Gitea package names (e.g. OCI images, etc.) can contain `/` charactres.
These are encoded as %2F in request paths.  Apache needs to forward
these sequences to the Gitea server without decoding them.
Unfortunately, the `AllowEncodedSlashes` setting, which controls this
behavior, is a per-virtualhost setting that is *not* inherited from the
main server configuration, and therefore must be explicitly set inside
the `VirtualHost` block.  This means Gitea needs its own virtual host
definition, and cannot rely on the default virtual host.
2022-11-27 17:21:03 -06:00
Dustin 4511d5447e vm-hosts: Add missing kube.network config
When I added the *systemd-networkd* configuration for the Kubernetes
network interface on the VM hosts, I only added the `.netdev`
configuration and forgot the `.network` part.  Without the latter,
*systemd-networkd* creates the interface, but does not configure or
activate it, so it is not able to handle traffic for the VMs attached to
the bridge.
2022-08-22 20:00:47 -05:00
Dustin b8b8ae5798 vm-hosts: Define machines to auto start 2022-08-20 21:19:01 -05:00
Dustin bc60451949 metricspi: Update DNS server address
DNS is now handled by the border firewall.
2022-08-20 18:19:13 -05:00
Dustin 4622240c6c r/netboot/jenkins-agent: Configure NBD exports
The *netboot/jenkins-agent* Ansible role configures three NBD exports:

* A single, shared, read-only export containing the Jenkins agent root
  filesystem, as a SquashFS filesystem
* For each defined agent host, a writable data volume for Jenkins
  workspaces
* For each defined agent host, a writable data volume for Docker

Agent hosts must have some kind of unique value to identify their
persistent data volumes.  Raspberry Pi devices, for example, can use the
SoC serial number.
2022-08-15 17:14:06 -05:00
Dustin dbc18022f2 metricspi: Increase scrape_timeout for speedtest
Running the Internet speed test can often take longer than a minute.
2022-08-12 14:54:49 -05:00
Dustin ce3e88932d vmalert: Allow configuring http.pathPrefix
*vmalert* requires explicit configuration when it is behind a reverse
proxy.
2022-08-12 13:10:36 -05:00
Dustin fe87edea21 r/vmalert: Allow configuring external source URLs
The `-external.url` and `-external.alert.source` command line arguments
and their corresponding environment variables can be used to configure
the "Source" links associated with alerts created by `vmalert`.
2022-08-12 12:58:53 -05:00
Dustin c57500a9f4 metricspi: Update speedtest scrape target
The firewall hardware is too slow to run the *prometheus_speedtest*
program.  It always showed *way* lower speeds than were actually
available.  I've moved the service to the Kubernetes cluster and it
works a lot better there.
2022-08-12 12:55:52 -05:00
Dustin 4ddbc9f256 hosts: Add mtrcs0.p.r
*mtrcs0.pyrocufflink.red* is a Raspberry Pi CM4 on a Waveshare
CM4-IO-BASE-B carrier board with a NVMe SSD.  It runs a custom OS built
using Buildroot, and is not a member of the *pyrocufflink.blue* AD
domain.

*mtrcs0.p.r* hosts Victoria Metrics/`vmagent`, `vmalert`, AlertManager,
and Grafana.  I've created a unique group and playbook for it,
*metricspi*, to manage all these applications together.
2022-08-11 21:40:19 -05:00
Dustin 4aedeef546 grafana: Redirect HTTP to HTTPS 2022-08-10 21:55:54 -05:00
Dustin c48cc985b2 r/collectd: Ignore filesystems by path
In addition to ignoring particular types of filesystems, e.g. OverlayFS,
we can also ignore filesystems by their mount point.  This could be
useful, for example, for bind-mounted directories, such as those used on
Kubernetes nodes.
2022-08-05 18:56:48 -05:00
Dustin c8e89a4b16 hosts: Add Kubernetes machines
There is no specific playbook or role for Kubernetes.  All OS
configuration is done at install time via kickstart scripts, and
deploying Kubernetes itself is done (manually) using `kubeadm init` and
`kubeadm join`.
2022-08-03 20:52:01 -05:00
Dustin 3b692a9de8 vm-hosts: Add Kubernetes VLAN configuration 2022-08-03 20:51:33 -05:00
Dustin 6f11a4cf3a grafana: Set Grafana domain
Necessary for Grafana CSRF protection.
2022-07-24 10:31:46 -05:00
Dustin 3e8da609e7 frigate: Keep front porch recordings for 2 days
Now that there is plenty of storage in the new video server, let's keep
24/7 recordings from the front porch camera, too.
2022-07-23 17:52:26 -05:00
Dustin c1c28a51b5 frigate: Use native MQTT/TLS support
Frigate has native support for MQTT over TLS now, so there is no more
any need to use stunnel.
2022-07-23 17:27:02 -05:00
Dustin d5ef18ccc3 frigate: Split camera config into separate file
This will make it easier to manage Frigate camera settings.
2022-07-23 17:26:19 -05:00
Dustin 41582beef9 group_vars/frigate: Add second back yard camera
Adding a second camera to the back yard, on the North side of the porch,
to try and figure out how the possums keep getting under the porch even
with the chicken wire around it!
2022-07-18 18:25:20 -05:00
Dustin 82f9ce0797 group_vars/frigate: Keep back yard recordings
We're trying to discover how the possums are getting into and out of the
house.  Let's enable continuous video recording from the back yard
camera so we can observe them and come up with a plan to get rid of
them.
2022-07-18 18:20:21 -05:00
Dustin a3608f187c home-assistant: Enable Mosquitto persistence
Configuring Mosquitto to persist its state to the filesystem will keep
retained messages from MQTT sensors, etc.
2022-05-29 11:26:39 -05:00
Dustin 3c8e576841 grafana: Enable anonymous access
Allow unauthenticated users to view dashboards.  Useful for Heads-Up
Displays.
2022-03-07 20:10:13 -06:00
Dustin 5485fc6f93 websites/d…and…t: Configure formsubmit
To handle the RSVP form on *dustinandtabitha.com*, we are going to use
*formsubmit*.  It runs on the same machine that hosts the website, so
there's no dealing with CORS.  The */submit/rsvp* path, which is proxied
to the backend, is the RSVP form's target.
2022-02-27 17:56:54 -06:00
Dustin 3632698f37 websites/dustinandtabitha.com: Add role
Wedding website 😍
2022-02-27 17:41:40 -06:00
Dustin c12da40228 home-assistant: Correct BURP exclude syntax
BURP does not support relative paths or globs in `exclude` values.
2022-01-16 10:08:27 -06:00
Dustin 5efbee725e home-assistant: Omit history DB from backups
The state history database is entirely too big.  It takes over an hour
to create a backup of it, which usually causes BURP to time out.  The
data it stores isn't particularly interesting anyway.  Instead of trying
to back it up and ultimately not getting any backup at all, we'll just
skip it altogether to ensure we have a consistent backup of everything
else that is actually important.
2022-01-02 12:07:12 -06:00
Dustin 2b27a31bee frigate: Update config syntax for 0.9.x
There were several backward-incompatible changes introduced in Frigate
[0.9.0](https://github.com/blakeblackshear/frigate/releases/tag/v0.9.0).
Notably, recordings and clips are now configured together.
2021-12-30 09:33:58 -06:00
Dustin 6acb25e309 nextcloud: Trust headers from public rev proxy
If Nextcloud does not have the Internet-facing reverse proxy listed in
its "trusted proxies" setting, it will mark all traffic as being from
the proxy itself.  This breaks brute force detection, etc.
2021-12-20 22:20:09 -06:00
Dustin 74deb895ae pyrocufflink-dns: Remove dc0 forwarder
Decommissioning *dc0.pyrocufflink.blue*.  Do not forward requests for
internal domain names to it.
2021-12-18 16:44:48 -06:00
Dustin 62ca80a5f0 pyrocufflink-dns: Remove FireMon zones
There is no longer any point to having forward zones in the main DNS
server for FireMon domains, since we don't have a network-wide VPN
anymore.
2021-12-18 10:51:17 -06:00
Dustin 739ffb2845 home-assistant: Configure BURP backups
Take a snapshot of the history database first, then back up everything
in `/var/lib/homeassistant`.
2021-12-17 20:57:38 -06:00
Dustin fdfdaa6fe6 bitwarden_rs: Update burp backup path
Vaultwarden data are stored in a different location since the migration
to Podman.
2021-12-17 20:33:31 -06:00
Dustin 14c7b1fcc1 bitwarden_rs: Update collectd process name
`bitwarden_rs` is now named `vaultwarden`.
2021-11-06 19:42:07 -05:00
Dustin c882ac45e7 nut: Add playbook for NUT
NUT runs on *serial0.pyrocufflink.blue* and monitors the two UPSes on
the server rack.
2021-10-31 14:28:27 -05:00
Dustin 881c8de625 Switch Prometheus/collectd to pull
Transitioning from push-based to pull-based monitoring with
Prometheus/collectd.  The *write_prometheus* plugin will be installed on
all hosts, and Prometheus will be configured to scrape them directly.
2021-10-30 16:41:17 -05:00
Dustin 8e9699810b burp-server: Monitor burp process with collectd 2021-10-16 21:53:51 -05:00
Dustin 8e35c57f53 samba-dc: Monitor processes with collectd
Monitor the state of Samba server processes via the *processes* plugin
for *collectd*.
2021-10-16 16:33:23 -05:00
Dustin d032e1b89d bitwarden_rs: Monitor processes with collectd
Monitor the state of the `bitwarden_rs` server process via the
*processes* plugin for *collectd*.
2021-10-16 16:32:22 -05:00
Dustin 014043909a vm-hosts: Enable multicast querier on bridge
Having this option enabled dramatically improves the reliability of
collectd multicast traffic from physical machines and VMs on a separate
VM host from the receiving machine.
2021-10-16 16:12:15 -05:00
Dustin ce0dac983f pyrocufflink: Set root password and SSH keys
1. Set a password for *root* on all machines (useful for logging in via
   serial console if network is down)
2. Set an authorized SSH key for root on all machines:

   * For Fedora 34, use my FIDO2 security token key
   * For all other hosts, use my ED25519 key
2021-10-16 15:40:20 -05:00
Dustin f8f19405c7 vm-hosts: Add camera virtual network
The *camera* virtual network is needed for VMs (i.e. *nvr0*) to connect
directly to security cameras.
2021-10-10 16:09:15 -05:00
Dustin 4d5076ced9 vm-hosts: Remove hass VM network
There are no longer any virtual machines connected to the Home Assistant
network.  Home Assistant runs on a physical machine.
2021-10-10 16:09:15 -05:00
Dustin 3be9c40f2b r/vmhost: Install mount helpers
Filesystems like NFS and CIFS require "helper" utilities (i.e.
`mount.nfs` and `mount.cifs`, respectively).  These need to be installed
in order for a system to be able to mount those filesystems.

The current shared storage system uses NFSv4, and as such, the
*nfs-utils* package needs to be installed on the VM hosts.
2021-10-10 16:09:15 -05:00
Dustin 55920c0025 vmhost: Define VM/storage networks
Originally, the network configuration for the VM networks and the
storage network was configured using the *netifaces* role.  This has
effectively stopped working in recent versions of Fedora, as it sort of
relied on `dhcpcd`, which has not been maintained in Fedora for a while
and no longer behaves correctly.  After evaluating *NetworkManager* as a
replacement, I decided that *systemd-networkd* is a more appropriate
solution.

There are effectively two "layers" of network configuration needed for
the VM hosts: the host-specific settings, and the common settings.  The
host-specific settings include such properties as the IP address of the
management interface and the names of the physical ports that make up
the bonded interfaces.  The common settings are the bonded interfaces,
the VLAN interfaces created on top of the bond, and the bridges that
provide access to VMs.

To configure the host-specific settings, each host simply needs the
appropriate `networkd_*` variables in its `host_vars` file.  For the
common settings, we apply the *systemd-networkd* role again in the
`vmhost.yml` with different values for these variables.  Thus,
effectively, `systemd-networkd.yml` manages the host-specific settings,
while `vmhost.yml` manages the common settings.
2021-10-10 16:09:15 -05:00
Dustin 9868873860 frigate: Enable RTMP on Back Yard camera
I couldn't get RTMP to work on the Back Yard camera because the `ffmpeg`
process kept crashing:

```
ffmpeg.back_yard.clips_rtmp    ERROR   : av_interleaved_write_frame(): Connection reset by peer
ffmpeg.back_yard.clips_rtmp    ERROR   : [flv @ 0x5562090c8ec0] Failed to update header with correct duration.
ffmpeg.back_yard.clips_rtmp    ERROR   : [flv @ 0x5562090c8ec0] Failed to update header with correct filesize.
ffmpeg.back_yard.clips_rtmp    ERROR   : Error writing trailer of rtmp://127.0.0.1/live/back_yard: Connection reset by peer
watchdog.back_yard             INFO    : Terminating the existing ffmpeg process...
watchdog.back_yard             INFO    : Waiting for ffmpeg to exit gracefully...
```

I thought increasing the value of `--shm-size` argument for `podman`
would help, but even going as high as 1024 mebibytes did not resolve the
problem.

Ultimately, I decided that it is not really necessary to view the full
4k stream in real time.  The back yard camera supports three streams, so
I set them all up for different roles.  I briefly considered using a
single 1080p stream for both object detection and RTMP streaming, but
this consumed considerable CPU time, so I decided against it for now.  I
may re-evaluate that option if I decide to purchase a TPU.
2021-08-22 20:31:59 -05:00
Dustin 544810ad34 home-assistant: Omit overlayfs from collectd
Podman containers create a *lot* of overlay filesystems.  There is no
reason to report these with collectd.
2021-08-22 11:38:40 -05:00
Dustin b7ba6a59ab hosts: Add nvr0.p.b
*nvr0.pyrocufflink.blue* hosts Frigate.  It is deployed on a separate
subnet, for two reasons:

* To avoid streaming video from the cameras through the firewall
* To prevent any hosts on the LAN except Home Assistant from
  communicating with Frigate, since it does not have any kind of
  authentication or access control
2021-08-21 17:20:19 -05:00
Dustin 911b86e694 prometheus: collectd: Listen on unicast socket
For hosts that cannot send metrics via multicast (e.g. because they are
on a different subnet), *collectd* needs to listen on the all-hosts
unicast address.
2021-08-21 17:15:21 -05:00
Dustin 7d2b3887c2 Add ability to update HA-related containers
Home Assistant, Zigbee2MQTT, and ZWaveJS2MQTT can now be updated by
setting the corresponding Ansible variable.
2021-08-12 19:02:34 -05:00
Dustin d78257326a Merge branch 'tabitha-website' 2021-07-24 18:37:05 -05:00
Dustin 910d430e1e website: Deploy Tabitha's website
Tabihta's website is a very simple, static HTML site.  It uploaded via
SFTP and served at *tabitha.biz*.
2021-07-24 18:36:13 -05:00
Dustin ceeb61cdb0 roles/homeassistant: Proxy ZwaveJS2Mqtt Web UI
ZwaveJS2Mqtt includes a very powerful web-based UI for configuring and
controlling the Z-Wave network.  This functionality is no longer
available within Home Assistant itself, so being able to access the
ZwaveJS2Mqtt UI is crucial to operating the network.

I wanted to make the UI available at */zwave/*, which requires using
*mod_rewrite* to conditionally proxy requests based on the `Connection`
HTTP header, since the UI passes both HTTP and WebSocket requests to the
same paths.  *mod_rewrite* configuration is not inherited from the main
server configuration to virtual hosts, so the
`RewriteRule`/`RewriteCond` directives have to be specified within the
`<VirtualHost>` block.  This means that the Home Assistant proxy
configuration has to be within its own virtual host, and the
Zwavejs2Mqtt configuration has to be there as well.
2021-07-19 15:58:58 -05:00
Dustin 57b3039f2c roles/mosquitto: Update for Mosquitto 2.x
Mosquitto 2.x included two significant changes from 1.6:

* There is no longer a "default" listener; all listeners are configured
  in the same way
* The daemon drops privileges *before* reading TLS certificates and
  private keys
2021-07-19 15:58:58 -05:00
Dustin 3d9d7423ef hosts: Add stats0.p.b to prometheus group
Although configuration policy is not yet available for Prometheus
itself, the `collectd.yml` playbook also uses the *prometheus* host
group.  Specifically, hosts in this group are configured to receive
collectd data from other hosts and expose those data through the
`write_prometheus` plugin.
2021-07-05 09:34:25 -05:00
Dustin 5e61e93cea roles/grafana: Deploy Grafana
This commit introduces the *grafana* role and the corresponding
`grafana.yml` playbook.  The role installs Grafana using the system
package manager, and configures the server (including LDAP
authentication).
2021-07-02 21:47:33 -05:00
Dustin 6b9b87a406 roles/nextcloud: Configure outbound email
Since the Nextcloud configuration file is managed by the configuration
policy, all of the settings configurable through the web UI need to be
templated.  One important group of settings is the outbound email
configuration.  This can now be configured using the `nextcloud_smtp`
Ansible variable.
2021-06-25 11:12:38 -05:00
Dustin b86e0d8f29 roles/nextcloud: Switch to Fedora package
Fedora now includes a packaged version of Nextcloud.  This will be
_much_ easier to maintain than the tarball-based distribution method.
There are some minor differences in how the Fedora package works,
compared to the upstream tarball.  Notably, it puts the configuration
file in `/etc/` and makes it read-only, and it stores persistent data
separate from the application.  These differences require modifications
to the Apache and PHP-FPM configuration, but the package also included
examples to make this easier.  Since the `config.php` is read-only now,
it has to be managed by the configuration policy; it cannot be modified
by the Administration web UI.
2021-06-24 20:21:48 -05:00
Dustin bb6186b90e roles/mosquitto: Add role to deploy MQTT server
*Mosquitto* implements an MQTT server.  It is the recommended
implementation for using MQTT with Home Assistant.

I have added this role to deploy Mosquitto on the Home Assistant server.
It will be used to send data from custom sensors, such as the
temperature/pressure/humidity sensor connected to the living room wall
display.
2021-05-02 19:10:17 -05:00
Dustin 90421e77d2 pyrocufflink-dhcp: DHCP reservations for VM hosts
When there is a network issue that prevents DNS names from being
resolved, it can be difficult to troubleshoot.  For example, last night,
the Samba domain controller crashed, so *pyrocufflink.blue* names were
unavailable.  Furthermore, the domain controller VM was apparently
locked up, so I could not SSH into it directly, and it needed to be
rebooted.  Since the VM host's name did not resolve, I could not find
its address to log into it and reboot the VM.  I resorted to scanning
the SSH keys of every IP address on the network until I found the one
that matched the cached key in ~/.ssh/known_hosts.  This was cumbersome
and annoying.

Assigning DHCP reservations to the VM hosts will ensure that when a
situation like this arises again, I can quickly connect to the correct
VM host and manage its virtual machines, as its address is recorded in
the configuration policy.
2021-02-17 20:33:41 -06:00
Dustin ac516ce09d protonvpn: Switch to US-TX#5
US-IL#41 seems down...
2021-02-13 08:33:59 -06:00
Dustin 371305bed4 roles/synapse: Deploy the Matrix homeserver
The *synapse* role and the corresponding `synapse.yml` playbook deploy
Synapse, the reference Matrix homeserver implementation.

Deploying Synapse itself is fairly straightforward: it is packaged by
Fedora and therefore can simply be installed via `dnf` and started by
`systemd`.  Making the service available on the Internet, however, is
more involved.  The Matrix protocol mostly works over HTTPS on the
standard port (443), so a typical reverse proxy deployment is mostly
sufficient.  Some parts of the Matrix protocol, however, involve
communication over an alternate port (8448).  This could be handled by a
reverse proxy as well, but since it is a fairly unique port, it could
also be handled by NAT/port forwarding.  In order to support both
deployment scenarios (as well as the hypothetical scenario wherein the
Synapse machine is directly accessible from the Internet), the *synapse*
role supports specifying an optional `matrix_tls_cert` variable.  If
this variable is set, it should contain the path to a certificate file
on the Ansible control machine that will be used for the "direct"
connections (i.e. on port 8448).  If it is not set, the default Apache
certificate will be used for both virtual hosts.

Synapse has a pretty extensive configuration schema, but most of the
options are set to their default values by the *synapse* role.  Other
than substituting secret keys, the only exposed configuration option is
the LDAP authentication provider.
2020-12-30 21:54:02 -06:00
Dustin 1c575c4340 protonvpn: Connect to server by IP address
Since DNS only allowed to be sent over the VPN, it is not possible to
resolve the VPN server name unless the VPN is already connected.  This
naturally creates a chicken-and-egg scenario, which we can resolve by
manually providing the IP address of the server we want to connect to.
2020-09-23 18:50:06 -05:00
Dustin 8ca093050b pyrocufflink-dns: Cloudflare over ProtonVPN
This commit adds a new playbook, `protonvpn.yml`, and its supporting
roles *strongswan-swanctl* and *protonvpn*.  This playbook configures
strongSwan to connect to ProtonVPN using IPsec/IKEv2.

With this playbook, we configure the name servers on the Pyrocufflink
network to route all DNS requests through the Cloudflare public DNS
recursive servers at 1.1.1.1/1.0.0.1 over ProtonVPN.  Using this setup,
we have the benefit of the speed of using a public DNS server (which is
*significantly* faster than running our own recursive server, usually by
1-2 seconds per request), and the benefit of anonymity from ProtonVPN.

Using the public DNS server alone is great for performance, but allows
the server operator (in this case Cloudflare) to track and analyze usage
patterns.  Using ProtonVPN gives us anonymity (assuming we trust
ProtonVPN not to do the very same tracking), but can have a negative
performance impact if its used for all Internet traffic.  By combining
these solutions, we can get the benefits of both!
2020-09-06 11:06:58 -05:00
Dustin a7b8e2fbfa pyrocufflink-dhcp: Remove obsolete networks
* pyrocufflink.jazz has been gone for a while now
* tachyglossus.net DHCP is managed by the USG now
2020-09-06 10:40:27 -05:00
Dustin f536c9633e roles/named: Support logging queries to syslog
This commit adds two new variables to the *named* role:
`named_queries_syslog` and `named_rpz_syslog`.  These variables control
whether BIND will send query and RPZ log messages to the local syslog
daemon, respectively.
2020-09-06 10:40:27 -05:00