Since there are no other plain HTTP virtual hosts, the one defined for
chmod777.sh became the "default." Since it explicitly redirects all
requests to https://chmod777.sh, it caused all non-HTTPS requests to be
redirected there, regardless of the requested name. This was
particularly confusing for Tabitha, as she frequently forgets to put
https://…, and would find herself at my stupid blog instead of
Nextcloud.
The *esp4* kernel module does not load automatically on Fedora. Without
this module, strongSwan can establish IKE SAs, but not ESP SAs. Listing
the module name in a file in `/etc/modules-load.d` configures the
*systemd-modules-load* service to load it at boot.
I believe the reason the VPN was not auto-restarting was because I had
incorrectly specified the `keyingtries` and `dpd_delay` configuration options.
These are properties of the top-level connection, not the child. I must
have placed them in the `children` block by accident.
The *cert* role must be defined as a role dependency now, so that the
role can define a handler to "listen" for the "certificate changed"
event. This change happened on *master*, before the *matrix* branch was
merged.
Graylog 3.3 is currently installed on logs0. Attempting to install the
*graylog-3.1-repository* package causes a transaction conflict, making
the task and playbook fail.
Before the advent of `ansible-vault`, and long before `certbot`/`lego`,
I used to keep certificate files (and especially private key files) out
of the Git repository. Now that certificates are stored in a separate
repository, and only symlinks are stored in the configuration policy,
this no longer makes any sense. In particular, it prevents the continuous
enforcement process from installing Let's Encrypt certificates that have
been automatically renewed.
The *websites/proxy-matrix* role configures the Internet-facing reverse
proxy to handle the *hatch.chat* domain. Most Matrix communication
happens over the default HTTPS port, and as such will be directed
through the reverse proxy.
The *synapse* role and the corresponding `synapse.yml` playbook deploy
Synapse, the reference Matrix homeserver implementation.
Deploying Synapse itself is fairly straightforward: it is packaged by
Fedora and therefore can simply be installed via `dnf` and started by
`systemd`. Making the service available on the Internet, however, is
more involved. The Matrix protocol mostly works over HTTPS on the
standard port (443), so a typical reverse proxy deployment is mostly
sufficient. Some parts of the Matrix protocol, however, involve
communication over an alternate port (8448). This could be handled by a
reverse proxy as well, but since it is a fairly unique port, it could
also be handled by NAT/port forwarding. In order to support both
deployment scenarios (as well as the hypothetical scenario wherein the
Synapse machine is directly accessible from the Internet), the *synapse*
role supports specifying an optional `matrix_tls_cert` variable. If
this variable is set, it should contain the path to a certificate file
on the Ansible control machine that will be used for the "direct"
connections (i.e. on port 8448). If it is not set, the default Apache
certificate will be used for both virtual hosts.
Synapse has a pretty extensive configuration schema, but most of the
options are set to their default values by the *synapse* role. Other
than substituting secret keys, the only exposed configuration option is
the LDAP authentication provider.
Since the *bitwarden_rs* relies on Docker for distribution and process
management (at least for now), it needs to ensure that the `docker`
service starts automatically.
Because the various "webapp.*" users' home directories are under
`/srv/www`, the default SELinux context type is `httpd_sys_content_t`.
The SSH daemon is not allowed to read files with this label, so it
cannot load the contents of these users' `authorized_keys` files. To
address this, we have to explicitly set the SELinux type to
`ssh_home_t`.
BIND sends its normal application logs (as opposed to query logs) to the
`default_debug` channel. By sending these log messages to syslog, they
can be routed and rotated using the normal system policies. Using a
separate dedicated log file just ends up consuming a lot of space, as it
is not managed by any policy.
I am not sure the point of having both `ssl_request_log` and
`ssl_access_log`. The former includes the TLS ciphers used in the
connection, which is not particularly interesting information. To save
space on the log volume of web servers using Apache, we should just stop
creating this log file.
Changing/renewing a certificate generally requires restarting or
reloading some service. Since the *cert* role is intended to be generic
and reusable, it naturally does not know what action to take to effect
the change. It works well for the initial deployment of a new
application, since the service is reloaded anyway in order for the new
configuration to be applied. It fails, however, for continuous
enforcement, when a certificate is renewed automatically (i.e. by
`lego`) but no other changes are being made. This has caused a number
of disruptions when some certificate expires and its replacement is
available but has not yet been loaded.
To address this issue, I have added a handler "topic" notification to
the *certs* role. When either the certificate or private key file is
replaced, the relevant task will "notify" a generic handler "topic."
This allows some other role to define a specific handler, which
"listens" for these notifications, and takes the appropriate action for
its respective service.
For this mechanism to work, though, the *cert* role can only be used as
a dependency of another role. That role must define the handler and
configure it to listen to the generic "certificate changed" topic. As
such, each of the roles that are associated with a certificate deployed
by the *cert* role now declare it as a dependency, and the top-level
playbooks only include those roles.
The *collectd-prometheus* role configures the *write_prometheus* plugin
for collectd. This plugin exposes data collected or received by the
collectd process in the Prometheus Exposition Format over HTTP. It
provides the same functionality as the "official" collectd Exporter
maintained by the Prometheus team, but integrates natively into the
collectd process, and is much more complete.
The main intent of this role is to provide a mechanism to combine the
collectd data from all Pyrocufflink hosts and insert it into Prometheus.
By configuring the collectd instance on the Prometheus server itself to
enable and use the *write_prometheus* plugin and to receive the
multicast data from other hosts, collectd itself provides the desired
functionality.
For hosts with multiple network interfaces, collectd may not send
multicast messages through the correct interface. To ensure that it
does, the `Interface` configuration option can be specified with each
`Server` option. To define this option, entries in the
`collectd_network_servers` list can now have an `interface` property.
The *collectd* role, with its corresponding `collectd.yml` playbook,
installs *collectd* onto the managed node and manages basic
configuration for it. By default, it will enable several plugins,
including the `network` plugin. The `collectd_disable_plugins` variable
can be set to a list names of plugins that should NOT be enabled.
The default configuration for the `network` plugin instructs *collectd*
to send metrics to the default IPv6 multicast group. Any host that has
joined this group and is listening on the specified UDP port (default
25826) can receive the data. This allows for nearly zero configuration,
as the configuration does not need to be updated if the name or IP
address of the receiver changes.
This configuration is ready to be deployed without any variable changes
to all Pyrocufflink servers. Once *collectd* is running on the servers,
we can set up a *collectd* instance to receive the data and store them
in a time series database (i.e. Prometheus).
Since Apache HTTPD does not have any built-in log rotation capability,
we need `logrotate`. Somewhere along the line, the *logrotate* package
stopped being installed by default. Additionally, with Fedora 30, it
changed from including a drop-in file for (Ana)cron to providing a
systemd timer unit.
The *logrotate* role will ensure that the *logrotate* package is
installed, and that the *logrotate.timer* service is enabled and
running. This in turn will ensure that `logrotate` runs daily. Of
course, since the systemd units were added in Fedora 30, machines to
which this role is applied must be running at least that version.
By listing the *logrotate* role as a dependency of the *httpd* role, we
can ensure that `logrotate` manages the Apache error and access log
files on any server that runs Apache HTTPD.
By default, strongSwan will only attempt key negotiation once and then
give up. If the VPN connection is closed because of a network issue, it
is unlikely that a single attempt to reconnect will work, so let's keep
trying until it succeeds.
The *motioneye* role installs motionEye on a Fedora machine using `pip`.
It configures Apache to proxy for motionEye for outside (HTTPS) access.
The official installation instructions and default configuration for
motionEye assume it will be running as root. There is, however, no
specific reason for this, as it works just fine as an unprivileged user.
The only minor surprise is that the `conf_path` configuration setting
must be writable, as this is where motionEye places generated
configuration for `motion`. This path does not, however, have to
include the `motioneye.conf` file itself, which can still be read-only.
This commit adds a new playbook, `protonvpn.yml`, and its supporting
roles *strongswan-swanctl* and *protonvpn*. This playbook configures
strongSwan to connect to ProtonVPN using IPsec/IKEv2.
With this playbook, we configure the name servers on the Pyrocufflink
network to route all DNS requests through the Cloudflare public DNS
recursive servers at 1.1.1.1/1.0.0.1 over ProtonVPN. Using this setup,
we have the benefit of the speed of using a public DNS server (which is
*significantly* faster than running our own recursive server, usually by
1-2 seconds per request), and the benefit of anonymity from ProtonVPN.
Using the public DNS server alone is great for performance, but allows
the server operator (in this case Cloudflare) to track and analyze usage
patterns. Using ProtonVPN gives us anonymity (assuming we trust
ProtonVPN not to do the very same tracking), but can have a negative
performance impact if its used for all Internet traffic. By combining
these solutions, we can get the benefits of both!
This commit adds two new variables to the *named* role:
`named_queries_syslog` and `named_rpz_syslog`. These variables control
whether BIND will send query and RPZ log messages to the local syslog
daemon, respectively.
BIND response policy zones (RPZ) support provides a mechanism for
overriding the responses to DNS queries based on a wide range of
criteria. In the simplest form, a response policy zone can be used to
provide different responses to different clients, or "block" some DNS
names.
For the Pyrocufflink and related networks, I plan to use an RPZ to
implement ad/tracker blocking. The goal will be to generate an RPZ
definition from a collection of host lists (e.g. those used by uBlock
Origin) periodically.
This commit introduces basic support for RPZ configuration in the
*named* role. It can be activated by providing a list of "response
policy" definitions (e.g. `zone "name"`) in the `named_response_policy`
variable, and defining the corresponding zones in `named_zones`.
Normally, Home Assistant uses a SQLite database for storing state
history. On a Raspberry Pi with only an SD card for storage like
*hass1.pyrocufflink.blue*, this can become extremely slow, especially
for large data sets. To speed up features like history and logbook,
Home Assistant supports using an external database engine such as
PostgreSQL or MariaDB.
The *hassdb* role and corresponding `hassdb.yml` playbook deploys a
PostgreSQL server for Home Assistant to use. It needs only to create
the role and database, as Home Assistant manages its own schema.
The *postgresql-setup* service is no longer necessary, as upstream has
fixed the SELinux policy to allow root to invoke the `postgresql-setup`
command directly.
This commit adds a task to generate a PostgreSQL configuration file from
a template. Previously, the default configuration file generated by
`initdb` was sufficient, but in order to enable SSL connections, some
changes to it are required.
Naturally, SSL connections require a server certificate, so the
*postgresql-server* role will now also copy certificate files to the
managed node, if any.
Fedora has renamed the *strongswan* service to *strongswan-starter*.
The *strongswan* service now controls strongSwan via Vici, which uses a
different configuration format and is not compatible with the files in
`/etc/strongswan/ipsec.d`. As I am migrating everything to Wireguard
now, it does not make sense to rewrite all of the IPsec configuration in
this new format, so using the legacy format with the renamed service
makes more sense.
Because the Home Assistant user's home directory is on `/var`, Python
packages installed in the "user site" do not get the correct SELinux
labels and thus run in the wrong domain. This causes a lot of AVC
denials and other issues that prevent Home Assistant from working
correctly.
To resolve this issue, Home Assistant is now installed in a virtual
environment at `/usr/local/homeassistant`. This directory is still
owned by the Home Assistant user, allowing Home Assistant to manage
packages installed there. Since it is rooted under `/usr`, files are
labelled correctly and processes launched from executables there will
run in the correct domain.
The main network, *pyrocufflink.blue* (172.30.0.0/26) is now on VLAN 1
instead of VLAN 30. This changed when I replaced the Cisco SG200-26
with the UniFI Switch 48, to simplify configuration of all of the
Ubiquiti devices.
For some time, I have been trying to design a new configuration for the
reverse proxy on port 443 to correctly handle all the types of traffic
on that port. In the original implementation, all traffic on port 443
was forwarded by the gateway to HAProxy. HAproxy then used TLS SNI to
route connections to the correct backend server based the requested host
name. This allowed both HTTPS and OpenVPN-over-TLS to use the same
port, however it was not without issues. A layer 4 (TCP) proxy like
this "hides" the real source address of clients connecting to the
backend, which makes IP-based security (e.g. rate limiting, blacklists,
etc.) impossible at the application level. In particular, Nextcloud,
which implements rate limiting was constantly imposing login delays on
all users, because legitimate traffic was indistinguishable from
Internet background noise.
To alleviate these issues, I needed to change the proxy to operate in
layer 7 (HTTP) mode, so that headers like *X-Forwarded-For* and
*X-Forwarded-Host* could be added. Unfortunately, this was not easy,
because of the simultaneous requirement to forward OpenVPN traffic.
HAProxy can only do SNI inspection in TCP mode. So, I began looking for
an alternate way to proxy both HTTP and non-HTTP traffic on the same
port.
The HTTP protocol defines the `CONNECT` method, which is used by forward
proxies to tunnel HTTPS over plain HTTP. OpenVPN clients support
tunneling OpenVPN over HTTP using this method as well. HAProxy has
limited support for the CONNECT method (i.e. it doesn't do DNS
resolution, and I could find no way of restricting the destination) with
the `http_proxy` option, so I looked for alternate proxy servers that
had more complete support. Unsurprisingly, Apache HTTPD has the most
complete implementation of the `CONNECT` method (Nginx doesn't support
it at all). Using a name-based virtual host on port 443, Apache will
accept requests for *vpn.pyrocufflink.net* (using TLS SNI) and allow the
clients to use the `CONNECT` method to create a tunnel to the OpenVPN
server. This requires OpenVPN clients to a) use *stunnel* to wrap plain
HTTP proxy connections in TLS and b) configure OpenVPN to use the
TLS-wrapped HTTP proxy.
With Apache accepting all incoming connections, it was trivial to also
configure it as a layer 7 forward proxy for Bitwarden, Gitea, Jenkins,
and Nextcloud. Unfortunately, proxying for the other websites
(darkchestofwonders.us, chmod777.sh, dustin.hatch.name) was not quite as
straightforward. These websites would need to have an internal name
that differed from their external name, and thus a certificate valid for
that name. Rather than reconfigure all of these sites and set all of
that up, I decided to just move the responsibility for handling direct
connections from outside to the *web0* and eliminate the dedicated
reverse proxy. This was not possible before, because Apache could not
forward the OpenVPN traffic directly, but now with the forward proxy
configuration, there is no reason to have a separate server for these
connections.
Overall, I am pleased with how this turned out. It makes the OpenVPN
configuration simpler (*stunnel* no longer needs to run on the OpenVPN
server itself, since Apache is handling TLS termination), eliminates a
network hop for the websites, makes the reverse proxy configuration for
the other web applications much easier to understand, and resolves the
original issue of losing client connection information.
A name-based HTTP (not HTTPS) virtual host for *pyrocufflink.net* is
necessary to ensure requests are handled properly, now that there is
another HTTP virtual host (chmod777.sh) defined on the same server.
This commit updates the configuration for *pyrocufflink.net* to use the
wildcard certificate managed by *lego* instead of an unique certificate
managed by *certbot*.
*chmod777.sh* is a simple static website, generated by Hugo. It is
built and published from a Jenkins pipeline, which runs automatically
when new commits are pushed to Gitea.
The HTTPS certificate for this site is signed by Let's Encrypt and
managed by `lego` in the `certs` submodule.
This commit adds front-end and back-end configuration for HAProxy to
proxy HTTP/HTTPS for
*nextcloud.pyrocufflink.net*/*nextcloud.pyrocufflink.blue* to
*cloud0.pyrocufflink.blue*.
The *nextcloud* role installs Nextcloud from the specified release
archive, downloading it to the control machine first if necessary, and
configures Apache and PHP-FPM to serve it.
The `nextcloud.yml` playbook uses the *cert* role to install the X.509
certificate for the Nextcloud server, sets up Apache HTTPD with the
*apache* role, and installs Nextcloud using the *nextcloud* role.
The host *cloud0.pyrocufflink.blue* is the Nextcloud server for
Pyrocufflink.
The *cert* role is intended to be a generic, reusable role to copy an
X.509 certificate and/or private key file to managed nodes. It is
intended to be included in a playbook with at least the `cert_src` and
`cert_dest` variables defined, e.g.:
```
- hosts: whatever
roles:
- role: cert
cert_src: whatever.cer
cert_dest: /path/to/whatever.cer
```
the `haproxy_ssl_default_bind_options` variable is not defined for
machines running Fedora, because this parameter is not used in the
default configuration file there.
I seem to have forgotten how I got the RPM for Gitea. I think I built
it, but I cannot find the spec file, nor the RPM package. Since this is
clearly not reproducible, I decided to switch to using the binary
provided by upstream for now, until either I or Fedora get around to
making a better RPM.
Installing Gitea from the upstream binary is simple: just download it
and copy it to `/usr/local/bin`. Of course, the OS user and systemd
unit have to be managed by configuration policy when it's installed this
way.
*burp1.pyrocufflink.blue* will replace *burp0.pyrocufflink.blue* as the
BURP server for Pyrocufflink. It is a physical machine (Fitlet), making
it simpler to manage the USB drives. The old virtual machine will be
decommissioned soon.
Ansible replaced the `version_compare` filter with a `version_compare`
test that does the same thing. The former is completely gone now,
causing the template to fail to render, so its usage of that filter
needs to be updated.
Using the generic *burp.pyrocufflink.blue* name will allow easier
transition to a new BURP server. However, since this is not the actual
name, it cannot be used for task delegation, so a separate variable is
required to store the real name of the BURP server. This is only used
during client deployment, and not by BURP itself.
The *graylog* role installs Graylog from the *graylog2.org* Yum
repository and manages basic server configuration. It augments the
default systemd unit to provide the `CAP_NET_BIND_SERVICE` capability to
the Graylog server process via ambient capabilities, thereby allowing
the server to bind to the privileged Syslog UDP port.
The `Alias` configuration for Certbot needs to be configured before any
other locations, to ensure the `/.well-known` path is always served from
the local filesystem. If another drop-in configuration file (e.g.
`bitwarden.conf`) is ordered before it, it may override this
configuration and prevent Let's Encrypt from working.
In order to allow Jenkins to connect to the Docker daemon socket, the
socket must be owned by the *docker* group, and the *jenkins* user must
be a member of it.
This commit adds an HAProxy backend for Bitwarden, and adds ACL rules to
the frontend to proxy traffic to *bitwarden.pyrocufflink.blue* or
*bitwarden.pyrocufflink.net* to it.
Since the same certificate is used for LDAPS and RADIUS (EAP-TLS), it
makes more sense to store it only once, with the later file as a symlink
to the former.
This commit configures *bw0.pyrocufflink.blue* as a BURP client, so that
the Bitwarden data can be backed up. A pre-backup script is used to
take a consistent snapshot of the SQLite database before copying it to
the BURP server.
The BURP server runs as user *burp*, and nas such, requires that the
client-specific configuration files be owned by that user so they can be
read when a client connects.
Newer versions of the BURP client require `status_port` to be set. This
commit updates the `burp.conf.j2` template to more closely match the
default configuration shipped with the *burp* package, including setting
this new value.
Newer versions of Gitea need a JWT secret for Oauth2. Gitea will
attempt to generate one at startup if it is not already specified in the
configuration file, but this will fail since the file is not writable by
the user running the service. As such, it must be set via configuration
policy.
The point of the "wheel host" is to serve as a repository of Python
packages (wheels) built by Jenkins for consumption by `pip` et al. For
applications and libraries that do not provide all of their dependencies
as binary packages, this makes a convenient way to install them without
requiring all of the build tools and dependencies on the destination
machine.
The idea here is that a Jenkins job runs `pip wheel` for a distribution
package name or `requirements.txt` file and then uploads the resulting
wheel files using `rsync`. Apache is configured to serve the upload
directory with an index compatible with `pip`'s `--find-links`.
The *hass-dhcp* role installs dnsmasq and configures it to serve DHCP
requests on the Home Assistant network. Since this network is not
routed, the regular DHCP relay/server setup will not work.
This commit adds a systemd unit to enable the Kernel Same-page Merging
daemon on VM hosts. This allows much greater virtual machine density,
especially when many VMs are running the same guest OS.
Debian does not support system-wide SSL cipher suite profiles of course,
so these options need to be specified explicitly when deploying HAProxy
on Debian-based machines.
This commit updates the net-ifaces scripts for both *vmhost0* and
*vmhost1* to create VLAN and bridge interfaces for the Management and
Home Assistant networks.
The *taiga* role installs the three components of Taiga:
* taiga-back
* taiga-events
* taiga-front
*taiga-back* is a Python application. Its dependencies are installed via
`pip` in the *taiga* user's site-packages, and the application itself is
installed by unpacking the archive. *taiga-events* is a Node.js
application. Its dependencies are installed by `npm`, and is itself
installed by unpacking the archive. Finally, *taiga-front* is a
single-page browser application that is installed by unpacking the
archive, and served by Apache.
Taiga requires PostgreSQL and RabbitMQ.
The *websites/pyrocufflink.net* role configures the public web server to
host *pyrocufflink.net*. This site has two functions:
* It redirects `/` to http://dustin.hatch.name/
* It proxies user home directories (i.e. /~dustin/) to the file server
The `ServerName` directive needs to be set inside the default SSL vhost,
as this property does not get inherited from the global configuration,
and it is needs to be set in order for SNI to work correctly.
By default, per-user directories (i.e. `/~username/`) are disabled in
Fedora's configuration of Apache. This commit introduces a new variable,
`apache_userdir`, which can be used to enable this feature. It should be
set to a string other than *disabled*, which is the path under users'
home directories that will be served, if it is accessible. Normally, the
value would be `public_html`.
To support multiple websites with separate Let's Encrypt certificates,
the *certbot* role needs to be applied as a dependency of each
individual website role. This will allow each application to specify a
different value for `certbot_domains`.
The default systemd unit configuration for *certbot-renew.service* runs
the `certbot renew …` command as root. This can cause permissions
issues, since this Ansible role expects the *certbot* user to be able to
access all configuration, data, and log files. As such, this commit adds
a systemd unit extension for *certbot-renew.service* to run the command
as *certbot*.
Moving the route definitions to global scope, and defining an address
pool, will allow other clients besides *dhatch-d4b* to connect to and
use the OpenVPN tunnel service. This may be useful in situations where
IPsec is blocked
The VPN capability of the UniFi Security Gateway is extremely limited.
It does not support road-warrior IPsec/IKEv2 configuration, and its
OpenVPN configuration is inflexible. As with DHCP, the best solution is
to simply move service to another machine.
To that end, I created a new VM, *vpn0.pyrocufflink.blue*, to host both
strongSwan and OpenVPN. For this to work, the necessary TCP/UDP ports
need to be forwarded, of course, and all of the remote subnets need
static routes on the gateway, specifying this machine as the next hop.
Additionally, ICMP redirects need to be disabled, to prevent confusing
the routing tables of devices on the same subnet as the VPN gateway.
The routes to FireMon networks are now defined using the
`firemon_networks` Ansible variable. The global `iroute` and
client-specific `route` options are generated from the CIDR blocks
specified in this list.
The *aria2* role installs the *aria2* download manager and sets it up to
run as a system service with RPC enabled. It also sets up the web UI,
though that must be installed manually from an archive, for now.
For hosts that do not have any TSIG keys, the `named_keys` variable
still must be defined (to an empty iterable) in order for the template
to expand properly.
To avoid having a single point of failure, a second recursive DNS server
is necessary. This will be useful in cases where the VM hosts must both
be taken offline, but Internet access is still required.
The new server, *dns1.pyrocufflink.blue*, has all the same zones defined
as the original. It forwards the *pyrocufflink.blue* zone and
corresponding reverse zones to the domain controllers, and acts as a
slave for the *pyrocufflink.red* zone.
*smtp1.pyrocufflink.blue* is a VM that will replace
*smtp0.pyrocufflink.blue*, a Raspberry Pi.
I decided that there is little use in having the availability guarantee of
a discreet machine for the SMTP relay. The only system that would NEED
to send mail if the VM host fails is Zabbix, which operates as its own
relay anyway. As such, the main relay can be a VM, and the Raspberry Pi
can be repurposed as a recursive DNS server.
The *koji-web* role installs and configures the Koji Web GUI front-end
for Koji. It requires Apache and mod_wsgi. A client certificate is
required for authentication to the hub, and must be placed in the
host-specific subdirectory of `certs/koji`.
The *koji-client* role is a generic role that can be used to configure
the Koji client library/`koji` CLI tool. By default, it manages the
default configuration at `/etc/koji`, but by using the
`koji_client_dir`, `koji_client_user`, and `koji_client_id` variables,
it can be used to configure per-user client configuration as well.
The *kojira* role sets up the Koji repository agent to manage
repository metadata for build tags. It runs as a daemon, usually on the
same machine as the Koji hub. A client certificate is required for
authentication, and must be supplied by placing it in the
`certs/koji/{{ inventory_hostname }}` directory.
The *koji-gc* role sets up the Koji garbage collector utility to run
periodically. It uses cron for scheduling. A client certificate is
required for authentication, and must be supplied by placing it in the
`certs/koji/{{ inventory_hostname }}` directory.
The minimal Fedora installation does not include a cron implementation.
The *cronie* role can be applied to hosts installed in this way to
ensure that cron is available for task scheduling.
The *burp-client* role installs and configures a BURP client. It should
support RHEL/CentOS/Fedora and Gentoo.
To manage the client password and other server-mandated configuration,
the role uses Ansible's delegation feature to generate a configuration
file in the "clientconfdir" on the BURP server.
An hourly cron task is scheduled that runs `burp -a t` every hour. This
allows the server to configure backup timebands and intervals.
The *burp-server* role installs and configures a BURP server. It is
adapted from a previous iteration, and should support CentOS/RHEL/Fedora
and Gentoo, as well as both BURP 1.x and 2.x (depending on which version
gets installed by the system package manager).
To manage the certificate authority, the *burp-server* role uses the
`burp_ca` command. This has the advantage of not requiring any external
certificate management, but effectively binds the CA to a specific
machine.
The value of the `shlib_directory` is dependent the system architecture.
Specifically, x86_64 machines use `/usr/lib64/postfix`, while everything
else uses `/usr/lib/postfix`. This role was originally deployed on a
Raspberry Pi, so the original path was correct. Attempting to deploy it
on an x86_64 machine revealed the error.
This commit adds a new task that loads a variables file based on the
architecture. Each option defines an `arch_libdir` variable, which can
be expanded in the `postfix_shlib_directory` variable as needed.
*myala.pyrocufflink.jazz* no longer hosts any public-facing websites,
and is in fact shut down. To prevent HAproxy from failing to start
because it cannot resolve the name, this backend needs to be removed.
The *fileserver* role configures Samba as a file sharing server. It uses
the *samba* role to handle cross-distribution installation of Samba
itself, and is focused primarily on configuring shared folders.
This commit adds an *after* ordering dependency on the network device
unit to the *wait-global-address@.service* template unit. Without this
dependency, the service will wait forever for a global address if the
device does not exist. With the dependency, though, if the device does
not appear within the default timeout, the wait service will never
start, causing all dependent services to fail, but allowing the boot
process to continue.
The *websites/darkchestofwonders.us* role prepares a machine to host
http://darkchestofwonders.us/. The website itself is published via rsync
by Jenkins.
The *websites/dustin.hatch.name* role configures a server to host
http://dustin.hatch.name/. The role only applies basic configuration;
the actual website application is published by Jenkins.
The Apache service needs to be reloaded after the *certbot* role
configures it to serve the `/.well-known/acme-challenge` path, so that
the changes can take effect before the `certbot` command is run to
request the certificate(s).
If another role that depends on the *apache* role accidentally creates
an invalid configuration, it will be impossible to correct it by
subsequent invocations of its playbook. This is because the *apache*
role always tries to start the service, which will fail if the
configuration is invalid, thus aborting the playbook. With this early
abort, there is no way for later tasks to correct the error.
Playbooks that include the *apache* role should have a task that is
executed after all the roles have been applied to ensure the service is
running.
The *dch-storage-net* role configures a machine to connect the storage
network and mount shared folders from the storage appliance.
The `wait-global-address.sh` script and corresponding
*wait-global-address@.service* systemd unit template are necessary to
ensure that the storage network is actually available before attempting
to mount the shared volumes. This is particularly important at boot,
since `dhcpcd` does not implement any kind of signaling that can be used
by *network-online.target*, so the network is considered "online" as
soon as the `dhcpcd` process has started. This typically results in
"network unreachable" errors.
The *net-ifaces* role manages a script that creates virtual network
interfaces, such as bridge, bond, and VLAN, that `dhcpcd`/`dhclient`
alone cannot. This provides a lightweight alternative to
*systemd-networkd* and *NetworkMangager*.
Though the default for the `fqdn` value is listed as `both` in
*dhcpcd.conf(5)*, the current behavior of `dhcpcd` suggests that it may
actually be `none`. Without explicitly setting `fqdn both`, the value of
the kernel node name is sent as-is in the *hostname* option (12). If the
node name is set to the FQDN, then dynamic DNS gets broken, since the
DHCP server always appends its domain name to the provided hostname.
Setting `fqdn both` causes `dhcpcd` to send the FQDN in the *FQDN*
option (81), which the DHCP server interprets correctly.
Using a list to specify the values for the `allowinterfaces` and
`denyinterfaces` parameters in `dhcpcd.conf` makes the configuration
policy cleaner and more type-safe.
Today I realized that `dhcpcd` has been logging several hundred thousand
of these messages every second:
libudev: received NULL device
This was causing both `dhcpcd` and `systemd-journald` to consume 100%
CPU.
I am not entirely sure what a "device management" module is in the
context of `dhcpcd`, but it does not seem to be required. Setting the
`nodev` option in `dhcpcd.conf` suppresses the messages, and seems to
have no effect on the operation of the daemon.
Traffic from the management network is not allowed except for specific
services. NTP is required of course, for time synchronization with the
pyrocufflink.blue domain controllers. RADIUS is necessary for WiFi
authentication, which is also handled by the DCs.
The `ifconfig` global directive specifies the IP address added to the
tunnel interface device, not the network. The `push route` directives
need to include this address to correctly send route information to
clients.
The *dch-openvpn-server* role installs and configures OpenVPN and
stunnel to provide both native OpenVPN service as well as
OpenVPN-over-TLS. The latter uses stunnel, listening on TCP port 9876,
to allow better firewall traversal and TCP port sharing via reverse
proxy.
The `apache_server_tokens` variable can now be set, which controls the
value of the `ServerTokens` directive. If the variable is set, the
`ServerTokens` directive will be added to the `00-servername.conf` file.
The `samba_interfaces` variable can now be defined to populate the
`interfaces` global configuration parameter in `smb.conf`. This
parameter controls the interfaces or addresses to which the Samba server
binds, and also the IP addresses that are registered in DNS.
The *certbot* role now supports copying the data for an existing Let's
Encrypt account to the managed node using an archive. If an archive
named for the inventory hostname (typically the FQDN) of the managed
node is found in the `accounts` directory under the `files` directory of
the *certbot* role, it will be copied to the managed node and extracted
at `/var/lib/letsencrypt/accounts`. This takes the place of running
`certbot register` to sign up for a new account.
The *install* tag is applied to any task that installs a package.
The *user* tag is applied to any task that creates an OS user or group.
The *group* tag is applied to any task that creates an OS user group.
For machines that do not use firewalld, the *zabbix-agent* role will now
skip attempting to open the Zabbix agent port using the `firewalld`
module. The `host_uses_firewalld` variable controls this behavior.
The *certbot* role installs and configures the `certbot` ACME client. It
adjusts the default configuration to allow the tool to run as an
unprivileged user, and then configures Apache to work with the *webroot*
plugin. It registers for an account and requests a certificate for the
domains specified by the `certbot_domains` Ansible variable. Finally, it
enables the *certbot-renew.timer* systemd unit to schedule automatic
renewal of all Let's Encrypt certificates.
The *dch-proxy* role sets up HAProxy to provide a revers proxy for all
public-facing web services on the Pyrocufflink network. It uses the TLS
Server Name Indication (SNI) extension to determine the proper backend
server based on the name requested by the client.
For now, only Gitea is configured; the name *git.pyrocufflink.blue* is
proxied to *git0.pyrocufflink.blue*. All other names are proxied to
Myala.
The *haproxy* installs HAproxy and sets up basic configuration for it.
It configures the systemd unit to launch the service with the `-f
/etc/haproxy` arguments, which will cause it to load all files from the
`/etc/haproxy` directory, instead of just `/etc/haproxy/haproxy.cfg`.
This will allow other roles to add frontend and backend configuration by
adding additional files to this directory.
The *sshd* role can be used to configure the OpenSSH daemon. It supports
configuring a few options globally, as well as a limited set of options
in `Match` blocks (e.g. per-user/group configuration).
The `trustca` role can be used to add CA certificates to the system
trust store. It requires a variable, `ca`, to be defined, referring to
the name of a file containing a CA certificate to install.
The `gitea_ssh_domain` and `gitea_http_domain` variables can be used to
configure the host portion of the URLs for cloning Git repositories over
SSH and HTTPS, respectively. By default, both values are the FQDN of the
machine hosting Gitea.
The *gitea* role installs Gitea using the system package manager and
configures Apache as a reverse proxy for it.
The configuration file requires a number of "secret" values that need to
be unique. These must be specified as Ansible variables:
* `gitea_internal_token`
* `gitea_secret_key`
* `gitea_lfs_jwt_secret`
The `gitea generate` command can be used to create these values.
Normally, Gitea expects to run its own setup tool to generate the
configuration file and create the administrative user. Since the
configuration file is generated from the template instead, no
administrative user is created automatically. Luckily, the `gitea`
command includes a tool to create users, so the administrator can be
created manually, e.g.:
sudo -u gitea gitea admin create-user -c /etc/gitea/app.ini \
--admin
--name giteadmin \
--password giteadmin \
--email giteadmin@example.org
In order to enable LDAPS/STARTTLS support in Samba, the `tls enabled`
option must be set to `yes` and the `tls keyfile` and `tls certfile`
options must be set to the path of the private key and certificate
files, respectively, that Samba will use. The `samba_tls_enabled`,
`samba_tls_keyfile`, and `samb_tls_certfile` Ansible variables can be
used to control these values.
The `socket options` directive does not need to be specified in
`smb.conf`. I think I copied it from an example many years ago, and
never bothered to remove it. It is definitely not required, most likely
not helping performance at all, and most likely hindering it.
This commit adjusts the firewall and networking configuration on dc0 to
host the Pyrocufflink remote access IPsec VPN locally instead of
forwarding it to the internal VPN server.
The *dch-vpn-server* role configures strongSwan to act as an IPsec
responder for `vpn.pyrocufflink.net` and provide an IKEv2/IPsec VPN for
remote access clients, as well as the reverse VPN to FireMon.
The *strongwan* role is intended to be used as a dependency of other
roles that use strongSwan for IPsec configuration. It deploys some basic
configuration and configures the *strongswan* service, but does not
configure any connections, secrets, etc.
Using `state=absent` with the `file` module in a `with_items` loop to
delete the "default" module and site configuration files and the example
certificates is incredibly slow. Especially on the Raspberry Pi, it can
take several minutes to apply this role, even when there are no changes
to make. Using the `command` module and running `rm` to remove these
files, while not as idempotent, is significantly faster. The main
drawback is that each item in the list is not checked, so new items to
remove have to be added to the end of the list instead of in
alphabetical order.
The *freeradius* role is used to install and configure FreeRADIUS. The
configuration system for it is extremely complicated, with dozens of
files in several directories. The default configuration has a plethora
of options enabled that are not needed in most cases, so they are
disabled here. Since the initial (and perhaps only) use case I have for
RADIUS is WiFi authentication via certificates, only the EAP-TLS
mechanism is enabled currently.
The *postfix* role installs and configures the Postfix MTA. It currently
supports a number of modes, including direct transfer and relay. Relay
mode supports STARTTLS security and PLAIN authentication.
Since the location of the configuration drop-in directory can vary by
distribution, it is important to expand the `zbx_agent_config_dir`
variable in the `Include` parameter.
The *zabbix-agent* role installs the Zabbix monitoring agent on the
managed node, and sets it up to communicate with the Zabbix server
specified by the `zabbix_server` variable. This role "should" be
compatible with most distributions; it has been tested with Fedora and
Gentoo.
The *zabbix-server* role deploys the Zabbix server database, daemon, and
web interface. It requires the *apache* role to configure Apache HTTPD
to serve the web UI.
The *apache* role installs and configures the Apache HTTPD server and
its *mod_ssl* module. It currently only works on Fedora/RHEL-based
distributions.
The `ad` identity mapper backend is apparently the only one that can
use shell, home directory, etc. attributes from the directory now (as of
Samba 4.6).
The *ssh-hostkeys* role is used to manage the global SSH host key
database. This file is consulted by the `ssh` command when verifying
remote host keys on first connect. If the host key is found here, it is
copied to the user's host key database file without prompting for
verification.
The *jenkins-slave* role prepares a host to have the Jenkins slave
agent deployed on it. Deploying the agent itself is done by the Jenkins
master, through the web UI.
The service principal name added to `/etc/krb5.keytab` had a trailing
`}` character because of a typo in the Ansible task. This resulted in
GSSAPI authentication failing because server processes could not find
the host key in the key table.
This commit introduces a new role, *hostname*, that is used by the
`hostname.yml` playbook to set the hostname. It also writes
`/etc/hosts` using a template.
It is occasionally necessary to advertise multiple prefixes on the same
interface, particularly when those prefixes are not on-link. The *radvd*
role thus now expects each item in `radvd_interfaces` list to have a
`prefixes` property, which itself is a list of prefixes to advertise.
Prefixes can specify properties such as `on_link`, `autonomous`,
`preferred_lifetime`, etc.
Marking packets matching port-forwarding rules, and then allowing
traffic carrying that mark did not seem to work well. Often, packets
seemed to get dropped for no apparent reason, and outside connections to
NAT'd services was sometimes slow as a result. Explicitly listing every
destination host/port in the `forward` table seems to resolve this
issue.
The *filter* table is responsible for deciding which packets will be
accepted and which will be rejected. It has three chains, which classify
packets according to whether they are destined for the local machine
(input), passing through this machine (forward) or originating from the
local machine (output).
The *dch-gw* role now configures all three chains in this table. For
now, it defines basic rules, mostly based on TCP/UDP destination port:
* Traffic destined for a service hosted by the local machine (DNS, DHCP,
SSH), is allowed if it does not come from the Internet
* Traffic passing through the machine is allowed if:
* It is passing between internal networks
* It is destined for a host on the FireMon network (VPN)
* It was NATed to in internal host (marked 323)
* It is destined for the Internet
* Only DHCP, HTTP, and DNS are allowed to originate from the local
machine
This configuration requires an `internet_iface` variable, which
indicates the name of the network interface connected to the Internet
directly.
`dhcpcd` needs to start after the `network` service has started, as the
latter creates the interfaces to which the former needs to delegate IPv6
prefixes.
The *nftables* role handles installation and basic configuration of the
userspace components for nftables.
Note that this role currently only works on Fedora, and requires
*nftables* 0.8 or later for wildcard includes.
The *networking* service, which is actually a legacy init script, is
provided by the *initscripts* package on RHEL and its derivatives. This
service needs to be running in order for the configuration generated by
the *rhel-network* role to be applied to the managed node.
The `network.yml` playbook is used to configure the network interfaces
on a managed node. Currently, it only supports the Red Hat configuration
style (i.e. `/etc/sysconfig/network-scripts/ifcfg-*` files).
The *samba-dc* role now configures `winbindd` on domain controllers to
support identity mapping on the local machine. This will allow domain
users to log into the domain controller itself, e.g. via SSH.
The Fedora packaging of *samba4* still has some warts. Specifically, it
does not have a proper SELinux policy, so some work-arounds need to be
put into place in order for confined processes to communicate with
winbind.
The *samba* role provides general configuration for Samba. Other roles
will provide configuration for specific features such as Active
Directory membership, file shares, etc.
The *system-auth* role deploys PAM configuration for system-wide user
authentication. It is specifically focused on Active Directory
authentication using Samba/Winbind.
The *nsswitch* role can be used to configure the name service switch on
glibc-based distributions, including Gentoo, Fedora, and CentOS. It is
specifically focused on Active Directory authentication via
Samba/Winbind.
Only *master* zones need zone files pre-populated, as the other types of
zones are populated by data named receives from queries and transfers.
Other types of zones require other options, however, to be usable. This
commit introduces minimal support for specifying *slave*, *forward*, and
*stub* zones.
Items in the `allow_update` property can use the address match list
syntax to specify arbitrary restrictions, including TSIG key names.
There is really no need for a special case for key names.
To support signing of updates, TSIG keys can be defined using the
`named_keys` variable. This variable takes a list of objects with the
following properties:
* `name`: The name of the key
* `algorithm`: The signature algorithm (default: `hmac-md5`)
* `secret`: The base64-encoded key material
The *named* role now supports generating configuration for authoritative
DNS zones and DNSSEC keys. Zones are defined by populating the
`named_zones` variable with a list of objects describing the zone. Zone
properties can include:
* `name`: The DNS domain name
* `type`: The zone type, defaults to `master`
* `allow_update`: A list of hosts/networks or DNSSEC key names (which
must be specified as an object with a `key` property)
* `update_policy`: A list of BIND update policy statements
* `ttl`: The default (minimum) TTL for the zone
* `origin`: The authoritative name server for the zone
* `refresh`, `retry`, `expire`: Record cache timeout values
* `default_records`: A list of default records, defined as objects with
the following properties:
* `name`: The RR name
* `type`: The RR type (default: `A`)
* `value`: The RR value
Zone files will be created in `/var/named/dynamic`. Existing zone files
will **not** be overwritten; management of zone records is done using
`nsupdate` or similar.
This commit adjusts the tasks in the *samba-dc* role to use a
conditional include to restrict tasks relating to the BIND9_DLZ plugin
only to hosts that are configured to use it.
The *bind-utils* package contains `dig` and `nsupdate`, which are used
to query and manage DNS records.
The *cyrus-sasl-gssapi* package contains the GSSAPI plugin for
SASL-aware applications, including `ldapsearch`.
The *ldb-tools* package contains `ldbsearch` and other tools for
directly using Samba database files.
The `/var/lib/samba/bind-dns` directory contains files that are
hard-linked to files in the `/var/lib/samba/private` directory. All
paths for a file must have the same context, or `restorecon` will
effectively "toggle" the labels each time it is run.
Evidently, some files in `/var/lib/samba` match multiple file context
rules. Thus, when running `restorecon` against the entire
`/var/lib/samba` directory, files in the `bind-dns` subdirectory may end
up with the wrong label. To work around this issue, `restorecon` is now
run only on that subdirectory to ensure the correct labels are applied.
This is likely to cause problems when a full filesystem relabel is
scheduled.
The `named` daemon does not seem to pick up all changes to the
configuration file during a graceful reload. To avoid strange states,
the daemon is now fully restarted after the configuration file is
regenerated.
The `allow-update` block in `named.conf` enumerates the hosts/networks
that are allowed to issue dynamic DNS updates. This is required in
Active Directory and other environments where clients and/or DHCP
servers create DNS records automatically.
By default, the block is omitted from the generated configuration file.
The `named_allow_update` variable can be set to a list of patterns (e.g.
CIDR blocks, ACL names, etc.) to populate it.
The *samba-dc* role now supports joining an existing Active Directory
domain as an additional domain controller. The `samba_is_first_dc` variable
controls whether the machine will be provisioned with a new domain (when
true) or added to an existing domain (when false).
Joining an existing domain naturally requires credentials of a user with
permission to add a new DC, the `samba_dc_join_username` and
`samba_dc_join_password` variables can be used to specify them.
Alternatively, if these variables are not defined, then the process will
attempt to use Kerberos credentials. This would require playbooks to
make a ticket-granting-ticket available somehow, such as by executing
`kinit` prior to applying the *samba-dc* role.
The *named* role configures the BIND DNS server on managed nodes. It
writes `/etc/named.conf`, using a template that supports most of the
commonly-used options. The configuration can be augmented by other
templates, etc. by specifying file paths in the `named_options_include`
or `named_global_include` variables, both of which are lists.
The *samba-dc* role installs Samba on the managed node and configures it
as an Active Directory Domain controller. A custom module,
`samba_domain` handles the provisioning using `samba-tool domain
provision` in an idempotent way.
The *kerberos* role configures the MIT Kerberos library. Specifically,
it creates `/etc/krb5.conf` and populates it with some basic default
options. It also creates the `/etc/krb5.conf.d` directory, into which
other roles can write additional configuration files.
The *base* role performs the basic tasks needed to manage a node using
Ansible. Specifically, it installs the necessary packages for
manipulating SELinux policy.