This commit adds an *after* ordering dependency on the network device
unit to the *wait-global-address@.service* template unit. Without this
dependency, the service will wait forever for a global address if the
device does not exist. With the dependency, though, if the device does
not appear within the default timeout, the wait service will never
start, causing all dependent services to fail, but allowing the boot
process to continue.
The *websites/darkchestofwonders.us* role prepares a machine to host
http://darkchestofwonders.us/. The website itself is published via rsync
by Jenkins.
The *websites/dustin.hatch.name* role configures a server to host
http://dustin.hatch.name/. The role only applies basic configuration;
the actual website application is published by Jenkins.
The Apache service needs to be reloaded after the *certbot* role
configures it to serve the `/.well-known/acme-challenge` path, so that
the changes can take effect before the `certbot` command is run to
request the certificate(s).
If another role that depends on the *apache* role accidentally creates
an invalid configuration, it will be impossible to correct it by
subsequent invocations of its playbook. This is because the *apache*
role always tries to start the service, which will fail if the
configuration is invalid, thus aborting the playbook. With this early
abort, there is no way for later tasks to correct the error.
Playbooks that include the *apache* role should have a task that is
executed after all the roles have been applied to ensure the service is
running.
The *dch-storage-net* role configures a machine to connect the storage
network and mount shared folders from the storage appliance.
The `wait-global-address.sh` script and corresponding
*wait-global-address@.service* systemd unit template are necessary to
ensure that the storage network is actually available before attempting
to mount the shared volumes. This is particularly important at boot,
since `dhcpcd` does not implement any kind of signaling that can be used
by *network-online.target*, so the network is considered "online" as
soon as the `dhcpcd` process has started. This typically results in
"network unreachable" errors.
The *net-ifaces* role manages a script that creates virtual network
interfaces, such as bridge, bond, and VLAN, that `dhcpcd`/`dhclient`
alone cannot. This provides a lightweight alternative to
*systemd-networkd* and *NetworkMangager*.
Though the default for the `fqdn` value is listed as `both` in
*dhcpcd.conf(5)*, the current behavior of `dhcpcd` suggests that it may
actually be `none`. Without explicitly setting `fqdn both`, the value of
the kernel node name is sent as-is in the *hostname* option (12). If the
node name is set to the FQDN, then dynamic DNS gets broken, since the
DHCP server always appends its domain name to the provided hostname.
Setting `fqdn both` causes `dhcpcd` to send the FQDN in the *FQDN*
option (81), which the DHCP server interprets correctly.
Using a list to specify the values for the `allowinterfaces` and
`denyinterfaces` parameters in `dhcpcd.conf` makes the configuration
policy cleaner and more type-safe.
Today I realized that `dhcpcd` has been logging several hundred thousand
of these messages every second:
libudev: received NULL device
This was causing both `dhcpcd` and `systemd-journald` to consume 100%
CPU.
I am not entirely sure what a "device management" module is in the
context of `dhcpcd`, but it does not seem to be required. Setting the
`nodev` option in `dhcpcd.conf` suppresses the messages, and seems to
have no effect on the operation of the daemon.
Traffic from the management network is not allowed except for specific
services. NTP is required of course, for time synchronization with the
pyrocufflink.blue domain controllers. RADIUS is necessary for WiFi
authentication, which is also handled by the DCs.
The UniFi controller has been moved to a Raspberry Pi on the Management
network. This machine needs a static address to use in the "inform URL"
it sends to managed devices.
The Management network (VLAN 10, 172.30.0.240/28) will be used for
communication with and configuration of network devices including
switches and access points. This keeps configuration separate from
normal traffic, and allows complete isolation of infrastructure devices.
The `ifconfig` global directive specifies the IP address added to the
tunnel interface device, not the network. The `push route` directives
need to include this address to correctly send route information to
clients.
The *dch-openvpn-server* role installs and configures OpenVPN and
stunnel to provide both native OpenVPN service as well as
OpenVPN-over-TLS. The latter uses stunnel, listening on TCP port 9876,
to allow better firewall traversal and TCP port sharing via reverse
proxy.
The `apache_server_tokens` variable can now be set, which controls the
value of the `ServerTokens` directive. If the variable is set, the
`ServerTokens` directive will be added to the `00-servername.conf` file.
The `samba_interfaces` variable can now be defined to populate the
`interfaces` global configuration parameter in `smb.conf`. This
parameter controls the interfaces or addresses to which the Samba server
binds, and also the IP addresses that are registered in DNS.
The *certbot* role now supports copying the data for an existing Let's
Encrypt account to the managed node using an archive. If an archive
named for the inventory hostname (typically the FQDN) of the managed
node is found in the `accounts` directory under the `files` directory of
the *certbot* role, it will be copied to the managed node and extracted
at `/var/lib/letsencrypt/accounts`. This takes the place of running
`certbot register` to sign up for a new account.
The *install* tag is applied to any task that installs a package.
The *user* tag is applied to any task that creates an OS user or group.
The *group* tag is applied to any task that creates an OS user group.
Since the host *gw0* is not a member of the *pyrocufflink.blue* domain,
GSSAPI authentication does not work. As such, the SSH private key has to
be made available to the `ansible-playbook` process for authentication
to that host.
The `zabbix.yml` playbook applies to hosts that are not members of the
*pyrocufflink.blue* domain, and thus have different passwords for
`sudo`. Using the `-e` argument to `ansible-playbook` and specifying a
single Vault-encrypted file that defines the `ansible_become_password`
variable effectively forces Ansible to try to use that password on every
host. This is because variables defined on the command line, or read
from a file specified on the command line, have the highest precedence.
To use different passwords on different hosts, the normal variable
scoping rules have to be used. To that end, one `sudo-pass` file is
created in the `group_vars/pyrocufflink` directory, so it will apply to
all machines that are members of the *pyrocufflink.blue* domain.
Additionally, another `sudo-pass` file is created in the `host_vars/gw0`
directory; it will only apply to the gateway device.