dustin.web/content/cv/firemon.md

178 lines
6.9 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters!

This file contains ambiguous Unicode characters that may be confused with others in your current locale. If your use case is intentional and legitimate, you can safely ignore this warning. Use the Escape button to highlight these characters.

+++
title = 'FireMon'
date = 2013-12-01
[extra]
title = 'Principal Engineer'
years = '2013Present'
+++
FireMon is a software development company based in Overland Park, KS. As the
System Architect, I focus on building a scalable platform for delivering
FireMon software to customers that is easy to use. FMOS, the FireMon Operating
System, is a mechanism for delivering the FireMon <abbr title="Security
Intelligence Platform">SIP</abbr> to customers and a collection of tools for
deploying and managing the software in a wide array of environments, ranging
from a single server to massive multi-node ecosystems.
<!-- more -->
# FMOS: FireMon Operating System
## Ansible Configuration Policy
* Configuration policy for deployment of all FireMon software and
third-party dependencies
* Support for single-server and distributed deployments
* Automatically compute JVM heap sizes for each process based on available
resources
* Configures Elasticsearch in single-node or clustered mode
* Configures PostgreSQL with optional replication to standby servers
* Configures Kernel NFS server and client to share filesystem data between
machines
* Configures FireMon application server processes, including connection and
authentication information for PostgreSQL, Elasticsearch
* Configures strongSwan IPsec/IKEv2 key management daemon for opportunistic
encryption of Elasticsearch communication
* Configures operating system login, password policy, including support for
external authentication providers such as LDAP or Kerberos
* Sets up *collectd* and Carbon (Graphite data storage engine) to track
system performance metrics, optionally replicating metrics data to a
FireMon-managed central storage for real-time review
* Optionally configures *rsyslog* to send log messages to remote destinations
over UDP, TCP, or TCP+TLS
* Configures *tmux* to automatically launch at user login
## Deployment and Maintenance Tools
* Python software for configuring and managing machines running FireMon
software (`fmos` command)
* Critical functionality for application maintenance:
* Updating OS and software
* Backing up and restoring data
* Capturing diagnostic information for technical support
* Modifying configuration settings
* Managing server certificates and private keys
* D-Bus daemon to handle privileged operations
* Unprivileged command-line interface
* HTTP API developed with FastAPI
## Generation II Platform
* Based on CentOS 7
* Full-disk encryption using LUKS
* Anaconda installer with custom addon for generating machine-specific LUKS
master key passphrase
* Kickstart script for fully-automated installation
* Used Koji to build RPM packages for first- and third-party software
* Distribution included Ansible for configuration management
* systemd units for controlling FireMon application services
## Generation III Platform
* Based on CentOS 7, later CentOS 8 (Stream)
* Immutable SquashFS root filesystem image
* Full-disk encryption using LUKS
* Custom Dracut modules to verify image OpenPGP signature, mount as rootfs,
initialize LUKS-encrypted persistent data volume with LVM
* Custom SELinux policy to confine FireMon software
# DevOps Team Lead
* Exclusively managed all resources using Ansible configuration management
* Deployed and maintained hundreds of internal and cloud systems running
RHEL/CentOS Linux (5, 6, 7, 8)
* PXE provisioning of all on-premises virtual machines
* All machines Active Directory domain members using Samba/Winbind
* Zabbix system monitoring
* Agent installed on all machines
* Collects system availability and performance metrics
* Custom templates for basic application availability metrics
* Atlassian Bitbucket (Stash) Git repository host
* Jenkins continuous integration platform
* Integrated with Bitbucket for project discovery and change events
* Jobs configured using `Jenksinsfile` pipeline definition files within
repositories
* Build environments defined as container images, jobs run in Docker
containers on Jenkins agents
* Ephemeral agents using vSphere plugin, various virtual machine templates
for different project needs
* Application data backups using *BURP*: Back Up and Restore Program
* Graylog log aggregation
* All machines send system, application logs via syslog over TLS, using
*rsyslog*
* Custom pipelines for parsing and indexing fields from log messages
* Alerts based on log message contents, frequency
* Prometheus application monitoring
* Victoria Metrics time-series database
* Prometheus exporters for many applications (Jenkins, Bitbucket,
Elasticsearch, GlusterFS, HAProxy, Nginx, Redis)
* Custom Grafana dashboards for status display, performance analysis
* *collectd* monitors system performance from ephemeral Jenkins worker nodes
via multicast, exposes Prometheus metrics
* AlertManager notifications to e-mail and Slack for application availability
and performance alerts
* HashiCorp Vault HA cluster for secret storage, including Jenkins credentials
# Internal Tools
## FMOS Web Tools
* Internal application used by software developers and support agents
* Multi-tiered architecture with multiple nodes at each tier to avoid any
single point of failure
* Application Server Tier: Python 3.6/FastAPI
* Storage Tier: GlusterFS
* Index Tier: Elasticsearch
* Cache Tier: Redis
* Message Tier: RabbitMQ
* Worker Tier: Python 3.6/Celery
* Ingress: HAProxy
* User Interface: Typescript/Vue+Vuetify
## PR Bot
* Implements a web hook for Atlassian Bitbucket (stash)
* Reacts to new and updated Pull Requests
* Automatically checks Git commits and changed code to enforce style guide and
other project-specific requirements
* Adds comments to Pull Requests indicating check results, marks PR as approved
or needs work
* Written in Python, no external dependencies
## QEMU VM Log Socket Proxy
* Component of FMOS End-to-End tests running on-premises using QEMU/libvirt
* Uses kernel *inotify(7)* events to detect virtual machine log channel socket
files appearing on the VM host
* Automatically connects to sockets as they appear
* Receives all data from channel sockets and writes them to a file in the
libvirt storage pool
* Written in Rust
## FMOS ISO Writer
* Internal application used by development and QA teams to write FMOS installer
images to USB disks attached to remote physical appliances
* Accessible via purpose-built, ultra-minimal Linux distribution (Kernel and
Busybox only) delivered by network boot/PXE
* Written in Rust
# FireMon-as-a-Service
* Cloud-hosted FireMon software deployment
* Deployed backend infrastructure for federated authentication using OpenLDAP,
MIT kerberos
* Followed Infrastructure-as-Code principles using Ansible
* Developed custom integrated authentication solution for FireMon Security
Manager software to provide full-featured account and credential management
using Kerberos protocol (Authgate)
* Python bindings for *mit-kerberos* using Cython