Hmmm….almost 2 years since a post. I was pretty busy.
Anyway, I’ve been making some major changes to how I manage my home server, so I thought a write-up would be interesting. I’ve had some form of home server for over 10 years. It’s always been Linux, but has morphed in many ways over the years. It began as a frontend for XBMC (before it was Kodi), but after migrating to Plex, it’s become completely headless. Over the years, I’ve stacked software on it, installing OS level packages and letting them run behind the scenes or though Apache. Quite honestly, it began to be a mess. I took inventory of it one day, and realized should this thing ever die a horrible fiery death, I’d have weeks of rebuilding. Plus, I frequently found myself in dependency hell, trying to track down what was needed for me to do the thing that I wanted to do.
So, I took a step back, remembered some of those buzzwords that I’d heard around work and the internet, studied them, compared them, decided on a toolset, and put them into practice.
One thing that was clear to me in doing my analysis…. I am not running the data center for a Fortune 500 company. I only manage three small Linux servers for personal use. Often, lightweight, simple options work great for me. With that, many of these solutions can scale quite large either with some additional thought towards configuration or add-on services.
Ansible – Configuration Management
First of all, I’m tired of manually editing configuration files, documenting my changes, and hoping I can remember what I did next time. That’s silly. I needed a configuration management system. I considered a few alternatives, Chef, Salt, Puppet were the main competitors, but I chose Ansible. It’s a simple push based architecture that relies on ssh and Python, two technologies I already know very well. It also does not require any special infrastructure. I could run it off my server, a Raspberry Pi, or a laptop.
I’d actually done some work for my home server in Ansible before, but I’d automated simple tasks, not the state. I went ahead and started with a blank slate using the Ansible best practices to define the state I wanted vs defining the tasks I needed to accomplish on my server. Then, I stuck this in a Git Repo. I’m currently using Gitlab, because their free accounts offer the most flexibility for a hobbyist.
Semaphore – Ansible Frontend
Ansible’s full capabilities are available from the command line alone, but sometimes it’s easier just to open up a web page and click a button. The enterprise solution is Ansible Tower and the open source upstream solution is AWX. I did play with AWX, and found a lot of good features, but it was very heavy for what I needed. It requires 4 Docker containers: Web UI container, Worker container(s), PostgreSQL, and RabbitMQ. I found Semaphore to be simple, lightweight, and did everything I needed. It can be used to manage SSH keys, GitHub Repos for your playbooks, users, and projects. On any playbook execution, it will update from git, then perform the requested action. There is currently no internal scheduling mechanism, but there a rest API available for externally triggered jobs.
At the end of the day, it accomplishes my goal rather well. I can edit, commit, merge, and run all in a handful of minutes. (more if I actually test first)
Docker – Application Management
One of my bigger frustrations began to be managing software dependencies. I often found myself troubleshooting dependencies, manually editing configuration files, and configuring Linux users and groups to allow shared file access.
Why do this anymore? Most mainstream Linux services have some form of Dockerfile available on Docker Hub. The file is easily readable, so even if you don’t like some of the practices in an image you want, you can definitely create one of your own. Additionally, Ansible has great Docker modules, so services can be configured easily with the same configuration management system in use by the rest of the system. Some of the services I’m running in docker are:
- Airsonic – Free, web-based media streamer. Fork of Subsonic.
- Grafana – Analytics and Monitoring
- Plex – Media Server
- Portainer – Management UI for Docker – Useful for inspecting and viewing logs
- Prometheus (and add ons) – Monitoring System and Time Series database
- RabbitMQ – Message broker – Used in a Django/Celery project I’m working on. Prime candidate for Docker due to Erlang requirements.
Prometheus – Monitoring
Previously, I had been relying on Icinga2. While stable, configuration was a pain, and it relied on OK/Warn/Critical limits that would need to be configured on each node remotely. I felt I needed to re-learn the configuration schema each time I needed to add a new custom alert. Additionally, Icinga2 had limited options for reporting history and graphing out of the box. It was also dependent on Apache and MySQL, so what would alert me if those went down?
After analyzing my options, I gravitated towards Prometheus. It didn’t come pre-configured with a bunch of fancy dashboards and alerts like some of other offerings, but it was easy to manage, and there were many add-ons to enrich the experience. Data is gathered through exporters, scraped with HTTP requests. Prometheus can even scrape HTTPS URLs with authentication. I’m currently using few exporters to gather information on my systems:
- Prometheus node_exporter – runs as service on all nodes to collect system metrics. This exporter can even scrape text files, which I’ve configured to check for available apt packages on my Ubuntu systems.
- cAdvisor -Analyzes resource usage and performance characteristics of running containers. (Offered by Google)
- Blackbox exporter – Allows blackbox probing of endpoints over HTTP, HTTPS, DNS, TCP and ICMP.
I plan to retire Icinga2 soon, after I have been able to improve my alerting thresholds and gain a little more confidence in the system.
Grafana – Analytics and Alerting for Prometheus
Prometheus is great for storing and querying data. It can graph data, but the interface is best used to develop new queries and graphs. I found Grafana to be the best package deal to support Prometheus, as it can generate graphs, and send alerts to multiple channels. I’ve tried my hand at generating my own dashboards, but shared ones available on Grafana.com are much better than I’ve been able to create quickly. So far, I’ve been able to get whole system dashboards to help me monitor and alert on various metrics. The big ones for me are filesystem space, backup status, and security patch requirements. As a bonus, I’ve also been able to create dashboards for others to show only the metrics they are concerned with (and automate a nagging email when disk space runs low).
I’ll end this post with some of the graphs I have configured in my Grafana Dashboards: