Matthias Nehlsen

Software, Data and Stuff

SystemD and Clojure

Oh hey, I’m back. Been a while. Today, I want to share with you how I’m using systemd to start my Clojure applications on matthiasnehlsen.com, and keep them alive, in case anything should go wrong. These are the applications managed this way:

Also, I’m using systemd to start up sse-chat, a Scala demo application which you can also find on GitHub. However, this application is only started by systemd, but not restarted when anything goes wrong.

The background for this post is that I recently ordered a new Skylake Intel® Xeon® E3-1275 v5 based server at Hetzner, and I felt it was finally time to retire the manual process startup approach I had used before. Servers should be updated as often as possible, but who does that often enough when it takes 10-15 minutes to wait for a reboot and then manually restart the processes? Certainly not me. So instead, all process startup should be automatic. Initially, I considered using Docker, but regarding monitoring that the application is alive, and restarting it if not, systemd has the better story to offer. Also, I wasted way too much time on a Docker environment in my last client project, so I’m a little cured of the snake oil.1

So what I wanted was restarting the machine and have all services come up automatically. Also, I wanted to use the watchdog functionality, which expects the monitored applications to call systemd with a heartbeat message and restarts the application if that heartbeat wasn’t encountered for say 20 seconds or whatever else you define there. You can read all about this mechanism in this blog post by one of the original authors of systemd.

While my applications were running rock solid for months in a row until I finally managed to update the server and restart it, it is certainly appealing from an operations perspective to have a mechanism in place that listens for a heartbeat and restarts a process when the heartbeat does not come as expected. So I thought this might be a good opportunity to write a small library that takes care of emitting said heartbeat when an application is monitored by systemd. You can find this library on GitHub here.

This library also happens to be a sweet opportunity to write a minimal systems-toolbox system, with a scheduler component that emits messages every so often, and then calls systemd via JNA.

This is the entire library:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
(ns matthiasn.systemd-watchdog.core
  (:require [matthiasn.systems-toolbox.switchboard :as sb]
            [matthiasn.systems-toolbox.scheduler :as sched])
  (:import [info.faljse.SDNotify SDNotify]))

(defn start-watchdog!
  "Call systemd's watchdog every so many milliseconds.
   Requires the NOTIFY_SOCKET environment variable to be set, otherwise does
   nothing. Fires up a minimal systems-toolbox system with two components:
    * a scheduler component
    * a component notifying systemd.
   Then, the scheduler will emit messages every so often, and upon receiving,
   the notifying component will call the sendWatchdog function.
   Takes the timeout in milliseconds."
  [timeout]
  (when (get (System/getenv) "NOTIFY_SOCKET")
    (sb/send-mult-cmd
      (sb/component :wd/switchboard)
      [[:cmd/init-comp (sched/cmp-map :wd/scheduler-cmp)]
       [:cmd/init-comp
        {:cmp-id      :wd/notify-cmp
         :handler-map {:wd/send (fn [_] (SDNotify/sendWatchdog))}}]
       [:cmd/send {:to  :wd/scheduler-cmp
                   :msg [:cmd/schedule-new
                         {:timeout timeout
                          :message [:wd/send]
                          :repeat  true}]}]

It fires up a switchboard, which manages and wires systems, the :wd/notify-cmp, which calls (SDNotify/sendWatchdog) from the SDNotify library, and a scheduler component, which emits :wd/send messages every timeout milliseconds. You can build much more complex applications with the systems-toolbox, e.g. BirdWatch. The 14 lines above (plus comments and imports) however are about the minimum case when some scheduling is desired.

You can have a look at the mentioned examples if you’re interested in building systems with the systems-toolbox. In subsequent articles, I will introduce them in detail. For now, you can just use the library in your projects if you want to have your application monitored by systemd. It’s just a one-liner, as you can see for example in the trailing mouse pointer example:

1
  (wd/start-watchdog! 5000)

This simple command calls systemd every 5 seconds, but only if the NOTIFY_SOCKET environment variable is set, which would only be the case if systemd had started the application.

Here’s the service configuration:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
[Unit]
Description=systems-toolbox websocket latency visualization example

[Service]
Type=simple
User=bw
Group=bw
Environment=PORT=8010
Environment=HOST=0.0.0.0
WorkingDirectory=/home/bw/run
ExecStart=/usr/bin/java -jar /home/bw/bin/trailing-mouse-pointer.jar
WatchdogSec=20s
Restart=on-failure

# Give a reasonable amount of time for the server to start up/shut down
TimeoutSec=300

[Install]
WantedBy=multi-user.target

You can find all the service configurations for my server in my **conf project, together with some install scripts which allow me to set up a new server with little effort. I hope this helps you in your deployments. It certainly helps me with mine.

Would you like to know when there’s a new article? Subscribe to the newsletter and I’ll let you know.

Cheers, Matthias


  1. There, the problem was that silly Docker service that frequently hung, which, for whatever reason, required a REBOOT of the whole machine. As you can imagine, this was very annoying, as that, of course, meant ALL services would become unavailable until the machine was back up.

« Systems Toolbox Example

Comments