I recently built an architecture prototype for DIMOCO aiming at answering roughly the following questions:

  • what does it entail to run an Akka cluster, how fast can new nodes join, how manageable is such a cluster?
  • are traditional publish-subscribe message distribution semantics supported (one-to-one, one-to-N, one-to-any-of-N)?
  • how can message loss be averted / minimized?
  • how fast is the message sending, is it fast enough?
  • does it feel “Java-like” enough (since Akka has strong ties to Scala and the aim is to use Java for this project)?
  • how well does it run on the data centers on premise, how easy is it to operate?

Since I’ve been itching to find an excuse for trying out the new Raspberry PI 3 Model B this was the perfect opportunity to order 3 of them and use them for testing the prototype.

Innocent and unsuspecting Raspberry PIs, unaware of their fate as Akka cluster nodes

The following post gathers some of the impressions gained while using Akka Cluster with the Java API and in combination with the Raspberry PI 3.

Akka with Java

Akka, like all of the core technologies of the Typesafe Lightbend has both a Java and a Scala API. In the early days of some of these technologies, the Java API wasn’t always on par with the Scala one, for a simple reason: it is actually not that easy to provide a “fluent” Java API on top of a Scala core implementation since Java knows only a subset of Scala. Doing it the other way around (devising the Java API first and then building the core implementation with Scala) turns out to be much easier, which is the approach Lightbend has been taking for their latest reactive microservice framework, Lagom.

Since I hadn’t been using the Java API extensively before and had in general been taking a long break from Java, this was a good opportunity to get back to the JVM’s first resident language.

My experience in one word is: smooth. Java’s support for Lambda’s takes away much of the pain experienced while coming back from Scala. The streams API feels somewhat clunky and don’t even get me started on Java’s Optional (it really should have been called OptionalReturn to make the intent clear) but those are bearable constraints to work with. Also, there’s the awesome Javaslang library which has a much nicer functional feel to it than the APIs provided by the Java language itself. I only missed Scala’s beloved case classes sorely for defining protocols with them is so much nicer than using immutable POJOs.

I had used sbt as a build tool for the prototype. During a two-day handover workshop we turned that sbt build into a maven project. This too worked out very nicely including the creation of a ZIP file with a startup script for easy deployment on the nodes (sbt has the sbt native packager plugin for this purpose which lets you easily package applications as ZIP, Docker, DEB, RPM, etc.).

All in all, the experience of using Akka with Java has been really positive and the overall feedback was that it feels very familiar and not in any way peculiar to Java. The only tool that I missed in Java (and hence didn’t show) is the multi JVM testing support which lets you define test scenarios automatically ran on multiple JVMs, which is a perfect way of automating the testing of an Akka cluster application (then again I can see why there is no Java API for it - for Java’s constraints would really get in the way in this case).

Akka on the Raspberry PI 3 Model B

Assembled Raspberry PIs next to a Sinclair ZX 81 for scale

Running an Akka cluster on the Raspberry PI is nothing new. The two technologies seem to go hand-in-hand for one reason or another. The Raspberry PI 3 Model B has a 1.2GHz 64-bit quad-core ARMv8 CPU and 1GB of LPDDR2 RAM at 900 Mhz as well as built-in 2.4GHz 802.11n wireless (additionally to ethernet), more than enough to run Akka (which can run on 128 MB of heap - or even less if I remember correctly).

There were some gotcha’s I ran into while setting up the Akka cluster to run on it.

Power supply and networking

For the Raspberry’s ethernet networking to run smoothly you need to have a good power source. Even if it seems tempting to just plug the PI into any USB port, that won’t do so well. Since I didn’t have good enough power supplies at hand (I only had mobile phone USB chargers which are not strong enough), I resolved to use wireless instead for the demonstration which requires less power. But even when using wireless, it is important to plug each PI in its separate source since the power demand seems to increase with network traffic and not having enough power results in packet loss. It took me some time and some investigating with Wireshark to figure out what was going on while running load tests, as the behaviour of the cluster was erratic to say the least.

Akka Persistence and ARM

Akka Persistence’s default journal implementation (perfect for use with small prototypes) is using LevelDB, a lightweight key-value storage library written in C++. The trouble is that there is no ARM version of LevelDB JNI yet. After trying to build it myself I quickly figured that there are too many missing dependencies that do not have ARM support yet. So I gave up and used a store that does run on ARM: MongoDB. The Raspberry PIs are now web scale.

Conclusion

All in all, this project was a good example for showing how lightweight and flexible Akka is as a technology:

  • demonstrating the various usage scenarios was rather easy thanks to the support for distributed publish-subscribe
  • the performance was not an issue either. In order to simulate network I/O the prototype nodes were writing one file per message. On the Raspberry PIs the nodes were able to process up to roughly 800 messages / second over wireless and with at-least-once delivery semantics enabled. On the client’s data centers the maximum throughput was 5000 messages / second, the limiting factor being the storage (the test machines were using an attached network drive, rather than a local drive)
  • deploying and running the cluster on premise was no problem either since Akka uses TCP/IP in combination with seed nodes as default discovery mechanism (as opposed to e.g. UDP multicast which isn’t readily available in all data centers).