The case for document-oriented databases

Published on 28 July 2013, last updated on 8 May 2014

Last week, I started working with AngularJS for a real-world project – I had played with Angular before, mainly to see how it differs from KnockoutJS which I had been using intensively before. This post is not about AngularJS itself (I may write one later on when I have a better perspective about the framework), I’ll just say that my first impressions about it are:

That being said, there’s one thing that AngularJS seems to get really right: it makes it possible to map documents to the UI without imposing its own design philosophy on the said document. With KnockoutJS for example, you’d have to make sure certain fields were provided as arrays, and generally had to know really well how the different bindings worked before knowing into what shape to bring your data so that the UI would do what you wanted.

Let’s say for example that you want to make it possible to add advertisements about wine on an application (who wouldn’t). To spice things up a bit, let’s say that the content should be multilingual. We’d have a model looking like this:

So our document mainly consists of a link and to multilingual content (in reality we may want to add more things to it, such as special characteristics, the price, etc., but let’s keep things simple for the sake of the example).

It is possible to write up the user-interface for the above model like this:

Some comments about this code:

What I find really interesting in comparison to KnockoutJS is the possibility to bind values against a map: using the binding expression wine.translations[$parent.currentLang], we can dynamically navigate the document hierarchy without having to add any special lookup code. Of course, when e.g. creating a new document, we need to provide a scaffold for those documents, where translation would be an empty object, but this is a small price to pay in comparison to the flexibility this provides in regards to data design.

So what does this all have to do with document-oriented databases? Well, let’s see what we need to represent the above object in a relational database, using e.g. Slick. Given that we have multi-lingual content, we have several options as to how to model it, let’s go for an approach where each entity requiring multi-lingual content gets its own translation table. Using lifted embedding, we now get the following:

Next, let’s see how to turn this into the document we want to display. For this purpose, we need to create a model on the server-side to represent the document to which to map:

Now, let’s write the code to get back a full representation of a Wine, including the translations, added to the Wines table object defined above:

Great, now we’ve got our data, let’s give that to the client (assuming we use the Play framework):

If we want to create or update a new wine entry, we need to do the mapping the other way around:

I think that by now you may guess what I’m getting at: these mapping gymnastics to go from a hierarchical document to a relational database are extremely verbose. Let’s see in contrast how the same would look like by putting things into e.g. MongoDB using the Salat case class mapper.

First, the data definition:

Next, the controller code:

Much simpler, isn’t it?

I think what this boils down to is the following: AngularJS and other modern client-side frameworks are built around the fundamental concept of manipulating and working with hierarchical documents. Relational databases, on the other hand, aren’t all that great at representing hierarchies – those need to be turned into relations, which are fundamentally different things. Of course, there is a way to get from one paradigm to the other, however, as we’ve seen, it is up to the developer to do the mapping from one to the other. In the Java space, this mapping has traditionally been done using ORMs such as Hibernate, however, these tools bring another set of problems with them, as soon as the data model gets bigger.

I’m pretty sure that I’m doing some things the wrong way in regards to Slick; the query above can probably be simplified somewhat. Yet the issue remains that overall, representing a document in a relational database does require a non-trivial amount of mapping effort.

On the other hand, relational databases are (still to this day) percevied as safe and easy to operate, and the hype that came with the NoSQL movement has essentially focused on how fast and scalable those technologies are. And yet, I think that at the end of the day, there’s only a few players that really require the speed and scalability that’s been hyped so much, and a lot more that could benefit from using document stores for their “document” aspect.

Liked this article? Subscribe to the mailing list to get regular updates on similar topics.

Leave a Reply