The case for document-oriented databases
Contents
Last week, I started working with AngularJS for a real-world project - I had played with Angular before, mainly to see how it differs from KnockoutJS which I had been using intensively before. This post is not about AngularJS itself (I may write one later on when I have a better perspective about the framework), I’ll just say that my first impressions about it are:
- it lacks good documentation - the existing documentation doesn’t exactly have the right “feel”, and is incomplete
- when coming from Knockout, Angular feels rather invasive, given that it is a “full-blown” framework rather than “just” a MVVM mapping tool
- my current impression is that the learning curve looks like a roller-coaster - there’s moments of intense productivity interrupted with high amounts of “not getting anywhere, why on earth am I using this sh*t”
That being said, there’s one thing that AngularJS seems to get really right: it makes it possible to map documents to the UI without imposing its own design philosophy on the said document. With KnockoutJS for example, you’d have to make sure certain fields were provided as arrays, and generally had to know really well how the different bindings worked before knowing into what shape to bring your data so that the UI would do what you wanted.
Let’s say for example that you want to make it possible to add advertisements about wine on an application (who wouldn’t). To spice things up a bit, let’s say that the content should be multilingual. We’d have a model looking like this:
{
"translations": {
"fr": {
"title": "Côtes de Provence rosé",
"description": "Robe claire et limpide, le nez en accords avec la bouche, dévoile des notes originales et plaisantes de fraise écrasée et de fruits rouges."
},
"en" {
"title": "Rosé wine Côtes de Provence",
"description": "..."
}
},
"url": "http://www.cotes-de-provence-domaine-grand-rouviere.com/c%C3%B4tes-de-provence-ros%C3%A9/"
}
So our document mainly consists of a link and to multilingual content (in reality we may want to add more things to it, such as special characteristics, the price, etc., but let’s keep things simple for the sake of the example).
It is possible to write up the user-interface for the above model like this:
|
|
Some comments about this code:
- the
control-group
andvalidation-error-for
directives are custom directives to render boilerplate Twitter Bootstrap) markup, inspired by this article - the
ng-options
directive fetches the languages and displays them appropriately in the language drop-down. The languages are made available in the root scope of the application and have as value[{value: "en", label: "English"}, {value: "fr", label: "French"]
What I find really interesting in comparison to KnockoutJS is the possibility to bind values against a map: using the binding expression wine.translations[$parent.currentLang]
, we can dynamically navigate the document hierarchy without having to add any special lookup code. Of course, when e.g. creating a new document, we need to provide a scaffold for those documents, where translation
would be an empty object, but this is a small price to pay in comparison to the flexibility this provides in regards to data design.
So what does this all have to do with document-oriented databases? Well, let’s see what we need to represent the above object in a relational database, using e.g. Slick. Given that we have multi-lingual content, we have several options as to how to model it, let’s go for an approach where each entity requiring multi-lingual content gets its own translation table. Using lifted embedding, we now get the following:
|
|
Next, let’s see how to turn this into the document we want to display. For this purpose, we need to create a model on the server-side to represent the document to which to map:
|
|
Now, let’s write the code to get back a full representation of a Wine, including the translations, added to the Wines
table object defined above:
|
|
Great, now we’ve got our data, let’s give that to the client (assuming we use the Play framework):
|
|
If we want to create or update a new wine entry, we need to do the mapping the other way around:
|
|
I think that by now you may guess what I’m getting at: these mapping gymnastics to go from a hierarchical document to a relational database are extremely verbose. Let’s see in contrast how the same would look like by putting things into e.g. MongoDB using the Salat case class mapper.
First, the data definition:
|
|
Next, the controller code:
|
|
Much simpler, isn’t it?
I think what this boils down to is the following: AngularJS and other modern client-side frameworks are built around the fundamental concept of manipulating and working with hierarchical documents. Relational databases, on the other hand, aren’t all that great at representing hierarchies - those need to be turned into relations, which are fundamentally different things. Of course, there is a way to get from one paradigm to the other, however, as we’ve seen, it is up to the developer to do the mapping from one to the other. In the Java space, this mapping has traditionally been done using ORMs such as Hibernate, however, these tools bring another set of problems with them, as soon as the data model gets bigger.
I’m pretty sure that I’m doing some things the wrong way in regards to Slick; the query above can probably be simplified somewhat. Yet the issue remains that overall, representing a document in a relational database does require a non-trivial amount of mapping effort.
On the other hand, relational databases are (still to this day) percevied as safe and easy to operate, and the hype that came with the NoSQL movement has essentially focused on how fast and scalable those technologies are. And yet, I think that at the end of the day, there’s only a few players that really require the speed and scalability that’s been hyped so much, and a lot more that could benefit from using document stores for their “document” aspect.