MongoDB (from “humongous”, citing their website), is a document driven (emphasis will be explained later), NoSQL database written in C++. It advertises itself for it’s short query times, flexibility and automatic sharding, but is it really a solution for every application?

Fast and Furious?

The speed of MongoDB depends largely on it’s data structure – it uses Documents instead of SQL’s records. Please note the word ‘Document’ – that’s exactly what you should treat Mongo data – as a structured, self-contained element that you could print on a sheet of paper and magnet to the fridge. You can build it however you want, it can contain direct quotes, images, titles and paragraphs, but you cannot easily reference another document in it. And that’s where matters get complicated.

Let’s suppose we are building an application for a school. We want to be able to view teachers, their students and the student’s grades.

Since MongoDB represents it’s documents as a JSON clump, that’s what I will be using in this post. A typical record in our database would look something like this:

teachers: [
  {
    name: "Jane Doe",
    students: [
      {
       name: "Harry Potter"
       grades: [
        {
          math: 4.5
        }
       ]
      }
    ]
  }
]

Already we can see one flaw of Mongo – since it doesn’t enforce a unified data structure (which is actually good in our case so far, since students may attend different classes), it has to keep key names in the document itself. This takes up precious space – but let’s leave that for now. Our record is simple enough, and we can easily get all the teacher’s pupils and their grades with one database query. So far so good, but the functionality is a bit limited. So we’ll try to add a Subject model, assign it to a teacher, and in it, list all the students that attend that subject to expand on the possibilities a bit. We’ll get something like this:

teachers: [
  {
    name: "Jane Doe",
    subjects:[
      {
        name: "Math",
        students: [{...}]
      },
      {
        name: "Computer science",
        students: [{...}]
      }
    ]
  }
]

See the problem here? Already we may be including some of the students twice. What if we added some extracurricular courses? Or a blackboard duty schedule? Our document keeps growing because the database structure has become relational – and MongoDB is not designed for that. Add this to the space taken by key names, and suddenly we have a big storage problem.

There is a workaround for it. We can avoid including all the users by just referencing their id’s – which are of BSON (Binary Json) type, and represented as a string. This solves the problem of redundancy, but creates a new one, to which I’ll devote another paragraph.

Come join us!

A quick, punk-rock introduction to this short section:

https://www.youtube.com/watch?v=jqwvU8QhlYQ

Okay, you get the point. MongoDB doesn’t support joins. Which means that complex queries including multiple tables have to be done manually, in your application code. That can’t be good – first of all, you have to write ugly code that should be supplied by the database engine, make multiple requests to the database and therefore lose time – especially if database sharding is in use, and you have to query multiple servers. If you ever find yourself in that spot, there is only one solution:

Change the database as soon as possible.

Slow and mellow?

MongoDB boasts flexibility, allowing for inconsistent data. But in reality, most cases you want to keep your data consistent anyway, since it really is a good practice and makes for easier validations. I personally get the feeling that MongoDB is just trying to kick open an unlocked door.

Another issue is concurrency – beggining with MongoDB version 2.2, every write you make causes a database wide lock, and no other process can write even to a completely unrelated table. It seems they are working on a more granular locking system, as seen here, but it seems any MongoDB user will have to suffer that performance hit for at least a while longer.

Much to learn you still have

Summing up, we arrive at a simple conclusion: Mongo is not ready for production yet. It’s still young, has a relatively small community (at least compared to MySQL or even Postgres), and it’s documentation is somewhat lacking. Does that mean it’s a bad database? No. When it’s a bit more complete and battle-tested, it will be viable, but only for very specific causes. It’s not a badly designed database, it’s just not very versatile – it’s only good on a few occasions. It doesn’t scale very well vertically, but it’s beautifully designed for horizontal expansion. It knows what it’s good at. And so should anyone who chooses to use it in their application.

Post tags:

Join our awesome team
Check offers

Work
with us

Tell us about your idea
and we will find a way
to make it happen.

Get estimate