For decades now, the global standard for data storage has been relational databases. These use 'tables' linked by 'keys' to define the relationships between 'rows' of data.
Scalability has long been an issue with relational databases. Storage limitations are not easily remedied, with the only option available to most organisations being vertical scaling, i.e. adding more hard disk space to the server. This can be a very expensive!. As the amount of data grows, interaction with the database inevitably becomes slower, in particular the speed of inserts and updates can fall drastically due the overhead caused by maintaining the joins (and associated indexes) and the atomic transactions.
This is where MongoDB comes in.
MongoDB is an open-source, document-oriented, 'NoSQL' database developed by MongoDB Inc. in 2009.
Unlike traditional relational databases, relationships are not defined between rows in tables. Rather, data is stored in JSON-like text known as BSON, which allows for objects to be embedded within objects hierarchically.
These objects are stored in 'collections' which are similar to the concept of tables in SQL, but without a strictly enforced schema. The embedding of objects can allow for one document to hold all the necessary data, increasing querying, insert, and update speeds.
One of the major selling points for MongoDB (and other NoSQL database solutions) is the ability to scale horizontally. By distributing the database across multiple, cheaper servers, an organisation can both save money, and improve the performance of the database.
MongoDB doesn’t have an enforced schema, allowing you to map easily database objects to your application's objects, including any nested objects. This flexibility is, however, a double-edged sword as it comes with the possibility of data inconsistency without careful control.
Using MongoDB’s 'sharding' functionality, a collection can be distributed across multiple servers. The Mongo servers are able to route queries where they need to go based on the user-defined 'shard key'. The results from the different shards are then merged as one dataset.
It works on all the main platforms, such as Windows, Linux, OSX, Solaris and FreeBSD.
MongoDB can be significantly faster than SQL when inserting or updating large amounts of data. The lack of completely atomic transactions, and table joins, allows for very fast updating of data. The downside of this is the possible loss of data if something goes wrong.
Likely due to its distributed nature, each insert/update is treated individually, there is currently no way to do an all or nothing bulk insert operation for MongoDB.
MongoDB has security features similar to those in SQL, offering the ability to manage different users, with different permission levels, and access to databases. It does however not enforce this by default, causing many users to leave their databases wide open to attack.
To process queries faster, MongoDB tries to keep its indexes in memory. It’s not necessary for it to do so, but if you do not impose limits explicitly, MongoDB will use as much memory as it can to fulfil its needs.
In conclusion, MongoDB is a very interesting, scalable, powerful, but immature technology. Its distributed architecture is woven heavily into its DNA. It is beneficial for companies whose data growth rate is very high or possibly where data is scattered over multiple locations. Generally, database scalability is difficult, but MongoDB, on the other hand, has a built-in, easy solution to address that.
Its youth does hurt it and it took a while for the development team to learn from their predecessors. Originally MongoDB didn’t even feature single-document atomicity, and would return success status’s before committing any inserts, causing much confusion when things went wrong. This has been resolved now.
Its lack of enforced default security measures has lead to many issues. In January of this year (2017) some 28,000 public MongoDB databases were hacked in a single day, with the hackers ransoming the data in exchange for bitcoin. These were all ‘exposed’ databases, with no, or misconfigured security.
Despite various mis-steps, MongoDB continues to evolve and grow, aiming to make distributed, NoSQL databases, a realistic alternative to traditional relational databases in the future.