Recent rise of
NoSQL movement ( MongoDB, Cassandra, HBase, CouchDB and many more) can be
attributed to 2 factors:
-
Adoption of agile aka "write
code first design later" methodology:
In using RDBMS you have to design your
tables, data structure, relations first, then only you can start
coding. While using NoSQL you can start coding without worrying about
tables. You can modify your objects at lesser cost of development.
-
Impedance mismatch between Object
Oriented programming and relational database
: It is major pain point for developers to
use Object Relation Mapping software (e.g. Hibernate). Meanwhile
MongoDB, which is document oriented database, can fit directly in Object
oriented paradigm.
Both MongoDB
and RDBMS databases have their uses. Some use case are common to both and
some are exclusive.
Consider an
example of transaction based system and this is what MongoDB people says on
their website:
"
While
most modern applications require a flexible, scalable system like MongoDB,
there are use cases for which a relational database like MySQL would be better
suited. Applications that require complex, multi-row transactions (e.g., a
double-entry bookkeeping system) would be good examples. MongoDB is not a
drop-in replacement for legacy applications built around the relational data
model and SQL.
"
(https://www.mongodb.com/compare/mongodb-mysql
)
Let us compare MongoDB and MySQL (any RDBMS will do, I have just chose MySQL
for sake of It.) on various features.
-
Philosophy:
MySQL is relation oriented while
MongoDB is document oriented. If you manipulate a single type of object
most of time then document oriented database will suit you more. Relation
oriented is more suitable if you frequently manipulate more than one type
of object at a time.
-
Querying through user interface:
MySQL use standard SQL for
querying, while MongoDB use propriety JSON protocol. It provides JavaScript
client for querying, which is very different from SQL. e.g. MySQL query
"SELECT id, name FROM users WHERE age>50 LIMIT 10" will
translate into "db.users.find({"age":{"$gt":50}},
{"id":1,"name":1}).limit(10)". SQL although
have steep learning curve, but is versatile.
-
Ease of Use:
You have to give MongoDB full marks
here. MongoDB is very easy to set up and get it running. Its Java client
is also very simple. Since document structure can be easily mapped to
Object structure, users of OO language such as Java, C#, C++ will find
less mental friction while using Mongo client.
-
Maturity:
Since MongoDB is not even a decade
old, so release stability is problem. There were major changes between
Mongo 2 vs Mongo 3. Compare this to RDBMS which are more or less unchanged
in structure from last 2 decades.
-
Multi table query:
MongoDB has no concept of
joins, so it cannot be done in MongoDB but then again it is how it has
been designed, it is not RDBMS. But you can change your document structure
and embed the other document inside the first document. This sometimes
leads to very massive document structure. Again MongoDB has maximum
document size limit of 16MB, so you cannot keep going on fattening your
document.
-
Foreign Keys:
MongoDB does not support
foreign keys, if you need these type of constraint, you have to handle it
in code, a complexity at the implementation time.
-
SQL injection vulnerability:
The name itself indicates that is
applicable for SQL only, while MongoDB is susceptible to other kinds of
attack but is unaffected by SQL injection.
-
Lock granularity:
MySQL provides very fine granularity
of locking, while MongoDB provides only one level. Till MongoDB 2 it was
only at collection (table) level. In MongoDB 3 it is on document (row)
level. There is no other level of locking.
-
User Privilege:
MongoDB provide access on role based.
It does not support query based privilege.
-
Indexes:
MongoDB support multiple type of
indexes. But it puts limitation on size of index, number of indexes, and
number of fields in compound indexes, some similarity with RDBMS.
Without indexes MongoDB queries are very slow and once your table size
grows more than 2-3 million documents (rows). If you have an indexed field
and you are adding a document in which field size is greater than 1 KB, it
will raise an error.
-
Scaling:
Horizontal scaling with MongoDB is more
convenient in practice. This has been major selling point of MongoDB. But
if you see top 10 most popular website, 3 of them (YouTube, Wikipedia and
Twitter) uses MySQL, none have used MongoDB. It is not a matter of
what to use, it is more of how to use. There appears to be a compelling
reason why MongoDB is out in their design decisions.
-
Error reporting:
Error reporting while writing
to MongoDB leaves lot to be desired. It does not return all the error,
only last error. Also while replicating to slaves if some of replication
has failed, it will just ignore it, so consistency is another issue.
-
Performance degradation:
If load on database is heavy and data
is not properly indexed, then all databases RDBMS and NoSQL suffer
performance degradation alike. But behavior of both are
different. MongoDB currently puts limit on max number of connection as
20000. When this limit is reached, it behave very unwieldy. Sometimes
only option is to restart.
-
Data durability:
Before MongoDB 3.2, MMAP was default
database engine of MongoDB. MMAP uses memory mapped file for writing.
Memory mapped file syncs to disk in regular interval. If journaling is not
enabled and crash happens then that time interval data will be lost.
Enabling Journal slows down writes dramatically.
In part 2
we will discuss factors which are responsible for rise in adoption of MongoDB.