Wednesday, February 9, 2011

Document Store Databases

Today I am doing some testing on MongoDB. We are getting ready to launch a new project and so I have spent the first part of the month getting things configured. I now have our staging area completely set up and ready as the rest of the software comes online.

Most database software like Oracle or PostgreSQL is relational. That means data is stored in tables and those tables can be related to other tables. If you have a PERSON table with five attributes (e.g., last name, first name, gender, etc.), those same attributes are stored for every person in the table. MongoDB is a document store. This means that you can keep track of different attributes for each person. You may store that Jimmy has green eyes and not care about that for any other person in the database.

The reason we are using MongoDB is because of its speed. While we can get 1,000 transactions per second out of Oracle or PostgreSQL, we are getting 25,000 from MongoDB using equivalent hardware. That is a huge difference.

MongoDB also has the ability to automatically partition the data to run on separate servers. If your database starts getting too big, simply add another server and the data is automatically partitioned onto it. Then when you go to look for Jimmy's data, both machines look through half as much data so you get even more speed.

Of course MongoDB does have a drawback. For starters, there are not really any reporting tools. In our system we will be doing nightly pulls of the data and inserting it into a PostgreSQL database. Then we can run reports on a daily basis using tools like BIRT or JasperReports. There is also a bit of a learning curve for database administration. However these issues are a small price to pay for such speed.

No comments:

Post a Comment