Go Back   CORTEX Forums > Local Happenings > CORTEX Blogs > Innovations in Data Management
Register Blogs FAQ Members List Calendar Search Today's Posts Mark Forums Read

Innovations in Data Management Tony Bain is an expat Kiwi, Father, Entrepreneur, Angel Investor, Blogger, and occasional Writer for Read Write Web. He is an associate director for Red Rock and the founder of Tony Bain Group.

Reply
 
LinkBack Thread Tools Display Modes
Old 15th December 2009, 01:40 PM   #1 (permalink)
Member
 
Join Date: Jun 2009
Posts: 49
Tony Bain is on a distinguished road
Post Is Cassandra winning the NoSQL race?

Cassandra is fast emerging as one of the key NoSQL databases. While we often express that the point of NoSQL is to offer more choice than an “RDBMS” hammer for every nail, there are practical reasons why a small number...



Cassandra is fast emerging as one of the key NoSQL databases.* While we often express that the point of NoSQL is to offer more choice than an “RDBMS” hammer for every nail, there are practical reasons why a small number of stack technologies gain dominance and others circle on the sidelines.


Cassandra has already ticked many of the boxes needed to shoot it into the stratosphere as a widely used, default database platform.* Especially so in the web world where high scalability, high availability, open source and being proven by a bigger fish all matter.* Specifically Cassandra has:
  • The ability to scale across many nodes
  • The ability to scale to many hundreds of gigabytes of data
  • High availability, losing a node doesn’t take down the cluster & online node provisioning and data distribution (and automated data copy).* Also is decentralized (every node is the same as another, no single point of failure).
  • Bigtable like “Column Families” (more advanced schema control than DHT)
  • Dynamo like eventual consistency (not a plus but a trade off required for scalability) & log based recovery and the ability to either write asynchronously or synchronously
Cassandra, if you’re note familiar, was built originally by Facebook as an internal database system required to help them scale to their massive data demands.* It was then thrown over the wall and made open source, where the community picked it up and ran with it.* Cassandra is capable of supporting transaction processing workloads at large scale and has found favor at RackSpace, Twitter, Digg and others.

Interestingly, I understand Facebook forked the code and have continued to develop their own internal version independently of the open source version.* The open source Cassandra is now largely developed by RackSpace where they have 3 people working full time (+ the community at large) lead by Jonathan Ellis & Digg.* The reasons behind this aren’t entirely clear, but one may assume that Facebook were happy to share their work with the community, but don’t have the time or interest in managing the ongoing development of an open source project.

Scale is the primary reason why you would choose a platform like Cassandra.* Traditional RDBMS’s start to struggle when you want to go over 1 node, and big clusters are currently only really possible using expensive shared disk technology or when targeting specialized analytical workloads (MPP RDBMS).* I understand Facebook is running a 150 node Cassandra cluster and others have 30+ node clusters in production also.

What Cassandra is majorly lacking right now (apart from secondary indexes which I think they are working on) is the backing of a commercial vendor who is providing product support (RackSpace are not doing this).* But I am sure this will be addressed in the near future with either RackSpace spinning something up or someone like Cloudera adding it to their responsibilities.





Get More from the original blog...
Tony Bain is offline   Reply With Quote
Reply

Bookmarks

Thread Tools
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is On
Trackbacks are On
Pingbacks are On
Refbacks are On


Similar Threads
Thread Thread Starter Forum Replies Last Post
Mortgage market still a two horse race Latest News Headlines 2009 Q4 News Headlines 0 1st December 2009 06:51 AM
Betting on the race that stops the nation admin Forecasting Special Interest Group 0 10th November 2009 07:49 AM
The NoSQL community needs to engage the DBA’s Tony Bain Innovations in Data Management 0 30th July 2009 06:06 PM
Ziggy remains in NBN race as Anderson ruled out Latest News Headlines 2009 Q3 News Headlines 0 16th July 2009 10:27 AM
The weekly wrap: NAB winning favour Latest News Headlines 2009 Q3 News Headlines 0 10th July 2009 09:50 AM


All times are GMT +11. The time now is 08:57 AM.

© The Business Intelligence Group

Search Engine Optimization by vBSEO 3.3.0