Go Back   CORTEX Forums > Best Practices > Subject Matter Expertise > Presentation > Presentation News Feeds
Register Blogs FAQ Members List Calendar Search Today's Posts Mark Forums Read

Amazon opens their real NoSQL database, awesome potential for data mining and busines

This is a discussion on Amazon opens their real NoSQL database, awesome potential for data mining and busines within the Presentation News Feeds forums, part of the Presentation category; Amazon have turned on access to Dynamo , which runs much of the infrastructure behind Amazon Web Services and provided the inspiration for Cassandra. All your data is hosted on ...


Reply
 
LinkBack Thread Tools Search this Thread Display Modes
Old 20th January 2012, 03:15 PM   #1
Administrator
 
Join Date: Oct 2007
Posts: 15,959
Blog Entries: 7
admin has disabled reputation
Post Amazon opens their real NoSQL database, awesome potential for data mining and busines

Amazon have turned on access to Dynamo, which runs much of the infrastructure behind Amazon Web Services and provided the inspiration for Cassandra. All your data is hosted on SSDs (Solid State Drives) which means it should be pretty damn fast.

In tech terms this is a tuneable-consistency and throughput key-value store. That means you can dial up or down the resources (and cost) depending on your performance and consistency (whether or not the very latest data is available in your query) needs.

This kind of thing is useful in applications where you've got unknown and changing volumes of data turning up that you want to mine in real time, that will eventually grow to enormous volumes that you still need to be able to store and mine quickly. Web analytics is a good example: you don't know how many people will be hitting your web site, so you start with the default "5 writes/second" but can dial it up and back when there's peaks of load. If you use something like NodeDB to do the data collection and inserting, any backlog of data writes will just queue up until it gets in if there's momentary peaks in demand.

Querying these key-value stores isn't as easy as with relational databases, which have had over 40 years of development and can be very flexibly queried. The advantage they hold over relational databases is that they scale linearly. If a query on 100 records takes a second, a query of 100000 records will also take a second provided you throw 1000x the hardware at the database server. Relational databases are very hard to scale this way once you grow past small clusters of machines. With relational database clusters, ensuring each replica of the data is consistent takes up ever-increasing fractions of the resources, meaning the scaling curve is anything but linear.
The full blog post from Amazon.


Permalink | Leave a comment »



More from the Datalicious Blog...
admin is offline  
Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiTweet this Post!
Reply With Quote
Reply

Bookmarks

Thread Tools Search this Thread
Search this Thread:

Advanced Search
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is On
Trackbacks are On
Pingbacks are On
Refbacks are On


Similar Threads
Thread Thread Starter Forum Replies Last Post
Big Data, MapReduce, Hadoop, NoSQL Latest News Headlines Other International Vendors 0 21st October 2011 01:45 AM
New article on Data Mining-- "Grab Bag: Frequently-Asked Data Mining Questions and An admin Analytic News Feeds 0 12th May 2011 06:13 AM
Sunlight Labs opens up Real Time Congress API admin Analytic News Feeds 0 17th February 2011 08:50 PM
Data mining competitions - facilitating real-time science admin Analytic News Feeds 0 9th July 2010 04:10 AM
Calpont opens up: InfiniDB Open Source Analytical Database (based on MySQL) Latest News Headlines DWH Tip Feeds 0 25th November 2009 11:03 AM


All times are GMT +11. The time now is 02:16 PM.

© The Business Intelligence Group

Search Engine Optimization by vBSEO