Go Back   CORTEX Forums > Vendors and Service Provders > Open Source Analytics
Register Blogs FAQ Members List Calendar Search Today's Posts Mark Forums Read

Open Source Data Warehouses?

This is a discussion on Open Source Data Warehouses? within the Open Source Analytics forums, part of the Vendors and Service Provders category; I wanted to start a thread that discusses our choices when selecting an open source data warehouse. The choices to be made are at least over what: Methodology Tools are ...


Reply
 
LinkBack Thread Tools Search this Thread Display Modes
Old 22nd October 2009, 08:10 AM   #1
Member
 
Join Date: Oct 2007
Posts: 384
Blog Entries: 26
Steve Bennett will become famous soon enough
Default Open Source Data Warehouses?

I wanted to start a thread that discusses our choices when selecting an open source data warehouse. The choices to be made are at least over what:
  1. Methodology
  2. Tools
are used.

I'm interested in hearing about not just off-the-shelf solutions (what are they?) but also in any open methodologies that exist.

Are there any live examples out there?

Here is one methodology I am aware of.

MIKE2.0

I know that the MIKE2.0 (Method for an Integrated Knowledge Environment) Methodology community is working on what they call their Data Warehousing Solution Offering. This is a method for developing:
"... a Data Mart, an Enterprise Data Warehouse or variants of these systems that are more departmentally-focused, operational in nature or application-specific. The focus of this solution offering is around the “back end” of Data Warehousing and the overall delivery process as opposed the “front end” Business Intelligence aspect that is provided through other offerings. Therefore, the enablers for this offering are techniques related to data modelling, data integration, metadata management and data quality management as well as Data Warehouse strategy and architectural techniques."
It is still early days and although there are some implementations, they have all been done by BearingPoint - who are the instigators of the whole MIKE2.0 concept.

The MIKE2.0 community is generating some real IP on the website and if you are starting a warehouse, ODS or data mart project, then I can recommend a visit.

In the interests of disclosure: I am a member of the MIKE2.0 community but I have no financial relationship with BearingPoint.

So, who knows of other methodologies and what are my tool choices?
Steve Bennett is offline  
Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiTweet this Post!
Reply With Quote
Old 28th November 2009, 10:01 AM   #2
Member
 
Join Date: Oct 2008
Posts: 30
JohnG is on a distinguished road
Post Crunching data on the cheap

Data warehousing vendors are offering free, open-source versions that actually pack some heat

ARN, Chris Kanaracus (IDG News Service) 26 November, 2009 07:03:00

Data-warehousing software systems are expensive, but many enterprises have nonetheless been willing to dig deep, betting that analytics will provide new insights into their business and a competitive advantage.

In a report released earlier this year, research firm IDC predicted the data-warehousing platform market will grow from roughly US$7.9 billion in 2009 to about $10.8 billion in 2013.

The good news for IT shops that want to get started in analytics, but don't have the budget right now, is the recent emergence of free software options that pack fairly serious data-crunching firepower.

In October, Greenplum announced a Single Node Edition of its MPP (massively parallel processing) database. MPP architectures split up data workloads into multiple pieces that are managed independently on an array of servers.

The Single Node version can be used in production mode on one x86 server with up to two CPU sockets and unlimited cores. It can also be deployed in a single virtual machine with up to eight virtual cores. There is no storage cap. Single Node Edition can also be tied back into a broader Greenplum implementation.

Also in October, Calpont released InfiniDB Community Edition, an open-source, column-oriented database. The columnar method can in many cases greatly reduce disk I/O demand compared to systems that store data in rows, and also achieve higher levels of compression, said analyst Curt Monash of Monash Research.

InfiniDB Community Edition is also limited to a single server, but has no cap on CPU count, the number of cores, memory, data volume or concurrent users.

Calpont also has a commercial edition of InfiniDB, now in early adopter stage, that allows users to scale out the system to multiple servers.

InfiniDB Community Edition follows the release last year of another open-source columnar data-warehousing platform from Infobright. The latter may have an edge for now over InfiniDB in terms of community support; Infobright recently said the software has been downloaded more than 15,000 times.

Ultimately, though, these free data-warehousing options have their limits and likely usage scenarios, according to Monash.

"If you have a single analyst or small team of analysts doing early exploratory querying against some terabytes of data or less, then these products are likely to do the job," he said.

Companies may also not have the budget to procure anything else, or can't get funding without conducting an initial proof of concept, Monash added.

"There certainly are workloads for which they are insufficient, and you'll have to pay money for a product that will do the job for you," he said. "But if you want to get more value out of your data, these free products could be a great place to start."
JohnG is offline  
Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiTweet this Post!
Reply With Quote
Old 25th January 2010, 02:40 AM   #3
New Member
 
Join Date: Oct 2009
Posts: 4
Arthur is on a distinguished road
Default Open Source Data Warehouses?

Open source column-oriented DBMS based on MySQL. ... and Talend Announce Integrated Open Source Solutions for Data Warehousing.









_________________
Download Windows 7 " height="21" width="219"> Download Windows 7
Arthur is offline  
Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiTweet this Post!
Reply With Quote
Old 20th July 2010, 04:56 PM   #4
albertdeny
Guest
 
Posts: n/a
Default

I think this is great stuff, especially for those who have not yet entered the bandwagon of data analysis, and ironically (despite everything), but also for those with much fanfare.

live casino
 
Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiTweet this Post!
Reply With Quote
Reply

Bookmarks

Thread Tools Search this Thread
Search this Thread:

Advanced Search
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is On
Trackbacks are On
Pingbacks are On
Refbacks are On


Similar Threads
Thread Thread Starter Forum Replies Last Post
Google and Open Source Steve Bennett Open Source Analytics 1 29th May 2009 12:12 PM
Google and Open Source Steve Bennett Oz Analytics 0 14th May 2009 01:05 PM
Open Source Jane B Forecasting Special Interest Group 3 2nd April 2009 04:18 PM
Open Source admin Reporting Tips and Techniques 5 7th February 2009 01:12 PM
Welcome To Open Source admin Open Source Analytics 1 7th February 2009 01:11 PM


All times are GMT +11. The time now is 03:34 PM.

© The Business Intelligence Group

Search Engine Optimization by vBSEO