| |
| ||||||
Related PostsThis is a discussion on Related Posts within the Open Source Analytics forums, part of the Vendors and Service Provders category; Open source is also being discussed in other forums. Post links to those discussions here.... |
![]() |
| | LinkBack | Thread Tools | Search this Thread | Display Modes |
| | #1 |
| Member Join Date: Oct 2008
Posts: 26
![]() | Open source is also being discussed in other forums. Post links to those discussions here. |
| | |
| | #3 |
| Member | Pentaho and Amazon.com deliver BI to the cloud Companies will be able to 'rent' Pentaho Version 3.0 via Amazon's EC2. Eric Lai 24/03/2009 08:59:00 Open-source business intelligence application Pentaho is joining the roster of applications available via Amazon.com Inc.'s EC2 Web hosting service. Companies will be able to "rent" the new release of Pentaho, Version 3.0, via EC2. That arrangement should lower the upfront start-up costs of using Pentaho -- though those costs were already low for its on-site version, according to Lance Walter, vice president of marketing at Orlando-based Pentaho Corp. The on-site version of Pentaho's open-source software can be used for free, though many business customers subscribe to Pentaho support. Pentaho is not the first "BI as a service" provider. Cambridge, Mass.-based start-up Good Data Corp. began testing a cloud-based BI service last fall. Good Data's offering is also on EC2. Other enterprise applications available via EC2 include open-source ERP application Compiere, application servers such as JBoss, and a plethora of databases, including Oracle, MySQL and Microsoft Corp.'s SQL Server. Other new features in Pentaho 3.0 include redesigned dashboards that incorporate Adobe Flash technology for better visuals and are now easy enough for most business end users to build themselves, said Walter. Pentaho and JasperSoft Corp. are the two most popular vendors of open-source BI offerings. JasperSoft is backed by Linux vendor Red Hat Inc. Pentaho's open-source community includes 40,000 registered members and, according to Walter, "hundreds of active contributors." |
| | |
| | #4 |
| Guru Join Date: Oct 2007
Posts: 101
![]() | From Techworld: Greenplum touts super-quick data loading Database "fastest in the industry" Tom Jowitt (Techworld) 18/03/2009 Greenplum has released new technology which it says can speed the loading of data into large scale databases, without compromising overall performance. San Mateo, California-based Greenplum provides a high performance database (DBMS) typically used in data warehousing and large-scale analytical processing (or business intelligence) applications. It powers the Sun Data Warehouse Appliance, and customers include the likes of Linkedin, Nasdaq, NYSE Euronext, Fox Interactive Media, and Myspace. Data loading is rapidly becoming an issue for companies increasingly facing exponential data growth. "For many companies data loading is a bottleneck," said Ben Werther, director of product marketing at Greenplum. "Data loading is traditionally done at night, but more data and longer loading cycles, sometime means this extends into the working day." "The amount of data is growing on a daily or weekly basis," said Paul Salazar, VP of corporate marketing. "Companies are seeking to gain competitive advantage from analysing the data they capture and they are also choosing to store more data about specific events." Salazar said that if customers can gain field intelligence quickly, by shorten data loading times to a couple of hours instead of overnight or longer, then there is a definite competitive advantage to be had. To this end, Greenplum has introduced technology it is calling MPP Scatter/Gather Streaming' (or SG Steaming for short). SG Streaming technology is available immediately with the Greenplum Database. It is included at no extra charge to Greenplum customers, and the company says it eliminates the bottlenecks associated with other approaches to data loading. Indeed, Greenplum cites customers that are achieving production loading speeds of over 4TB per hour. "The loading capabilities of this database are remarkable," said Brian Dolan, director of research analytics at Fox Interactive Media. "We're loading at rates of four terabytes an hour, consistently." "This is definitely the fastest in the industry," said Greenplum's Werther. "Netezza for example quotes 500GB an hour, and we have not seen anyone doing more than 1TB an hour." According to Werther, Greenplum utilises a "parallel-everywhere" approach to loading in which data flows from one or more source systems to every node of the database without any sequential choke points. This differs from traditional "bulk loading" technologies, used by most mainstream database and MPP appliance vendors that push data from a single source, often over a single or small number of parallel channels, and result in fundamental bottlenecks and ever-increasing load times. Greenplum's approach also avoids the need for a "loader" tier of servers, as required by some other MPP database vendors. The SG Streaming technology ensures parallelism by "scattering" data from all source systems across 100s or 1,000s of parallel streams that simultaneously flow to all nodes of the Greenplum Database. Performance scales with the number of Greenplum Database nodes, and the technology supports both large batch and continuous near-real-time loading patterns with negligible impact on concurrent database operations. Another useful feature is that the data can be transformed and processed in-flight, utilising all nodes of the database in parallel, for extremely high-performance ELT (extract-load-transform) and ETLT (extract-transform-load-transform) loading pipelines. Of course, this means that Greenplum competes against the likes of hardware-based players like NCR's Teradata and Netezza, as well as other mainstream players such as Oracle. But Greenplum says that its ability to utilise off-the-shelf servers, storage, and networking, means that customers are not tied into any particular hardware configuration, and instead are offered cost-effective scaling on commodity hardware. Greenplum launched version 3.2 of its database software back in September last year. Greenplum Database 3.2 was the first database to include MapReduce, a parallel computing technique pioneered by Google for analysing the web, which boosted the data analytics capabilities of the new DBMS. |
| | |
| | #5 | ||
| Member Join Date: Oct 2008
Posts: 26
![]() | Quote:
Quote:
| ||
| | |
| | #6 | |
| Administrator | Cross post from latest news (whole article is here. Quote:
| |
| | |
![]() |
| Bookmarks |
| Thread Tools | Search this Thread |
| Display Modes | |
| |
Similar Threads | ||||
| Thread | Thread Starter | Forum | Replies | Last Post |
| Cross Posts About Forecasting | admin | Forecasting Special Interest Group | 3 | 6th December 2009 11:35 AM |
| | |
| | |