
Last year, at about this time of the year, I was well involved in the process of writing the book
Pentaho Solutions: Business Intelligence and Data Warehousing with Pentaho and MySQL" for
Wiley. To date, "Pentaho Solutions" is still the only all-round book on the open source
Pentaho Business Intelligence suite.
It was an extremely interesting project to participate in, full of new experiences. Although the act of writing was time consuming and at times very trying for me as well as my family, it was completely worth it. I have none but happy memories of the collaboration with my full co-author
Jos van Dongen, our technical editors Jens Bleuel, Jeroen Kuiper,
Tom Barber and
Tomas Morgner, several of the Pentaho Developers, and last but not least, the team at Wiley, in particular Robert Elliot and Sara Shlaer.
When the book was finally published, late August 2010, I was very proud - as a matter of fact, I still am :) Both Jos and I have been rewarded with a lot of positive feedback, and so far, book sales are meeting the expectations of the publisher. We've had mostly positive
reviews on places like Amazon, and
elsewhere on the web. I'd like to use this opportunity to thank everybody that took the time to review the book: Thank you all - it is very rewarding to get this kind of feedback, and I appreciate it enourmously that you all took the time to spread the word. Beer is on me next time we meet :)
Announcing "Pentaho Kettle Solutions" 
In the early autumn of 2010, just a month after "Pentaho Solutions" was published, Wiley contacted Jos and me to find out if we were interested in writing a more specialized book on
ETL and data integration using Pentaho. I felt honoured, and took the fact that Wiley, an experienced and well-reknowned publisher in the field of data warehousing and business intelligence, voiced interested in another Pentaho book by Jos an me as a token of confidence and encouragement that I value greatly. (For Pentaho Solutions, we heard that Wiley was interested, but we contacted them.) At the same time, I admit I had my share of doubts, having the memories of what it took to write Pentaho Solutions still fresh in my mind.
As it happens, Jos and I both attended the 2009 Pentaho Community Meeting, and there we seized the opportunity to talk to
Matt Casters, chief Pentaho Data Integration and founding developer of
Kettle (a.k.a. Pentaho Data Integration). Both Jos and I didn't expect Matt to be able to free up any time in his ever busy schedule to help us to write the new book. Needless to say, he made us both very happy when he rather liked the idea, and expressed immediate interest in becoming a full co-author!
Together, the three of us made a detailed outline and wrote a formal proposal for Wiley. Our proposal was accepted in December 2009, and we have been writing since. The tentative title of the book is
Pentaho Kettle Solutions: Building Open Source ETL Solutions with Pentaho Data Integration. It is planned to be published in September 2010, and it will have approximately 750 pages.
Our working copy of the outline is quite detailed but may still change in the future, which is why I won't publish it here until we finished our first draft of the book. I am 99% confident that the top level of the outline is stable, and I have no reservation in releasing that already:
- Part I: Getting Started
- ETL Primer
- Kettle Concepts
- Installation and Configuration
- Sample ETL Solution
- Part II: ETL Subsystems
- Overview of the 34 Subsystems of ETL
- Data Extraction
- Cleansing and Conforming
- Handling Dimension Tables
- Fact Tables
- Loading OLAP Cubes
- Part III: Management and Deployment
- Testing and Debugging
- Scheduling and Monitoring
- Versioning and Migration
- Lineage and Auditing
- Securing your Environment
- Documenting
- Part IV: Performance and Scalability
- Performance Tuning
- Parallization and Partitioning
- Dynamic Clustering in the Cloud
- Realtime and Streaming data
- Part V: Integrating and Extending Kettle
- Pentaho BI Integration
- Third-party Kettle Integration
- Extending Kettle
- Part VI: Advanced Topics
- Webservices and Web APIs
- Complex File Handling
- Data Vault Management
- Working with ERP Systems
Feel free to ask me any questions about this new book. If you're interested, stay tuned - I will probably be posting 2 or 3 updates as we go.
More from Roland Bouman's Blog ...