| |
| ||||||
Managing your history dataThis is a discussion on Managing your history data within the BI Monkey forums, part of the CORTEX Blogs category; This post is to an extent a small rant about some design decisions I have been constrained by on my current project. These decisions were made predominantly for*one fairly bad ... |
![]() |
| | LinkBack | Thread Tools | Search this Thread | Display Modes |
| | #1 |
| Guru Join Date: Jun 2009
Posts: 122
![]() | This post is to an extent a small rant about some design decisions I have been constrained by on my current project. These decisions were made predominantly for*one fairly bad reason: it made the architect’s life easier (apologies to the architects if they are reading – but these were bad choices!) The design choices in question are around the managing of history data. In one component of the system it relates to Database storage design, the other relates to Cube storage design. In both cases the history data is stored in a separate location to the “current” data. Databases: Why separate history*tables are*a bad idea The first – and most compelling – reason for not storing your history data in*separate tables to your current*tables is that it increases complexity for users. Instead of having one location to look for data, your users now have to use two. The second compelling reason is that there is no point to doing this from a storage point of view. SQL 2005 & 2008 (Enterprise*editions only, admittedly) provide partitioning. This enables the contents of an individual*table to be stored in separate locations on different filegroups. This means that you can store your current days data in one location and your history in a different one. The reason for doing this*is the same as splitting it into separate tables – that querying the current section will be faster than the historic section. *In theory*queries against partitioned tables*should in fact be faster as the current data is now no longer in the same filegroup as the history data. Now, there is an overhead associated with designing and maintaining partitions but I don’t see that it is significantly larger than that required to*deal with the process required to archive data into separate tables*on a daily basis. Additionally when maintaining separate history tables, you need to separate out every single table, whether it gets 10 rows a day or 10 million. With partitioning you can just target the large tables that need that focus. There are*other downsides to maintaining separate*tables. If you make a*change to a table design, you need to do it in 2 places. *You also need to remember to update your history processes. If your history process fails, you can end up with users getting unexpected query results or ETL process failures when the system loads the next day’s data into the current tables – *and untangling it becomes a real mess. If your partition processes fail to run, you just have too much data in one filegroup for a while – unlikely to be*fatal. So if you have large tables you need to split out for performance purposes – do it at the back end, using the power of the database – which is designed to store data efficiently. Keep it away from the users – they neither need to know or care about your need to keep the data separate. If you want to give them a single object to query with the current day’s data, just use views. Cubes: Why a separate history*Cube is a bad idea Much of the above applies here – SSAS also has partitions – so you can again store your historic and current data in separate physical locations with the users being totally unaware of this. Again there is overhead in maintenance, but this will also balance out with the maintenance and risks associated with maintaining two identical cubes that only differ in terms of data source. Use your storage options! So without banging on about the same things any further, please consider the following two points whenever considering managing your history data:
Get More from the original blog... |
| | |
![]() |
| Bookmarks |
| Thread Tools | Search this Thread |
| Display Modes | |
| |
Similar Threads | ||||
| Thread | Thread Starter | Forum | Replies | Last Post |
| Cartographies of Time: A History of the Timeline | admin | Analytic News Feeds | 0 | 20th April 2010 04:29 AM |
| Christian Bartens, Managing Director of Datalicious, new chair of ADMA Data & Analyti | admin | Presentation News Feeds | 0 | 14th March 2010 06:00 PM |
| IBM Announces New Software for Managing Data Centers | Latest News Headlines | IBM and Cognos Forum | 0 | 20th October 2009 04:06 PM |
| Interactive Data Promotes Liz Duggan to Managing Director, Evaluations | Latest News Headlines | 2009 Q4 News Headlines | 0 | 6th October 2009 06:57 AM |
| Excellent video on the History of BI | James Beresford | BI Monkey | 3 | 1st July 2009 12:58 PM |
| | |
| | |