In SQL 2008 a new T-SQL construct was added -*the
MERGE operation. (Ok, pedants will know this wasn’t new to Oracle, *but it was new to SQL Server).
This operation allows for the merging of*a dataset into a reference dataset – which can be remarkably similar to Insert / Update operations*effected by*the
Slowly Changing Dimension transformation. However the way it operates is*very different. Instead of the SCD’s row by row evaluation approach, the MERGE operation is a set based operation. What this means is it compares the whole of the source dataset to the reference dataset in a single pass.*This has significant implications for performance – on a site where I implemented this the operation which took 1,200 seconds in the SCD cut down to 51 seconds using a Merge.
There are limitations and differences to be aware of:
- You cannot directly*return row counts for Insert / Update / Ignore operations in the Merge
- As it is*a bulk operation a single row will cause failure of the whole batch
- There’s no GUI – just hand crafted SQL
- Less error trapping / logging options
- More flexibility in terms of actions when matches / non matches are found
The main reason why you would consider the SQL Merge – it handles Type 1, and with a little cunning, Type 2 dimensions – in a fraction of the time it takes the SCD to plod through. It’s still not as fast as a proper in memory comparison using something such as
TableDifference*- but it’s always good to know you have something else available*in your toolkit.
Further information:
Get More from the original blog...