Go Back   CORTEX Forums > Local Happenings > CORTEX Blogs > BI Monkey
Register Blogs FAQ Members List Calendar Search Today's Posts Mark Forums Read

The Sort Transformation

This is a discussion on The Sort Transformation within the BI Monkey forums, part of the CORTEX Blogs category; Fig 1: The Sort Transformation In this post I will be covering the Sort Transformation. The sample package can be found here for 2005 and guidelines on use are here ...


Reply
 
LinkBack Thread Tools Search this Thread Display Modes
Old 22nd July 2009, 10:03 PM   #1
Guru
 
Join Date: Jun 2009
Posts: 122
James Beresford is on a distinguished road
Thumbs up The Sort Transformation

Fig 1: The Sort Transformation


In this post I will be covering the Sort Transformation. The sample package can be found here for 2005 and guidelines on use are here.

What does the Sort Transformation do?

You will be unsurprised to know that it sorts data based on a specified key, with the additional option to remove duplicates. The Sort Transformation is a blocking component within the Data Flow - what this means is that it needs to consume all data fed to it before it will output any data - this can lead to performance issues if you have large amounts of data to sort, or many Sort components in your data flow.

Fig 2: Configuring the Sort Transformation


To configure the Sort Transformation, you simply select the keys that you want to sort the data set on, set the priority (Sort Order) in which you want the sort applied to the keys and the direction you want to sort them in (Sort Type). If needed you can change the name of the sort keys in the output (Output Alias). You can optionally drop columns from the input set (just uncheck the Pass Through checkbox) - though if you’re doing this, you should check if that column even needed to be passed there in the first place. Finally you can remove duplicates by checking the “Remove rows with duplicate sort values”. Be aware this will only check for duplicates on the keys, so if you have duplicate sort keys with different associated data you will only get one of those rows in the output. If sorting text data you may also need to set the Comparison Flags to manage how the component deals with certain special cases.

When would you use the Sort Transformation?

You would use it when you want to sort or deduplicate data! However, it’s always worth questioning if you really need to sort data - it makes it easier for developers to review but from a machine point of view, it often makes little difference. If you are using the Sort to deduplicate data, have a look at voting on Jamie Thompson’s suggestion for a DISTINCT component on Connect. However Sort and Distinct operations are best pushed back to a database if at all possible. If you are sorting large amounts of data and hitting performance problems, you may want to take a look at the custom NSort component - i’ve not used it but it may be worth a look if the default component is proving inadequate.

MSDN Documentation for the Sort Transformation can be found here for 2008 and here for 2005.



Get More from the original blog...
James Beresford is offline  
Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiTweet this Post!
Reply With Quote
Reply

Bookmarks

Thread Tools Search this Thread
Search this Thread:

Advanced Search
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is On
Trackbacks are On
Pingbacks are On
Refbacks are On


Similar Threads
Thread Thread Starter Forum Replies Last Post
The Merge Transformation James Beresford BI Monkey 0 20th July 2009 01:11 PM
The Multicast Transformation James Beresford BI Monkey 0 7th July 2009 05:48 AM
The Unpivot Transformation James Beresford BI Monkey 0 4th July 2009 02:20 PM
The Conditional Split Transformation James Beresford BI Monkey 0 23rd June 2009 09:34 PM
The Pivot Transformation James Beresford BI Monkey 0 23rd June 2009 09:34 PM


All times are GMT +11. The time now is 09:37 AM.

© The Business Intelligence Group

Search Engine Optimization by vBSEO