| |
| ||||||
The Derived Column TransformationThis is a discussion on The Derived Column Transformation within the BI Monkey forums, part of the CORTEX Blogs category; Fig 1: The Derived Column Transformation In this post I will be covering the Derived* Column Transformation. The sample package can be found here for 2005 and guidelines on use ... |
![]() |
| | LinkBack | Thread Tools | Search this Thread | Display Modes |
| | #1 |
| Guru Join Date: Jun 2009
Posts: 122
![]() | Fig 1: The Derived Column TransformationIn this post I will be covering the Derived* Column Transformation. The sample package can be found here for 2005 and guidelines on use are here. What does the Derived Column Transformation do? The Derived Column Transformation provides a means to change column data as it passes through the data flow. It uses the SSIS Expression Language to transform the data and allows you to replace an existing column’s value or create a new column which can use values from other columns or variables to create a new data item. Fig 2: Configuring the Derived Column TransformationIn the Derived Column editor, there are three panes to work with. The top left pane has two folders, one for Variables and one for Columns. The top right pane contains folders for the various functions available. The bottom pane is where you define the Derived Columns. Both the top panes have drag and drop functionality, so you can click and drag a function, column or variable into the Expression area of the Derived Column configuration pane. Setting up a Derived column is straightforward – if adding a new column, provide a name for it in the Derived Column Name column. If replacing the content of an existing column, select its name in the Derived Column dropdown, and the Derived Column Name will lock down to the name of the column being replaced – in the example above see the ListPrice column for an example. The next part is the Expression itself. For details of the language for the expression, see my posts on the* SSIS Expression Language. The options you have are basic expressions, such as ROUNDing numbers, YEAR functions for dates and SUBSTRING for strings. Two things that I will raise are NULL Functions and Type Casts. Because SSIS is so strict about data typing, if you evaluate or return a Null in your expression, you have to pull a Null from the Null functions list – for example if assessing if a Date value is null, you would need to use the Null function NULL (DT_DATE). Similarly Type Casts convert data to a specific type, so for example if you wanted to put a string source type value into a float column, you would have to convert it using the Type Cast (DT_R8) [Sting Column]. The Data Type, Length, Precision, Scale and Code pages are determined automatically from the type of data you are working with – note that you cannot change the data type of an existing column. The only room you really have to move here is to change string lengths and code pages, or precision and scale for numerics when you create a new column. In the sample package I show a few simple examples of Column manipulation, using column on column, variable on column, pure variable and pure column operations. What are the Derived Column Transformation’s limitations? My biggest problem with the Derived Column Transformation is that the function list is small, and worst of all, fixed. In a rare example of Cognos Data Manager being better than SSIS, Data Manager allows for the creation of custom functions that can then be re-used. SSIS offers no such flexibility, which means if you have complex operations that need to be done repeatedly (e.g. T******* & Nulling incoming strings to clean input data) you can’t create a custom function to simplify the operation and make the operation reuseable. I’ve added a Feature Suggestion on Connect to request an package level extensible function library – please vote for it if you agree this is a big hole in the component. One thing which trips up a few people is that you cannot use the result of one derived column in another derived column within the same transformation. The logic behind this is pretty simple – each column is treated as a separate independent item within the data flow and can only consume columns that are input to the transformation. A derived column is effectively an output and so cannot be referenced within the component it was created in. Lesser gripes are that the Editor is too small. There is no call out box like when setting component properties using expressions, so complex expressions quickly become difficult to read and debug. Add to that the only way to get the syntax error messages is to hover over the function that is invalid and try to read the note that appears for about 4 seconds, meaning the only real way to read long error messages is to hover over it with the mouse and do a screen capture. Where should you use the Derived Column Transformation? It is best used when you have to perform simple operations to change data values – for example T******* strings, simple IF statements and SUBSTRINGs. It also reproduces all of the functionality in the Audit and Data Conversion tasks, so if you are using those anywhere you may want to consider replacing them with a Derived Column. If your expressions are getting complicated or you repeat alot of operations, you may want to move these operations to the uglier but more powerful Script Component, which I will be covering soon. MSDN Documentation for the Derived Column Transformation can be found here for 2008 and here for 2005. If you are still struggling, try these additional resources:
Get More from the original blog... |
| | |
| | #2 |
| New Member Join Date: Jun 2009
Posts: 2
![]() | Interesting... seeing how "Trim"ming gets censored because of the sexual context of a substring of it... |
| | |
![]() |
| Bookmarks |
| Thread Tools | Search this Thread |
| Display Modes | |
| |
Similar Threads | ||||
| Thread | Thread Starter | Forum | Replies | Last Post |
| The Sort Transformation | James Beresford | BI Monkey | 0 | 22nd July 2009 10:03 PM |
| The Merge Transformation | James Beresford | BI Monkey | 0 | 20th July 2009 01:11 PM |
| The Unpivot Transformation | James Beresford | BI Monkey | 0 | 4th July 2009 02:20 PM |
| Flat File Source Error: The column delimiter for column [ColumnName] was not found | James Beresford | BI Monkey | 0 | 30th June 2009 12:21 PM |
| The Copy Column Transformation | James Beresford | BI Monkey | 0 | 25th June 2009 10:12 PM |
| | |
| | |