| |
| ||||||
Replacing multiple characters in a Text String in SSISThis is a discussion on Replacing multiple characters in a Text String in SSIS within the BI Monkey forums, part of the CORTEX Blogs category; Today as part of some ongoing painful text file imports, I had to clean up some phone numbers that had been entered into a free text field. Consequently the data ... |
![]() |
| | LinkBack | Thread Tools | Search this Thread | Display Modes |
| | #1 |
| Guru Join Date: Jun 2009
Posts: 135
![]() | Today as part of some ongoing painful text file imports, I had to clean up some phone numbers that had been entered into a free text field. Consequently the data contained this kind of garbage data: (03) 8765 4321 0387654321 03-87654321 - Bob’s mums number (03)87654321 - or use 04 87654321 during day Which didn’t lend itself to being fixed in a Derived Column, as every possible replace character would have to be hardcoded, like so: REPLACE(REPLACE([PhoneNumberText],”(”,”"),”(”,”") - and that’s just to clear the brackets. Doing this for evey possible stray character was out of the question. What I wanted was just to get rid of any character which wasn’t a numeric. My mind immediately leapt to Regular Expressions, but couldn’t see how to do these easily in T-SQL against the source or at all in a standard SSIS component. Fortunately Brian Knight has a solution - using Regular Expressions within a script task. So, quoting him almost verbatim: In the Imports area, add the following line of code: Imports System.Text.RegularExpressionsNext, replace the Input0_ProcessInputRow subroutine with the following code: Public Overrides Sub Input0_ProcessInputRow(ByVal Row As Input0Buffer)Which worked beautifully and saved me writing some truly awful code. However it does lead me to gripe about an absent feature of SSIS that bugs me. The limited number of functions and inability to create custom functions to use in the Derived Column transformation means it is easy to fall back on to the Script Component to do your dirty work. This annoys me because a) i’m not a VB programmer and b) it reduces the transparency of what is going on in the data flow. Fingers crossed this will improve in 2011 - i’ve added feedback on Connect to request this - please vote it up if you consider it important. More... |
| | |
![]() |
| Bookmarks |
| Thread Tools | Search this Thread |
| Display Modes | |
| |
Similar Threads | ||||
| Thread | Thread Starter | Forum | Replies | Last Post |
| Debugging SSIS ScriptTasks and ScriptComponents with Information Messages | Darren Gosbell | Random Procrastination | 0 | 23rd June 2009 07:30 PM |
| Teragram unveils Sentiment Analysis Manager at Text Analytics Summit* | Latest News Headlines | SAS Forum | 0 | 3rd June 2009 12:54 AM |
| How to split Cognos report (PDF) into multiple pdf's | bujji | IBM and Cognos Forum | 0 | 28th February 2009 01:00 AM |
| | |
| | |