One of the ongoing challenges with SSIS is its difficulty in handling complex or damaged text files. One approach to dealing with such files is to
bring them all in as one*wide text column and then split them using code. Sometimes, the file is too wide for that approach, so below is an extension of that method where you import the column as a text stream (DT_TEXT, or Unicode DT_NTEXT) and then split the text stream in a script transformation:
******* ‘ Declare variables
******* Dim TextStream As Byte()*********** ‘ To hold Text Stream
******* Dim TextStreamAsString As String*** ‘ To Hold Text Stream converted to String
******* Dim StringArray() As String******** ‘ To contain split Text Stream
******* ‘ Load Text Stream into variable
******* TextStream = Row.TextStreamColumn.GetBlobData(0, CInt(Row.Column0.Length))
******* ‘ Convert Text Stream to string
******* TextStreamAsString = System.Text.Encoding.ASCII.GetString(TextStream)
******* ‘ Split string into array and output
******* StringArray = TextStreamAsString.Split(“#”) *******
******* Row.Column1 = StringArray(1).ToString
******* Row.Column2 = StringArray(2).ToString
******* Row.Column3 = StringArray(3).ToString**
An important thing to note*is that in the step where the Text Stream is converted to a string, the Encoding will depend on the type of text stream you are bringing in – Unicode files will need*”Unicode” instead of “ASCII”. Also I have used a hash (”#”) as the*column delimiter but that value will vary depending on what type of file you are bringing in.
Get More from the original blog...