| |
| ||||||
SQL Server Data Quality Services & SSISThis is a discussion on SQL Server Data Quality Services & SSIS within the BI Monkey forums, part of the CORTEX Blogs category; So far in my posts on SSDQS we’ve looked at the Data Quality Services Client and building SSDQS Knowledge Bases. Now in practice when handling bulk data a need to ... |
![]() |
| | LinkBack | Thread Tools | Search this Thread | Display Modes |
| | #1 |
| Guru Join Date: Jun 2009
Posts: 135
![]() | So far in my posts on SSDQS we’ve looked at the Data Quality Services Client and building SSDQS Knowledge Bases. Now in practice when handling bulk data a need to reference this in routine loads is needed, and to nobody’s surprise, SSIS is the tool for the job. The DQS Cleansing Component So, in our (shiny, new) SSIS Toolbox we have a new component to connect to DQS – the DQS Cleansing Component: SSIS DQS Cleansing ComponentThe DQS cleansing component pushes a data flow to the DQS Engine for validation. This requires a special Connection Manager, the DQS Cleansing Connection Manager, which as we can see below is a simple creature: SSIS DQS Cleansing Connection ManagerThe sole option at this point is to choose which DQS Server to point at. So, lets look at what we get in the SSIS Component once we use the Connection Manager: SSIS DQS Cleansing Component Connection Manager optionsOnce again – still nice and simple – choosing your Connection Manager allows you to then pick from a list of Published Knowledge Bases. Once a KB is selected, a list of the available Domains is populated, though there is nothing you can do with this list other than review it. So next we move to the Mapping tab: SSIS DQS Cleansing Component Mapping TabThe usual suspects are there – pick your input columns in the top half of the tab and they become available for mapping in the lower half. Each input column can be mapped to a single Domain (I can’t quite see how Composite Domains work in this context). You then get three output streams – the Output, Corrected Output and Status Output. The Output is just the column passed through, Corrected is the column value corrected by the DQS Engine and the Status is the record status (which comes out as Correct, Corrected or Unknown which corresponds to the DQS Data Quality Project statuses. In the Advanced Editor you can also switch on Confidence and and Reason Outputs, which relate to matching projects. Note that there is only a single output for the DQS Cleansing Component – if you want to send OK, Error and Invalid records to different locations, you will need to do so with a downstream Conditional Split component. Summary So we’ve had a quick look at the basics of automating DQS activities using SSIS, and how SSIS plugs in to the DQS Server. Subsequent posts will start digging into some practical implementation including performance. Some further reading can be found here:
Get More from the original blog... |
| | |
| | #2 |
| Guest Location: London
Posts: n/a
| hi,, I am new user this post |
|
| | #3 |
| Guest
Posts: n/a
| |
|
![]() |
| Bookmarks |
| Thread Tools | Search this Thread |
| Display Modes | |
| |
Similar Threads | ||||
| Thread | Thread Starter | Forum | Replies | Last Post |
| SQL Server Data Quality Services ? Composite Domains | James Beresford | BI Monkey | 0 | 23rd August 2011 03:34 PM |
| SQL Server Data Quality Services ? Domain Management | James Beresford | BI Monkey | 0 | 3rd August 2011 05:12 PM |
| SQL Server Data Quality Services – Creating a Knowledge Base | James Beresford | BI Monkey | 0 | 2nd August 2011 10:41 AM |
| SQL Server Data Quality Services ? First Look | James Beresford | BI Monkey | 0 | 28th July 2011 10:57 AM |
| Installing SQL Server Data Quality Services (Denali CTP3) | James Beresford | BI Monkey | 0 | 26th July 2011 06:12 PM |
| | |
| | |