| |
| ||||||
SQL Server Data Quality Services – Creating a Knowledge BaseThis is a discussion on SQL Server Data Quality Services – Creating a Knowledge Base within the BI Monkey forums, part of the CORTEX Blogs category; So far we’ve opened up the client and taken a look at the areas for working with. In this post I’ll look at setting up a Knowledge Base from scratch ... |
![]() |
| | LinkBack | Thread Tools | Search this Thread | Display Modes |
| | #1 |
| Guru Join Date: Jun 2009
Posts: 135
![]() | So far we’ve opened up the client and taken a look at the areas for working with. In this post I’ll look at setting up a Knowledge Base from scratch using some sample data I’ve mocked up based on some Netflix catalogue data. Note all of the images are of near full screen so I’ve shrunk them in the post, just click on the image to see the full screen. Creating a Knowledge Base To create a new KB just select “New Knowledge Base” from the client front screen, and give it a name and description. You can either clone an existing one on already the server or import, but in this case I’m just going to start from scratch. There’s three possible modes in which you can create a new KB:
SSDQS Client: Create new Knowledge BaseAs I’m not attempting to do any matching, I’m using the Knowledge Discovery approach. (Note: because I’m using an Excel source I need Excel installed on the machine). Mapping Columns to Domains Once I’ve picked my data source, I need to map columns: Data Quality Client - Field MappingAt this stage, there is no facility to automatically create domains, so before I map the fields to a Domain, I need to create each of them: Data Quality Client - New DomainNow a (known) defect appears to be fields from Excel are automatically treated as text – so if they contain numbers you cannot map them to a Numeric domain. So I’ll skip over evaluating Year for now. Analysing and Managing the results Once all the mapping is done, click next and you can upload the file for analysis. Now my sample file is 5,000 records, and it took DQS a few minutes just to upload this – so big sets may take a while. Anyway, the output of this is an analysis of the data, displayed in a profiler screen: Data Quality Client - Discovery OutputThis breaks down the records by the following criteria:
Data Quality Client - Managing Domain ValuesHere we are reviewing the values found for Country, and can manage the values that come through, flagging them as Correct, Error or Invalid – and assign a corrected value to incorrect ones. I’m trying to see if there is a functional difference between Error and Invalid. As per documentation: The status of the value, as determined by the discovery process. You can change the type by clicking the down arrow and selecting a different type. A green check indicates that the value is correct or corrected; a red cross indicates that the value is in error; and an orange triangle with an exclamation point indicates that the value is not valid.Skipping over the semantics issues, what we see here is a list of the values that the DQS Client has found in the Knowledge Discovery analysis of the data. We can then flag these values to Invalid or Error as we see fit – or leave them at their default value of Correct. Once we have flagged them as not correct, it is then possible to enter the Correct value in the “Correct To” column. Handily, the client then groups your corrected values under the correct value in the list. The final thing is to click Next and Publish the Knowledge Base (i.e. store the results back on the DQS Server) Summary So in this post we have quickly reviewed the creation of* Knowledge Base through the Knowledge Discovery Mode. This has allowed us to create a set of values in our Knowledge Base using some sample data and then apply some corrections to those values, using a simple GUI to manage the results. In the next post I will look at working in more depth with this created Knowledge Base using the “Domain Management” mode. Get More from the original blog... |
| | |
![]() |
| Bookmarks |
| Thread Tools | Search this Thread |
| Display Modes | |
| |
Similar Threads | ||||
| Thread | Thread Starter | Forum | Replies | Last Post |
| SQL Server Data Quality Services ? First Look | James Beresford | BI Monkey | 0 | 28th July 2011 10:57 AM |
| Installing SQL Server Data Quality Services (Denali CTP3) | James Beresford | BI Monkey | 0 | 26th July 2011 06:12 PM |
| Organising Knowledge>> Blog: Data, Knowledge Organisation and Scientific Knowledge | Straits Knowledge | Green Chameleon | 0 | 20th April 2011 03:48 PM |
| New Data Visualizations in SQL Server Reporting Services 2008 R2 | Latest News Headlines | Microsoft News and Views | 0 | 10th March 2010 09:47 AM |
| Starter Kit for PerformancePoint Planning Server SQL Server Integration Services (SSI | Latest News Headlines | Microsoft News and Views | 0 | 25th November 2009 01:49 PM |
| | |
| | |