Data Quality Service’s Composite Domains in action!

Standard

In this post, I’ll show you how composite domains can help you create cross domain rules in Data Quality Services.

Scenario:

You have a data set of employee name, employee category and yearly salary. you want to validate the value in the yearly salary column based on the employee category. Here are the business rules:

Note: for the purpose of the demo, every number is a dollar.

Now, the rule in the Table can be read as:

If employee category is A then yearly salary should be greater than 100000 and less than 200000.

1 composite domains data quality services

Note: I have kept it simple for demo purposes.

Now here is our Data-Set before we set out to validate it:

Employee NameEmployee CategoryYearly Salary
Jon V YangA127000
Eugene L HuangB90000
Ruben  TorresC83000
Christy  ZhuD70000
Elizabeth  JohnsonA90000
Julio  RuizC65000
Janet G AlvarezD43000
Marco  MehtaB81000

*Names are taken from Adventure works database. The values in the names and salary column are purely fictional.

Solution:

It’s just an overview, It’s not covered in step by step fashion:

1. Create a KB > created three domains: Employee Category, Employee Name and Yearly Salarly

2. created a composite domain:

2 created a composite domain data quality services

3. Under Composite Domain (CD) Rules Tab:

I started out with defining the rules for category A:

3 create composite domains rules data quality services

And I completed w/ specifying business rules for all four categories

4 create composite domains SQL server 2012

4.  Published KB

5. Created a New DQS project > Selected the KB created above

6.  Selected the data source > Mapped domains

7. I also selected from the list of selected composite domains:

5 view select composite domains data quality project

8. After seeing the cleaning statistics, I switched to the invalid tab to see the records that didn’t match the record:

6 composite domain invalid tab new tab corrected tab correct tab

9. So by now, we have identified records that do not match the rules. A data steward can now correct them if he/she wants to or leave them as it is. Notice the Approve/reject check boxes.

Note that: Not only can you update the yearly salary but you can also update the employee category. So if you think that the employee has been wrongly categorized, you can change that.

10. After this, you can export the data-set which has records that match the business rules and the data-set would be ready to be consumed!

Conclusion:

In this post, we saw how to create cross domain rules using composite domains w/ an example of Employee Category and Yearly Salary.

 

0 thoughts on “Data Quality Service’s Composite Domains in action!

What do you think? Leave a comment below.