Data Profiling in Data Quality Services happens at following stages:
1) While performing Knowledge Discovery activity
1A: In the Discover step:
1b. Also in the manage domain values step:
While profiling gives you statistics at the various stages in the Data Cleaning or Matching process, it is important to understand what you can do with it. With that, Here are the statistics that we can garner at the knowledge discovery activity:
- Newness
- Uniqueness
- Validity
- Completeness
2) While Performing Cleansing activity:
2A: on the cleansing step:
2b: Also on the mange and view results step:
Here the profiler gives you following statistics:
- Corrected values
- Suggested Values
- Completeness
- Accuracy
Note the Invalid records under the “source statistics” on left side. In this case 3 records didn’t pass the domain rule.
3) While performing Matching Policy activity (Knowledge Base Management)
3a. Matching policy step:
3b. Matching Results step:
Here the profiler gives following statistics:
- newness
- uniqueness
- number of clusters
- % of matched and unmatched records
- avg, min & max cluster size
4) While performing Matching activity (Data Quality Project)
4a. Matching step:
4b. Export step:
Here Profiler gives following statistics:
- Newness
- uniqueness
- completeness
- number of clusters
- % of matched and unmatched records
- avg, min & max cluster size
Conclusion:
In this post, I listed the statistics provided by Profiler while performing Knowledge Discovery, cleansing, matching policy and matching activity in SQL Server 2012 Data Quality Services.
Related articles
- SQL Server 2012 Data Quality Services Term based Relation’s in action! (parasdoshi.com)
- Difference between Term based relations and Domain values in SQL server 2012 Data Quality Services (parasdoshi.com)
- How to clean records using Regular Expressions in Data Quality Services? (parasdoshi.com)
- How to standardize data using Data Quality Services? (parasdoshi.com)
- How to detect unrealistic or invalid values using Data Quality Services? (parasdoshi.com)
0 thoughts on “Data Profiling and SQL Server 2012 Data Quality Services”