How to use TSQL checksum to compare data between two tables?

Standard

In any BI project, data validation plays an important part. You want to make sure that the data is right! usually business helps in this validation. As a developer, you might also want to do some preliminary data validation. One of the techniques that I’ve learned recently is to use TSQL checksum to compare data between two tables. In this post , I’ll describe the technique & post a pseudo code.

we’ll create a pseudo code to compare all columns but you should be able to use that to tweak that if you need it.

1) Run checksum(*) on Tables:

On Table1:

select checksum(*) as chktb1 from table1
go

On Table 2:

select checksum(*) as chktb2 from table2
go

At this point, you should get two result sets each populated by checksum values computer over all columns since you passed * in the checksum function.

2) Now let’s join these tables & look at rows w/ different checksum: (in other words, it is going to list all rows that are different between table1 & table2)

select * from
(
select checksum(*) as chktb1 from table1
) as tb1
left join
(
select checksum(*) as chktb2 from table2
) as tb2
on tb1.someid=tb2.someid /* you can have more ids */
where tb1.chktb1 != tb2.chktb2

3) You can add individual column now to see what changed:

select * from
(
select checksum(*) as chktb1, columnname1, columnname2 from table1
) as tb1
left join
(
select checksum(*) as chktb2, columnname1, columnname2 from table2
) as tb2
on tb1.someid=tb2.someid
where tb1.chktb1 != tb2.chktb2

Conclusion:
I hope this helps especially if you don’t have rights to install 3rd party tools on your dev machine.