Problem:
if you’ve a sizable number of rows that need update in SSIS, then you don’t want to do a row based update commands because it won’t be efficient. if you’ve good number of rows that need to be updated then you can use the SET Based updates. it’s a common design pattern for loading dimensions in a data warehouse.
Find the steps below:
Solution:
There are two main steps to achieve this:
1) Populate the “update” table with rows that have been changed. Note that a new table needs to be created.
2) Run the SQL command to do a SET based update
Let’s see each step in detail:
1) Populate the “update” table with rows that have changed.
For this step, first make sure that you have a table that can hold the rows that have been updated.
Then create a Data Flow that take the source data and lookups the data that has changed and puts it in an update/staging table:
Note: I’ve used a small table for demo purpose but you won’t use this method if you don’t have a more rows to update because as you can see this method adds an overhead of putting the data in the update table first.
2) Run the SQL command to do a SET based update
Here’s the sample query:
[code language=”sql”]
— run the update command
Update Dim
Set
[Column1]=Upd.[Column1]
[Column2]=Upd.[Column2]
— [Column3]=Upd.[Column3]
— …
From dbo.DimDestination Dim
Inner Join dbo.Destination_update Upd
on Dim.Destination_sk=Upd.Destination_sk
–Truncate update table
Truncate table dbo.Destination_update
[/code]
Conclusion:
In this post, we saw how to write a package in SSIS that does SET based updates.