In my current work I looking for and matching duplicates in two separate databases on a regular basis, so I need to filter both the source and the reference table so that they only include records that have not been matched previously. One simple way to achieve this in SSIS is to create a new table in the reference database that only includes unmatched records:
SELECT *
INTO DBName.dbo.UnmatchedRecords
FROM DBName.dbo.AllRecords
WHERE AllRecords.MatchIdField IS NULL
Then when I select the reference table in the fuzzy lookup component, I can choose the UnmatchedRecords table, instead of the original base table.
Of course, you need to ensure that the Unmatched table is kept up to date when you add new Match Ids to the original table, but this work-around ensures that the result set output by SSIS does not include already matched records.
No comments:
Post a Comment