Submitted by
Category
Review Cycle
.
Public
Joachim Mutter/sysarc
on
01/02/2009
at
03:29 PM
SSiS\Code
Lookup Task replacement
Problem:
There may be cases, where you cannot use the lookup task.
I.e. during the package runtime, there where insert or deletes to the lookup table, but the caching mechanism
of the Lookup task could not be switched of, so there is no chance to use the quick caching stuff in conjunction
with dynamic changes of the lookup table. But with the following script task, you will be able to do this with a good
performance.
The following approach has been inspired by an article from Sashin Dedhia
SSIS Lookup using a Script Component
.
Control Flow
The two dataflow tasks and the Recordset variable, which contains the filled recordset from the dataflow task "Fill Recordset"
Data Flow Task "Fill Recordset"
Generation of the recordset for the lookup.
Here we use another trick. We donut use a Table or a SQL statement, instead we use a Stored Procedure with a
temporary table. The trick for bypass the missing meta data of the OLE DB Source is, to create a statement with
an appropriate Query for returning the meta data, which could be used by the OLE DB Source.
A really good article is
SSiS: Stored procedures using TempTables
. where Michael Cage pointed out how to build the
Stored Procedure. References for this proposal are Blog Entries from
Jami Thomson
SSIS: Using stored procedures inside an OLE DB Source component
states that
a stored procedure
doesn't have any meta data for the OLE DB provider to retrieve.
Adam Machanic's BLOG entry,
Stored procedures are not parameterized views
states that stored procedures don't provide
an output contract
But Michael found a way to do this. First he explained the usage of
SET
FMTONL
Y
OFF
;
but this is not
a good way, because this statement is responsible for executing the Stored Procedure 5 times by the OLEDB Source
But the second example with a NoOp Select statement shows the real way
Code
Data Flow Task "Lookup Dataflow"
If use the same table from Adventure Works for the input, but append a new row with a non existing ContractID to test that stuff
As you can see, ExistingRow Output contains the 19972 rows, which come from the input source and MissingRows contains only on row,
the one we added as non existing rows to the dataflow at the start of the task.
To do this, we work with 2 additional output buffers in the Script Task "Lookup".
Buffer ExistingRows the input row is copied, if it is an existing row in the Lookup recordset
Buffer MissingRows contains only the missing ContactID and a comment, which could be logged
Additionally we add a missing row into the internal Recordset, so if this ContractID comes again, the lookup will be positive
by creating a new DataRow and add this to the Datatable object.
Code
DTSX pacakge
Zur SysArc Homepage ...