0
votes

I am new to SSIS. My SSIS package contains:

1.Extract Data from Excel file (Excel Source Plus Component)
2.Added new Columns (Derived column Transformation)
3.Store Data into SQL Server (OLE DB Destination)

My Requirement if more than one row exists for same ID and username in the excel file, this SSIS package should retrieve the rows which contains duplicate rows and store the entire row into the another table and stop the package execution before the derived column transformation.
If there is no duplicate values exists for key column , then only the execution should proceed. My table structure is like below:

ID username Name Address EmailID PhoneNo
----------------------------------------
1  ABC      MNO   ASDA   asd     345
1  ABC      XYZ   ASD    DF      678
2  SDF      MNO   sdF    sdf     2323

The above table should be stopped before the derived column transformation and store the 2 rows (ID:1,2) in another table.

  1. I tried with aggregate transformation but in that I don't want to apply group by clause to remaining non-key columns.If I didn't apply any columns to remaining columns ,they are not coming in the output.
  2. I tried Lookup column and multi join, but I am not aware of self join.

Help me to resolve this problem.Thank You.

1

1 Answers

0
votes

Import all the rows into a new staging table. Then issue a SQL command that calls a stored procedure that directs the rows to the appropriate destination tables and then truncates the staging table for the next run.

This will work MUCH faster than anything you can do during the dataflow process.