1
votes

Need to loading a flat file with an SSIS Package executed in a scheduled job in SQL Server 2016 but it's taking TOO MUCH TIME (like 2/3 hours) just to load data in source then it’s need extra (2/3 hours) time for sort and filter then need similar time to load data in target, the file just has like million rows and it’s not less than 3 GB file approximately. This is driving me crazy, because is affecting the performance of my server.

SSIS package: -My package is just a Data Flow Task that has a Flat File Source and an OLE DB Destination, that’s all -The Data Access Mode is set to FAST LOAD. -Just have 1 indexes in the table. My destination table has 32 columns

Input file: Input text file has more than 32 columns, surrogate key data may not unique , referenced columns date may not unique , Need to filter them.

Face two problems one is SSIS FlatFile-Source take huge time to load date another one is sort and filter. What to do?

enter image description here

2
How much memory is on the server with ssis? - Ruslan Tolkachev

2 Answers

1
votes

If you want it to run fast use this pattern:

  1. Load the data exactly as-is into a staging table
  2. Optionally add indexes to the staging table afterwards
  3. Use SQL to perform whatever processing you need (i.e. SELECT DISTINCT, GROUP BY into the final table)

You can do this kind of thing in SSIS but you need to tune it properly etc. it's just easier to do it inside a database which is already well optimised for this

0
votes

Some Suggestions

1. DROP and Recreate indexes

Add 2 Execute SQL Task; before and after the Data Flow Task, the first drop the index and the second recreate it after that the Data Flow Task is executed successfully

2. Adjust the buffer size

You can read more about buffer sizing in the following Technet article

3. Remove duplicates in SQL server instead of Sort Components

Try to remove the Sort components, and add an Execute SQL Task after the Data Flow Task which run a similar query:

;WITH x AS 
(
  SELECT col1, col2, col3, rn = ROW_NUMBER() OVER 
      (PARTITION BY col1, col2, col3 ORDER BY id)
  FROM dbo.tbl
)
DELETE x WHERE rn > 1;

4. Use a script component instead of Sort

You have to implement a similar logic to the following answer:

Some helpful Links