Daily we get the data in excel formats we load the data into staging and then go to SSIS package and take excel as connection manager and perform transformations and move the data to warehouse. since we are taking data from excel only then why to create a stage and truncate it, since we taking excel as source and every manipulation is done with in it? Can someone please explain Real time scenario? I have seen many websites and couldn't understand what the concept is all about like staging, source(excel),lookup target(warehouse) Why to create to stage since everything is being done SSIS package only ?
0
votes
Will your source (Excel) files contain rows of data that have previously been loaded into the warehouse? Basically, will you be tracking changes against rows in the DW?
– Adam Parker
Excel is simply not a reliable enough source to insert directly to the final table. In fact rarely is this ever done - the staging pattern is very common
– Nick.McDermaid
A staging environment also allows you to encapsulate your transformation logic within stored procedures, which can be modified without making changes to your SSIS packages. This also reduces the complexity of your SSIS packages to focus just on importing the data and running stord procedures. On larger projects, utilising a pattern such as this can be a big time saver.
– iamdave
1 Answers
0
votes
The staging area is mainly used to quickly extract data from its data sources, minimizing the impact of the sources. After data has been loaded into the staging area, the staging area is used to combine data from multiple data sources, transformations, validations, data cleansing.
You can use a staging design pattern :
- Incremental load
- Truncate Insert
- Using Delimiters with HashBytes for Change Detection
You can find out about the Package design pattern for loading a data warehouse