ETL = Extract, Transform and Load. Staging database's help with the Transform bit. Personally I always include a staging DB and ETL step.
A Staging database assists in getting your source data into structures equivalent with your data warehouse FACT and DIMENSION destinations. It also decouples your warehouse and warehouse ETL process from your source data.
If your data warehouse destination tables pretty much map your production DB tables with only some additional dimension fields then you could get away with ignoring the Staging Database. This will save you a little development time. I don't recommend this as you:
- Will end up tying your data warehouse solution directly to your
source database
- Will most likely end up with a very complicated ETL step
- May end up with race conditions/orphan records due to
changes in your source database during the ETL process
- Data Warehousey people may make 'hrumph' type sounds at you
Most likely though you will be performing some sort of data manipulation (converting dates to DATE_DIM keys, aggregating values) in which case a staging DB will help you separate your transformation logic and calculations from your data warehouse operations (dimensioning data).
You may have also come across this sort of pattern:
[PROD DB] -(ETL)-> [RAW DB] -(ETL)-> [STAGING DB] -(ETL)-> [DW DB] -(ETL)-> [DM DB]
which if performance considerations are important you may want to look at. In your case the RAW_DB could be an exact 1:1 copy of your production database and the ETL step that creates it might just be a recreate the DB from the most recent nightly backup. (Traditionally RAW_DB was used for getting data from various external sources with each field as pure text, these fields where then converted to their expected data type with exceptions handled as encountered. This is not so much of a problem when you have one source and its a nice strongly typed normalized database)
From this RAW_DB the next ETL process would truncate and populate staging such that the STAGING DB contains all the new/updated records that are going into the warehouse.
Another added benefit of all these steps is that it really assists in debugging weird data as for any given run you can see record values inside each of the difference databases and identify which ETL process is introducing the sadness.