0
votes

I have searched everywhere on the web to find out how I can import data into a star schema data warehouse. A lot of the stuff online explain the design of the star schema and data warehouse but none explain how exactly data is loaded into the DW. Here is what i've done so far:

I am trying to make an application of high school basketball statistics for each player.

I have:

  1. A list of all of the players name, height, position and number
  2. A list of all of the high schools
  3. list of all of the schedules
  4. list of conferences
  5. statistics(points, rebounds, steals, games played, etc) for each player for the current year.

I assume the the stats would be my fact table and the rest are my dim tables.

Now the million dollar question --How in the world do get the data into that format appropriately?

I tried simply importing them to their respective tables but dont know how they connect.

Example: there are 800 players and 400 schools. each schools has a unique id (primary key). I upload the players into dim players and schools into dim schools. Now how do I connect them?

Please help. Thanks in advance. Sorry for the rambling :)

4
If you have your data into dim tables and I hope each dim table will have a Primary column, then in your fact table, refer those dim table primary key (which is a kind of foreign key).Viki888

4 Answers

1
votes

There are many ways for importing data into a database: using builtin loaders, scripts or, what is mostly used for DW environments, an ETL tool.

About your fact table, I think stats are metrics, not the transaction. In other words, you measure a transaction, not a metric itself.

1
votes

Using an ETL tool (E- Extract your data from your soruces , T- transform your data or manipulate it to go as you want, L - Load the data in your DW) you can safely and surely have your data loaded in your DW.

You can use ETL tools like : SSIS , Talend , etc.

0
votes

Yes, "star", "dim", "fact", and "data warehouse" are appropriate terms, but I would rather approach it from "entities" and "relationships"...

You have essentially defined 5 "Entities". Each Entity is (usually) manifested as one database table. Write the CREATE TABLEs. Be sure to include a PRIMARY KEY for each; it will uniquely identify each row in the table.

Now think about relationships. Think about 1:many, such as 1 high school has 'many' players. Think about many:many.

For 1:many, you put, for example, the id of the high school as a column in the player table.

For many:many you need an extra table . Write the CREATE TABLEs for any of those you may need.

Now, read the data, and do INSERTs into the appropriate table.

After that, you can think about the SELECTs to extract interesting data. At the same time, decide what INDEX(es) will be useful. But that is another discussion.

When you are all finished, you will have learned a bunch about SQL, and may realize that some things should have been done a different way. So, be ready to start over. Think of it as a learning exercise.

-1
votes

You can use SQL server data tools for this project. SQL server Data tools consists of a SSIS,SSAS and SSRS.

Use SSIS to create a ETL process for your data in your database. Use SSAS to create dimensions, fact tables and cubes (You can do a lot more in this). Use SSRS to present the data in a user friendly way.

Lot of videos are available youtube.