HR Data Mart Design Advice

Question

I am working on a design for an HR data mart using the Kimball approach outlined in 'The Data Warehouse Toolkit'.

As per the Kimball design, I was planning to have a time-stamped, slowly-changing dimension to track employee profile changes (to support point-in-time analysis of employee state) and a head-count periodic snapshot fact table to support measures of new hires, leavers, leave taken, salary paid etc.

The problem I've encountered is that, in some cases, our employees can be assigned to multiple roles/jobs and each one needs to be tracked separately (i.e. the grain of my facts has to be at job-level, not employee level).

How might the Kimball design be adapted to fit a scenario where employee and role/job form a hierarchy like this? Ideally, I want to avoid duplicating employee profile data (address, demographics etc) for each role/job an employee is assigned to, but does this mean I need to snow-flake the dimension?

Options I've been considering include the below - I'd be interested in any thoughts or suggestions the community has on this so all input is welcome!

1) (see attached, design 1) A snowflake-style approach with an employee table which has a 1-to-Many link role table, which, in turn, has a 1-to-many link with the fact table. The advantage here is a clean employee dimension but I don't want to introduce unnecessary complexity. Is there any reason why I shouldn't link both dimensions directly to the fact table? The snowflake designs I've seen don't seem to do this.

2) (see attached, design 2) A combined Employee/Role dimension where each employee has a record for each assigned role but only one on them is flagged as 'Primary Role'. Point-in-time queries on the dimension can be performed by constraining on the 'Primary Role' flag.

Wes H Wes H · Accepted Answer · 2019-06-12T14:15:53

Anything that occurred is an event and can be a fact. When you look at relationships between data, you need to also ask if the data value describes the entity (dim) or something that happened to/with the entity(fact). Everything can be a dim or a fact.(sometimes both)

A job describes an event that happened to the employee. You should have a fact employeejob that relates to the Dim employee and Dim job (as well as your date dimensions). This will then allow you to break down absences by employee and job. Your dim job would really just be job title, pay grades, etc. The fact would contain effective dates. Research factless fact tables.

Note that your vacancy reference would be part of a separate fact (when/where did you post it, how many applicants are all measurable facts about the vacancy). This may also be an example of a degenerate dimension.

I'm not fond of your monthly fact. I think that should just be some calculated measures built on fact absence and fact employeejob. When those events are put up against your dimensions, you can break them down by date, job type, manager, etc.

HR Data Mart Design Advice

1 Answers