2
votes

So imagine a standard HR database with the following entities: Employee, Evaluation, Training, Role, Department, Location.

If we wanted to find an employee's typical retirement age, what's the best way to design this hypothetically? If we had 'Retirement_Date' as an attribute under the Employee entity, then we'd be left with nulls.

Would we need to create an entity called 'Retirement' just to store data on employees who retired?

2
what's wrong with nulls?Alex Zen
@AlexZen: You better don't ask Chris Date :-)dnoeth

2 Answers

1
votes

It seems likely that there would be other attributes associated with retirees that are not applicable to current employees. Put those attributes (including retirement date) in a separate table for retirees.

One danger of combining the two different types of person into one table (AKA "bundling") is that it might allow an inconsistent state to arise: where an employee has one or more retirement-specific attributes populated but not all. You could add extra constraints to prevent that situation but it's usually simpler not to bundle together different types of data. It also seems counter-intuitive to record people in a table called Employee when they are not employees.

Using nulls will generally make it harder to write queries and derive useful results from data. Nulls practically guarantee that some queries will give results that don't properly fit the reality you are trying to represent.

0
votes

Your data model may be over-simplified. Employees can come and go. You probably want overall information about the employee, but you might also want a history table. After all, employees change departments, history, and so on. Retirement is not the only issue.

As for retirement itself, I see nothing wrong with including a retirement date in one of the entities. And if it is NULL, then there is no known retirement. You might want to put a default date far in the future rather than NULL.

The following two methods are both reasonable ways of defining retired/active customers:

(case when retirement_date < current_date then 'RETIRED' else 'ACTIVE' end)
(case when retirement_date > current_date then 'ACTIVE' else 'RETIRED' end)

However, the first doesn't work when retirement_date is NULL for active customers. For this reason, it might be better to put in a date far, far in the future for active employees. In that case, both formulations works, and you don't have to worry (as much) about application logic.