Star Schema - External Identifier fact or dimension?

Question

Here's a question I'm struggling with in a star schema design.

The outline is that we track packages with embedded globally unique identifiers (tags). Each of those tags creates to a series of chronological events. I consider the events to be the facts and am including the continuously variable values as columns in the fact table. Dimensions are things like the package type.

What I'm not sure about is whether the tag identifier should be in a dimension or directly on the fact table. We've currently got over 5 million unique tags we are tracking.

Is such a large dimension advisable?

If the tag identifier is the main business key to identify a package, it should remain in the fact table. — tobi6

Marek Grzenkowicz Marek Grzenkowicz · Accepted Answer · 2016-08-11T15:02:46

It is a degenerate dimension and you should keep this column in the fact table.

Star Schema - External Identifier fact or dimension?

1 Answers