My first Publication Agile-Data-Warehouse-Design-eBook | Page 172
Modeling Star Schemas
151
If a dimension contains only CV and FV attributes, effective dating attributes are
unnecessary. Implement them anyway and you will be ready for the day when a
new HV attribute is added, or stakeholders finally realize they needed historically
correct attributes all the time. The cost of redundantly maintaining these attributes
initially is insignificant compared to the cost of refactoring ETL processes.
There are five additional administrative attributes in Figure 5-10 that you should
also consider adding to every dimension:
MISSING
CREATED DATE
CREATED BY
UPDATED DATE
UPDATED BY
Additional ETL
attributes can be
useful…
The MISSING flag “Y” indicates that the row is a special “Missing” dimensional
record (usually with a zero or negative surrogate key). “N” indicates a normal dim-
ension record. This can be useful for filtering out all forms of missing (e.g. “N/A”
and “Error”) without exposing their surrogate key values to BI users. …for identifying
The CREATED and UPDATED attributes provide basic ETL audit information
on the date/time and ETL version used to create and update the dimension. Chap-
ter 9 provides more details on audit techniques. …and providing
special “missing”
rows…
dimension audit
information
When you present existing BEAM✲ dimension tables as spreadsheets to stake-
holders, hide surrogate keys and other ETL-only columns to limit the discussion
to business attributes.
Time Dimensions
If you haven’t already done so you should model when dimensionally—just like
any other W-type. A CALENDAR dimension is essential to the data warehouse
because it provides the conformed time hierarchies (discussed in Chapter 3) and
descriptions that stakeholders need to analyze every business process. You should
also model time of day to discover if stakeholders have custom descriptions for
periods during a day, such as peak/off peak or shift names. If so, these should be
implemented as attributes of a separate minute granularity CLOCK dimension to
avoid a single monster TIME dimension that would contain 2.6 million minutes
for every 5 years of warehouse history. Model time
Figure 5-11 shows how a single time of day granularity when event detail: CALL
TIME is replaced by two surrogate keys: CALL DATE KEY and CALL TIME KEY
in a fact table. Both CALENDAR and CLOCK are role-playing (RP) dimensions
that will be used to replace the when details of every event. Chapter 7 provides full
details on time dimensions and their special property surrogate date keys. Datetime details
dimensionally
as separate
CALENDAR and
CLOCK dimensions
become separate
date and time
surrogate keys