My first Publication Agile-Data-Warehouse-Design-eBook | Page 90
Modeling Business Dimensions
69
Boolean flag attributes that contain "Y" or "N", (e.g., RECYCLABLE FLAG) can
usefully be augmented with matching report display-friendly attributes containing
descriptive values, (e.g., "Recyclable" and "Non-Recyclable").
Mandatory Attributes
While you are filling out example data for an attribute ask whether it is mandatory.
If the stakeholders believe it is, you should add MD to its column type. MD does
not necessarily define attributes as NOT NULL in the data warehouse. MD may
just represent the stakeholders’ idealistic view, while the data warehouse has to
cope with the actual operational system data quality. By documenting allegedly
mandatory attributes you are capturing rules that the ETL process can test for, and
identifying potentially useful attributes for defining dimensional hierarchies.
Use MD to record
that stakeholders
believe an attribute
to be mandatory
Missing Values
One example you must fill in for every attribute is its “Missing” value. If the di-
mension subject has already been identified as possibly missing from an event
story there will already be a missing subject copied from the event. If not you
should add a missing row to the dimension just as you would to an event. You fill
out this row by asking the stakeholders how they want “Missing” to be displayed
for each attribute. Every dimension
Paradoxically you need to ask for missing values even for mandatory attributes. For
example, If CUSTOMER TYPE is a mandatory attribute of CUSTOMER then for
all SALES events involving Customers you can rely on the Customer Type to be
present. But if Customers are missing for certain SALES events (for example,
anonymous cash transactions) then CUSTOMER TYPE will also be missing.
Figure 3-4 shows that when CUSTOMER is missing, the stakeholders want
CUSTOMER TYPE to be displayed as “Unknown.” Even mandatory
If stakeholders need the data warehouse to be able to differentiate between various
types of “Missing” (e.g., “Not Applicable”, “Missing in error”, or “To be assigned
later”) the dimension will need additional special case missing stories with differ-
ent descriptive values and ETL processes will have to work a little harder to assign
the correct “missing” version to the events. The implementation of this is discussed
in Chapter 5. If there are different
Don’t go overboard with examples. Dimensions usually have far more attributes
than events have details, and you want to discover as many dimensional attrib-
utes as possible rather than exhaustively capture examples for only a few.
needs x missing row
to document
missing display
values
attributes need a
missing value
types of missing,
you need multiple
missing stories