My first Publication Agile-Data-Warehouse-Design-eBook | Page 90

Modeling Business Dimensions 69 Boolean flag attributes that contain "Y" or "N", (e.g., RECYCLABLE FLAG) can usefully be augmented with matching report display-friendly attributes containing descriptive values, (e.g., "Recyclable" and "Non-Recyclable"). Mandatory Attributes While you are filling out example data for an attribute ask whether it is mandatory. If the stakeholders believe it is, you should add MD to its column type. MD does not necessarily define attributes as NOT NULL in the data warehouse. MD may just represent the stakeholders’ idealistic view, while the data warehouse has to cope with the actual operational system data quality. By documenting allegedly mandatory attributes you are capturing rules that the ETL process can test for, and identifying potentially useful attributes for defining dimensional hierarchies. Use MD to record that stakeholders believe an attribute to be mandatory Missing Values One example you must fill in for every attribute is its “Missing” value. If the di- mension subject has already been identified as possibly missing from an event story there will already be a missing subject copied from the event. If not you should add a missing row to the dimension just as you would to an event. You fill out this row by asking the stakeholders how they want “Missing” to be displayed for each attribute. Every dimension Paradoxically you need to ask for missing values even for mandatory attributes. For example, If CUSTOMER TYPE is a mandatory attribute of CUSTOMER then for all SALES events involving Customers you can rely on the Customer Type to be present. But if Customers are missing for certain SALES events (for example, anonymous cash transactions) then CUSTOMER TYPE will also be missing. Figure 3-4 shows that when CUSTOMER is missing, the stakeholders want CUSTOMER TYPE to be displayed as “Unknown.” Even mandatory If stakeholders need the data warehouse to be able to differentiate between various types of “Missing” (e.g., “Not Applicable”, “Missing in error”, or “To be assigned later”) the dimension will need additional special case missing stories with differ- ent descriptive values and ETL processes will have to work a little harder to assign the correct “missing” version to the events. The implementation of this is discussed in Chapter 5. If there are different Don’t go overboard with examples. Dimensions usually have far more attributes than events have details, and you want to discover as many dimensional attrib- utes as possible rather than exhaustively capture examples for only a few. needs x missing row to document missing display values attributes need a missing value types of missing, you need multiple missing stories