My first Publication Agile-Data-Warehouse-Design-eBook | Page 297

Dimensional Design Patterns for Cause and Effect 277 Too Many Degenerate Dimensions? Most transaction fact tables will contain at least one degenerate transaction ID that cannot be stored more efficiently in a separate dimension table because its cardi- nality approaches that of the fact table itself and it has no additional descriptive attributes. However if a fact table contains many additional degenerates you should try to prune these to keep the fact table record length under control using the following guidelines: Large sets of degenerate dimensions within a fact table should be remodeled as separate dimensions If degenerates will be used to group or filter or will be browsed in combina- tion, they should be remodeled as attributes in a separate dimension. Any degenerates that contain large unstructured comments should be re- placed by a surrogate key to a COMMENT dimension (as in Figure 9-2). If a degenerate [Y/N] flag will be frequently counted it can be remodeled as a low cardinality additive fact with the values 0, 1 that can be summed. This is especially useful as aggregates can be built that use this fact. Some degenerates can be remodeled as useful additive and non-additive If the degenerate is high cardinality and will be counted distinctly it should remain in the fact table where it will act as a non-additive fact. facts If a degenerate flag describes the type of value in an adjacent fact it may repre- sent data that would be better modeled as separate additive fact columns without the flag. For example, a REVENUE fact and a flag REVENUE TYPE with the values: ‘E’ for estimate and ‘A’ for actual, should instead be modeled as two facts: ESTIMATED REVENUE and ACTUAL REVENUE. Sometimes a degenerate meets more than one of these criteria. For example, a flag may be frequently counted, and used for grouping and constraining. In which case, you can model it as both a fact and a dimensional attribute. Creating How Dimensions If you identify degenerates that should be remodeled as dimensions, check to see if any belong in existing dimensions. For any that do not, define a new dimension with its own surrogate key and relocate the degenerates to it, replacing them in the fact table with the new surrogate key. This new dimension is often called a “junk” dimension, because of its tough-to-classify mix of attributes. But it really is not junk at all. Instead, it is a non-conformed how dimension, specific to just this set of facts, that can often be usefully named after its matching fact table. For example, a CALL DETAILS FACT table may need a CALL DETAIL how dimension, and a SALES FACT table may need a SALE TYPE dimension. If a fact table has multiple small non-conformed dimensions—typically whys and hows— they can often be merged to reduce the number of keys in the fact table. Move degenerates to a physical how dimension named after the fact table and replace them in the fact table with a surrogate key