My first Publication Agile-Data-Warehouse-Design-eBook | Page 296

276 Chapter 9 Why and How from scratch without the bridge and the fact table would still need to be altered to add the OPTION PACK KEY. Column flag dimensions should only be populated with observed combinations, otherwise they can easily grow to be bigger than fact tables. A bit flag dimension with only 20 columns has over a million possible combinations. If bridge table rows contain quantities, a pivoted dimension can contain count columns If the business problem was more complex and varying quantities of each option could be chosen to configure a product, the OPTION PACK bridge table would need to contain an OPTION QUANTITY attribute and the OPTION PACK [PD] pivoted dimension would contain option count columns rather than [Y/N] flags. Similarly, if small numbers of options and options quantities were handled as separate fact rows (to avoid a bridge table) and comparisons or combinations were constantly used then a pivoted fact table might be created with option count facts. Consequences Pivoted dimensions are limited to relatively small and stable value populations Pivoted dimensions are limited by the maximum number of columns available in a database table (usually between 256 and 1024) and the ETL involved in automating the maintenance of volatile combination values is complex. A pivoted dimensions works well for Pomegranate because there are only a few hundred relatively stable options (with several new ones being add manually each year) but it could not cope with a possible 155,000 ICD10 diagnosis codes. How Dimensions How dimensions are often degenerate (DD) IDs that provide useful links to operational source records and unique count measures How dimensions document any additional information about facts that are not captured by other dimensions. The most common how dimensions are degenerate (DD) transaction identifiers stored in fact tables. These dimensions describe how facts come to exist by tying them back to the original source system transactions. They can also be invaluable for providing unique transaction counts. For example, an ORDER ID in an ORDERS FACT table can be used to count how many orders contained at least one laptop product line item. Using COUNT(DISTINCT Order_ID) ensures that individual orders with several line items for different laptops will not be over-counted. As mentioned earlier, a degenerate ID that can be uniquely counted is essential if a star schema has one or more multi-valued dimen- sions. Look out for conformed degenerate dimensions and add them to the event matrix to help you discover event sequences and milestone dependencies that can be modeled as evolving events.