My first Publication Agile-Data-Warehouse-Design-eBook | Page 288

268 Chapter 9 Why and How Multi-Valued Bridge Pattern Problem/Requirement Changing the granularity of a fact table to remove a M:M relationship “hard-codes” fact allocations The multi-valued group of diagnosis for each treatment creates a M:M relationship between the event and a diagnosis dimension. This could be addressed in the star schema design by changing the physical fact table granularity from one row per claim line item to one row per claim line item per diagnosis (CLAIM ID GD, TREATMENT_KEY GD, DIAGNOSIS_KEY GD). However, this causes a signifi- cant allocation problem. If 10M claims have an average of 5 itemized treatments for an average of 3 diagnosis codes each this would immediately triple a 50M row fact table to 150M rows. While the extra 100M rows will adversely affect query performance, the real issue is what facts do you put on these extra rows? Doctors submit 50M “atomic” claim amounts, how do you go about splitting these amongst their multiple diagnoses to create 150M additive CLAIM AMOUNT facts? Looking at the example events you may have some ideas on how Bond’s treatment costs should be allocated to his symptoms but with hundreds of millions of claim facts to process, automating this would be difficult and few stakeholders will agree on how you should “hard-code” these fact allocations. Solution Use a multi-valued bridge table (MV) to resolve a M:M relationship between a fact table and a dimension Fact allocation problems can be avoided by leaving the fact table granularity unaltered, and using a multi-valued bridge table (MV) instead, to resolve the M:M relationship. For example, DIAGNOSIS GROUP [MV], shown in Figure 9-7, can be used to join unaltered claim facts to a DIAGNOSIS dimension. It does this by storing the multiple DIAGNOSIS KEYs of a claim as separate rows of a diagnosis group, each with a now familiar WEIGHTING FACTOR. Diagnosis groups are created and assigned a surrogate key (DIAGNOSIS GROUP KEY) as unique claim diagnosis combinations are observed during ETL. These bridge table keys are added to the facts as they are loaded so that tables can be joined as in Figure 9-8. Figure 9-7 DIAGNOSIS GROUP multi-valued bridge table Bridge tables avoid the political issues of hard-coding fact allocations Not only does the bridge table resolve the technical problem of the M:M relation- ship, it sidesteps the political issues of how to split the atomic facts and provide greater reporting performance and flexibility. By not increasing the number of facts and altering their values, queries that stick to the normal single-value dimen- sions to analyze busiest doctors, sickest patients or most expensive treatments run