My first Publication Agile-Data-Warehouse-Design-eBook | Page 288
268
Chapter 9
Why and How
Multi-Valued Bridge Pattern
Problem/Requirement
Changing the
granularity of a fact
table to remove a
M:M relationship
“hard-codes” fact
allocations
The multi-valued group of diagnosis for each treatment creates a M:M relationship
between the event and a diagnosis dimension. This could be addressed in the star
schema design by changing the physical fact table granularity from one row per
claim line item to one row per claim line item per diagnosis (CLAIM ID GD,
TREATMENT_KEY GD, DIAGNOSIS_KEY GD). However, this causes a signifi-
cant allocation problem. If 10M claims have an average of 5 itemized treatments
for an average of 3 diagnosis codes each this would immediately triple a 50M row
fact table to 150M rows. While the extra 100M rows will adversely affect query
performance, the real issue is what facts do you put on these extra rows? Doctors
submit 50M “atomic” claim amounts, how do you go about splitting these amongst
their multiple diagnoses to create 150M additive CLAIM AMOUNT facts? Looking
at the example events you may have some ideas on how Bond’s treatment costs
should be allocated to his symptoms but with hundreds of millions of claim facts to
process, automating this would be difficult and few stakeholders will agree on how
you should “hard-code” these fact allocations.
Solution
Use a multi-valued
bridge table (MV) to
resolve a M:M
relationship
between a fact table
and a dimension
Fact allocation problems can be avoided by leaving the fact table granularity
unaltered, and using a multi-valued bridge table (MV) instead, to resolve the M:M
relationship. For example, DIAGNOSIS GROUP [MV], shown in Figure 9-7, can
be used to join unaltered claim facts to a DIAGNOSIS dimension. It does this by
storing the multiple DIAGNOSIS KEYs of a claim as separate rows of a diagnosis
group, each with a now familiar WEIGHTING FACTOR. Diagnosis groups are
created and assigned a surrogate key (DIAGNOSIS GROUP KEY) as unique claim
diagnosis combinations are observed during ETL. These bridge table keys are
added to the facts as they are loaded so that tables can be joined as in Figure 9-8.
Figure 9-7
DIAGNOSIS
GROUP
multi-valued
bridge table
Bridge tables avoid
the political issues
of hard-coding fact
allocations
Not only does the bridge table resolve the technical problem of the M:M relation-
ship, it sidesteps the political issues of how to split the atomic facts and provide
greater reporting performance and flexibility. By not increasing the number of
facts and altering their values, queries that stick to the normal single-value dimen-
sions to analyze busiest doctors, sickest patients or most expensive treatments run