My first Publication Agile-Data-Warehouse-Design-eBook | Page 285
Dimensional Design Patterns for Cause and Effect
265
Multi-Valued Dimensions
One of the challenges of causal dimensions—especially with external indirect
causes—is that there may be more than one cause for any given fact. For example,
Figure 9-4 shows an EVENT CALENDAR table that documents several sporting
events that may have influenced product sales in July 2010. This table makes it easy
for BI users to answer the question: “How much did we make during the World
Cup?”, because they don’t have to remember the dates, just pick the single event
from a drop-down list. As such this table is WHERE clause friendly and could be
used to store other event types with dynamic date ranges for which consistent date
range filters would be useful; for example, “Business” events like “Last 60 days,
Current year”, “Last 90 days, Current Year” and the same ranges from the previous
year.
Causal factors are
multi-valued
dimensional
attributes where
there is more than
one cause of the
same type for a fact
Weighting Factor Pattern
Problem/Requirement
If BI users want to group (rather than filter) a report by the multiple sporting
events they can get into trouble because many of these sporting events overlap.
They must be careful how they interpret a report that shows $30M sales during the
World Cup and $10M sales during Wimbledon. They must not add totals that
overstate the sales. The business has not made $40M because Wimbledon took
place during the World Cup. This problem arises because EVENT is a multi-valued
dimensional attribute—it can have more than one value for a single atomic-level
fact like a customer product purchase.
Grouping by a multi-
valued attribute can
cause over-counting
Figure 9-4
Event Calendar
Solution
For BI users who need to group by multiple events, Figure 9-5 shows an alternative
version of the sporting schedule that is more GROUP BY friendly. EVENT DAY
CALENDAR stores each (sporting) event and date combination—a 14 day event
like Wimbledon will be stored as 14 rows. This may appear a little wasteful but it
has two benefits:
Over-counting can
be avoided by
providing a
weighting factor
Each date/event combination can be given a weighting factor to allow facts to
be allocated amongst the multiple events that occur on the same day.