My first Publication Agile-Data-Warehouse-Design-eBook | Page 283
Dimensional Design Patterns for Cause and Effect
263
Figure 9-1
PROMOTION
dimension
While a PROMOTION dimension may be a small dimension, with only of few
hundred promotional condition combinations, it can be challenging to build and
assign to the facts because of its mix of direct and indirect causal factors. Direct
causal factors are usually straightforward to assign because they are captured by the
operational system but many of the interesting indirect causes may not be. For
example, a sales system will not (reliably) record whether discounts are also pro-
moted by TV ads or special (in-store or on-website) product displays because this
information is not needed to complete each sale transaction and print a valid
invoice/receipt (which must show any direct discount details). A richly descriptive
promotion dimension will require this information to be sourced from elsewhere—
typically from less formal data sources, like spreadsheets and word processing
documents, and its ETL processing will need to be sophisticated enough to assign
the full combination of promotional conditions correctly.
Indirect causal
values are often
more difficult to
source than direct
causal values
The DW/BI team may have to build small data entry applications to capture causal
descriptions and timetables when this information is “known to business but not
known to any system.”
If BI users need to analyze promotion return on investment (ROI), the data ware-
house will need an additional Promotion Spend Fact table—using the same con-
formed PROMOTION dimension. BI users can then run drill-across queries
against both PROMOTION SPEND FACT and SALES FACT to compare promo-
tion costs to sales revenue uplift.
PROMOTION may
be conformed
across sales and
promotion cost star
schemas
Unstructured Why Dimensions
In some cases valuable direct causal details are attached to transactions as unstruc-
tured comments. These potentially large textual columns should be removed from
fact tables and placed in separate dimensions, to maximize fact table performance
for the majority of queries that just want to rapidly aggregate the additive facts. The
resulting text dimensions are why dimensions. Figure 9-2 shows an example
COMMENT dimension that contains reasons that salespeople have given for
varying the price for specific customers. This could be turned into a better why
dimension by adding additional attributes that codify the free-format text reasons
based on interesting keywords. Embellishing the table with low cardinality sets of
descriptive tags would provide better report row headers and more consistent
filters.
Direct causal factors
are often captured
as free-format text
reasons. These
non-additive text
facts should be
removed from fact
tables and placed in
text dimensions