My first Publication Agile-Data-Warehouse-Design-eBook | Page 294
274
Chapter 9
Why and How
is also straightforward because the unique SERIAL NUMBER of each customized
product has also been added to ORDERS FACT. This enables customized products
to be counted distinctly so as not to double-count orders where the customer has
chosen both option 2 and option 3 for the same product.
Counting unique events that are filtered on multiple multi-values requires an
appropriate unique degenerate ID in the fact table to be counted uniquely.
Unfortunately, even with the role-playing bridge table and a unique degenerate ID
the proposed design does not easily answer their third question type:
How many products were purchased with options 2, and
3 and 14, but without options 4, and 5 and 190?
Combination
analysis can involve
complex set logic
SQL
The AND logic of the option combinations complicates matters. The users cannot
answer this question with simple SQL that might contain: “ WHERE Option=2
and Option=3…” because OPTION can be equal to both 2 and 3 at the same
time! Instead they must:
1.
2.
3.
4.
5.
Run 3 queries to find products with one of the 3 required options.
INTERSECT the results to find only the products with all 3 options.
Run 3 more queries to find products that have the 3 unwanted options.
INTERSECT the results to find the products with all 3 unwanted options.
Use SQL MINUS to take the second set of products away from the first.
These 9 subqueries can be executed as a single SQL SELECT but users would not
be able to construct them (or other combination questions) using simple ad-hoc
query tools. Even if they could, the queries would not necessarily perform well.
Multi-valued bridge tables often give rise to complex multi-valued combination
constraints. The AND logic becomes complex because the constraint needs to be
placed simultaneously on the multiple rows in the bridge table. It is far easier for
SQL to constrain multiple columns in this way than multiple rows.
Solution
Use a pivoted
dimension to turn
complex row
constraints into
simple column
constraints
If the number of options available across all customizable products is limited (e.g.
200 in total) and relatively static (e.g. new options are only added once a year) this
row problem can be turned into a column solution with a bit of lateral thinking.
Figure 9-14 shows an OPTION PACK FLAG dimension. This is a pivoted dimen-
sion (denoted by the code PD) that stores the same option combinations as the
bridge table, but as columns rather than rows. It requires 200 columns to do so but
these columns are just bit (or single byte) flags and they make combination con-
straints very easy to build in SQL. For example, the filter for the previous user
question becomes: