My first Publication Agile-Data-Warehouse-Design-eBook | Page 265
Design Patterns for High Performance Fact Tables and Flexible Measures
245
Figure 8-10
Periodic snapshot
containing
semi-additive facts
and the stock data is as follows: the product category “Advanced Laptop” contains
two products: POMBook Air and POMBook Pro; The SW region contains 10
stores; Each day last week, every SW store stocked 20 POMBook Airs and 60
POMBook Pros (let’s keep it simple); Last week had 7 days (like every other week).
AVG(Stock_Level) will return 40, which is the wrong answer to the stake–
holders question. 40 is the average of 60 and 20 which is what you get when half
the data has a value of 60 and the other half has a value of 20. The AVG() func-
tion—the equivalent of SUM(Stock_Level)/COUNT(*) —sums up 70 store/day
records with 20 laptops and 70 with 60 (5,600) and divides by the number of
records (140). To get the correct average for a category in a region, you must not
divide by the number products (in the category) or the number of stores (in the
region). Instead you must only divide by the number of non-additive dimensional
values (7 days). The correct SQL for this is: SUM(Stock_Level)/
COUNT(DISTINCT Stock_date) . The correct answer is: 800.
Periodic semi-
additive facts, such
as balances, must
use a time average
which divides the
total by the number
of non-additive time
periods in the query
Averaging a semi-additive balance correctly, requires you to understand the time
granularity of its fact table. For a daily snapshot, an average is calculated by
dividing by the number of distinct days: the number of non-additive time periods.
Counting Issues
ORDER COUNT in Figures 8-2 and 8-10 is yet another example of a semi-additive
fact that you must handle carefully. As long as queries are constrained to a single
product, ORDER COUNT can be summed across days and locations to give a total
number of unique orders. But if a query needs the total number of orders for the
“Advanced Laptop” category it’s in trouble because it will over-count any orders
that contain both POMBook Airs and POMBooks. Unfortunately, there is no way
to get the correct answer from STOCK FACT.
Storing counts, such as ORDER COUNT or CUSTOMER COUNT, in a periodic
snapshot can seem like a great idea for query efficiency (to save re-counting mil-
lions of records), but once they have been calculated to match the granularity of a
Unique counts are
semi-additive or
non-additive facts