My first Publication Agile-Data-Warehouse-Design-eBook | Page 265

Design Patterns for High Performance Fact Tables and Flexible Measures 245 Figure 8-10 Periodic snapshot containing semi-additive facts and the stock data is as follows: the product category “Advanced Laptop” contains two products: POMBook Air and POMBook Pro; The SW region contains 10 stores; Each day last week, every SW store stocked 20 POMBook Airs and 60 POMBook Pros (let’s keep it simple); Last week had 7 days (like every other week). AVG(Stock_Level) will return 40, which is the wrong answer to the stake– holders question. 40 is the average of 60 and 20 which is what you get when half the data has a value of 60 and the other half has a value of 20. The AVG() func- tion—the equivalent of SUM(Stock_Level)/COUNT(*) —sums up 70 store/day records with 20 laptops and 70 with 60 (5,600) and divides by the number of records (140). To get the correct average for a category in a region, you must not divide by the number products (in the category) or the number of stores (in the region). Instead you must only divide by the number of non-additive dimensional values (7 days). The correct SQL for this is: SUM(Stock_Level)/ COUNT(DISTINCT Stock_date) . The correct answer is: 800. Periodic semi- additive facts, such as balances, must use a time average which divides the total by the number of non-additive time periods in the query Averaging a semi-additive balance correctly, requires you to understand the time granularity of its fact table. For a daily snapshot, an average is calculated by dividing by the number of distinct days: the number of non-additive time periods. Counting Issues ORDER COUNT in Figures 8-2 and 8-10 is yet another example of a semi-additive fact that you must handle carefully. As long as queries are constrained to a single product, ORDER COUNT can be summed across days and locations to give a total number of unique orders. But if a query needs the total number of orders for the “Advanced Laptop” category it’s in trouble because it will over-count any orders that contain both POMBook Airs and POMBooks. Unfortunately, there is no way to get the correct answer from STOCK FACT. Storing counts, such as ORDER COUNT or CUSTOMER COUNT, in a periodic snapshot can seem like a great idea for query efficiency (to save re-counting mil- lions of records), but once they have been calculated to match the granularity of a Unique counts are semi-additive or non-additive facts