My first Publication Agile-Data-Warehouse-Design-eBook | Page 275

Design Patterns for High Performance Fact Tables and Flexible Measures 255 Mastering Data Warehouse Aggregates by Christopher Adamson (Wiley, 2006) provides definitive advice on designing, building, and using invisible aggregates within a dimensional data warehouse. Drill-Across Query Pattern Problem/Requirement As new star schemas and business processes are added to the data warehouse, BI users’ questions will inevitably become more sophisticated because they will want to perform cross-process analysis. When they do, it's important to understand how their queries should access multiple fact tables to compare and combine measures. For example, Figure 8-17 shows two HR processes: salary payments and ab- sence/leave tracking that need to be compared to answer the question: “Which employees were highly paid but were frequently absent in 2011?” Cross-process analysis requires queries to access multiple fact tables Figure 8-17 Querying multiple fact tables Because the two fact tables SALARY FACT and ABSENCES FACT share con- formed EMPLOYEE and CALENDAR dimensions it appears straightforward to join them using their common surrogate keys, as in the following query: SELECT Employee_ID, Employee, SUM(Salary), Sum(Absence) FROM Salary_Fact s, Absences_Fact a, Calendar c, Employee e WHERE s.Employee_Key = e.Employee_Key AND a.Employee_Key = e.Employee_Key AND s.Date_Key = c.Date_Key AND a.Date_Key = c.Date_Key AND c.Year = 2011 AND e.Employee_ID = “007” GROUP BY Employee_ID, Employee SORT BY 3 While the above SQL appears perfectly valid, it will not produce the correct totals for James Bond or any other employees even if the “007” constraint was removed. Joining two fact tables – don’t try this at home!