My first Publication Agile-Data-Warehouse-Design-eBook | Page 275
Design Patterns for High Performance Fact Tables and Flexible Measures
255
Mastering Data Warehouse Aggregates by Christopher Adamson (Wiley, 2006)
provides definitive advice on designing, building, and using invisible aggregates
within a dimensional data warehouse.
Drill-Across Query Pattern
Problem/Requirement
As new star schemas and business processes are added to the data warehouse, BI
users’ questions will inevitably become more sophisticated because they will want
to perform cross-process analysis. When they do, it's important to understand how
their queries should access multiple fact tables to compare and combine measures.
For example, Figure 8-17 shows two HR processes: salary payments and ab-
sence/leave tracking that need to be compared to answer the question: “Which
employees were highly paid but were frequently absent in 2011?”
Cross-process
analysis requires
queries to access
multiple fact tables
Figure 8-17
Querying multiple
fact tables
Because the two fact tables SALARY FACT and ABSENCES FACT share con-
formed EMPLOYEE and CALENDAR dimensions it appears straightforward to
join them using their common surrogate keys, as in the following query:
SELECT Employee_ID, Employee, SUM(Salary), Sum(Absence)
FROM
Salary_Fact s, Absences_Fact a, Calendar c, Employee e
WHERE s.Employee_Key = e.Employee_Key
AND
a.Employee_Key = e.Employee_Key
AND
s.Date_Key
= c.Date_Key
AND
a.Date_Key
= c.Date_Key
AND
c.Year
= 2011
AND
e.Employee_ID = “007”
GROUP BY Employee_ID, Employee SORT BY 3
While the above SQL appears perfectly valid, it will not produce the correct totals
for James Bond or any other employees even if the “007” constraint was removed.
Joining two fact
tables – don’t try
this at home!