My first Publication Agile-Data-Warehouse-Design-eBook | Page 277
Design Patterns for High Performance Fact Tables and Flexible Measures
Drill-across or multi-pass query support is a key feature of BI tools. It helps to
manage query performance by keeping individual queries simple. By accessing fact
tables one at a time the queries can be optimized as star joins and take advantage of
aggregate navigation. They may also be run in parallel by the DBMS.
257
Multi-pass SQL
summarizes fact
tables one at a time,
then joins the results
Choose BI tools that have drill-across/multi-pass functionality. As an alternative to
multi-pass, some tools will generate multiple inline views within a single query.
Drill-across also supports distributed data warehousing. You can scale a dimen-
sional data warehouse by placing star schemas and OLAP cubes on multiple
database platforms in multiple locations. Multi-pass queries allow these to be
accessed as a single data warehouse. Distributed data warehouses can use different
hardware, operating systems, and DBMSs for each database server—as long as they
contain stars or cubes with conformed dimensions that can be queried by a com-
mon BI toolset using drill-across techniques.
Drill-across enables
distributed data
warehousing. Stars
can be placed on
different DBMSs
As a general rule, fact tables shouldn’t be directly joined. Most fact tables have a
1:M or M:M relationship, which results in the facts being overstated when meas-
ures are calculated. Instead they should be queried by drilling across.
Drill-across works very well when queries need to combine summarized facts; for
example, when business processes are compared at a monthly or quarterly level,
individual multi-pass queries will access millions of facts but the answer sets will be
aggregated to a conformed row-header levels before they are returned, and BI tools
will only have to merge reports’ worth of data—a few hundred rows.
Drill-across queries
work well for
summary-level
process
comparisons
Consequences
However, drill across doesn’t work for every type of cross-process or multi-event
analysis. For example, Figure 8-19 shows the sad state of a BI user who is trying to
compare orders and shipments. He is trying to ask questions such as “What was
the average delay on shipping an order item over the last six months?” and “How
many unshipped items are there YTD this year vs. last year?” but his queries never
seem to finish, or perhaps even start. The problem is these questions require
individual line items from each transaction fact table to be compared before they
are aggregated. This can result in multi-pass SQL that returns millions of rows that
the BI tools must attempt to merge. Even when smart BI tools can construct the
correct in-database joins, performance can still be poor.
Drill-across queries
become inefficient
when cross-process
analysis involves
atomic-level
comparisons
Figure 8-19
Unhappy BI user:
difficult drill-across
analysis