My first Publication Agile-Data-Warehouse-Design-eBook | Page 272

252 Chapter 8 How Many Some DBMSs allow you to partition on more than one dimension. This can be useful when a particular dimension is frequently used to constrain queries or represents the way source data extracts are organized for ETL processing; for example, by organization, geography, or data provider. Aggregation Aggregates act as group by indexes for existing fact tables DBMS aggregate navigation automates aggregate usage Small high performance aggregates can be designed using the lost, shrunken and collapsed patterns An aggregate (AG) (fact table) is a stored summary of a base fact table. It acts like a group by index on the base facts—speeding up queries that do not need to return detailed figures. They are an essential complement to traditional where clause indexes. A star-join index optimizes highly constrained queries that need to sum- marize smaller quantities of data, whereas aggregates optimize broad, loosely constrained queries that need to summarize large quantities of data. Aggregates are derived fact tables that are very similar to periodic snapshots, dimensionally and in terms of granularity. They differ from periodic snapshots in that they do not provide any new facts. Instead, they simply contain summarized versions of the additive facts from base fact tables. Historically, data warehouse queries were written to use specific aggregates in the form of summary data marts. Today, many DBMSs provide aggregate navigation that automatically redirects queries to the best (smallest) aggregate. When this happens, the aggregates are invisible to the BI users and query tools. Aggregates must be designed so that they match the GROUP BY and WHERE clauses of the most popular queries, or they will be not be used. They also must be designed so that they are many times smaller than existing fact tables—to provide performance improvements that justify the cost of maintaining them. Twenty times smaller is a useful guideline—which can lead to a corresponding query performance boost. The three types of aggregate design in a dimensional data warehouse are lost, shrunken, and collapsed. Lost Dimension Aggregate Pattern Lost dimension aggregates are created by summarizing a fact table using a subset of its dimensions. Figure 8-14 shows a lost dimension aggregate formed by dropping the customer and store dimensions. Figure 8-14 Lost dimension aggregate