My first Publication Agile-Data-Warehouse-Design-eBook

Design Patterns for High Performance Fact Tables and Flexible Measures 229 Financial transaction fact tables often have an additional “book date” or applica- ble financial period dimension to handle late-arriving transactions and adjust- ments. The generic version of this is an audit date dimension, which can be added to any fact table to record when facts are inserted. Transaction fact tables are insert only because all the information about their transactional facts is known at the time they are loaded into the data warehouse, and does not change—unless errors occur. Even then, if the errors are operational rather than ETL, they are often handled as additional adjustment transactions that must be inserted. This helps to keep ETL processing as simple and efficient as possible—an important consideration when loading hundreds of millions of rows per day. Although transaction fact tables can be extremely deep, they are generally narrow—containing only the small numbers of facts captured by operational systems on any one transaction. Transaction fact tables are insert only – which speeds up their ETL processing Consequences Transaction fact tables are the bedrock of dimensional data warehouses. Because they do not summarize operational detail, they provide access to all the dimensions and facts of a business process. In theory, this means they can be used to calculate any business measure. However, in practice—due to their size and the complexity of many business measures—they can’t be used directly to answer every question. For example, transaction fact tables are impractical for repetitively calculating running totals over long periods of time. For efficiency, cumulative facts, such as balances, are best modeled as recurring events and implemented as periodic snapshots. Transaction fact tables often need to be supplemented with snapshots for BI usability and query performance Periodic Snapshot Periodic snapshots (PS) are used to store recurring measurements that are taken at regular intervals. Recurring measurements can be atomic-level facts that are only available on a periodic basis (such as the minute by minute audience figures for a TV channel), or they can be derived from more granular transactional facts. Periodic snapshots Most data warehouses use daily or monthly snapshots to store balances and other measures that would be impractical to calculate at query time from the raw trans- actions. For example, compare the cost of calculating product revenue and product stock level for April 1 st 2011 using atomic-level sales and inventory transactions. Product revenue is calculated by summarizing that one day’s worth of sales trans- actions, whereas the product stock level calculation requires every inventory transaction prior to April 1 st 2011 to be consolidated. To efficiently answer stock questions you need a periodic snapshot, such as STOCK FACT shown in Figure 8- 2. This is a daily snapshot of in-store product inventory that records the net daily effect of inventory transactions, rather than the transactions themselves. Periodic snapshots store regularly recurring facts can contain atomic- level facts but are typically used to hold measures derived from more granular transactions

My first Publication Agile-Data-Warehouse-Design-eBook | Page 249