My first Publication Agile-Data-Warehouse-Design-eBook | Page 249
Design Patterns for High Performance Fact Tables and Flexible Measures
229
Financial transaction fact tables often have an additional “book date” or applica-
ble financial period dimension to handle late-arriving transactions and adjust-
ments. The generic version of this is an audit date dimension, which can be added
to any fact table to record when facts are inserted.
Transaction fact tables are insert only because all the information about their
transactional facts is known at the time they are loaded into the data warehouse,
and does not change—unless errors occur. Even then, if the errors are operational
rather than ETL, they are often handled as additional adjustment transactions that
must be inserted. This helps to keep ETL processing as simple and efficient as
possible—an important consideration when loading hundreds of millions of rows
per day. Although transaction fact tables can be extremely deep, they are generally
narrow—containing only the small numbers of facts captured by operational
systems on any one transaction.
Transaction fact
tables are insert
only – which speeds
up their ETL
processing
Consequences
Transaction fact tables are the bedrock of dimensional data warehouses. Because
they do not summarize operational detail, they provide access to all the dimensions
and facts of a business process. In theory, this means they can be used to calculate
any business measure. However, in practice—due to their size and the complexity
of many business measures—they can’t be used directly to answer every question.
For example, transaction fact tables are impractical for repetitively calculating
running totals over long periods of time. For efficiency, cumulative facts, such as
balances, are best modeled as recurring events and implemented as periodic
snapshots.
Transaction fact
tables often need to
be supplemented
with snapshots for
BI usability and
query performance
Periodic Snapshot
Periodic snapshots (PS) are used to store recurring measurements that are taken at
regular intervals. Recurring measurements can be atomic-level facts that are only
available on a periodic basis (such as the minute by minute audience figures for a
TV channel), or they can be derived from more granular transactional facts. Periodic snapshots
Most data warehouses use daily or monthly snapshots to store balances and other
measures that would be impractical to calculate at query time from the raw trans-
actions. For example, compare the cost of calculating product revenue and product
stock level for April 1 st 2011 using atomic-level sales and inventory transactions.
Product revenue is calculated by summarizing that one day’s worth of sales trans-
actions, whereas the product stock level calculation requires every inventory
transaction prior to April 1 st 2011 to be consolidated. To efficiently answer stock
questions you need a periodic snapshot, such as STOCK FACT shown in Figure 8-
2. This is a daily snapshot of in-store product inventory that records the net daily
effect of inventory transactions, rather than the transactions themselves. Periodic snapshots
store regularly
recurring facts
can contain atomic-
level facts but are
typically used to
hold measures
derived from more
granular
transactions