My first Publication Agile-Data-Warehouse-Design-eBook | Page 253

Design Patterns for High Performance Fact Tables and Flexible Measures 233 Fact Table Granularity A fact table’s granularity is its level of detail: the meaning of each fact row in the table. Granularity can be stated in business terms and/or dimensionally. For exam- ple, the business definition of granularity for an order fact table is “one record per order line item”, while the dimensional granularity is “orders by date, time, cus- tomer, and product”. Transaction fact table and accumulating snapshot granularity tends to be defined in business terms while periodic snapshot granularity is defined dimensionally. Whichever approach you choose (often both, for the benefit of stakeholders and the DW team), stating and clearly documenting the granularity, is an essential step in fact table design. Fact tables that have fuzzy or mixed granu- larity definitions are impossible to build and use correctly. Granularity Granularity is documented in the model by recording the combination of granular- ity dimensions (GD) that uniquely identify each fact. For most transaction fact tables and accumulating snapshots the list of GD columns will include a degener- ate transaction ID dimension; for example, a call detail fact table with a business granularity of “one row per phone call”, can use a degenerate CALL REFERENCE NUMBER [GD] to uniquely identify each row. This succinct granularity definition is very useful for ETL processing, but for BI purposes it can be helpful if the granu- larity can also be defined using dimensions that are more likely to be queried— such as customer and call timestamp (assuming a customer can only make one call at a time). These alternative granularity definitions can be documented using numbered GD codes. For example, CALL REFERENCE NUMBER [GD1] and CUSTOMER KEY [GD2], CALL DATE KEY [GD2], CALL TIME KEY [GD2]. Granularity can be describes a fact table’s level of detail: the meaning of each fact row. It must be clearly documented stated in business terms or dimensionally by listing GD columns For accumulating snapshots and period-to-date snapshots that must be updated, GD columns, especially degenerate IDs, are used to define unique update indexes for fast ETL processing. For advice on fact table indexes see Indexing later in this chapter. Modeling Evolving Events Evolving events represent business processes that are complex enough or take enough time to complete that they are described as sequences of smaller milestone events. You can think of them as multi-verb events because each milestone can represent a discrete event (verb). These multiple verbs can be modeled as a single evolving event in two ways: Related events that represent a process sequence can be modeled as multi- verb evolving events, initially or Initially as evolving: An evolving event can emerge directly in response to a modelstorming “Who does what?” question when stakeholders think of an event as only the beginning of a time-consuming process that needs to be retrospectively