My first Publication Agile-Data-Warehouse-Design-eBook | Page 252
232
Chapter 8
Accumulating
snapshots can
usefully contain
duration and state
count facts that
match their
milestone time
dimensions
How Many
LENDING FACT also contains a duration (OVERDUE DAYS) and a state count
(OVERDUE COUNT). Durations are typical accumulating snapshot facts. If there
are a small number of interesting durations, they can be stored as explicit facts. If
there are many possible durations because there are a number of milestone dates,
the fact table should physically store the milestones as timestamp facts and BI
applications should access it through a view that calculates the durations. State
counts are another characteristic of an accumulating snapshot fact. They typically
match the milestones dates and simply record a 1 if a milestone has been reached
or 0 if it has not. They allow queries to quickly sum the number of events at each
milestone in a single pass without decoding dates or applying complex filters.
LENDING FACT could be extended with additional state counts for returned, lost
and on loan books.
Consequences
Accumulating
snapshots are
difficult to build,
especially when
they merge events
from multiple source
systems
Develop
accumulating
snapshots
incrementally by
modeling evolving
events and
delivering milestone
star schemas
Accumulating snapshots that support end-to-end business process measurement
are some of the most valuable fact tables, and are very popular with stakeholders,
but they can be extremely difficult to build. Many ETL nightmares are caused by
trying to merge multiple operational sources and transaction types in one pass into
the perfect accumulating snapshot. The code involved is complex and difficult to
quality assure, often resulting in delays. And when the snapshot is finally delivered,
while it may answer the initial questions perfectly, all too soon stakeholders can hit
a BI brick wall when they need to drill into missing details. This happens because
accumulating snapshots typically summarize a process from the perspective of the
initial event and only record the current status of the overall event. For example, an
order processing snapshot that summarizes deliveries for each order line would
help to spot problems with fulfillment performance, but would lack the delivery
details needed to explain why the problems are occurring.
The agile approach to successfully delivering an accumulating snapshot is to build
it incrementally. Using BEAM ✲ , snapshot requirements are captured by modeling
an evolving event (described shortly) that is implemented over a number of short
development sprints by remodeling its milestones as simpler discreet events. The
resulting transactional star schemas are far easier to build and test individually and
can provide early BI value ahead of the accumulating snapshot, which is incremen-
tally created by the relatively straightforward merging of facts that already use
conformed dimensions. The added bonus of this approach is reduced technical
debt: the atomic-level transaction stars contain all the details stakeholders will need
for drill down analysis in the future.
The Data Warehouse Toolkit, Second Edition, Ralph Kimball, Margy Ross (Wiley,
2002) contains four interesting accumulating snapshot case studies:
Chapter 5, “Order Processing”
Chapter 12, “Education” (college admissions)
Chapter 13, “Healthcare” (billing tracking)
Chapter 15, “Insurance” (claims processing)