My first Publication Agile-Data-Warehouse-Design-eBook | Page 181
160
Chapter 5
The Data Warehouse Matrix
A data warehouse
matrix documents
the actual
relationships
between fact and
dimension tables
The event matrix is
for planning. The
physical matrix
documents the
current live model
Document
conformance failure
on the matrix by
using dimension
version numbers
A data warehouse matrix is a version of the dimensional matrix that documents the
relationships between physical fact tables (or OLAP cubes) and physical dimension
tables. You can create an initial physical matrix by copying your event matrix and
editing its row and columns to show the actual physical tables. When you do, you
should also add additional technical details that will be useful to the DW/BI team.
Figure 5-17 shows an example matrix with additional columns for data sources,
fact table type, primary time dimension granularity and fact volumetrics.
This physical matrix and the event matrix should be kept in sync as much as
possible but will diverge at times because of their distinct functions. The event
matrix is a modeling and planning tool that reflects the stakeholders’ requirements,
whereas the data warehouse matrix is a management tool that reflects the current
state of the data warehouse—including any conformance failures.
If you have to compromise within a sprint and postpone conforming a dimension,
or you inherit a warehouse that has evolved without conformed dimensions, you
should record these conformance failures on the data warehouse matrix by using
dimension version numbers. Rather than create a separate column for each non-
conformed version of a dimension, continue to use a single column for each
planned conformed dimension but number each different version in use, rather
than just tick usage. Reserve the highest number for the best version of each di-
mension (usually the most recently developed). For example, Figure 5-17 shows
that Pomegranate has failed to conform product across manufacturing, sales and
customer support, instead there are three different versions of PRODUCT (per-
haps it really should be called DIM_PRODUCT). Thankfully, PRODUCT is
partially conformed; the best version is already the most widely used and only two
stars (Customer Orders and Customer Complaints) need to be refactored.
If you need to document conformance failure, hyperlink each dimension version
number to its non-conformed dimension table definition.
Update the event
matrix with any
conformance issues
and address them
with stakeholders
After each sprint, the event matrix should be updated with conformance failures
(planned conformance that did not happen) and non-conformed realities (planned
conformance that could not happen because it was wrong) so that these issues can
be addressed with the stakeholders during the next modelstorm. Use the updated
event matrix to plan the refactoring of older stars with newer more conformed
dimensions as part of your iterative development approach.
A live version of the matrix, showing up-to-date volumetrics and the current ETL
status for each star, is the ideal dashboard for a DW/BI team. You could develop
one by using BI tools to summarize ETL and DBMS catalogue metadata.