My first Publication Agile-Data-Warehouse-Design-eBook | Page 170
Modeling Star Schemas
more sophisticated they may change their minds. With modern DW/BI hardware
you have the luxury of storing and processing dimensional history for most dimen-
sions as standard practice. And just because you track history you don’t have to
give it to BI users who don’t want it (yet). Chapter 6 covers the hybrid SCD pattern
for providing both current value (“as is”) reporting and historic value (“as was”)
reporting without further complicating the model or ETL processes.
149
Store dimensional
history if possible.
Enable CV reporting
by providing a
hybrid SCD
The Data Warehouse ETL Toolkit, Ralph Kimball, Joe Caserta (Wiley, 2004),
Chapter 5, pages 183–196 provides information on the ETL processing needed to
support slowly changing dimensions. Pages 194–196 describe the complexities of
handling late-arriving dimensional history.
Updating the Dimension Definitions
You complete each dimension by adding a surrogate key primary key, and addi-
tional ETL administrative attributes to support SCD processing and audit require-
ments. Because you are only adding columns there is no need to create separate
spreadsheet versions of dimensions. The extra columns can easily be hidden when
you use the tables again for modelstorming with stakeholders.
You complete a
dimension by
adding a primary
key and audit
columns
Adding Surrogate Keys
Add a leading surrogate key column marked SK to each dimension, using a nam-
ing convention of [Dimension] KEY; e.g., PRODUCT KEY. This usually works
well as the suffix “KEY” is seldom used for business keys (ID, CODE, and NUM
are far more common). Fill in the example missing row in each dimension with a
zero surrogate and use simple sequential integers for other examples to make it
obvious that they are surrogate keys. Figure 5-10 shows a surrogate key added to
the PRODUCT dimension.
Add a surrogate key
(SK) with examples
including zero for
the missing row
Figure 5-10
Updated PRODUCT
dimension