My first Publication Agile-Data-Warehouse-Design-eBook | Page 240
220
Chapter 7
When and Where
Date Version Keys
Problem/Requirement
Multiple DATE
KEYs for a date
must still sort in date
order for efficient
partitioning and join
processing
Maintain DATE KEY
sort order by
appending a fixed
length version
number
When creating surrogate key values for multiple versions of a day it is important to
preserve their date order sequence so that, for example, all versions of March 17,
2011 are sorted together, ahead of all versions of March 18, 2011 (as shown in
Figure 7-9). This is vital for efficient fact table partitioning and date range join
processing that uses SQL BETWEEN logic.
Solution
You can maintain surrogate key date order sequence by appending a version
number to the end of the standard sequential date key—effectively scaling it by the
number of version digits. Figure 7-9 shows an epoch date key (generated using a
reference date of January 1, 1900) with a two-digit version number appended.
Two-digits allow the calendar to support up to 100 versions (0-99) of each date.
The same technique can also be applied to ISO format date keys, in which case
YYYYMMDD would become YYYYMMDDVV, where VV is the version number.
Building version numbers into your date keys is a good idea even if your data
warehouse or data mart will never go international. You never know when an extra
version of a date will come in handy.
You can create a
date version for
every country or just
one for each
variation on a date
The number of date versions needed depends on your multinational business
requirements. You can create a date version for every country (200+). This might
be appropriate if there are many geopolitical attributes and the combination of
possible values is greater than the number of countries. Alternatively, if the only
attribute that varies by location is HOLIDAY (Y or N), then you need only two
versions of a day: one for HOLIDAY = ‘Y’ and one for ‘N’. Only one version would
be needed for any date that is globally a holiday or non-holiday. A financial organi-
zation might use a calendar with six versions of each day, one for each of its global
markets.
Needing a date version for each country is unlikely, because many will share
common geopolitical attribute values. Create a single “00” standard version for
each day, and then add versions as needed when you encounter regional or
international variations.
Consequences
Because CALENDAR is the most commonly occurring role-playing dimension, it
is important to keep DATE KEYs small when modeling for multinational versions.
If you really need more than ten versions of a date and you have chosen
YYYYMMDD format date keys, adding a two digit version number will require an
8-byte integer. If you can live with ten versions or less—or use an epoch-based date
key—a 4-byte integer will suffice. Smaller date keys are always a good thing—
especially for larger fact tables!