My first Publication Agile-Data-Warehouse-Design-eBook | Page 228
208
Chapter 7
When and Where
Populating the Calendar
CALENDAR
dimensions often
need to cover wider
date ranges than
you think: to cope
with birth dates and
future maturity dates
Because the CALENDAR dimension is relatively small and static, it is often pre-
populated with all the dates needed for the foreseeable future. For example, loading
the calendar with 20 years of data—enough to cover 10 years of history and 10
years into the future—would create a modest dimension of only 7,308 records.
Having said that, calendars often need to cover a wider date range than first antici-
pated. For example, a financial services data warehouse might only hold 10 years of
transactions, but may need a CALENDAR dimension that can cope with customer
dates of birth up to 120 years ago and policy maturity dates 50 years into the
future. For many of these future dates holiday information will not be available and
will need to be left as null.
Spreadsheets, database functions, stored procedures, and ETL tools are all appro-
priate for populating the calendar—any of these can quickly generate the standard
calendar attributes from any origin date. Search online for “date dimension genera-
tor” to find SQL code and spreadsheets that you can reuse. Table 7-1 includes
additional, less automated, sources for enriching the calendar.
Table 7-1 A TTRIBUTE E XAMPLE S OURCE
Calendar attribute Standard
calendar Day, Month,
Quarter Spreadsheets, SQL functions/stored procedures,
ETL tools, online date dimension generators
Fiscal
calendar Fiscal Period,
Fiscal Year Finance department
Holiday
schedule Holiday Flag HR, manufacturing, national calendar
Seasonal
information Sales Season Sales, marketing, national calendar
sources
BI Tools and Calendar Dimensions
Design the
CALENDAR
dimension to
take advantage of
your BI tool features
You should design your CALENDAR dimension with the features and limitations
of your BI toolset in mind. For example, some BI tools require specific date col-
umns to help calculate time series measures and make efficient time comparisons.
Many BI tools have the ability to define a default display format for each column;
for example, DISCOUNT can be defined to always display as a two-digit percent-
age. You can use this feature, if available, to create a “correctly sorted month”
report item by defining MONTH as a date column that stores the first (or last, just
be consistent) day of each month, with a BI display format of “Mmm YYYY”. Even
though MONTH is represented as a date, it will group the facts correctly because it
contains only 12 distinct values for each year and it will display the month name
correctly thanks to the default display format, and most importantly: it sorts
correctly in calendar order because it is a date. This saves BI users from having to
pick two columns: MONTH NAME to display and MONTH NUMBER to sort by.
Holding the month as a date also enables automatic national language translation
of month names, if your BI toolset supports localization.