My first Publication Agile-Data-Warehouse-Design-eBook | Page 228

208 Chapter 7 When and Where Populating the Calendar CALENDAR dimensions often need to cover wider date ranges than you think: to cope with birth dates and future maturity dates Because the CALENDAR dimension is relatively small and static, it is often pre- populated with all the dates needed for the foreseeable future. For example, loading the calendar with 20 years of data—enough to cover 10 years of history and 10 years into the future—would create a modest dimension of only 7,308 records. Having said that, calendars often need to cover a wider date range than first antici- pated. For example, a financial services data warehouse might only hold 10 years of transactions, but may need a CALENDAR dimension that can cope with customer dates of birth up to 120 years ago and policy maturity dates 50 years into the future. For many of these future dates holiday information will not be available and will need to be left as null. Spreadsheets, database functions, stored procedures, and ETL tools are all appro- priate for populating the calendar—any of these can quickly generate the standard calendar attributes from any origin date. Search online for “date dimension genera- tor” to find SQL code and spreadsheets that you can reuse. Table 7-1 includes additional, less automated, sources for enriching the calendar. Table 7-1 A TTRIBUTE E XAMPLE S OURCE Calendar attribute Standard calendar Day, Month, Quarter Spreadsheets, SQL functions/stored procedures, ETL tools, online date dimension generators Fiscal calendar Fiscal Period, Fiscal Year Finance department Holiday schedule Holiday Flag HR, manufacturing, national calendar Seasonal information Sales Season Sales, marketing, national calendar sources BI Tools and Calendar Dimensions Design the CALENDAR dimension to take advantage of your BI tool features You should design your CALENDAR dimension with the features and limitations of your BI toolset in mind. For example, some BI tools require specific date col- umns to help calculate time series measures and make efficient time comparisons. Many BI tools have the ability to define a default display format for each column; for example, DISCOUNT can be defined to always display as a two-digit percent- age. You can use this feature, if available, to create a “correctly sorted month” report item by defining MONTH as a date column that stores the first (or last, just be consistent) day of each month, with a BI display format of “Mmm YYYY”. Even though MONTH is represented as a date, it will group the facts correctly because it contains only 12 distinct values for each year and it will display the month name correctly thanks to the default display format, and most importantly: it sorts correctly in calendar order because it is a date. This saves BI users from having to pick two columns: MONTH NAME to display and MONTH NUMBER to sort by. Holding the month as a date also enables automatic national language translation of month names, if your BI toolset supports localization.