My first Publication Agile-Data-Warehouse-Design-eBook | Page 224

204 Chapter 7 When and Where Time Dimensions When details are modeled as physical time dimensions Physical date dimensions help to simplify the most common grouping and filtering requirements of BI queries Every event contains at least one when detail, which should always be modeled as a time dimension, rather than left as a timestamp in the fact table. But why do you need a time dimension when you have datetime data types, date functions and date arithmetic built into data warehouse databases and BI tools? Descriptive time attributes, such as day of week, month, quarter and year, are constantly used to group and filter the information on BI reports. Deriving them from raw timestamps in every query is woefully inefficient and puts an unnecessary burden on BI users and BI tools, that cause mistakes and inconsistencies. Why decode the month or day of week every time they are needed, when they could be stored once in a dimension and reused consistently and efficiently, like any other dimensional attribute? Also, many commonly used time attributes—such as fiscal periods, holidays, and seasons of the year—simply cannot be derived from time- stamps alone because they are organization or location specific. You should build a physical time dimension to: Avoid duplication of calendar logic in each report or BI application Remove date arithmetic from constraints to increase index use Insulate queries from DBMS-specific date functions Support organization-specific fiscal periods Define conformed time hierarchies Provide consistent time-related business labels and definitions Date and time of day are modeled as separate dimensions to match their dimensional usage, manage their size and make the time granularity of facts explicit Calendar and clock are role-playing dimensions Time is actually best modeled dimensionally by splitting it into date and time of day. This may seem odd at first, but it does reflect how time is queried. Almost every query will group or filter on sets of days (Years, Quarters, Months or Weeks). Many queries will do the same with periods within a day (AM/PM, work shift, peak periods). But very few queries will use arbitrary periods that span dates and times (e.g., “sales totals by periods of 2 days and 8 hours”). Financial queries are grouped by the date-related fiscal calendar, ignoring time of day all together, while opera- tional and behavior queries can group months of data together by time of day to see peak and average activity levels. In recognition of these query schisms, when details (logical time dimensions) should be implemented as two distinct and manageable physical dimensions: a calendar date dimension, and a clock time of day dimension, each with its own surrogate key. Separating date and time, like this, also makes the time granularity of facts explicit. If time of day is not significant (or not recorded) for a business event, its fact table design simply omits the clock dimension, and includes only the calendar dimension. Figure 7-1 shows typical examples of Calendar and Clock dimensions related to an order fact table. Each of these dimensions play two roles: representing distinct Order and Delivery dates and times. Although only one physical instance of each