My first Publication Agile-Data-Warehouse-Design-eBook | Page 105

84 Chapter 3 Dimensional History You must define how dimensional attributes handle history Slowly changing dimensions dramatically affect how historical events are reported When you have discovered all the attributes of a dimension, described them using examples, and modeled their hierarchical relationships, there is one more piece of information that stakeholders must tell you about each one: how to handle its history. This information is also known as a dimension’s temporal properties or slowly changing dimension (SCD) policy. Stakeholders instinctively feel the need to preserve event history — especially legally binding financial transactions — but may think of dimensions as (relatively static) reference data that simply has to be kept up to date when it does change. While it is true that dimensions are relatively static compared to dynamic business events, slowly changing dimension history, or rather the lack of it, can have a profound effect on a data warehouse’s ability to accurately compare events over time and meet stakeholders’ initial and longer-term needs. For example, Dean Moriarty, a customer who was based in New York, relocates to California at the beginning of this year. Should a BI query for “order totals by state, this year vs. last year” associ- ate all of Moriarty’s orders to his current location: California? Or should Mori- arty’s orders last year be associated with New York (last year’s location), and only this year’s with California? What if BI users want to look at the biggest spenders in California over the last two years, should their queries include Moriarty based on his high spending while he was still in New York last year or exclude him because he hasn’t spent so much since moving to L.A.? Another way of asking these ques- tions is “Should queries use the current or historical values of customer state?” Is there a simple answer? Current Value Attributes Operational systems generally default to current valued reference data Current value (CV) attributes provide “as is” reporting that matches operational reporting results Operational systems generally concentrate on the “here and now” with an under- standable bias towards current values for reference or master data. For example, an order processing system will make sure that only “this year’s model” products are available to be selected for shipping to customers’ current locations. Because of this bias, operational database applications will often overwrite reference data when it changes. If the same updates are applied to dimensional attributes they will contain current values only. Dimensional attributes that only contain the current value descriptions provide “as is” reporting; i.e., they roll up event history using today’s descriptions, making it seem as if everything has always been described as it is now. For the previous example, current value (CV) attributes would roll up all of Moriarty’s orders to California (his current location) regardless of where, on the road, he was living when he placed them. This is typically the style of reporting that stakeholders are used to from their attempts to analyze history directly from operational systems.