My first Publication Agile-Data-Warehouse-Design-eBook | Page 133

112 Chapter 4 Generalization Agile data warehouse modelers must use generalization carefully. Data models that produces data value flexibility over simplicity are notoriously difficult to understand and use for BI. models that are They can work for transactional software products because their data structures are difficult for BI users completely hidden from the users by application interfaces. But “universal data to understand and models” that rely on high levels of generalization or abstraction do not work so well query for BI users who—despite the semantic layers provided by BI tools—need far simpler data warehouse designs to be able to construct and run ad-hoc queries efficiently. Modelstorming data One of the great benefits of modelstorming is that stakeholders feel a sense of requirements ownership in the resulting design. If they have abstractions forced upon them they specifically rather start to lose that feeling: it’s no longer their model, their data—it could be anyone’s. than generally The only Party Roles most stakeholders recognize are Host, Guest, or Gate- promotes stake- crasher—or maybe political ones if that’s their specialist field. In extreme cases holder design where generalization is taken too far, to the point where the data model can be used ownership to represent almost anything, it will actually mean nothing to stakeholders. This defeats the goal of modelstorming, which is not to design data structures that merely store data but to design ones that stakeholders will use and cherish. Modeling each interesting who, what, when, where, why and how as specifically as possible helps to promote the data model understanding needed to construct meaningful queries and interpret their results. Postpone ‘technical Stakeholders are happy with “reasonable” levels of generalization if they can see an benefit only’ obvious business benefit such as a better understanding of the commonalities generalization until (conformance) between business processes that improves analysis. But if the star schema design benefits are purely technical—to cut down database administration or streamline ETL—then you should postpone generalization until you design your star schemas and ETL processes. Discovering Process Sequences Conformed why and how dimensions often indicate a process sequence The last two Ws, why and how, are grouped together on the matrix because of their similarities and close relationships within processes. Whys and hows are the most common types of non-conformed dimension but when they are conformed they can often change type, from how to why and vice versa. This happens when events have a cause and effect relationship that often represents a process sequence. You discover just such a sequence if you ask: Why does a warehouse worker ship a product? and get the answer: Because a customer ordered the product.