My first Publication Agile-Data-Warehouse-Design-eBook | Page 183

162 Chapter 5 Summary Agile data profiling targets the data sources implicated by the BEAM ✲ model. It is done early as a data-driven modeling activity to validate the stakeholders data requirements before detailed star schemas are designed. When data sources don’t yet exist, proactive DW/BI designs based on the BEAM ✲ model can help define better BI data feeds from new operational systems. Annotated models present data profiling results in a format stakeholders are familiar with. An annotated table contains source names, data types and summary data profiling metrics. Data source issues such as missing data and mismatched definitions are highlighted using strikethrough. Additional data is highlighted using italics. The DW/BI team uses the annotated model and detailed data profiling results to provide initial task estimates for building and loading the proposed facts and dimensions. These ETL estimates are added to the event matrix for use during model review and sprint planning. During a model review, stakeholders use the annotated model and the DW/BI team estimates to agree amendments to the design and reprioritize their requirements in light of the data realities and available development resources. BEAM ✲ models are easily translated into logical dimensional models and star schemas. Dimension tables are updated by adding primary keys and administrative attributes. Event tables are converted into fact tables by replacing dimensional details with foreign keys and changing quantities (how many details) into fully-additive (FA), semi-additive (SA), or non- additive (NA) facts with standardized (conformed) units of measure. (Data warehouse) surrogate keys are used as dimension primary keys to insulate the data warehouse from business keys, provide dimensional flexibility (manage SCDs, missing values, multi-levels, etc.) and improve query efficiency. Enhanced star schemas convey additional dimensional information. Consistent dimensional layout documents dimensions by W-type and increases multi-star schema model readability. BEAM ✲ short codes document table and column level dimensional properties not handled by standard ER notation. In addition to the standard documentation provided by modeling tools, a data warehouse matrix provides an overview of all the star schemas and OLAP cubes in the data warehouse. Similar in layout to an event modeling and planning matrix, this physical matrix provides additional information about the actual warehouse for a technical audience. It is a vital tool for managing the warehouse and must be kept up to date along with the event matrix and star schema diagrams as the data warehouse design evolves.