My first Publication Agile-Data-Warehouse-Design-eBook | Page 30

How to Model a Data Warehouse 9 relationship between the dimensions. A subset of the dimensional foreign keys form a composite primary key for the fact table and defines its granularity, or level of detail. The term dimension in this book refers to a dimension table whereas dimensional attribute refers to a column in a dimension table. Dimensions contain sets of descriptive (dimensional) attributes that are used to filter data and group facts for aggregation. Their role is to provide good report row headers and title/heading/footnote filter descriptions. Dimensional attributes often have a hierarchical relationship that allows BI tools to provide drill-down analysis. For example, drilling down from Quarter to Month, Country to Store, and Cate- gory to Product. Dimensional Not all dimensional attributes are text. Dimensions can contain numbers and dates too, but these are generally used like the textual attributes to filter and group the facts rather than to calculate aggregate measures. Despite their width, dimensions are tiny relative to fact tables. Most dimensions contain considerably less than a million rows. Dimensions are hierarchies support drill-down analysis small, fact tables are large The most useful facts are additive measures that can be aggregated using any combination of the available dimensions. The most useful dimensions provide rich sets of descriptive attributes that are familiar to BI users. Advantages of Dimensional Modeling for Data Warehousing The most obvious advantage of a dimensional model, noticeable in Figure 1-3, is its simplicity. The small number of tables and joins, coupled with the explicit facts in the center of the diagram, makes it easy to think about how sales can be measured and easy to construct the necessary queries. For example, if BI users want to explore product sales by store, only one short join path exists between PRODUCT and STORE: through the SALES FACT table. Limiting the number of tables in- volved and the length of the join paths in this way maximizes query performance by leveraging DBMS features such as star-join optimization (which processes multiple joins to a fact table in a single pass). Dimensional models A deeper, less immediately obvious benefit of dimensional models is that they are process-oriented. They are not just the result of some aggressive physical data model optimization (that has denormalized a logical 3NF ER model into a smaller number of tables) to overcome the limitations of databases to cope with join intensive BI queries. Instead, the best dimensional models are the result of asking questions to discover which business processes need to be measured, how they should be described in business terms and how they should be measured. The resulting dimensions and fact tables are not arbitrary collections of denormalized data but the 7Ws that describe the full details of each individual business event worth measuring. Dimensional models maximize query performance and usability are process- oriented. They represent business processes described using the 7Ws framework