My first Publication Agile-Data-Warehouse-Design-eBook | Page 30
How to Model a Data Warehouse
9
relationship between the dimensions. A subset of the dimensional foreign keys
form a composite primary key for the fact table and defines its granularity, or level
of detail.
The term dimension in this book refers to a dimension table whereas dimensional
attribute refers to a column in a dimension table.
Dimensions contain sets of descriptive (dimensional) attributes that are used to
filter data and group facts for aggregation. Their role is to provide good report row
headers and title/heading/footnote filter descriptions. Dimensional attributes often
have a hierarchical relationship that allows BI tools to provide drill-down analysis.
For example, drilling down from Quarter to Month, Country to Store, and Cate-
gory to Product. Dimensional
Not all dimensional attributes are text. Dimensions can contain numbers and dates
too, but these are generally used like the textual attributes to filter and group the
facts rather than to calculate aggregate measures. Despite their width, dimensions
are tiny relative to fact tables. Most dimensions contain considerably less than a
million rows. Dimensions are
hierarchies support
drill-down analysis
small,
fact tables
are large
The most useful facts are additive measures that can be aggregated using any
combination of the available dimensions. The most useful dimensions provide
rich sets of descriptive attributes that are familiar to BI users.
Advantages of Dimensional Modeling for Data Warehousing
The most obvious advantage of a dimensional model, noticeable in Figure 1-3, is its
simplicity. The small number of tables and joins, coupled with the explicit facts in
the center of the diagram, makes it easy to think about how sales can be measured
and easy to construct the necessary queries. For example, if BI users want to
explore product sales by store, only one short join path exists between PRODUCT
and STORE: through the SALES FACT table. Limiting the number of tables in-
volved and the length of the join paths in this way maximizes query performance
by leveraging DBMS features such as star-join optimization (which processes
multiple joins to a fact table in a single pass). Dimensional models
A deeper, less immediately obvious benefit of dimensional models is that they are
process-oriented. They are not just the result of some aggressive physical data
model optimization (that has denormalized a logical 3NF ER model into a smaller
number of tables) to overcome the limitations of databases to cope with join
intensive BI queries. Instead, the best dimensional models are the result of asking
questions to discover which business processes need to be measured, how they
should be described in business terms and how they should be measured. The
resulting dimensions and fact tables are not arbitrary collections of denormalized
data but the 7Ws that describe the full details of each individual business event
worth measuring. Dimensional models
maximize query
performance and
usability
are process-
oriented. They
represent business
processes
described using the
7Ws framework