My first Publication Agile-Data-Warehouse-Design-eBook | Page 218
198
Chapter 6
Temporary balancing
can be achieved by
filling in a missing
Who and What
Top down balancing: A value is copied down into the missing level from the level
directly above it. For example, the POMServer CATEGORY value “Computing” is
copied into SUBCATEGORY.
level with values
from the level
directly above or
below
Bottom up balancing: A value is copied up into the missing level from the level
directly below it. For example the POMServer PRODUCT TYPE value “Server” is
copied into SUBCATEGORY.
Top down and bottom up balancing: Gaps are filled with new unique values
created by concatenating the values directly above and below the missing level. For
example, “Computing/Server” might be used to fill the SUBCATEGORY gap for
“POMServer”.
When a ragged hierarchy is discovered during data profiling, if only a very small
percentage (1-2%) of it is ragged (skips a level) this usually indicates errors in the
data rather than intentional design. The errors should be corrected and a simple
balanced hierarchy defined.
Multi-Level Dimension Pattern
Problem/Requirement
A business event
needs to be
described using
various levels of a
dimensional
hierarchy
Having different business processes associated with different levels of the Product
hierarchy is common. For example, sales plans are set on a monthly basis at the
brand level, whereas sales transactions are recorded daily for individual products.
These different business processes are handled by separate fact tables, and the
different levels of product detail should be handled by separate dimensions too: a
full PRODUCT dimension for sales facts and a BRAND [RU] rollup dimension for
sales plans. Attaching the appropriate dimension to each fact table clearly docu-
ments its fixed product granularity. However, there are circumstances where a
single business process can be associated with different levels of the Product hier-
archy. For example, a web page event on a Pomegranate website can describe a
visitor viewing a single product, multiple products of the same product type, a
product category description or no product information at all.
Solution
It is possible to attach a product description to the majority of page visits recorded
on the Pomegranate’s websites, especially the online store pages. But not every
page refers to products; some pages describe multiple products: whole product
categories, subcategories, or specific brands. You can easily handle non-product
pages by using the “special” zero surrogate key that represents “missing product”,
as discussed in Chapter 5. In a similar way, you can use other “special” surrogate
key values to help you describe the page visits that relate to the higher levels in the
product hierarchy by designing a multi-level dimension.