My first Publication Agile-Data-Warehouse-Design-eBook | Page 217
Dimensional Design Patterns for People and Organizations, Products and Services
Ragged hierarchies, as described in Chapter 3, can look similar to variable-depth
hierarchies but the important distinction that makes them easier to deal with is
that they have a known maximum number of named levels. This means that they
can be implemented in a dimension by simply defining the missing or unused
levels as nullable. Figure 6-26 shows an example of a product dimension contain-
ing a ragged hierarchy (that matches the hierarchy chart of Figure 3-8). It contains
the product “POMServer”, which does not have a subcategory, perhaps because it
is the only product of its type. This simple “flattening” of the hierarchy, into a fixed
number of columns within the dimension, is in stark contrast to the complexity of
building a separate hierarchy map, but it can result in a “Swiss cheese” dimension,
full of “NULL holes” that show up as gaps on reports. Even if these holes are filled
in with the stakeholders’ preferred label, such as “Not Applicable”, they can cause
problems for drill-down analysis: all the “Not Applicable” values are grouped
together and cannot be further drilled on. Also, it does not inspire stakeholder
confidence in the data warehouse when BI applications use such a common level as
a product subcategory and return “Mobile,” “Desktop,” and “Not Applicable” .
197
Ragged hierarchies
have a fixed number
of uniquely nameable
levels. They can be
implemented in a
dimension by defining
non-mandatory
attributes for the
hierarchy levels that
have missing values
Figure 6-26
Balancing a
ragged
hierarchy
In most cases, the best approach is to balance a ragged product hierarchy by filling
in the missing values with the stakeholders during a modelstorming workshop as
part of conforming these dimensional attributes. Where filling in the gaps with the
stakeholders is not possible—stakeholders cannot agree on the appropriate new or
existing values or there are just too many missing values to tackle in the available
time—there are three methods for automatically generating usable interim values
for the missing levels (that will induce the stakeholders to create their own):
Balance slightly
ragged hierarchies
with the help of
stakeholders: by
asking them to fill in
the missing values