My first Publication Agile-Data-Warehouse-Design-eBook | Page 203
Dimensional Design Patterns for People and Organizations, Products and Services
MegaHard £27M, and so on; i.e., a report showing total revenue for all the top level
clients without listing any of their subsidiaries. This is where the HIGHEST
PARENT flag (see Figure 6-10) is useful. By constraining on Highest_Parent
= ‘Y’ a query will include only the full hierarchy for each top most customer, and
the revenue figures for each of its subsidiaries will be summarized only once.
183
Use the HIGHEST
PARENT flag to
filter out partial
hierarchies and
avoid over-counting
Queries that include multiple parent customers without constraining on highest
parents only, must fetch subsidiaries distinctly to avoid overstating the facts. For
example, a query that asks for the total revenue for all customers with any parents
in California must present a distinct list of SUBSIDIARY KEYs to the fact table
before summing revenue, otherwise the revenue will be double counted for any
subsidiary that has both a parent and grandparent in California.
For example SQL that handles subsidiaries distinctly while querying multiple
parents see: The Data Warehouse Toolkit, Second Edition, Ralph Kimball, Margy
Ross (Wiley, 2002) page 166.
Building Hierarchy Maps
Earlier we stated that proprietary recursive extensions to SQL are not suitable for
ad-hoc BI queries against recursive relationships, but they can be used very suc-
cessfully during ETL processing to unravel recursive relationships and populate
hierarchy maps.
Recursive SQL can
be used by ETL to
load hierarchy maps
You can download an Oracle PL/SQL stored procedure for loading the COMPANY
STRUCTURE hierarchy map from modelstorming.com . You will also find SQL for
creating the map and a CUSTOMER dimension populated with the Pomegranate
example data which you can use to test it or your own ETL hierarchy map loader.
Tracking History for Variable-Depth Hierarchies
When you are designing a hierarchy map to record history, you have to decide
which of three histories you are going to track inside the map:
Parent History: tracking changes to the HV dimensional attributes of a parent
level; for example, when a parent company is reclassified or a manager’s salary
grade changes. This involves populating the hierarchy map with every surrogate
key value for a parent and adding new rows with the new parent key value for
every level in its hierarchy every time a parent HV attribute changes. Parent history is
Hierarchy History: tracking changes to the hierarchical relationships; for example,
a parent company sells a subsidiary or an employee starts reporting to a new
manager. This involves the effective dating of all the rows in the hierarchy map.
New rows are added to the hierarchy map with the appropriate effective date when
new children are added to a hierarchy and end dates are adjusted on existing
hierarchy relationships when they are changed or deleted. A change/move—for Hierarchy history can
tracked by adding
every HV parent key
value to the hierarchy
map
be tracked by adding
effective dates to
each hierarchy map
record