My first Publication Agile-Data-Warehouse-Design-eBook | Page 180
Modeling Star Schemas
159
Prototyping the DW/BI Design
You cannot know how well your data warehouse design matches the available data
until you try to load it, nor how well it matches the stakeholders’ actual BI re-
quirements until they use it. That is why the agile principle of early delivery of
working software is vital for reducing DW/BI risk. So, as soon as you have a phys-
ical schema (some working software)—don’t postpone the moment of reality any
longer—validate the design by prototyping the reports and dashboards that stake-
holders have wanted to talk about all along. Validate the design
Turn end of sprint demos into prototyping workshops; have BI developers help the
original modelstormers (real stakeholders) get their “hands dirty” using their
design with real data and real BI tools, as in Figure 5-16. These workshops can be
remarkably productive because the stakeholders—having used the 7Ws to model-
storm their data requirements—will already be thinking about their business
questions and report layouts in terms of these 7W dimensional interrogatives. Stakeholders will be
by prototyping with
real data, real BI
tools, and real
stakeholders
ready to define their
reports using the
7Ws
Figure 5-16
DW/BI Prototyping
You should value working software over comprehensive documentation and
maximize the work you don’t have to do: Don’t waste time mocking up reports or
dashboards specifications using spreadsheets or word-processors when you
have a database schema, sample data and the stakeholders’ BI tools of choice.
For prototypes, avoid test data generation—it proves nothing. Instead, validate the
ETL process by sampling small amounts of real data, extracted from the actual
sources documented in the model. 10,000 recent facts with matching dimensional
descriptions plus similar samples from one or two previous years is usually just
enough representative data for stakeholders to get a true feel for what the final
solution will be like. Use data profiling to set realistic expectations of the prototype
before any queries are run. Make sure stakeholders understand that counts and
totals will be low because a small percentage of the data has been sampled.
Speed up ETL prototyping by not indexing the data. BI prototyping with un-
indexed sample data on modest hardware will also help to set realistic expecta-
tions for query performance against complete data, fully-indexed on specialist
DW/BI hardware.
Load prototype stars
with 10,000 recent
facts and similar
samples from
previous
time periods