My first Publication Agile-Data-Warehouse-Design-eBook | Page 300

280 Chapter 9 Why and How Solution A step dimension numbers each event in a sequence The humble looking STEP dimension, in Figure 9-17, helps BI users understand sequential behavior. It allows ETL processes to explicitly label events with their position in a sequence (from its beginning and from its end), along with the length of the sequence. For example, a web browsing session of four page views by the same visitor (IP address) within an agreed timeframe would be represented as four rows in a PAGE VIEWS FACT table. The first page view event would be labeled as step 1 of 4 by assigning it a STEP KEY of 7 (see Figure 9-17). The next page view would be labeled as step 2 of 4 using STEP KEY 8, and so on. Figure 9-17 STEP dimension A STEP dimension enables positional analysis (better story telling) using simple single-pass queries Step dimensions can play multiple roles to describe sequences within sequences Figure 9-18 Using the STEP dimension to describe web page visits BI users can use the STEP dimension to easily identify page views belonging to sessions of any length, rank pages by position within sessions, and answer ques- tions about the beginning, midpoint and ending of sessions for any interesting subset of customers, time, and products. They can quickly find the good and bad (“session killer”) last page visits of a session (LAST STEP = “Y”), or those that precede session killers (STEPS UNTIL LAST = 1) using simple, single-pass SQL. Answering questions like these without a STEP dimension would be too difficult for all but the most SQL-savvy BI users. A STEP dimension can also play multiple roles for an event; for example, Figure 9- 18 shows a PAGE VIEWS FACT table with two STEP dimension roles: STEP IN SESSION which describes page position within the overall session, and STEP IN PURCHASE which describes how close each page is to a purchase decision. Each time a visitor clicks on a link to place a product in a shopping cart, STEP IN PURCHASE would be reset and the next mini-sequence length calculated.