European Policy Analysis Volume 2, Number 1, Spring 2016

European Policy Analysis

In Figure 5 , predictor space is now divided in a way that either more “ Punctuation ” or more “ Incremental ” values are in each region . For classification , we simply look for the dominant category in each region . For example , if Year is less than 1959 , it is more likely that there had been a punctuation in budgets . If we split the remaining area again , we see that frequent mentioning of a topic in the State of the Union speeches of the previous years made punctuations less likely . 7

The division of the predictor space in different regions can be visualized in a decision tree . Figure 6 shows the corresponding decision tree .

The great thing about splitting the predictor space in regions is that the direction of the predictors ’ influence can change . If sou is bigger than 1.5 , this increases the chance of punctuations , but if it is above 54 punctuations are less likely .

In order to find out how the predictor space could be divided in regions , two things are necessary : a criterion to evaluate the value of a possible split and an algorithm to find splits that optimize this criterion .

In classification trees , the classification error rate for each region would be a compelling evaluation criterion 8 :

EE = 1 − max

! pp !" .

Here , p ! represents the proportion of training observations in the mth

region that are from the kth class .” ( James et al . 2013 , 311 – 312 ).

Because the classification error is not sensitive enough for tree growing , in practice , the Gini index is often preferred . “ The Gini index is defined by

GG = pp !" 1 − pp !" ,

!!! a measure of total variance across the K classes ” ( James et al . 2013 , 312 ). The Gini index can be interpreted as a measure of node purity , that is , the lower the number of observations from different classes in a region , the lower the Gini index will be . For a region with only one class , the Gini index will be zero .

To find the best partition of the predictor space , recursive binary splitting is used , which is a top-down , greedy approach :

The approach is top-down because it begins at the top of the tree ( at which point all observations belong to a single region ) and then successively splits the predictor space ; each split is indicated via two new branches further down on the tree . It is greedy because at each step of the treebuilding process , the best split is made at that particular step , rather than looking ahead and picking a split that will lead to a better tree in some future step . ( James et al . 2013 , 306 )

It is recursive because every branch of the tree could be seen as a tree on its own and the same procedure is applied again ( Siroky 2009 ).

The algorithm can be described as follows :

1 . Try all possible cutpoints for the first predictor and calculate the resulting evaluation criterion . 2 . Choose the cutpoint that reduces the criterion most .

107

European Policy Analysis Volume 2, Number 1, Spring 2016 | Page 107