Journal on Policy & Complex Systems Volume 5, Number 2, Fall 2019 | Page 181

Journal on Policy and Complex Systems
gregate GPA , shows a decline with accuracy and an increase in uncertainty .
In response to the first research question , it is not entirely surprising that training error measures and application error measures are different .

However , what is surprising is that in terms of our framework , the differences between the errors is a result of compounding error of model predictions . More prediction error leads to more incorrect class assignments and more disruptors at the school . If observations are treated independently , as is common , then this information is inaccessible to model training . This seems to suggest a need for adaptive feedback mechanisms to ensure model stability when placed in situ and not just error performance .

In the second research question , we can also evaluate baseline refinement as an acceptance criterion for model training . In Figure 4 , the vertical dashed line represents the expected accuracy resulting from a mean baseline model . Given the distribution of y , the mean baseline model has an MSE of . 62 which translates to an expected accuracy of about 38 %. Choosing this as a baseline for model evaluation does not define a useful point of reference if our goal is defined as l = 2.25 , because models performing better than the baseline can still have a high probability of going below this mark . Instead , a better baseline would be to select an accuracy value of 70 % which translates to an MSE of . 2 . Therefore , the target training error of the model should aim to have an MSE of 0.2 or better in order to reach the application goals .
In total , this paper presents a framework for analyzing predictive models in the context of complex systems . The need for such a framework is illustrated using a simple stylistic model of classroom assignments and we begin to explore the differences and relations between model training error and application error . There is evidence that these two can be related more rigorously but will be application dependent . Our future efforts will be twofold : 1 .) to make this framework more mathematically rigorous to establish bounds on expected application error given errors in predictive model training and 2 .) to use more realistic scenarios and datasets . Of interest will be including multiple operationalized models of theories in the social sciences to better anticipate outcomes of implementing predictive models in complex systems .
References
Arya , A ., Fellingham , J . C ., & Schroeder , D . A . ( 2004 ). Aggregation and measurement errors in performance evaluation . Journal of Management Accounting Research , 16 ( 1 ), 93 – 105 .
Berk , R . ( 2017 ). An impact assessment of machine learning risk forecasts on parole board decisions and recidivism . Journal of Experimental Criminology , 13 ( 2 ), 193 – 216 .
Church , C . E . & Fairchild , A . J . ( 2017 ). In search of a silver bullet : Child welfare ’ s embrace of predictive analytics [ article ]. Juvenile and Family Court
178