Scoring-Training feb 2014 | Page 8

Binning What is binning Binning means the process of transforming a numeric characteristic into a categorical one as well as re-grouping and consolidating categorical characteristics. Why binning is required • Increases scorecard stability: some characteristic values can rarely occur, and will lead to instability if not grouped together. • Improves quality: grouping of similar attributes with similar predictive strengths will increase scorecard accuracy. • Allows to understand logical trends of “Good/Bad” deviations for each characteristic. • Prevents scorecard impairment otherwise possible due to seldom reversal patterns and extreme values. • Prevents overfitting(overtraining) possible with numerical variables Automatic binning The most widely used automatic binning algorithm is Chi-merge. Chi-merge is a process of dividing into intervals (bins) in the way that neighboring bins will differ from each other as much as possible in the ratio of “Good” and “Bad” records in them. Analysis and manual correction of automatic binning Sometimes due to particularities in data distribution automatic binning needs to be corrected manually. The example below shows the range divided into 5 bins using an automatic binning (Fig 1.), now we only need to manually adjust the band. For example, manually adjusts the second boundary of the rang