Scoring-Training feb 2014 | Page 7

Historical Data Preparation Historical Data Preparation 1. Historical data should include a set of characteristics and a target variable. All of scorecard development methods quantify the relationship between the characteristics (input columns) and “Good/Bad” performance (target column). 2. Example of borrowers characteristics. Scorecard characteristics are similar to those used in subjective expert judgment . Credit Product Financial Credit History Social • Amount • Term • Document Goal • Assets • Debts • Monthly Income • Monthly Expenses • In Current Bank • In Other Banks • Credit Bureau Data 3. Those characteristics, whose usage is not reasonable, are excluded. For example: on the picture you can see that the “Good/ Bad” distribution does not depend on the Home Ownership characteristics. 4. All borrowers should be marked in the target column as “Good” or “Bad” by a certain rule. For example: all the borrowers to pay in 30 days, are “Good”, but borrowers with a delay of more than 90 days are marked as “Bad”. • Work experience • Time of residence at current address • Marital status Good Intermediate Bad Bad Bad Bad 0 Exclusions Certain types of accounts need to be excluded from the dataset. For example: bank workers or VIP clients records could be excluded from data set. outlier Data Cleansing Borrowers portfolio data can contain the following anomalies that should be replaced or deleted : • Outliers - values that lie far outside the main volume • Data entry errors • Missing values missing value www.plug-n-score.com data entry error 30 60 90 120 180+ Days