Evaluating credit rating models and rethinking K-S indicators2015-12-05 KPMG Big Data team, KPMG Big Data mining
The concept of "credit rating" can sound straightforward. One morning, for example, you get a call, an acquaintance borrows money from you, and you will make a quick decision in half-asleep: borrow or not. In a second of the flash of light, you may have considered each other's temperament, economic strength, home address, all kinds of black and white history ... But in the end, you're dealing with a single choice of two options and the consequences of choosing, which is the simplest "rating." Commercial banks are also similar to customers who apply for loans. In order to control bad loans and avoid losses, banks need to credit their customers in advance. Of course, subjective evaluation of customer lack of operability, then need to establish some kind of credit rating model, using data to classify customers as "good customer" and "bad customer", that is, trustworthy customers and defaulting customers.
The credit rating model has been used for more than fifty or sixty years, and has gradually established a relatively comprehensive evaluation system in the process of continuous development. The key to measuring the strength of a credit rating model is its ability to differentiate between good and bad customers and sort them correctly. Based on the industry experience, we can judge the accuracy of the model by examining the consistency between the customer's risk ranking results and the results of the actual default. In the case of effectiveness, the model gives low ratings to customers who are prone to default, while giving scoring values to customers who are less likely to default, thus reflecting the differentiated ability of the model: the higher the distinction, the better the model, and the worse the model.
According to this principle, in the evaluation criterion of credit scoring model, the k-s statistic is one of the few widely used evaluation indexes because of its simple calculation and easy comprehension. This article will introduce the K-S statistics and its shortcomings, and put forward "auks statistics" as a new evaluation standard, hoping to provide new ideas for the bank's credit rating business and other related practices.
K-s statistics are derived from two sample Kolmogorov-smirnov tests, which are non-parametric tests used to verify whether the two probability distributions are the same. The k-s statistic measures the maximum vertical distance between two distributions, i.e.
Two samples K-s test mainly investigate whether the two samples obey the same distribution, which is used as the evaluation criterion of the credit rating model. The output of credit evaluation model can be considered as the probability of event occurrence. If the empirical distribution of bad customer forecasts differs significantly from the experience distribution of good customer forecasts, the credit rating model assigns significantly different estimates to good customers and bad customers. The k-s statistic is equal to the maximum distance between the good customer and the bad customer's experience distribution. If the two distributions are significantly different, you can assume that the model's k-s statistics are sufficient to differentiate whether the applicant will become a bad customer. As shown in the following:
How do you evaluate the effectiveness of a credit rating model? We must select a validation sample that is different from the modeling sample that created the model. As with the modeling sample, an observation in the validation sample represents a customer, where the dependent variable y and the value of the input variable x are known. When validating a model, the model to be tested is used to predict each customer's or credit score in the validation sample. If the k-s statistic is used as a criterion for the model's merits and demerits, this value can be calculated according to each customer's or score in the validation sample. Sort these or ratings from low to high, and then divide into groups (usually 20 or 10), each of which contains good customers and bad customers, because the model's error classification is impossible to avoid, and any scoring model cannot give all bad customers absolute low scores for all good customers. However, a good model can ensure that bad customers score relatively low and good customer score is relatively high, that the good model can guarantee more harmony. , the dashed line indicates the experience distribution of the good customer, and the solid lines represent the experience distribution of the bad customer. The maximum distance between the two experience distributions is k-s statistics. The greater the value of the k-s statistic, the more significant the two differences, the more reasonable the scoring model gives. Therefore, the k-s statistic can be used as the criterion of credit scoring model, and it is more convenient in practical operation, the Npar1wayprocedure and EM module in SAS and the basic software package stats in R language can be applied to calculate this index.
However, there are significant deficiencies in k-s statistics. K-s statistics only from one point to measure the difference of two distribution, its stability is necessarily insufficient. We have designed the verification scheme, referring to another commonly used indicator AUC statistic, samples of sample size 5960 of the verification sample for multiple sampling, and each extracted sample to do model verification calculation k-s statistics and another common indicator AUC statistics to check their stability. Finally, we find that the variation coefficients of k-s statistics are far greater than those of the AUC statistic.
To increase the stability, the best way is to change the distance to the area, the local promotion to the whole. To do this, we have designed a new statistic: The area under the K-s curve (areas under the K-s curve), can be abbreviated as Auks.
When, can be assumed, then
Compared with k-s statistic, the advantage of auks statistic is that it can verify the merits and demerits of the model from the whole evaluation range instead of a point, and has a relatively low dependence on the sample quantity. We use two statistics to verify the evaluation model, in the simulation experiment, compared with k-s statistic, the auks statistic always has more stable mean value, smaller standard deviation and smaller coefficient of variation, as the evaluation index of credit scoring model has better stability.
In the field of credit scoring for many years, the industry has created and summed up a set of more comprehensive evaluation standards, which complement each other, generally can guarantee the application value of credit evaluation model. However, these standards, indicators and statistics are still flawed and need to be constantly revised and improved to continue to refine this evaluation system. Believe that Auks statistics will become a valuable new indicator.
Rethinking K-s indicators (KPMG Big Data Mining)