Validation of Variables before Loading the Data

Validating data in sas

This is what is done in this transformation to ensure that no data is lost. Kindly look into the attachment. In probability of default bank defaulters model, it checks whether the credit risk model is able to distinguish between good and bad customers.

Kindly look into the

It means the model predicts the highest number of events in the first decile and then goes progressively down. This rule might be tough to achieve if you are working on large sample and small event rate.

This is what is done

It is a simple line graph of percentage of events against deciles scoring bins. You can check the rank ordering in the image below. As a follow-up, the warehousing staff would review the kicked-out records periodically to determine their disposition. Hosmer Lemeshow Test It measures calibration and shows how close the predicted probabilities are to the actual rate of events.

The rank ordering is maintained in this example. In this case, it would be difficult to process this information at all, so it may be wise to set these types of records aside for later hand processing. Rank Ordering To see rank ordering, calculate the percentage of events defaults in each decile group and check the event rate should be monotonically decreasing.

Model Validation Techniques

In probability of default bank defaultersScore predicted probability

The figure in attachment also shows that two other columns are processed to look for values that are incorrect, but themselves only indicativeof problems with those particular data records. Lift Chart It measures how much better one can expect to do with the predictive model comparing without a model.

Score predicted probability the validation sample using the response model under consideration. This will allow subsequent steps to perform sorting, grouping, or other operations as needed for useful analysis. When this occurs, it might suggest that some major failure has occurred, for example, an inability to read the current version of the invoices database table. So, the ability to specify a processing flow change based on critical data flaws can help prevent problems from becoming more serious reporting problems.