Manuel Robalinho: Credit Score using Machine Learning

terça-feira, 24 de março de 2020

Credit Score using Machine Learning

The goal is to use machine learning to create a credit score for customers. This score gives the degree of confidence that the customer will meet the agreed payments. The higher the score, define the greater the probability of non-payment.

Multiple Linear Regression in Python with Scikit-Learn

We just performed linear regression in the above section involving two variables. Almost all the real-world problems that you are going to encounter will have more than two variables.

Linear regression involving multiple variables is called “multiple linear regression” or multivariate linear regression. The steps to perform multiple linear regression are almost similar to that of simple linear regression.

We will use customer information to generate a ‘trust’ score on the customer. The scoring formula can be adapted for each company according to its credit context. In this example, we are going to use the average number of days the customer is late, and the average billing amount for the past 2 years to calculate a score that combines the 2 information.

After calculating the score, we submit the information to a machine learning with Scikit-Learn, so that the system can predict new scores based on the learning information.

Our formula for Score calculation described on Score calculation.xlsx

Customer information is in the excel: Customers_CODE.XLSX

Customer company information:

Customer from date

State, Region, Postcode, Salesman, Main CNAE (type of company classification in Brazil)

Highest Billing Date

Maximum billing amount

Last Date invoice issued

Largest credit exposure date

Highest credit exposure

Average historical delay

Average revenue last 48 months

Amount payable Overdue

Amount payable due

Customer Last Order Date

Date of this information

SERASA Information

( Serasa it’s a company that sell’s information about other companies)

Serasa Score

Probability of not paying

Last date non payment

Amount of unpaid documents

Value of unpaid documents

Last date bad checks

Amount of bad checks

Last date protests

Value protests

Last date judicial actions

Value judicial actions

Last date overdue debts

Value overdue debts

I create some Python code to read and clean data from my data set.

After we create a routine to read all records and calculate my score (newscore) using my definitions. I have an excel file with my definitions to calculate my score.

Applying Keras-Model to training and test the model. Create Train and Test datasets, 80% for Train and 20% to Test. In graphic mode, we can compare Train and Test results:

I need to predict some individual records, so, I made a python function to predict score:

Now testing, running to predict my record:

Using tkinter python library, we can create a screen with better visual:

Conclusion:

The creation of a score using the information known to a customer, can automate and make a credit system more reliable. This approach contributes to cooler and more reliable risk analysis, resulting only from market data and information, removing from the system the criteria of proximity to the client and the emotions that negotiation can generate.

References:

MY GITHUB

MY MEDIUM

Manuel Robalinho

terça-feira, 24 de março de 2020

Credit Score using Machine Learning

Customer company information:

SERASA Information

Conclusion:

References:

Sem comentários:

Enviar um comentário

Mensagens populares