16 research outputs found
Intelligent defaulter Prediction using Data Science Process
Abstract
The machine learning classifiers employed are not good enough to clearly classify the loan defaulters. In order to alleviate this problem, the data science process is adopted. During the machine learning phase, the random forest classifier with tuned hyperparameters is trained on the loan lending dataset for obtaining the bank loan defaulter model. This classifier is learned by using a smaller number of features as predictor variables obtained after the Exploratory Data Analysis phase. The ability of the model to correctly classify the unseen loan applicant’s data is evaluated in terms of the classifier accuracy and attending the false positives during the model diagnosis from the confusion matrix which unattended will prove fatal for the loan lending banks. The conclusion drawn from this analysis is that the performance metric of the classifier namely the classifier accuracy for Random Forest has outperformed the state-ofart statistical classifiers.</jats:p
