I need to predict the likelihood of a customer becoming a loan delinquency
and forfeiting on a loan. I’m struggling to fulfil the job requirements with RapidMiner and need someone’s expertise to clean and model the data as required.
The following tasks are required:
1) An exploratory data analysis of the training data set using
RapidMiner Studio data mining tool.
2) A Decision Tree model for predicting loan delinquency based on the data set
using RapidMiner and an appropriate set of data mining operators and a
reduced set of variables determined by the exploratory data analysis.
3) Build a Logistic Regression model for predicting loan delinquency based on the
data set using RapidMiner and an appropriate set of data mining operators
and a reduced set of variables determined by exploratory data analysis in Task 1.
4) Conduct a comparative performance evaluation of the Final Decision Tree
Model with a Final Logistic Regression Model for predicting loan delinquency.
*Note* you will need to use the Cross Validation Operator; Apply Model Operator and
Performance (Binominal Classification) Operator in your final data mining process
models (Decision Tree, Logistic Regression) to generate the required model performance
metrics (Accuracy, Miscalculation Rate, True Positive Rate, False Positive Rate, Area
under Roc Chart (AUC), Precision, Recall, Lift, Sensitivity, F Measure) required for Task
4.
I require the RapidMiner file with explanatory notes.