
Customer Churn at a bank
Business Objective:
For the bank to better understand its customers using customer segmentation analysis.
For the bank to be more accurate and efficient in predicting customer churn.
I also assessed the implication of using “Gender” variable and its bias in Banking data.
Dataset: 23 variables & 10127 observations of data which included demographic, financial & behavioral variables
Approach:
Used K means clustering for customer segmentation
Training/testing split of the data: 80/20
Ran about 7 models – Logit regression, Decisions tress, Gradient Boosting, Random Forest, Support Vector Machine and Neural networks (ANN)
Compared the specificity and sensitivity of each of these models
Result: K means clustering resulted in 6 optimum clusters (using elbow method).
Assessing banking data with and without the ‘Gender’ variable. Though the difference between the model’s accuracies, sensitivities and specificities can seem small on comparing them on 10,000 rows of data, however, when the data set is larger, which it usually is for a bank, it DOES matter.

INSIGHTS
Behavioral segmentation seems more appropriate, as it tells the bank what the customers are doing with its current services and how that can be leveraged. There is little understanding of customer differences if one views demographics only. Another downside is the assumption that everyone within the same demographic behaves identically
Based on my predictive modeling analysis, some of the variables that contribute to the likelihood of a customer churning or not are:
The number of transactions & total transaction amount
Revolving balance
Customer's relationship with the banks
The data used by creditors in developing and testing a model can perpetuate unintended biased outcome when variables like ‘Gender’ and ‘Race’ are used as these variables Reinforce existing, harmful stereotypes and prejudices
Having such variables is not harmful when the business is looking at reducing attrition for female customers.
Therefore, I feel WHAT YOU DO WITH YOUR ALGORITHM IS MORE IMPORTANT THAN WHAT THE ALGORITHM DOES FOR YOU.