Customer Rentention and Profitability Analysis for the Banking Sector Using Machine Learning

Authors

  • Sumit Pariyar UG Scholar, Dept. of Computer Science and Engineering (AI & ML), KPR Institute of Engineering and Technology, Coimbatore-641407, Tamil Nadu, India Author
  • Pandiya Rajan G Assistant Professor II, Dept. of CSE (Artificial Intelligence and Machine Learning), KPR Institute of Engineering and Technology, Coimbatore-641407, Tamil Nadu, India Author

DOI:

https://doi.org/10.47392/IRJAEH.2026.0393

Keywords:

Customer churn prediction, customer lifetime value (CLV), banking analytics, XGBoost, SHAP explainability, anomaly detection, Isolation Forest, machine learning, business intelligence, Microsoft Azure

Abstract

Customer churn represents one of the most consequential operational challenges facing modern banking institutions. As fintech alternatives and neobanks continue to erode traditional customer loyalty, banks require predictive systems that go beyond historical rule-based triggers. This paper proposes an end-to-end, AI-driven framework that integrates supervised churn prediction, Customer Lifetime Value (CLV) estimation, and behavioral anomaly detection within a unified operational pipeline. The system employs XGBoost as its core classification engine, trained on a dataset of 88,167 anonymized customer records spanning demographic, transactional, and behavioral attributes. Data preprocessing incorporates SMOTE-based class balancing, z-score normalization, and a suite of engineered features including complaint ratio, activity drop index, and profitability index. The trained model achieves an accuracy of 88.3%, precision of 0.896, recall of 0.850, F1-score of 0.872, and an ROC-AUC of 0.923 on held-out test data. Model decisions are made interpretable through SHAP (SHapley Additive Explanations) analysis, while a Google Gemini integration translates SQL-driven query results into plain-English narratives for non-technical stakeholders. The system is deployed on Microsoft Azure via a FastAPI backend and Streamlit frontend, achieving a mean API response latency of 200 ms and 99.8% uptime over a one-week evaluation period. Experimental results demonstrate that the proposed framework substantially outperforms conventional churn detection approaches and offers a scalable, privacy-conscious foundation for data-driven retention strategy in banking.

Downloads

Download data is not yet available.

Downloads

Published

2026-05-09

How to Cite

Customer Rentention and Profitability Analysis for the Banking Sector Using Machine Learning. (2026). International Research Journal on Advanced Engineering Hub (IRJAEH), 4(05), 3102-3108. https://doi.org/10.47392/IRJAEH.2026.0393