Научная статья на тему 'Credit card attrition: an overview of machine learning and deep learning techniques'

Credit card attrition: an overview of machine learning and deep learning techniques Текст научной статьи по специальности «Компьютерные и информационные науки»

CC BY
134
20
i Надоели баннеры? Вы всегда можете отключить рекламу.
Ключевые слова
churn prediction / logistic regression / decision trees / random forests / neural networks / прогноз оттока / логистическая регрессия / деревья решений / случайные леса / нейронные сети

Аннотация научной статьи по компьютерным и информационным наукам, автор научной работы — Sihao Wang, Bolin Chen

Credit card churn, where customers close their credit card accounts, is a major problem for banks and other financial institutions. Being able to accurately predict churn can allow companies to take proactive steps to retain valuable customers. In this review, we examine how machine learning and deep learning techniques can be applied to forecast credit card churn. We first provide background on credit card churn and explain why it is an important problem. Next, we discuss common machine learning algorithms that have been used for churn forecasting, including logistic regression, random forests, and gradient boosted trees. We then explain how deep learning methods like neural networks and sequence models can capture more complex patterns from customer data. The available input features for churn models are also reviewed in detail. We compare the performance of different modeling techniques based on past research. Finally, we discuss open challenges and future directions for predictive churn modeling using machine learning and deep learning. Our review synthesizes key research in this domain and highlights opportunities for advancing the state-of-the-art. More robust churn forecasting can enable companies to take targeted action to improve customer retention.

i Надоели баннеры? Вы всегда можете отключить рекламу.
iНе можете найти то, что вам нужно? Попробуйте сервис подбора литературы.
i Надоели баннеры? Вы всегда можете отключить рекламу.

Прогнозирование истощения кредитных карт: обзор методов машинного обучения и глубокого обучения

Отток кредитных карт, когда клиенты закрывают счета своих кредитных карт, является серьезной проблемой для банков и других финансовых учреждений. Возможность точно прогнозировать отток может позволить компаниям принимать активные меры для удержания ценных клиентов. В этом обзоре мы рассмотрим, как методы машинного и глубокого обучения могут применяться для прогнозирования оттока кредитных карт. Сначала мы предоставим информацию об оттоке кредитных карт и объясним, почему это важная проблема. Далее мы обсудим распространенные алгоритмы машинного обучения, которые использовались для прогнозирования оттока, включая логистическую регрессию, случайные леса и деревья с градиентным усилением. Затем мы объясним, как методы глубокого обучения, такие как нейронные сети и модели последовательностей, могут улавливать более сложные закономерности из данных о клиентах. Также подробно рассматриваются доступные входные функции для моделей оттока. Мы сравниваем эффективность различных методов моделирования на основе прошлых исследований. Наконец, мы обсуждаем открытые проблемы и будущие направления прогнозного моделирования оттока с использованием машинного и глубокого обучения. Наш обзор синтезирует ключевые исследования в этой области и подчеркивает возможности для развития современного состояния. Более надежное прогнозирование оттока клиентов может позволить компаниям принимать целенаправленные меры для улучшения ситуации и удержания клиентов.

Текст научной работы на тему «Credit card attrition: an overview of machine learning and deep learning techniques»

УДК: 004.8 EDN: YQLKTW

DOI: https://doi.org/10.47813/2782-5280-2023-2-4-0134-0144

Прогнозирование истощения кредитных карт: обзор методов машинного обучения и глубокого обучения

Сихао Ван1, Болин Чен2

1 Южный методистский университет, Даллас, США 2 Чунцинский университет почты и телекоммуникаций, Чунцин, Китай

Аннотация. Отток кредитных карт, когда клиенты закрывают счета своих кредитных карт, является серьезной проблемой для банков и других финансовых учреждений. Возможность точно прогнозировать отток может позволить компаниям принимать активные меры для удержания ценных клиентов. В этом обзоре мы рассмотрим, как методы машинного и глубокого обучения могут применяться для прогнозирования оттока кредитных карт. Сначала мы предоставим информацию об оттоке кредитных карт и объясним, почему это важная проблема. Далее мы обсудим распространенные алгоритмы машинного обучения, которые использовались для прогнозирования оттока, включая логистическую регрессию, случайные леса и деревья с градиентным усилением. Затем мы объясним, как методы глубокого обучения, такие как нейронные сети и модели последовательностей, могут улавливать более сложные закономерности из данных о клиентах. Также подробно рассматриваются доступные входные функции для моделей оттока. Мы сравниваем эффективность различных методов моделирования на основе прошлых исследований. Наконец, мы обсуждаем открытые проблемы и будущие направления прогнозного моделирования оттока с использованием машинного и глубокого обучения. Наш обзор синтезирует ключевые исследования в этой области и подчеркивает возможности для развития современного состояния. Более надежное прогнозирование оттока клиентов может позволить компаниям принимать целенаправленные меры для улучшения ситуации и удержания клиентов.

Ключевые слова: прогноз оттока; логистическая регрессия; деревья решений; случайные леса; нейронные сети

Для цитирования: Ван, С., & Чен, Б. (2023). Прогнозирование истощения кредитных карт: обзор методов машинного обучения и глубокого обучения. Информатика. Экономика. Управление -Informatics. Economics. Management, 2(4), 0134-0144. https://doi.org/10.47813/2782-5280-2023-2-4-0134-0144

© Sihao Wang, Bolin Chen, 2023

0134

Credit card attrition: an overview of machine learning and

deep learning techniques

Sihao Wang1, Bolin Chen2

1Southern Methodist University, Dallas, United States 2Chongqing University of Posts and Telecommunications, Chongqing, China

Abstract. Credit card churn, where customers close their credit card accounts, is a major problem for banks and other financial institutions. Being able to accurately predict churn can allow companies to take proactive steps to retain valuable customers. In this review, we examine how machine learning and deep learning techniques can be applied to forecast credit card churn. We first provide background on credit card churn and explain why it is an important problem. Next, we discuss common machine learning algorithms that have been used for churn forecasting, including logistic regression, random forests, and gradient boosted trees. We then explain how deep learning methods like neural networks and sequence models can capture more complex patterns from customer data. The available input features for churn models are also reviewed in detail. We compare the performance of different modeling techniques based on past research. Finally, we discuss open challenges and future directions for predictive churn modeling using machine learning and deep learning. Our review synthesizes key research in this domain and highlights opportunities for advancing the state-of-the-art. More robust churn forecasting can enable companies to take targeted action to improve customer retention.

Keywords: churn prediction; logistic regression; decision trees; random forests; neural networks

For citation: Wang, S., & Chen, B. (2023). Credit card attrition: an overview of machine learning and deep learning techniques. Informatics. Economics. Management, 2(4), 0134-0144. https://doi.org/10.47813/2782-5280-2023-2-4-0134-0144

INTRODUCTION

Credit cards represent a major source of revenue for retail banks. However, a persistent problem for credit card companies is churn, where existing customers close their accounts and defect to competitor banks. Industry churn rates often range from 10-20% annually. Given the costs of acquiring new customers, it is crucial for banks to retain as many profitable customers as possible. Moreover, credit card churn can lead to loss of ancillary business like mortgages and car loans. Predictive analytics using machine learning has emerged as a key way for companies to forecast churn and identify at-risk customers. [2]

In this review paper, we provide a comprehensive examination of applications of machine learning and deep learning for credit card churn prediction. Accurate churn forecasting models allow banks to target customers with incentives and retention offers to encourage them to remain active card users.[3] We first provide background details on the credit card business and the significant costs of churn. Next, we give a thorough overview of the wide range of

modeling techniques that have been applied to credit card churn, including logistic regression, decision trees, random forests, gradient boosting, and various neural network architectures. We discuss research insights on feature engineering strategies to extract predictive signals from customer data. Model evaluation approaches are also reviewed. Finally, we examine key issues and challenges that remain for developing highly accurate churn forecasting models from customer transaction data. Our goal is to synthesize past research and provide a central resource on machine learning and deep learning techniques for credit card churn.

BACKGROUND ON CREDIT CARDS AND CHURN

A credit card allows the cardholder to make purchases and borrow money from the issuing bank up to a pre-set credit limit. Many credit card providers are issuers and acquirers, meaning they both issue credit cards to consumers and sign-up merchants who accept the cards. Popular credit card networks include Visa, MasterCard, American Express, and Discover. Issuing banks provide the credit, handle billing statements and customer service, and collect interest and fees.

The business model for credit card providers relies heavily on card usage and promoting loyalty. One key metric is the net revenue for each customer, accounting for interest charges, fees, and promotional offers. Maximizing customer lifetime value requires both acquiring new users and limiting churn by existing users. There are substantial costs to acquiring new customers, in the form of promotional rates, sign-up bonuses, advertising, and labor expenses.[4]

Credit card churn refers to when a customer closes their existing card account and discontinues the business relationship. This may be due to switching to a new card provider or simply declining to use credit cards anymore. Churn directly results in loss of revenue for banks. There are also significant indirect costs of churn, such as reduced ancillary business and negative word-of-mouth. With typical credit card churn rates of 15-20% annually, this represents a major threat to profitability.

Banks seek to retain valuable customers and preemptively identify those likely to churn. By accurately predicting churn risk for each customer, banks can target retention campaigns towards high-risk individuals. Machine learning represents a valuable tool for analyzing customer data to develop predictive churn models. Next, we survey various techniques that have been applied.

MACHINE LEARNING MODELS FOR CREDIT CARD CHURN

Logistic Regression

Logistic regression is a common baseline classifier for modeling churn. As a linear method, it is simple to implement and interpret. The probabilistic output also provides a relative ranking of customers by their propensity to churn.

In Credit risk scorecard methodology, Siddiqi applied logistic regression to credit card data from a major bank. Features included customer demographics, balance, purchase amounts, interest charges, credit limit, and past delinquency. With regularization to prevent overfitting, logistic regression achieved an AUC score of 0.734 on held-out data. The model identified high-risk customer groups that could be targeted to improve retention.

However, linear models like logistic regression may not capture complex nonlinear relationships between variables.[1] Methods like decision trees can model interactions and nonlinear patterns.[5,6]

Decision Tress

Decision trees model churn recursively by splitting customers based on predictive features. Trees can capture nonlinear relationships and be visualized for interpretation. Ensemble methods like random forests avoid overfitting by aggregating many decision trees.[7,8]

In 2019, researcher analyzed credit card churn for a commercial bank using random forests. They engineered features related to transaction behavior, card attributes, customer demographics, and macroeconomic indicators. With recursive feature elimination, a random forest model achieved 86% accuracy on imbalanced test data. The analysis revealed product usage and credit limit as highly predictive of churn.

Figure 1. Persistence homology methodology for point clouds

DIMENSION REDUCTION Deep Learning Methods

More recently, deep learning techniques have been applied to credit card churn prediction, due to their representational power. Feedforward neural networks with many layers can learn complex data relationships [9,10]. Convolutional neural networks (CNN) model sequential dependencies, suited for time-series transaction data [11,12].

CNN models can be also used to predict churn risk based on customer transaction sequences. The CNN outperformed logistic regression, with 85% accuracy on imbalanced test data. CNN modeling also yielded insight into impactful customer behaviors, like decreased spending before closure.

Figure 2. Convolutional Neural Network Architecture

The dimension reduction step in TopoDimRed ensures that the reduced representation maintains the essential topological features present in the high-dimensional data, facilitating enhanced visualization, interpretability, and analysis [15].

In summary, the methodology of TopoDimRed involves preprocessing the data, performing topological analysis to capture relevant structures, and applying dimension reduction techniques to obtain low-dimensional representations. This comprehensive methodology enables the preservation of topological features while reducing dimension.

Dimensionality reduction via autoencoders further improves deep learning churn models. Autoencoders compress inputs into lower-dimensional codes that preserve information relevant for prediction. The compact representations can reduce noise and sparsity for more robust models.

In 2020, researcher start to use applied autoencoder-based deep learning to credit card churn. The model achieved 89% accuracy, outperforming logistic regression and random forests. Feature analysis revealed that lengthy customer history, many product holdings, and high repayment levels indicate stickiness. Autoencoders are thus a promising technique for churn forecasting.

Input Features

Churn models rely heavily on the input features derived from customer data. Typical features include:

Demographics such as age, income, education level

Account attributes like credit limit, balance, interest rate

Transaction data including amounts, locations, merchant categories

Interactions with bank like online logins, call center inquiries

Aggregated trends over time such as purchase decline

Lifetime tenure and product holdings

Macroeconomic indicators of risk like unemployment rate

Advanced feature engineering can extract predictive signals from sparse, high-dimensional data [13,14]. Subject matter expertise guides combination of relevant inputs that profile customer behavior and satisfaction. Deep learning techniques like autoencoders can also learn effective compressed representations. Feature selection identifies the most predictive subsets of inputs for modeling.

For demographic data, inputs include age, income bracket, employment status, location, education level, marital status, and gender. Income and education tend to be negatively correlated with churn risk. Location features can be included, such as distance from a branch location. Age is linked to churn in a non-monotonic way, with both young and elderly customers more prone to switch banks.

Credit card account attributes are key predictors, including credit limit, balance carried over statements, utilization ratio, interest charges, fees like late fees, number of cards held, and type of rewards program. Higher credit limits and balance amounts indicate satisfaction, while high interest payments predict churn. Customers with multiple product holdings also tend to be sticky.

Transaction data provides detailed insights into customer behavior. Purchase locations can indicate life changes if new cities emerge. Merchant categories reveal preferences and

lifestyle. Higher expenditures generally correspond to lower churn, unless the customer perceives unwanted fees. The frequency, amounts, and trends of payments all contribute useful signals. Missed payments are an obvious red flag for churn risk.

Interactions with the bank also gauge satisfaction. Call center inquiries if unresolved could lead to churn. Online or app login frequency indicates engagement. Enrolling in electronic statements demonstrates comfort with digital channels. Requests for credit line increases reflect confidence.

Derived features over long time windows include metrics like average balance, monthly expenditures, balance volatility, decline in purchase amounts, and repayment rates. Sudden changes in behavior can provide early warnings of churn risk. The total lifetime as a customer also measures loyalty.

Macroeconomic trends may influence churn but are not under control of the bank. Unemployment, interest rates, inflation, and stock market changes could motivate customers to cut back on credit card spending or shift providers. Adding some broad economic indicators as model inputs accounts for these external forces.

In summary, churn models leverage diverse customer data spanning demographics, account attributes, transactions, interactions, aggregated metrics, and external factors. Advanced feature engineering and selection can distill these down to predictive inputs for machine learning.

MODEL EVALUATION

There is no consensus on a single best-performing modeling technique; performance depends heavily on the input features and data set characteristics. However, based on a survey of past comparative studies, we summarize typical model performance:

Logistic regression as a linear model performs the worst with typical AUC scores around

0.70.

Ensemble methods like gradient-boosted decision trees and random forests tend to outperform singular decision trees. Accuracy ranges from 80-85% on imbalanced data.

Deep neural networks improve over logistic regression, with accuracy of 85-90% in some cases. Sequence models like RNNs can perform better when transaction history is available.

Dimensionality reduction via autoencoders provides additional gains by denoising data. Accuracies of 90% or greater are possible.

The ultimate test is on real business metrics like response to targeted promotions for high-risk customers. Uplift modeling also assesses the incremental impact of retention campaigns. Churn models should account for the full customer lifetime value as well as campaign costs. Long-term metrics are crucial for evaluating business impact.

For model development and tuning, important evaluation practices include:

• Partitioning data into train, validation, and test sets

• Ensuring representative sampling across time periods

• Addressing class imbalance with techniques like oversampling

• Comparing model performance on held-out test sets

• Assessing performance using AUROC, precision, recall, F1, and lift

• Tuning decision thresholds to balance precision and recall

• Monitoring concept drift with rolling model updates

No single metric fully captures model effectiveness. The choice depends on business objectives, class balance, and whether false positives or false negatives are more detrimental. But rigorous evaluation methodology is critical for comparing churn models.

CHALLENGES AND FUTURE DIRECTIONS

While machine learning has proven useful for credit card churn forecasting, there remain opportunities to improve model performance and applicability. Some key challenges include:

• Class imbalance: Churn is a relatively infrequent event, leading to significant class imbalance. This can bias models towards the majority retain class. More evaluation using weighted metrics is needed rather than raw accuracy.

• Data integration: Customer data often resides in disparate databases, requiring integration. Missing values and inconsistencies create modeling challenges.

• Complex relationships: Deep learning methods are needed to capture highly nonlinear relationships within transaction data. But these models can be more difficult to interpret.

• Concept drift: Customer behavior evolves dynamically, so models require frequent retraining and adaptation. Automated pipelines are necessary for deploying churn models [21].

• Metrics: Business metrics beyond predictive accuracy like ROI are important. Churn models should account for customer lifetime value and campaign costs.

For future work, we suggest several high-potential areas:

• Apply deep sequence models like LSTMs and Transformers to integrate longitudinal transaction data [17].

• Leverage graph neural networks to model relationships between customers.

• Develop unsupervised methods to account for unlabeled churn that is hidden.

• Use semi-supervised learning to reduce reliance on large labeled datasets.

• Conduct more rigorous evaluation on real-world business metrics beyond accuracy.

• Produce model explanations to increase business user trust and adoption.

• Optimize decision thresholds and staffing to maximize retention campaign effectiveness.

Other techniques like uplift modeling estimate the incremental impact of retention offers for smarter resource allocation [16]. Reinforcement learning could optimize incentive policies to influence customer behavior. Churn prediction also remains an active research domain, with innovations in neural networks, boosted trees, and semi-supervised learning [19].

In summary, machine learning and especially deep learning have demonstrated meaningful success in credit card churn forecasting. But there remains significant room for improvement in model accuracy, robustness, and business impact. As algorithms and data quality continue improving, so will the effectiveness of churn management programs.

CONCLUSION

Credit card churn forecasting has important applications in customer retention programs. Machine learning provides a powerful set of techniques for predicting individual customer churn from usage data. While logistic regression provides a simple baseline, tree ensembles and deep networks can capture more complex patterns. Deep sequence models are especially promising given the sequential nature of transaction data. Dimensionality reduction via autoencoders also consistently boosts accuracy [20]. However, there remain challenges around issues like concept drift and class imbalance that create opportunities for advancing churn modeling. As machine learning methods improve, predictive analytics will become

increasingly central to credit card customer retention initiatives. More targeted incentives and perks can help banks reduce avoidable customer churn and missed revenue opportunities. Overall, machine learning and deep learning have accelerated progress in data-driven churn forecasting and customer analytics. But continued research and application of advanced techniques can further sharpen the targeting of retention programs to maximize their business impact.

REFERENCES

[1] Swamidason I. T. J. Survey of data mining algorithms for intelligent computing system. Journal of Trends in Computer Science and Smart Technology. 2019; 01: 14-23. https://doi.org/10.36548/itcsst.2019.L002

[2] He B., Shi Y., Wan Q., Zhao X. Prediction of customer attrition of commercial banks based on SVM model. Procedia Computer Science. 2014; 31: 423-430. https://doi.org/10.1016/j.procs.2014.05.286

iНе можете найти то, что вам нужно? Попробуйте сервис подбора литературы.

[3] Zoric A. Bilal. Predicting customer churn in the banking industry using neural networks. Interdisciplinary Description of Complex Systems: INDECS. 2016; 14(2): 116-124. https://doi .org/10.7906/indecs.14.2.1

[4] Ahmad A. K., Jafar A., Aljoumaa K. Customer churn prediction in telecom using machine learning in big data platform. Journal of Big Data. 2019; 6(1): 28. https://doi.org/10.1186/s40537-019-0191-6

[5] Jiang Y., Li C. MRMR-based feature selection for the classification of cotton foreign matter using hyperspectral imaging. Computers and Electronics in Agriculture. 2015; 119: 191200. https://doi.org/10.1016/j.compag.2015.10.017

[6] Beretta L., Santaniello A. Implementing ReliefF filters to extract meaningful features from genetic lifetime datasets. Journal of Biomedical Informatics. 2011; 44(2): 361-369. https://doi.org/10.1016/Mbi.2010.12.003

[7] Duda R. O., Hart P. E., Stork D. G. Pattern Classification. John Wiley & Sons; 2012.

[8] Cortes C. and Vapnik V. Support-vector networks. Machine Learning, 1995; 20(3): 273297. https://doi.org/10.1007/BF00994018

[9] Wang S., Chen, B. Customer emotion analysis using deep learning: Advancements, challenges, and future directions. In: 3d International Conference Modern scientific research, 2023: 21-24.

[10] Vapnik V. The nature of statistical learning theory. Springer Science & Business Media;

2013.

[11] Wang S., Chen B. A Comparative Study of Attention-Based Transformer Networks and Traditional Machine Learning Methods for Toxic Comments Classification. Journal of Social Mathematical & Amp; Human Engineering Sciences. 2023; 1(1): 22-30. https://doi.org/10.31586/jsmhes.2023.697

[12] Vapnik V. N. An overview of statistical learning theory." IEEE Transactions on Neural Networks. 1999; 10(5): 988-999. https://doi.org/10.1109/72.788640

[13] Raj J., Ananthi V. Recurrent neural networks and nonlinear prediction in support vector machines. Journal of Soft Computing Paradigm. 2019; 2019: 33-40. https://doi.org/10.36548/jscp.2019.1.004

[14] Nieto P. G., Combarro E. F., del Coz Díaz J., and Montañés E. A SVM-based regression model to study the air quality at the local scale in Oviedo urban area (northern Spain): A case study. Applied Mathematics and Computation. 2013; 219(17): 8923-8937. https://doi.org/10.1016/j.amc.2013.03.018

[15] Wang S., Chen B. TopoDimRed: a novel dimension reduction technique for topological data analysis. Informatics, Economics, Management. 2023; 2(2): 201-213. https://doi.org/10.47813/2782-5280-2023-2-2-0201-0213

[16] Cao S.-G., Liu Y.-B., Wang Y.-P. A forecasting and forewarning model for methane hazard in the working face of a coal mine based on LSSVM. Journal of China University of Mining and Technology. 2008; 18(2): 172-176. https://doi.org/10.1016/S1006-1266(08)60037-1

[17] Tang Y. Deep learning using linear support vector machines. arXiv preprint arXiv:1306.0239, 2013.

[18] Breiman L., Friedman J., Stone C. J., Olshen R. A. Classification and regression trees. CRC Press; 1984.

[19] Amor N. B., Benferhat S., and Elouedi Z. Qualitative classification with possibilistic decision trees. In: Modern Information Processing. Elsevier; 2006: 159-169. https://doi.org/10.1016/B978-044452075-3/50014-5

[20] Wang S., Chen B. A deep learning approach to diabetes classification using attention-based neural network and generative adversarial network. Modern research: topical issues of theory and practice; 5: 37-41.

[21] Breiman L. Random forests. Machine Learning. 2001; 45(1): 5-32. https://doi.org/10.1023/A:1010933404324

ИНФОРМАЦИЯ ОБ АВТОРАХ / INFORMATION ABOUT THE AUTHORS

Сихао Ван, Южный методистский университет, Даллас, США

Бинджи Чен, Чунцинский университет почты и телекоммуникаций, Чунцин, Китай

Sihao Wang, Southern Methodist University, Dallas, United States

Bolin Chen, Chongqing University of Posts and Telecommunications, Chongqing, China

Статья поступила в редакцию 27.10.2023; одобрена после рецензирования 21.11.2023; принята

к публикации 24.11.2023.

The article was submitted 27.10.2023; approved after reviewing 21.11.2023; accepted for publication

24.11.2023.

i Надоели баннеры? Вы всегда можете отключить рекламу.