EMPIRICAL STUDY ON REGULATORY SANDBOX APPLICATION BASED ON SIMULATION AND REINFORCEMENT LEARNING

Yang Xuan

ISSN 2310-5577

r

ppublishing.org

PREMIER

Publishing

DOI:10.29013/ESR-24-5.6-24-29

EMPIRICAL STUDY ON REGULATORY SANDBOX APPLICATION BASED ON SIMULATION AND REINFORCEMENT LEARNING

Yang Xuan 1

1 National University of Uzbekistan

Cite: Yang Xuan. (2024). Empirical Study on Regulatory Sandbox Application Based On Simulation And Reinforcement Learning. European Science Review 2024, No 5-6. https:// doi.org/10.29013/ESR-24-5.6-24-29

Abstract

This paper addresses the challenge of risk pricing in commercial banks amid increasing systemic risks influenced by global economic fluctuations, policy adjustments, and major global events. We introduce a novel framework combining digital twin technology and deep reinforcement learning to aid in more effective interest rate pricing decisions. By constructing a digital twin environment that simulates the operational conditions of commercial banks under various scenarios, and employing deep reinforcement learning models, the framework aims to devise optimal interest rate strategies that align with the banks' objectives. Our empirical analyses demonstrate the superiority of this AI-driven approach over traditional expert pricing methods, offering a robust decision support system for managing risk pricing in commercial banks. Keywords: Commercial Banks, Risk Pricing, Digital Twin, Deep Reinforcement Learning, Interest Rate Pricing, Systemic Risks, Simulation Environment, Financial Technology, Decision Support System

Introduction

Commercial banks are professional institutions that operate with risk, and risk pricing is one of the core issues that commercial banks must face. In their long-term operations, banks have gradually formed a variety of risk pricing strategies and schemes to deal with non-systematic risks. However, in the current environment, the increasingly complex factors such as global economic cycle fluctuations, monetary policy adjustments, the impact of the Russia-Ukraine war, and the COVID-19 pandemic have led to the continuous increase of systemic risks globally. How to conduct interest rate pricing has become a

major challenge for commercial banks. How commercial banks in the new era can conduct more scientific risk pricing has become a problem worthy of in-depth research. This paper innovatively proposes a framework for commercial bank risk pricing decision support that fully considers the impact of systemic risk factors: First, through business sorting, a digital twin environment is constructed based on macro and micro environmental factors.

I. Research Review of Simulation and Reinforcement Learning Techniques

The digital twin environment referred to in this paper is mainly constructed based on

simulation technology. The basic idea is to establish an experimental model that contains the main characteristics of the research system, and by running this experimental model, the necessary information to be studied can be obtained. In the fields related to commercial banks, many scholars have conducted research on simulation technology: In February 2017, Cui Yu et al. used the SIRS simulation model to study the risk cross-contagion mechanism between financial markets, which helped enhance the national financial market's risk prevention and control capabilities.

In February 2020, Grundke P and Kühn A built a simulation environment containing credit risk, interest rate risk and liquidity risk, using classified balance sheets to measure the impact on the Liquidity Coverage Ratio (LCR) and Net Stable Funding Ratio (NSFR). In March 2020, Zhang Shanshan established a systemic financial risk simulation model through system dynamics and conducted sensitivity analysis to detect sensitive risk factors, which has certain guiding value for preventing systemic risks in finance. In May 2021, Seyed Mohammad Sina Seyfi et al. proposed a Monte Carlo simulation algorithm based on a Gaussian mixture model, which can achieve fast and accurate calculation of Value-at-Risk (VaR) and Expected Shortfall (ES), having a positive effect on the financial risk market. In September 2021, Wang Dingxiang et al. focused on the accounts receivable financing model, and based on the analysis of credit risk transmission factors and transmission mechanisms in supply chain finance, they built intensity models and SIR models and conducted simulations to provide preventive measures for credit risk management in supply chain finance. In October 2022, Wu Yongfei et al. innovatively introduced digital twin technology and built a "digital twin environment on the bank's asset side" and a "digital twin environment on the bank's liability side" respectively, simulating the future operating conditions of commercial banks under different risk scenarios and different pricing strategies, providing a reference for bank risk pricing decision support.

The reinforcement learning used in this paper is an important branch of machine learning. Unlike supervised and unsuper-

vised learning, reinforcement learning is a self-supervised learning approach: the agent is trained based on action and reward data, and optimizes its action strategy; on the other hand, it autonomously interacts with the environment, observes and obtains environmental feedback. Many researchers at home and abroad have conducted research on reinforcement learning technology: In 2015, Mnih et al. published a paper in "Nature", proposing the Deep Q Network (DQN) model that combines deep learning and reinforcement learning, which can reach a level beyond human players after a period of training in the Atari 2006 game. The DeepMind team developed AlphaGo/AlphaGo Zero programs based on deep neural networks, using the Monte Carlo Tree Search (MCTS) algorithm, and combining supervised learning and reinforcement learning training methods, learning go strategies that surpass human level, defeating top human players Li Shishi in 2016 and Ke Jie in 2017. In 2019, Tencent developed the "Absolute Enlightenment" based on deep reinforcement learning technology, which can surpass professional players in the game of King of Glory. In 2022, Wang Yan-bo et al. applied deep reinforcement learning technology to the medical insurance fund allocation decision problem, and the empirical results showed that the results given by reinforcement learning can greatly save time and labor costs, providing decision support for medical insurance fund allocation work.

II. Empirical Research on Commercial Bank Risk Pricing Based on Digital Twins and Reinforcement Learning

2.1 Business Understanding

In their long-term operations, commercial banks have gradually formed a variety of risk pricing schemes represented by the cost-plus method and the benchmark interest rate method. The cost-plus method is obtained by comprehensively examining the financing costs, operating costs and customer default costs of the loan, and then adding a certain expected profit. This method pays less attention to factors such as horizontal competition factors and market pricing levels, and thus has a stronger introverted characteristic. The benchmark interest rate method is to first select an appropriate benchmark interest rate,

and then consider the impact of the default risk premium and term risk premium; this pricing method has a strong market orientation, but often pays less attention to various costs required for loans, which may result in certain risk exposure.

In this context, this paper innovatively proposes the use of a combined framework of digital twins and reinforcement learning for risk pricing of commercial bank loans. The study selected real data from a branch of a national joint-stock commercial bank to construct a digital twin simulation environment for three different loan types: mortgage loans, consumer loans and personal loans, and used the Deep Deterministic Policy Gradient (DDPG) method to determine the best interest rate in order to help commercial banks improve their loan risk pricing capabilities.

2.2 Data Understanding

For specific loan business scenarios and corresponding market conditions, it is first necessary to conduct an in-depth analysis of the influencing factors, and then construct a simulation environment for commercial banks under different conditions. Through research, it can be seen that the factors that affect the simulation environment of commercial bank loan pricing mainly include capital cost, degree of loan risk, loan term, loan amount, degree of competition in the lending market, macroeconomic factors, etc. This paper extracts statistical data and factor data including branch interest rates from January 2010 to March 2022, loan conditions and non-performing conditions at different interest rate levels, one-year LPR, five-year LPR, GDP, CPI index, CSI 1000 index, three-month SHIBOR, one-year SHIBOR, M2 money supply, cumulative city-wide GDP, cumulative city-wide GDP growth, electricity consumption (financial, real estate, business and resident services industries), cumulative total retail sales of consumer goods, cumulative personal income tax, and cumulative rural and urban per capita disposable income, covering three major categories of economic indicators including macro, industry and region.

2.3 Construction of Digital Twin Simulation Environment Model

The simulation construction of the digital twin environment mainly involves two

modules: loan supply model construction and credit risk model construction. In terms of the loan supply model, this paper uses the XGBoost model to build models for three specific loan businesses: mortgage loans, consumer loans and personal loans. During the modeling process, considering that risk shocks will affect the overall supply of commercial bank loan projects in the market, the model simulated the relationship between interest rate pricing and the total loan amount at this interest rate, i.e., the customer supply of different businesses under different time periods and different interest rate pricing, thus building a supply model in the simulation environment. The model constructed in this paper can automatically explore the relationship between factors. The research results show that although the weights of factors in supply models of different loan types are different, the most important influencing factor is the lending interest rate.

In terms of the credit risk model, since external macro and micro factors will also affect the credit risk of commercial bank loan customers, macro and micro factors are also selected as explanatory variables when modeling. The data includes bank-related statistics and macroeconomic factors such as CPI, CSI 1000 Index, one-year and five-year LPR, and combined with the bank's own loan policy and pricing situation, the model simulates the relationship between interest rate pricing and the non-performing rate at that interest rate, i.e., the credit risks of customers of different businesses under different time periods and different interest rate pricing, thereby building a credit risk model in the simulation environment. At the same time, the model optimizes some abnormal situations, so that when the interest rate is very low or very high, the non-performing rate can be consistent with the highest non-performing rate in history.

2.4 Deep Reinforcement Learning Model Construction

This paper makes adaptive improvements to the DDPG model in deep reinforcement learning, transforming the original algorithm part aimed at discrete decision space into one for continuous decision space, and using the Monte Carlo simulation method to replace the original "temporal difference

learning" mechanism. Combining the actual situation of commercial bank mortgage loans, consumer loans and personal loans, this paper focuses on the total default rate and total income of different loan businesses, and takes this as the optimization objective of reinforcement learning, i.e., to ensure that commercial banks can optimize various indicators as much as possible under the premise of meeting the operational indicator regulatory requirements, so as to reduce the default rate while increasing the total income.

The algorithm flow chart of the reinforcement learning model used in this paper is as follows:

1. Randomly initialize network, network, network and network parameters; initialize the experience pool

2. In each simulation round, loop:

2.1 Initialize the environment and obtain the initial state.

In each decision round, loop:

2.2 Calculate the decision, a = ^(s) + N, where N is random noise following a normal distribution;

2.3 Interact with the environment to obtain the current round's reward and the information of the next round's environmental state;

2.4 Store the quadruple in the experience replay pool, and when the experience pool is full, clear and train the network.

3. Sample a batch of training samples uniformly from the experience replay pool, and update the parameters of the AC network according to the parameter update formula, and periodically perform soft updates on the target network of the AC.

4. When the non-performing rate and income output by the simulation model exceed the non-performing rate and income under the bank's pricing, the training can be stopped.

Figure 1. DDPG Algorithm Model Framework

2.5 Experimental Results Analysis

This paper selects the data from December 2020 to February 2022 as the test set, based on the constructed commercial bank digital twin simulation environment, and uses the deep reinforcement learning framework to give the agent the optimal risk pricing for different loan businesses at different time stages in commercial banks, and outputs the different loan business total income results and

bank loan total non-performing rate results produced by the agent's pricing strategy and the commercial bank expert strategy. Among them: the red line represents the strategy results output by reinforcement learning, and the blue line represents the actual pricing situation of the commercial bank.

As shown in Figure 2 and Figure 3:

Figure 2. Agent Strategy vs. Commercial Bank Strategy Results (Total Loan Income)

Figure 3. Agent Strategy vs. Commercial Bank Strategy Results (Non-performing Rate)

As shown in Figure 2, with the continuous development of the bank's loan business, in different risk scenarios, both the agent and the expert strategies have given corresponding pricing strategies for different loan products. Due to the continuous accumulation of bank income, the total income generated by the agent's decisions and the expert strategy results will continue to increase over time. Moreover, at different time points, the total accumulated loan income generated by the agent's strategy is significantly higher than the expert decision result, and the difference between these two strategies is also constantly increasing, fully demonstrating that the interest rate pricing strategy given by the agent is significantly better than the current human expert strategy. This also further proves that within the observed time window, the agent can continuously optimize the pricing strat-

egy according to the market conditions, so that the allocation of loan resources in the market is more reasonable and effective.

III. Conclusion

This paper has constructed a simulation environment for the mortgage loan, consumer loan and personal loan business of a certain region of commercial banks based on digital twin technology, and used deep reinforcement learning technology for this scenario to give the bank's risk pricing strategy that maximizes income under a certain non-performing rate threshold, achieving better results than the current human expert strategy. The relevant technology is not only applicable to the commercial bank risk pricing scenario, but also has a broad space for further migration to other fields in the financial industry.

References

Collins, A., Sokolowski, J., & Banks, C. (2013). Applying reinforcement learning to an insurgency agent-based simulation. The Journal of Defense Modeling and simulation Applications Methodology Technology,- 11(4).- P. 353-364. URL: https://doi. org/10.1177/1548512913501728 Sivamayil, K., Elakkiya, R., Aljafari, B., Nikolovski, S., Subramaniyaswamy, V., & Indragand-hi, V. (2023). A systematic study on reinforcement learning based applications. Energies,-16(3).- P. 1512. URL: https://doi.org/10.3390/en16031512 Vokhidova, M. K., Abdullaeva, A. R. (2024). Directions of Trade Relations of Uzbekistan with the Countries of Central Asia. In: Sergi, B. S., Popkova, E. G., Ostrovskaya, A. A., Chursin, A. A., Ragulina, Y. V. (eds) Ecological Footprint of the Modern Economy and the Ways to Reduce It. Advances in Science, Technology & Innovation. Springer, Cham. URL: https://doi. org/10.1007/978-3-031-49711-7_76 Mahmud, M., Kaiser, M., Hussain, A., & Vassanelli, S. (2018). applications of deep learning and reinforcement learning to biological data. Ieee Transactions on Neural Networks and learning Systems,- 29(6).- P. 2063-2079. URL: https://doi.org/10.1109/tnnls.2018.2790388 Deeka, T., Deeka, B., & On-rit, S. (2021). A study of a competitive reinforcement learning approach for joint spatial division and multiplexing in massive mimo. Ecti Transactions on Electrical Engineering Electronics and Communications,- 19(1).- P. 83-93. URL: https://doi.org/10.37936/ecti-eec.2021191.226832

Contact: mehrivoxidova@gmail.com

EMPIRICAL STUDY ON REGULATORY SANDBOX APPLICATION BASED ON SIMULATION AND REINFORCEMENT LEARNING Текст научной статьи по специальности «Экономика и бизнес»

Аннотация научной статьи по экономике и бизнесу, автор научной работы — Yang Xuan

Похожие темы научных работ по экономике и бизнесу , автор научной работы — Yang Xuan

Текст научной работы на тему «EMPIRICAL STUDY ON REGULATORY SANDBOX APPLICATION BASED ON SIMULATION AND REINFORCEMENT LEARNING»