Hedging Interest Rate Options with Reinforcement Learning: an investigation of a heavy-tailed distribution

Allan Jonathan da Silva, Jack Baczynski, Leonardo Fagundes de Mello


Purpose: The study intends to model an interest rate index option using a heavy-tailed distribution. The goal is to calculate the interest rate path-dependent option prices that are consistent with market data and to develop a reinforcement learning strategy to discretely hedge the position considering transaction costs. Methodology: This paper presents a mathematical framework to calculate the price of interest rate path-dependent options. The research adapted a Fourier cosine series formula to employ the characteristic function of the present value of the forward index, which is modeled as a variance-gamma process and uses deep Q-learning to hedge such options. Findings: There is market evidence that the implied volatility curve is not flat. The study demonstrated that the variance-gamma process generates an increasing volatility smile, which is consistent with market observations. Additionally, hedging results show that the path-dependent options generated from the variance-gamma process can be efficiently hedged with advanced Q-learning techniques. Research limitations/implications: The study comprised only the variance-gamma process. Other probability distributions, such as the Normal Inverse Gaussian model, should be investigated. Practical implications: This study reveals which type of probability distribution should be present in a pricing engine to be consistent with implied volatilities. The approach provided here can assist managers in evaluating and comprehending market pricing behavior as well as achieving discrete hedging with costs. Originality: The paper addressed the merging of a fast pricing method for the interest rate options with a heavy-tailed distribution and the discrete interest rate derivatives hedging with reinforcement learning.

Full Text:



Abdulhameed, S. A. and Lupenko, S. (2022). Potentials of reinforcement learning in contemporary scenarios. Scientific Journal of the Ternopil National Technical University, 2(106), 92-100.

Almeida, C. and Vicente, J. V. M. (2012). Term structure movements implicit in Asian option prices. Quantitative Finance, 12(1):119-134.

Almeida, L. A., Yoshino, J., and Schirmer, P. P. S. (2003). Derivativos de renda-fixa no Brasil: Modelo de Hull-White. Pesquisa e Planejamento Econômico, 33:299-333.

Barbachan, J. S. F. and Ornelas, J. R. H. (2003). Apreçamento de opções de IDI usando o modelo CIR. Estudos Econômicos, 33(2):287-323.

Barbedo, C. H., Vicente, J. V. M., and Lion, O. B. (2010). Pricing Asian interest rate options with a three-factor HJM model. Revista Brasileira de Finanças, 8(1):9-23.

Black, F. (1976). The pricing of commodity contracts. Journal of Financial Economics, 3(1-2):167-179.

Bouziane, M. (2008). Pricing interest-rate derivatives: a Fourier-transform based approach. Springer, Berlin.

Breeden, D. and Litzenberger., R. (1978). State contingent prices implicit in option prices. Journal of Business, 51:621-651.

Brigo, D. and Mercurio, F. (2006). Interest Rate Models - Theory and Practice. Springer Finance. Springer, Berlin.

Cao, J. et al. Deep hedging of derivatives using reinforcement learning. SSRN Electronic Journal, 2019.

Cao Jay, C. J. H. J. P. Z. Deep hedging of derivatives using reinforcement learning. The Journal of Financial Data Science, v. 3, n. 1, p. 10-27, 2021.

Carreira, M. and Brostowicz, R. (2016). Brazilian Derivatives and Securities: Pricing and Risk Management of FX and Interest-Rate Portfolios for Local and Global Markets. Palgrave Macmillan UK.

Compare, M., Bellani, L., Cobelli, E., & Zio, E. (2018). Reinforcement learning-based flow management of gas turbine parts under stochastic failures. The International Journal of Advanced Manufacturing Technology, 99(9-12), 2981-2992.

Chu, T., Wang, J., Codecà, L., & Li, Z. (2020). Multi-agent deep reinforcement learning for large-scale traffic signal control. IEEE Transactions on Intelligent Transportation Systems, 21(3), 1086-1095.

Clark, P. (1973). A subordinated stochastic process model with finite variance for speculative prices. Econometrica, 41:135-156.

Cox, J. C., Ingersoll, J., and Ross, S. (1986). A theory of the term structure of interest rates. Econometrica, 53:385-407.

da Silva, A. J., Baczynski, J., and Bragança, J. F. S. (2019). Path-dependent interest rate option pricing with jumps and stochastic intensities. Lecture Notes in Computer Science, 11540:710-716.

da Silva, A. J., Baczynski, J., and Vicente, J. V. M. (2016). A new finite difference method for pricing and hedging fixed income derivatives: Comparative analysis and the case of an Asian option. Journal of Computational and Applied Mathematics, 297:98-116.

da Silva, A. J., Baczynski, J., and Vicente, J. V. M. (2020). Efficient solutions for pricing and hedging interest rate asian options. Working Paper Series - Banco Central do Brasil, (513).

da Silva, A. J., Baczynski, J., and Vicente, J. V. M. (2023). Recovering probability functions with fourier series. Pesquisa Operacional, 43:1-18.

Dandapani, K. (2017). Electronic finance – recent developments, Managerial Finance, Vol. 43 No. 5, pp. 614-626.

De Domenico, F., Livan, G., Montagna, G., and Nicrosini, O. (2023). Modeling and simulation of financial returns under non-gaussian distributions. Physica A: Statistical Mechanics and its Applications, 622:128886.

Du, J. et al. Deep reinforcement learning for option replication and hedging. The Journal of Financial Data Science, 2020.

Duffie, D. (2008). Financial Modeling with Affine Processes. Stanford University and University of Lausanne.

Black, F., and Scholes, M. (1973). The pricing of options and corporate liabilities. Journal of Political Economy, 81:637-54.

Fabozzi, F. J. (2000). Mercados, Análise e Estratégias de Bônus. Qualitymark, Rio de Janeiro, 1a edition.

Fang, F. and Oosterlee, C. W. (2008). A novel pricing method for European options based on Fourier-cosine series expansions. SIAM Journal on Scientific Computing, 31(2):826-848.

Gafrej, O. (2023). Predicting customer deposits with machine learning algorithms: evidence from Tunisia, Managerial Finance, Vol. ahead-of-print No. ahead-of-print.

Gatheral, J. (2006). The volatility surface: a practitioner's guide. The Wiley Finance Series. Wiley.

Genaro, A. D. and Avellaneda, M. (2018). Pricing interest rate derivatives under monetary policy changes. International Journal of Theoretical and Applied Finance, 21(6):1850037.

Glasserman, P. (2004). Monte Carlo Methods in Financial Engineering. Applications of mathematics: stochastic modelling and applied probability. Springer.

Goodman, L. and Fabozzi, F. (2002). Collateralized Debt Obligations: Structures and Analysis. Frank J. Fabozzi Series. Wiley.

Heath, D., Jarrow, R., and Morton, A. (1992). Bond pricing and the term structure of interest rates: A new methodology for contingent claims valuation. Econometrica, 60(1):77-105.

Hull, J. and White, A. (1993). One-factor interest rate models and the valuation of interest-rate derivatives securities. Journal of Financial and Quantitative Analysis, 28(2):235-253.

Hull, J. C. (2009). Options, Futures and Others Derivatives. Pearson Prentice Hall, 7th edition.

Jondeau, E., Poon, S.-H., and Rockinger, M. (2007). Financial Modeling Under Non-Gaussian Distributions. Springer, London.

Joshi, D. (2022). Portfolio optimization using reinforcement learning. Interantional Journal of Scientific Research in Engineering and Management, 06(10).

Junior, A. F., Grecco, F., Lauro, C., Francisco, G., Rosenfeld, R., and Oliveira, R. (2003). Application of Hull-White model to Brazilian IDI options. In Annals of Brazilian Finance Meeting.

Kaelbling, L. P., Littman, M. L., & Moore, A. W. (1996). Reinforcement learning: a survey. Journal of Artificial Intelligence Research, 4, 237-285.

Kalapos, A., Gór, C., Moni, R., & Harmati, I. (2021). Vision-based reinforcement learning for lane-tracking control. Acta Imeko, 10(3), 7.

Khalaf, L., Leccadito, A., and Urga, G. (2021). Multilevel and Tail Risk Management*. Journal of Financial Econometrics, 20(5):839-874.

Kienitz, J. and Wetterau, D. (2012). Financial Modelling: Theory, Implementation and Practice with MATLAB Source. The Wiley Finance Series. Wiley.

Lillicrap, T. P. et al. (2015) Continuous control with deep reinforcement learning, CoRR abs/1509.02971.

Madan, D. B., Carr, P. P., and Chang, E. C. (1998). The variance gamma process and option pricing. European Finance Review, 2:79-105.

Marshall, J. F. and Bansal, V. K. (1991). Financial Engineering: a complete guide to financial innovation. New York Institute of Finance.

Martellini, L., Priaulet, P., and Priaulet, S. (2003). Fixed-income securities. John Wiley & Sons, England.

Nain, I. and Rajan, S. (2023), Algorithms for better decision-making: a qualitative study exploring the landscape of robo-advisors in India, Managerial Finance, Vol. ahead-of-print No. ahead-of-print.

Neftci, S. (2000). An Introduction to the Mathematics of Financial Derivatives. Elsevier, New York, 2nd edition.

Núñez‐Letamendia, L. (2002), Trading systems designed by genetic algorithms, Managerial Finance, Vol. 28 No. 8, pp. 87-106.

Openai. Deep Deterministic Policy Gradient. 2018. < https://spinningup.openai.com/en/ latest/algorithms/ddpg.html>.

Ornelas, J. R. H. and Takami, M. Y. (2011). Recovering risk-neutral densities from brazilian interest rate options. Brazilian Review of Finance, 9(1):9-26.

Sakr, A. H., AboElHassan, A., Yacout, S., & Bassetto, S. (2021). Simulation and deep reinforcement learning for adaptive dispatching in semiconductor manufacturing systems. Journal of Intelligent Manufacturing, 34(3), 1311-1324.

Sallab, A. E., Abdou, M., Pérot, E., & Yogamani, S. (2017). Deep reinforcement learning framework for autonomous driving. Electronic Imaging, 29(19), 70-76.

Smithson, C. (1998). Managing Financial Risk: A Guide to Derivative Products, Financial Engineering, and Value Maximization. McGraw-Hill, 3rd edition.

Sutton, R., and Barto, A. (2018). Reinforcement Learning: An Introduction. The MIT Press, Second edition.

Tankov, P. and Cont, R. (2003). Financial Modelling with Jump Processes. Financial Mathematics Series. Chapman & Hall/CRC, Florida.

Vasicek, O. (1977). An equilibrium characterization of the term structure. Journal of Financial Economics, 5:177-188.

Vellekoop, M. and Nieuwenhuis, H. (2007). On option pricing models in the presence of heavy tails. Quantitative Finance, 7(5):563-573.

Vieira, C. and Pereira, P. (2000). Closed form formula for the price of the options on the 1 day brazilian interfinancial deposits index. In Annals of the XXII Meeting of the Brazilian Econometric Society, volume 2, Campinas, Brazil.

Wilmott, P. (2006). Paul Wilmott on Quantitative Finance. John Wiley & Sons, Chichester, 2th edition.

Zhu, J. (2009). Applications of Fourier Transform to Smile Modeling: Theory and Implementation. Springer Finance. Springer Berlin Heidelberg.

DOI: https://doi.org/10.11114/bms.v9i2.6515


  • There are currently no refbacks.

Business and Management Studies     ISSN 2374-5916 (Print)     ISSN 2374-5924 (Online)

Copyright © Redfame Publishing Inc.

To make sure that you can receive messages from us, please add the 'redfame.com' domain to your e-mail 'safe list'. If you do not receive e-mail in your 'inbox', check your 'bulk mail' or 'junk mail' folders.

If you have any questions, please contact: bms@redfame.com