TY - GEN
T1 - Optimal Action Space Search
T2 - 31st ACM International Conference on Information and Knowledge Management, CIKM 2022
AU - Duan, Zhongjie
AU - Chen, Cen
AU - Cheng, Dawei
AU - Liang, Yuqi
AU - Qian, Weining
N1 - Publisher Copyright:
© 2022 ACM.
PY - 2022/10/17
Y1 - 2022/10/17
N2 - Algorithmic trading is a crucial yet challenging task in the financial domain, where trading decisions are made sequentially from milliseconds to days based on the historical price movements and trading frequency. To model such a sequential decision making process in the dynamic financial markets, Deep Reinforcement Learning (DRL) based methods have been applied and demonstrated their success in finding trading strategies that achieve profitable returns. However, the financial markets are complex imperfect information games with high-level of noise and uncertainties which usually make the exploration policy of DRL less effective. In this paper, we propose an end-to-end DRL method that explores solutions on the whole graph via a probabilistic dynamic programming algorithm. Specifically, we separate the state into environment state and position state, and model the position state transition as a directed acyclic graph. To obtain reliable gradients for model training, we adopt a probabilistic dynamic programming algorithm to explore solutions over the whole graph instead of sampling a path. By avoiding the sampling procedure, we propose an efficient training algorithm and overcome the efficiency problem in most existing DRL methods. Furthermore, our method is compatible with most recurrent neural network architecture, which makes our method easy to implement and very effective in practice. Extensive experiments have been conducted on two real-world stock datasets. Experimental results demonstrate that our method can generate stable trading strategies for both high-frequency and low-frequency trading, significantly outperforming the baseline DRL methods on annualized return and Sharpe ratio.
AB - Algorithmic trading is a crucial yet challenging task in the financial domain, where trading decisions are made sequentially from milliseconds to days based on the historical price movements and trading frequency. To model such a sequential decision making process in the dynamic financial markets, Deep Reinforcement Learning (DRL) based methods have been applied and demonstrated their success in finding trading strategies that achieve profitable returns. However, the financial markets are complex imperfect information games with high-level of noise and uncertainties which usually make the exploration policy of DRL less effective. In this paper, we propose an end-to-end DRL method that explores solutions on the whole graph via a probabilistic dynamic programming algorithm. Specifically, we separate the state into environment state and position state, and model the position state transition as a directed acyclic graph. To obtain reliable gradients for model training, we adopt a probabilistic dynamic programming algorithm to explore solutions over the whole graph instead of sampling a path. By avoiding the sampling procedure, we propose an efficient training algorithm and overcome the efficiency problem in most existing DRL methods. Furthermore, our method is compatible with most recurrent neural network architecture, which makes our method easy to implement and very effective in practice. Extensive experiments have been conducted on two real-world stock datasets. Experimental results demonstrate that our method can generate stable trading strategies for both high-frequency and low-frequency trading, significantly outperforming the baseline DRL methods on annualized return and Sharpe ratio.
KW - algorithmic trading
KW - reinforcement learning
UR - https://www.scopus.com/pages/publications/85140841993
U2 - 10.1145/3511808.3557412
DO - 10.1145/3511808.3557412
M3 - 会议稿件
AN - SCOPUS:85140841993
T3 - International Conference on Information and Knowledge Management, Proceedings
SP - 406
EP - 415
BT - CIKM 2022 - Proceedings of the 31st ACM International Conference on Information and Knowledge Management
PB - Association for Computing Machinery
Y2 - 17 October 2022 through 21 October 2022
ER -