Abstract
This paper studies optimal investment and reinsurance strategies for insurers facing parameter uncertainty, addressing three objectives: maximising expected terminal utility, minimising ultimate bankruptcy probability and maximising expected terminal utility under constraints. When no constraints are imposed and the utility is exponential, we derive the approximate analytical solution and the associated optimal strategy. For general utility functions and the bankruptcy minimisation problem, explicit solutions are unavailable, so we propose a policy improvement algorithm that approximates the value function. The algorithm exploits the identity between the entropy-regularised reinforcement learning value function and the viscosity solution of the exploratory Hamilton-Jacobi-Bellman equation, expressing the optimal feedback strategy through the derivative of the value function to obtain the optimal distributional control. Finally, the effectiveness of the proposed numerical methods is validated through numerical examples.
| Original language | English |
|---|---|
| Journal | International Journal of Control |
| DOIs | |
| State | Accepted/In press - 2026 |
Keywords
- entropy regularised
- HJB equation
- Optimal investment
- reinforcement learning
- reinsurance
- viscosity solution