Quantitative controller synthesis for consumption Markov decision processes

  • Jianling Fu
  • , Cheng Chao Huang
  • , Yong Li
  • , Jingyi Mei
  • , Ming Xu*
  • , Lijun Zhang
  • *Corresponding author for this work

Research output: Contribution to journalArticlepeer-review

Abstract

Consumption Markov decision processes (CMDPs) are a kind of resource-constrained probabilistic decision-making systems allowing the consumption and replenishment to capacity of resource. Making each decision in CMDPs consumes some amount of resource, which can be reloaded to the full capacity in a specified set of states. We consider here the quantitative reachability probability controller synthesis problems for CMDPs that extract a policy preventing resource exhaustion and achieve (repeated) reachability goals with the maximal probability, in contrast to the original work focusing on the qualitative controller synthesis for CMDPs, e.g., fulfilling the (repeated) reachability goals with positive probability or probability one. We first prove that the exact quantitative reachability probability problems are NP-hard by a reduction from the 0–1 Knapsack problem. We solve them by converting a given CMDP to a flattened Markov decision process (MDP) and then constructing an optimal policy for that MDP. Unfortunately, the naive flattened MDPs are prohibitively large and thus make the subsequent synthesis not scale well. To overcome this drawback, we propose a pruned construction and a quotient construction of the flattened MDP to reduce the sizes of MDPs. The empirical evaluation shows that our optimizations can significantly improve the scalability of the purely flattening-based synthesis method for CMDPs on benchmarks from real-world scenarios — autonomous electric vehicle routing and helicopter planning in Mars Perseverance Rover project.

Original languageEnglish
Article number106342
JournalInformation Processing Letters
Volume180
DOIs
StatePublished - Feb 2023

Keywords

  • Markov decision process
  • NP-hardness
  • Quantitative analysis
  • Reachability
  • Scheduling

Fingerprint

Dive into the research topics of 'Quantitative controller synthesis for consumption Markov decision processes'. Together they form a unique fingerprint.

Cite this