UAV autonomous target search based on deep reinforcement learning in complex disaster scene

  • Chunxue Wu
  • , Bobo Ju
  • , Yan Wu
  • , Xiao Lin
  • , Naixue Xiong
  • , Guangquan Xu
  • , Hongyan Li*
  • , Xuefeng Liang*
  • *Corresponding author for this work

Research output: Contribution to journalArticlepeer-review

132 Scopus citations

Abstract

In recent years, artificial intelligence has played an increasingly important role in the field of automated control of drones. After AlphaGo used Intensive Learning to defeat the World Go Championship, intensive learning gained widespread attention. However, most of the existing reinforcement learning is applied in games with only two or three moving directions. This paper proves that deep reinforcement learning can be successfully applied to an ancient puzzle game Nokia Snake after further processing. A game with four directions of movement. Through deep intensive learning and training, the Snake (or self-learning Snake) learns to find the target path autonomously, and the average score on the Snake Game exceeds the average score on human level. This kind of Snake algorithm that can find the target path autonomously has broad prospects in the industrial field, such as: UAV oil and gas field inspection, Use drones to search for and rescue injured people after a complex disaster. As we all know, post-disaster relief requires careful staffing and material dispatch. There are many factors that need to be considered in the artificial planning of disaster relief. Therefore, we want to design a drone that can search and rescue personnel and dispatch materials. Current drones are quite mature in terms of automation control, but current drones require manual control. Therefore, the Snake algorithm proposed here to be able to find the target path autonomously is an attempt and key technology in the design of autonomous search and rescue personnel and material dispatching drones.

Original languageEnglish
Article number2933002
Pages (from-to)117227-117245
Number of pages19
JournalIEEE Access
Volume7
DOIs
StatePublished - 2019
Externally publishedYes

Keywords

  • Deep reinforcement learning
  • Markov decision
  • Monte Carlo
  • Q-learning

Fingerprint

Dive into the research topics of 'UAV autonomous target search based on deep reinforcement learning in complex disaster scene'. Together they form a unique fingerprint.

Cite this