ID 原文 译文
56968 随着人工智能技术的快速发展,强化学习技术在解决连续决策问题上展现出了较强的潜力. Thus, in suchconditions, UAVs must be able to complete tasks autonomously and intelligently, without receiving real-timecommands from the operators. With the rapid advances in artificial intelligence, reinforcement learning hasshown potentiality for solving continuous decision problems.
56969 无人机搜索问题作为一种典型的连续决策问题,属于强化学习技术的适用范围. The target searching problem studied in this paperfalls into this category and is suitable for adopting reinforcement learning technologies.
56970 但对于目前的强化学习及人工智能技术能否适用于无人机从而自主决策完成现实场景中的任务这一问题尚存争议,仍有待进一步探索. However, the feasibilityof reinforcement learning in UAV-based target searching in communication denied environments is not clear and,thus, requires in-depth investigations.
56971 为此,本文以现实战场环境为背景,对通信拒止及包含两方对抗的战场环境中的目标搜寻问题进行了建模,依据模型构建了对抗仿真平台,并通过实验研究的方式针对以下3个问题展开了探索:(1)强化学习在通信拒止环境下多无人机搜索问题的适用性; As a pilot study in this direction, this paper models the target searchingproblem in communication denied and confrontation situations and proposes a simulation environment basedon this model. Extensive experiments are conducted to answer the following questions. (1) Can reinforcementlearning be applied in target searching by multi-UAVs in communication denied environments?
56972 (2)各强化学习算法在该问题上的优劣; (2) What are theadvantages and disadvantages of different reinforcement learning algorithms in solving this problem?
56973 (3)通信拒止程度对强化学习算法效果的影响.通过运用当前主流的强化学习技术开展仿真实验并定量评估实验结果. (3) Howthe degree of communication denial influences the performance of these algorithms?
56974 本文总结发现:(1)强化学习在解决通信拒止环境下多无人机搜索问题上具备有效性;(2)在与其他算法对抗时,运用基于Deep Q-Network (DQN)强化学习技术的自主决策无人机集群体现出了较强的问题解决能力;(3)通信拒止程度对强化学习算法效果有影响,但在不同的通信拒止程度下,强化学习算法表现相对稳定. The current mainstreamreinforcement learning technologies are adopted to perform simulations, whose results are analyzed quantitatively,leading to the following observations. (1) Reinforcement learning can effectively solve target searching problemsfor multi-UAVs in communication denied environments. (2) Compared with other algorithms, an autonomousdecision-making UAV cluster based on a deep Q-network (DQN)exhibits the best problem-solving ability. (3)The algorithm performance changes with the degree of communication denial but remains largely stable when thecommunication condition varies.
56975 无人机通常是指由无线电遥控或者由自主控制算法控制的不载人飞行器. Unmanned aerial vehicles (UAVs) are usually controlled by radio or by autonomous control algorithms.
56976 相比于有人机,无人机在执行危险任务等方面有着很大的优势,但是目前还没有能够应对高强度空战的无人机系统. Compared with manned aerial vehicles, they have great advantages in performing dangerous tasks but,presently, no UAV system can cope with high-intensity air combat.
56977 此外,在执行空战任务时,单一无人机的鲁棒性往往得不到保证,而多无人机系统不仅能保证鲁棒性,还能通过饱和攻击的方式提高任务的成功率. In addition, the robustness of a single UAVcannot be guaranteed in air combat missions; on the other hand, a multi-UAV system not only ensures thisrobustness but also improves the mission success rate by using saturation attacks.