基于强化学习的多机协同超视距空战决策算法

基于强化学习的多机协同超视距空战决策算法
DOI:
                        
CSTR:
                        
作者:
                        
作者单位:1.南京航空航天大学 自动化学院, 南京 210016;2. 西北工业大学 自动化学院, 西安 710072
作者简介:
通讯作者:
中图分类号:
基金项目:

Multi aircraft collaborative beyond-visual-range air combat decision-making algorithm based on reinforcement learning

Author:

Affiliation:

1.Collage of Automation Engineering, Nanjing University of Aeronautics and Astronautics, Nanjing 210016, China;2.School of Automation, Northwestern Polytechnical University, Xi’an 710129, China

Fund Project:

摘要

图/表

访问统计

参考文献

相似文献

引证文献

资源附件

文章评论

摘要:

现代战争中的空战态势复杂多变，因此探索一种快速有效的决策方法十分重要。本文对多架无人机协同对抗问题展开研究，提出一种基于LSTM-MADDPG的多机协同超视距空战决策算法。首先，建立无人机运动模型、雷达探测区模型和导弹攻击区模型。然后，提出了多机协同超视距空战决策算法：设计了集中式训练分布式执行架构和协同空战系统的状态空间来处理多架无人机之间的同步决策问题；设计了学习率衰减机制来提升网络的收敛速度和稳定性；利用长短期记忆网络改进了网络结构，增强了网络对战术特征的提取能力；利用基于衰减因子的奖励函数机制加强无人机的协同对抗能力。仿真结果表明所提出的多机协同超视距空战决策算法使无人机具备了协同攻防的能力，同时算法具备良好的稳定性和收敛性。

Abstract:

With the modern air combat environment is becoming more and more complex, and the combat situation is changing rapidly, so it is important to explore a fast and effective decision-making method. In this paper, we propose an LSTM-MADDPG-based multi-aircraft collaborative beyond-visual-range air warfare decision algorithm to study the problem of multi-aircraft collaborative confrontation. Firstly, a beyond-visual-range air combat environment is established, including UAV movement model, radar detection zone model and missile attack zone model. Then, the multi-aircraft collaborative beyond-visual-range air warfare decision-making algorithm is proposed: a centralized training distributed execution architecture and the state space of the collaborative air warfare system are designed to deal with the synchronous decision-making problem among multiple UAVs; a learning rate decay mechanism is designed to improve the convergence speed and stability of the network; the network structure is improved using a long and short-term memory network to enhance the network's ability to extract tactical features; a decay-based factor-based reward function mechanism to enhance the cooperative countermeasure capability of UAVs. Finally, the results show that the proposed algorithm enables UAVs to have the ability of collaborative attack and defenses, while the algorithm has good stability and convergence.

参考文献

相似文献

引证文献

引用本文

王志刚,龚华军,尹逸,刘小雄.基于强化学习的多机协同超视距空战决策算法[J].南京航空航天大学学报,,():

复制

文章指标

点击次数:
下载次数:
HTML阅读次数:
引用次数:

历史

收稿日期:
最后修改日期:
录用日期:
在线发布日期: 2025-07-05
出版日期:

引用本文

分享

文章指标

历史

文章二维码