基于联邦深度强化学习的多无人机轨迹规划算法A multi-UAVs trajectory planning algorithm based on federated deep reinforcement learning
王鉴威,李学华,陈硕
摘要(Abstract):
针对多无人机协同服务地面用户移动边缘计算服务,构建以多无人机服务地面用户公平性和计算时延加权和最大化为目标的模型,联合优化无人机轨迹和任务卸载比例的调度。提出了一种移动边缘计算场景下基于联邦深度强化学习的多无人机轨迹规划算法。该算法首先在各无人机上部署独立的深度强化学习模型,使每架无人机根据自身获取的信息进行学习获取局部最优模型。其次引入联邦学习框架,通过信息聚合的方式,使多无人机协同服务地面用户,让服务效果达到全局最优。仿真结果表明,与无信息交互的多智能体深度强化学习相比,提出的方案有效优化了公平性和时延。
关键词(KeyWords): 无人机通信;移动边缘计算;深度强化学习;联邦学习;轨迹规划;公平性
基金项目(Foundation): 国家自然科学基金项目(61901043);; 北京信息科技大学“勤信人才”培育计划(QXTCPB202101);; 北京市教委科研计划科技一般项目(KM202211232010)
作者(Author): 王鉴威,李学华,陈硕
DOI: 10.16508/j.cnki.11-5866/n.2023.06.001
参考文献(References):
- [1] 姜泽峰,曹润宇,张善新.MEC网络中多无人机协同优化计算卸载策略[J].传感器与微系统,2023,42(7):52-56.JIANG Z F,CAO R Y,ZHANG S X.Multi-UAV collaborative optimization computing offloading strategy in MEC network[J].Transducer and Microsystem Technologies,2023,42(7):52-56.(in Chinese)
- [2] HUA M,WANG Y,LI C G,et al.UAV-aided mobile edge computing systems with one by one access scheme[J].IEEE Transactions on Green Communications and Networking,2019,3(3):664-678.
- [3] CHEN Z Y,ZHENG H Q,ZHANG J S,et al.Joint computation offloading and deployment optimization in multi-UAV-enabled MEC systems[J].Peer-to-Peer Networking and Applications,2022,15(1):194-205.
- [4] ZHU Z Y,QIAN L P,SHEN J F,et al.Joint optimisation of UAV grouping and energy consumption in MEC-enabled UAV communication networks[J].IET Communications,2020,14(16):2723-2730.
- [5] MA B D,LIU Z B,DANG Q Q,et al.Deep reinforcement learning of UAV tracking control under wind disturbances environments[J].IEEE Transactions on Instrumentation and Measurement,2023,72:1-13.
- [6] CHEN Y,DONG Q,SHANG X Z,et al.Multi-UAV autonomous path planning in reconnaissance missions considering incomplete information:a reinforcement learning method[J].Drones,2023,7(1):10.
- [7] CUI J J,LIU Y W,NALLANATHAN A.Multi-agent reinforcement learning-based resource allocation for UAV networks[J].IEEE Transactions on Wireless Communications,2020,19(2):729-743.
- [8] LIU C Y,ZHU Q.Joint resource allocation and learning optimization for UAV-assisted federated learning[J].Applied Sciences,2023,13(6):3771.
- [9] WANG Y T,SU Z,ZHANG N,et al.Learning in the air:secure federated learning for UAV-assisted crowdsensing[J].IEEE Transactions on Network Science and Engineering,2021,8(2):1055-1069.
- [10] 余雪勇,邱礼翔,宋家宁,等.无人机辅助边缘计算中安全通信与能效优化策略[J].通信学报,2023,44(3):45-54.YU X Y,QIU L X,SONG J N,et al.Security communication and energy efficiency optimization strategy in UAV-aided edge computing[J].Journal on Communications,2023,44(3):45-54.(in Chinese)
- [11] WANG Z,YU H,ZHU S C,et al.Curriculum reinforcement learning-based computation offloading approach in space-air-ground integrated network[C]//2021 13th International Conference on Wireless Communications and Signal Processing (WCSP),Changsha,China:IEEE,2021:1-6.
- [12] PENG H X,SHEN X M.Multi-agent reinforcement learning based resource management in MEC- and UAV-assisted vehicular networks[J].IEEE Journal on Selected Areas in Communications,2021,39(1):131-141.
- [13] ZHANG G C,WU Q Q,CUI M,et al.Securing UAV communications via joint trajectory and power control[J].IEEE Transactions on Wireless Communications,2019,18(2):1376-1389.
- [14] DAI Z J,ZHANG Y,ZHANG W C,et al.A multi-agent collaborative environment learning method for UAV deployment and resource allocation[J].IEEE Transactions on Signal and Information Processing Over Networks,2022,8:120-130.
- [15] WANG Y P,FANG W W,DING Y,et al.Computation offloading optimization for UAV-assisted mobile edge computing:a deep deterministic policy gradient approach[J].Wireless Networks,2021,27(4):2991-3006.
- [16] Wang Y P,Fang W W,Ding Y,et al.Computation offloading optimization for UAV-assisted mobile edge computing:a deep deterministic policy gradient approach[J].Wireless Networks,2021,27:2991-3006.