南京大学学报(自然科学版) ›› 2023, Vol. 59 ›› Issue (6): 10231033.doi: 10.13232/j.cnki.jnju.2023.06.012
• • 上一篇
摘要:
邻域搜索算法的关键是邻域结构的选择,但每次迭代搜索的时间较长,缺少在解空间内自主搜索的能力.利用深度强化学习(DRL)模型对邻域搜索算法进行改进,设计了一个新的深度混合型邻域搜索(DHNS)模型来求解带容量的车辆路径问题(CVRP).首先,利用贪婪算法为DRL模型提供初始解;其次,采用指针网络以及Transformer混合编码,利用不同网络的优势,深层次地提取节点特征信息;最后,将修复算子的修复过程转至DHNS模型,自动完成邻域搜索修复解的过程,扩大解空间的自主搜索能力.同时,针对混合编码中复杂传输机制以及解码输出误导性信息的问题,进一步在编码和解码过程中添加AOA (Attention on Attention)机制.AOA负责筛选有价值的信息,过滤不相关或误导性信息,有效刻画了注意力结果和查询之间的相关性,并对节点间的关系进行建模.实验结果表明,DHNS模型在100规模CVRP的优化效果上,优于现有DRL模型和部分传统算法.采用CVRPlib数据集中的算例对该算法的效能进行验证,结果表明,采用DHNS模型能够极大地提升路径问题的优化效能.
中图分类号:
1 | Cook W J, Cunningham W H, Pulleyblank W R, et al. Combinatorial optimization. New York, USA: Wiley?Interscience,2010:11-22. |
2 | Augerat P, Belenguer J M, Benavent E,et al. Separating capacity constraints in the CVRP using tabu search. European Journal of Operational Research,1998,106(2-3):546-557. |
3 | 代婉玉,张丽娟,吴佳峰,等. 改进TEB算法的局部路径规划算法研究. 计算机工程与应用,2022,58(8):283-288. |
Dai W Y, Zhang L J, Wu J F,et al. Research on local path planning algorithm based on improved TEB algorithm. Computer Engineering and Applications,2022,58(8):283-288. | |
4 | Yogatama D, Blunsom P, Dyer C,et al. Learning to compose words into sentences with reinforcement learning. 2016,arXiv:. |
5 | Kool W, Van Hoof H, Attention Welling M.,learn to solve routing problems. 2019,arXiv:. |
6 | 王扬,陈智斌,吴兆蕊,等. 强化学习求解组合最优化问题的研究综述. 计算机科学与探索,2022,16(2):261-279. |
Wang Y, Chen Z B, Wu Z R,et al. Review of reinforcement learning for combinatorial optimization problem. Journal of Frontiers of Computer Science and Technology,2022,16(2):261-279. | |
7 | Huang L, Wang W M, Chen J,et al. Attention on attention for image captioning∥Proceedings of the IEEE/CVF International Conference on Computer Vision. Seoul,Korea (South):IEEE,2019:4633-4642. |
8 | Vaswani A, Shazeer N, Parmar N,et al. Attention is all you need∥Proceedings of the 31st International Conference on Neural Information Processing Systems. Long Beach,CA,USA:Curran Associates Inc.,2017:6000-6010. |
9 | Vinalys O, Fortunato M, Jaitly N. Pointer networks∥Proceedings of the 28th International Conference on Neural Information Processing Systems. Montreal,Canada:MIT Press,2015:2692-2700. |
10 | Nazari M, Oroojlooy A, Taká? A,et al. Reinforcement learning for solving the vehicle routing problem∥Proceedings of the 32nd International Conference on Neural Information Processing Systems. Montréal,Canada:Curran Associates Inc.,2018:9861-9871. |
11 | Bresson X, Laurent T. The transformer network for the traveling salesman problem. 2021,arXiv:2103. 03012. |
12 | 王扬,陈智斌. 一种求解CVRP的动态图转换模型. 计算机工程与科学,2023,45(5):859-868. |
Wang Y, Chen Z B. A dynamic graph transformer model for solving CVRP. Computer Engineering and Science,2023,45(5):859-868. | |
13 | Kwon Y D, Choo J, Kim B,et al. Pomo:Policy optimization with multiple optima for reinforcement learning∥Proceedings of the 34th International Conference on Neural Information Processing Systems. Vancouver,Canada:Curran Associates Inc.,2020:21188-21198. |
14 | Wu Y X, Song W, Cao Z G,et al. Learning improvement heuristics for solving routing problems. IEEE Transactions on Neural Networks and Learning Systems,2022,33(9):5057-5069. |
15 | 王原,陈名,邢立宁,等. 用于求解旅行商问题的深度智慧型蚁群优化算法. 计算机研究与发展,2021,58(8):1586-1598. |
Wang Y, Chen M, Xing L N,et al. Deep intelligent ant colony optimization for solving travelling salesman problem. Journal of Computer Research and Development,2021,58(8):1586-1598. | |
16 | Ma Y N, Li J W, Cao Z G, et al. Learning to iteratively solve routing problems with dual?aspect collaborative transformer. Advances in Neural Information Processing Systems,2021(34):11096-11107. |
17 | Hottung A, Tierney K. Neural large neighborhood search for the capacitated vehicle routing problem. 2020,arXiv:. |
18 | Qin Z, Sun W X, Deng H,et al. cosFormer:Rethinking softmax in attention. 2022,arXiv:2202. 08791. |
19 | Chen X Y, Tian Y D. Learning to perform local rewriting for combinatorial optimization∥Proceedings of the 33rd International Conference on Neural Information Processing Systems. Vancouver,Canada:Curran Associates Inc.,2019:6281-6292. |
20 | Xin L, Song W, Cao Z G,et al. Multi?decoder attention model with embedding glimpse for solving vehicle routing problems∥Proceedings of the 35th AAAI Conference on Artificial Intelligence. Palo Alto,CA,USA:AAAI Press,2021:12042-12049. |
21 | Kool W, Van Hoof H, Gromicho J,et al. Deep policy dynamic programming for vehicle routing problems. 2021,arXiv:. |
[1] | 方明月, 冯早, 朱雪峰. 基于半监督聚类方法的管道运行状态识别研究[J]. 南京大学学报(自然科学版), 2023, 59(3): 435-445. |
[2] | 常芳芳, 陈祺航, 刘云龙. 局部可观测环境下未来信息辅助的无模型深度强化学习[J]. 南京大学学报(自然科学版), 2022, 58(5): 796-804. |
[3] | 王扬, 陈智斌, 杨笑笑, 吴兆蕊. 深度强化学习结合图注意力模型求解TSP问题[J]. 南京大学学报(自然科学版), 2022, 58(3): 420-429. |
[4] | 刘玲珊, 熊轲, 张煜, 张锐晨, 樊平毅. 信息年龄受限下最小化无人机辅助无线供能网络的能耗:一种基于DQN的方法[J]. 南京大学学报(自然科学版), 2021, 57(5): 847-856. |
[5] | 吴礼福, 徐行. 融合韵律与动态倒谱特征的语音疲劳度检测[J]. 南京大学学报(自然科学版), 2021, 57(4): 709-714. |
[6] | 王旻,林志斌,卢晶. 适用于虚拟低音音质的客观评价方法研究[J]. 南京大学学报(自然科学版), 2019, 55(5): 796-803. |
[7] | 朱 尧,毛晓蛟,杨育彬* . 基于多特征混合模型的视觉目标跟踪[J]. 南京大学学报(自然科学版), 2016, 52(4): 762-. |
|