南京大学学报(自然科学版) ›› 2023, Vol. 59 ›› Issue (6): 1023–1033.doi: 10.13232/j.cnki.jnju.2023.06.012

• • 上一篇    

深度混合型邻域搜索模型求解CVRP问题

杨笑笑, 陈智斌()   

  1. 昆明理工大学理学院,昆明,650000
  • 收稿日期:2023-09-14 出版日期:2023-11-30 发布日期:2023-12-06
  • 通讯作者: 陈智斌 E-mail:chenzhibin311@126.com
  • 基金资助:
    国家自然科学基金(11761042)

Deep hybrid neighborhood search model solves the CVRP

Xiaoxiao Yang, Zhibin Chen()   

  1. Faculty of Science,Kunming University of Science and Technology,Kunming,650000,China
  • Received:2023-09-14 Online:2023-11-30 Published:2023-12-06
  • Contact: Zhibin Chen E-mail:chenzhibin311@126.com

摘要:

邻域搜索算法的关键是邻域结构的选择,但每次迭代搜索的时间较长,缺少在解空间内自主搜索的能力.利用深度强化学习(DRL)模型对邻域搜索算法进行改进,设计了一个新的深度混合型邻域搜索(DHNS)模型来求解带容量的车辆路径问题(CVRP).首先,利用贪婪算法为DRL模型提供初始解;其次,采用指针网络以及Transformer混合编码,利用不同网络的优势,深层次地提取节点特征信息;最后,将修复算子的修复过程转至DHNS模型,自动完成邻域搜索修复解的过程,扩大解空间的自主搜索能力.同时,针对混合编码中复杂传输机制以及解码输出误导性信息的问题,进一步在编码和解码过程中添加AOA (Attention on Attention)机制.AOA负责筛选有价值的信息,过滤不相关或误导性信息,有效刻画了注意力结果和查询之间的相关性,并对节点间的关系进行建模.实验结果表明,DHNS模型在100规模CVRP的优化效果上,优于现有DRL模型和部分传统算法.采用CVRPlib数据集中的算例对该算法的效能进行验证,结果表明,采用DHNS模型能够极大地提升路径问题的优化效能.

关键词: 深度混合型邻域搜索模型, 深度强化学习, 混合模型, AOA机制

Abstract:

The key to the neighborhood search algorithm is the selection of neighborhood structure,but the search time of each iteration is long,and the ability to search autonomously in the solution space is lacking. In this paper,the deep reinforcement learning (DRL) model is used to improve the neighborhood search algorithm,and a new deep hybrid neighborhood search (DHNS) model is designed to solve the capacitated vehicle routing problem (CVRP). Firstly,the greedy algorithm is used to provide the initial solution for the DRL model. Secondly,the pointer network and Transformer hybrid encoder are used to take advantage of different networks to extract node feature information at a deep level. Finally,the repair process of the repair operator is transferred to the DHNS model,and the process of neighborhood search repair solution is automatically completed,expanding the ability to solution space for autonomous search. At the same time,aiming at the complex transmission mechanism in hybrid encoder and the problem of decoding misleading information,the AOA (Attention on Attention) mechanism is further added in the encoding and decoding process. AOA is responsible for screening valuable information,filtering out irrelevant or misleading information,effectively characterizing the correlation between attention results and queries,and modeling the relationship among nodes. Experimental results show that the DHNS model is superior to the existing DRL model and some traditional algorithms in the optimization effect of 100?scale CVRP. The efficiency of the algorithm is verified by using an example in the CVRPlib dataset,and the results show that the DHNS model greatly improve the optimization efficiency of the routing problem.

Key words: deep hybrid neighborhood search, deep reinforcement learning, hybrid model, Attention on Attention

中图分类号: 

  • O22

图1

模型架构及求解思路"

图2

破坏算子和修复算子的操作过程"

图3

DHNS模型总架构"

表1

DHNS模型在CVRP问题上的优化结果比较"

模型CVRP20CVRP50CVRP100
ObjGapTimeObjGapTimeObjGapTime
Random CW6.8111.64%-12.2518.07%-18.9621.18%-
LKH36.120.00%2 h10.380.00%7 h15.650.00%13 h
ALNS6.699.31%1 s11.248.28%2 s17.3310.7%5 s
OR⁃Tools*6.424.84%2 min11.228.12%12 min17.149.34%1 h
RL(BS)*6.404.39%27 min11.157.46%39 min16.968.39%74 min
AM (sampling)*6.251.91%6 min10.622.40%28 min16.233.72%2 h
AM⁃D (greedy)*6.282.95%1 s10.783.85%1 s16.404.79%1 s
NeuRewriter*6.160.48%22 min10.511.25%35 min16.102.88%66 min
POMO6.353.42%1 s10.743.52%1 s16.153.00%3 s
NLNS6.140.61%1 h10.551.65%2 h16.112.99%3 h
DACT6.130.24%11 min10.390.18%32 min15.710.38%1.5 h
MDAM(BS)6.140.26%3 min10.501.18%9 min16.032.49%31 min
DPDP*------15.690.26%6 h
DGTM6.130.13%2 s10.390.15%5 s15.680.19%20 s
DHNS6.120.06%3 min10.390.09%9 min15.670.12%20 min

图4

CVRP泛化性能的比较"

图5

CVRP20模型收敛图"

图6

CVRP50模型收敛图"

表2

模型在CVRPlib基准数据集上的实验结果"

实例最优OR⁃ToolsAMWu et alNLNSPOMODACTDHNS
平均间隙0.00%8.06%31.62%14.27%11.67%6.10%3.41%3.07%
X⁃n101⁃k252759129405377022971629845285952799627999
X⁃n106⁃k142636227343284732764227688268502685526809
X⁃n110⁃k131479116149154431592715247150941481014099
X⁃n115⁃k101274713320137451444514256131911296112875
X⁃n120⁃k61333214242139371548613986136151364913602
X⁃n125⁃k305553958665750676042357896595045856056912
X⁃n129⁃k182894031361301763212631045292212967829665
X⁃n134⁃k131091613275136191266912430113771120311188
X⁃n139⁃k101359015223142511562714652139001387313886
X⁃n143⁃k71570017470173971887218689161661625716106
X⁃n148⁃k464344846836795145056349692520854441344104
X⁃n153⁃k222122022919379382608827103238002260622394
X⁃n157⁃k131687617309213301977119862173471740317289
X⁃n162⁃k111413815030150851684715426148121450814520
X⁃n167⁃k102055722477222852436522359213902127021412
X⁃n172⁃k514560750505878095110852968556364716247366
X⁃n176⁃k264781252111581785713158023527225064750654
X⁃n181⁃k232556926321275202717327179261012620126055
X⁃n186⁃k152414526017257572842226896246642534524452
X⁃n190⁃k81698018088363832014520356185511812318102
X⁃n195⁃k514422550311792765176348562483074615346012
X⁃n200⁃k365857861009764776420062495615136201161280

表3

AOA机制在编码和解码阶段的消融实验"

模型CVRP20CVRP50CVRP100
ObjGapTimeObjGapTimeObjGapTime
DHNS(编码无AOA)6.130.16%2 min10.400.19%9 min15.700.31%21 min
DHNS(解码无AOA)6.130.24%2 min10.400.24%8 min15.700.35%20 min
DHNS6.120.06%3 min10.390.09%10 min15.670.12%22 min
1 Cook W J, Cunningham W H, Pulleyblank W R, et al. Combinatorial optimization. New York, USA: Wiley?Interscience,2010:11-22.
2 Augerat P, Belenguer J M, Benavent E,et al. Separating capacity constraints in the CVRP using tabu search. European Journal of Operational Research1998106(2-3):546-557.
3 代婉玉,张丽娟,吴佳峰,等. 改进TEB算法的局部路径规划算法研究. 计算机工程与应用202258(8):283-288.
Dai W Y, Zhang L J, Wu J F,et al. Research on local path planning algorithm based on improved TEB algorithm. Computer Engineering and Applications202258(8):283-288.
4 Yogatama D, Blunsom P, Dyer C,et al. Learning to compose words into sentences with reinforcement learning. 2016,arXiv:.
5 Kool W, Van Hoof H, Attention Welling M.,learn to solve routing problems. 2019,arXiv:.
6 王扬,陈智斌,吴兆蕊,等. 强化学习求解组合最优化问题的研究综述. 计算机科学与探索202216(2):261-279.
Wang Y, Chen Z B, Wu Z R,et al. Review of reinforcement learning for combinatorial optimization problem. Journal of Frontiers of Computer Science and Technology202216(2):261-279.
7 Huang L, Wang W M, Chen J,et al. Attention on attention for image captioning∥Proceedings of the IEEE/CVF International Conference on Computer Vision. Seoul,Korea (South):IEEE,2019:4633-4642.
8 Vaswani A, Shazeer N, Parmar N,et al. Attention is all you need∥Proceedings of the 31st International Conference on Neural Information Processing Systems. Long Beach,CA,USA:Curran Associates Inc.,2017:6000-6010.
9 Vinalys O, Fortunato M, Jaitly N. Pointer networks∥Proceedings of the 28th International Conference on Neural Information Processing Systems. Montreal,Canada:MIT Press,2015:2692-2700.
10 Nazari M, Oroojlooy A, Taká? A,et al. Reinforcement learning for solving the vehicle routing problem∥Proceedings of the 32nd International Conference on Neural Information Processing Systems. Montréal,Canada:Curran Associates Inc.,2018:9861-9871.
11 Bresson X, Laurent T. The transformer network for the traveling salesman problem. 2021,arXiv:2103. 03012.
12 王扬,陈智斌. 一种求解CVRP的动态图转换模型. 计算机工程与科学202345(5):859-868.
Wang Y, Chen Z B. A dynamic graph transformer model for solving CVRP. Computer Engineering and Science202345(5):859-868.
13 Kwon Y D, Choo J, Kim B,et al. Pomo:Policy optimization with multiple optima for reinforcement learning∥Proceedings of the 34th International Conference on Neural Information Processing Systems. Vancouver,Canada:Curran Associates Inc.,2020:21188-21198.
14 Wu Y X, Song W, Cao Z G,et al. Learning improvement heuristics for solving routing problems. IEEE Transactions on Neural Networks and Learning Systems202233(9):5057-5069.
15 王原,陈名,邢立宁,等. 用于求解旅行商问题的深度智慧型蚁群优化算法. 计算机研究与发展202158(8):1586-1598.
Wang Y, Chen M, Xing L N,et al. Deep intelligent ant colony optimization for solving travelling salesman problem. Journal of Computer Research and Development202158(8):1586-1598.
16 Ma Y N, Li J W, Cao Z G, et al. Learning to iteratively solve routing problems with dual?aspect collaborative transformer. Advances in Neural Information Processing Systems,2021(34):11096-11107.
17 Hottung A, Tierney K. Neural large neighborhood search for the capacitated vehicle routing problem. 2020,arXiv:.
18 Qin Z, Sun W X, Deng H,et al. cosFormer:Rethinking softmax in attention. 2022,arXiv:2202. 08791.
19 Chen X Y, Tian Y D. Learning to perform local rewriting for combinatorial optimization∥Proceedings of the 33rd International Conference on Neural Information Processing Systems. Vancouver,Canada:Curran Associates Inc.,2019:6281-6292.
20 Xin L, Song W, Cao Z G,et al. Multi?decoder attention model with embedding glimpse for solving vehicle routing problems∥Proceedings of the 35th AAAI Conference on Artificial Intelligence. Palo Alto,CA,USA:AAAI Press,2021:12042-12049.
21 Kool W, Van Hoof H, Gromicho J,et al. Deep policy dynamic programming for vehicle routing problems. 2021,arXiv:.
[1] 方明月, 冯早, 朱雪峰. 基于半监督聚类方法的管道运行状态识别研究[J]. 南京大学学报(自然科学版), 2023, 59(3): 435-445.
[2] 常芳芳, 陈祺航, 刘云龙. 局部可观测环境下未来信息辅助的无模型深度强化学习[J]. 南京大学学报(自然科学版), 2022, 58(5): 796-804.
[3] 王扬, 陈智斌, 杨笑笑, 吴兆蕊. 深度强化学习结合图注意力模型求解TSP问题[J]. 南京大学学报(自然科学版), 2022, 58(3): 420-429.
[4] 刘玲珊, 熊轲, 张煜, 张锐晨, 樊平毅. 信息年龄受限下最小化无人机辅助无线供能网络的能耗:一种基于DQN的方法[J]. 南京大学学报(自然科学版), 2021, 57(5): 847-856.
[5] 吴礼福, 徐行. 融合韵律与动态倒谱特征的语音疲劳度检测[J]. 南京大学学报(自然科学版), 2021, 57(4): 709-714.
[6] 王旻,林志斌,卢晶. 适用于虚拟低音音质的客观评价方法研究[J]. 南京大学学报(自然科学版), 2019, 55(5): 796-803.
[7] 朱 尧,毛晓蛟,杨育彬* . 基于多特征混合模型的视觉目标跟踪[J]. 南京大学学报(自然科学版), 2016, 52(4): 762-.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!