深度强化学习结合图注意力模型求解TSP问题
|
|
王扬, 陈智斌, 杨笑笑, 吴兆蕊
|
Deep reinforcement learning combined with graph attention model to solve TSP
|
|
Yang Wang, Zhibin Chen, Xiaoxiao Yang, Zhaorui Wu
|
|
表1 不同模型在TSP问题上的优化结果比较
|
Table 1 Optimization results for TSP problem by different models
|
|
| 模型 | 20⁃TSP | 50⁃TSP | 100⁃TSP |
|---|
| 花费 | 间隙 | 时间 | 花费 | 间隙 | 时间 | 花费 | 间隙 | 时间 |
|---|
| Concorde[9] | 3.83 | 0.00% | 5 min | 5.69 | 0.00% | 13 min | 7.76 | 0.00% | 1 h | | Gurobi[8] | 3.83 | 0.00% | 7 s | 5.69 | 0.00% | 2 min | 7.76 | 0.00% | 17 min | | OR⁃Tools | 3.86 | 0.94% | 1 min | 5.85 | 2.87% | 5 min | 8.06 | 3.86% | 23 min | | LKH3[12] | 3.83 | 0.00% | 42 s | 5.69 | 0.00% | 6 min | 7.76 | 0.00% | 25 min | | 2⁃opt [1] | 3.95 | 3.13% | 1 s | 6.11 | 7.38% | 7 s | 8.50 | 9.53% | 33 s | | Farthest Insertion[11] | 3.89 | 1.56% | 1 s | 5.97 | 4.92% | 2 s | 8.34 | 7.47% | 10 s | | Nearest Neighbor[1] | 4.48 | 16.9% | 1 s | 6.94 | 21.9% | 3 s | 9.68 | 24.7% | 7 s | | Kool et al(Greedy)[19] | 3.85 | 0.34% | ≪1 s | 5.80 | 1.76% | 2 s | 8.12 | 4.53% | 6 s | | Kool et al(Sampling)[19] | 3.84 | 0.08% | 5 min | 5.73 | 0.52% | 24 min | 7.94 | 2.26% | 1 h | | Costa et al[17] | 3.83 | 0.00% | 15 min | 5.71 | 0.12% | 29 min | 7.83 | 0.87% | 41 min | | Wu et al[22] | 3.83 | 0.00% | 1 h | 5.70 | 0.20% | 1.5 h | 7.87 | 1.42% | 2 h | | Kwon et al(single trajec)[5] | 3.83 | 0.12% | ≪1 s | 5.74 | 1.03% | 3 s | 7.84 | 1.12% | 8 s | | Kwon et al(no augment)[5] | 3.83 | 0.04% | ≪1 s | 5.71 | 0.35% | 10 s | 7.79 | 0.50% | 54 s | | Kwon et al(8×augment)[5] | 3.83 | 0.00% | 16 s | 5.69 | 0.05% | 1 min | 7.77 | 0.14% | 7 min | | Ours (single trajec) | 3.83 | 0.13% | ≪1 s | 5.73 | 0.70% | 5 s | 7.84 | 1.08% | 8 s | | Ours (no augment) | 3.83 | 0.05% | ≪1 s | 5.70 | 0.28% | 10 s | 7.79 | 0.47% | 55 s | | Ours (8×augment) | 3.83 | 0.00% | 17 s | 5.69 | 0.03% | 1 min | 7.77 | 0.12% | 7 min | | Ours (4×augment) | 3.83 | 0.00% | 9 s | 5.69 | 0.01% | 42 s | 7.76 | 0.09% | 3 min |
|
|
|