南京大学学报(自然科学版) ›› 2024, Vol. 60 ›› Issue (1): 1825.doi: 10.13232/j.cnki.jnju.2024.01.003
资文杰1, 贾庆仁1, 陈浩1,2(), 李军1,2, 景宁1
Wenjie Zi1, Qingren Jia1, Hao Chen1,2(), Jun Li1,2, Ning Jing1
摘要:
对城市三角网格(Urban Triangle Mesh)数据进行语义分割以识别不同类别的物体,是理解和分析三维城市场景的一种非常重要的方法.城市三角网格是一种具有丰富空间拓扑关系的三维空间几何数据,包含大量的几何信息,然而,现有的方法仅仅单独对每种几何信息进行特征提取,然后简单地融合再进行语义分割,难以利用几何信息之间的关联性,对个别物体的分割性能不佳.为了解决上述问题,提出一种基于自注意力机制Transformer的模型UMeT (Urban Mesh Transformer),其由多层感知机和MeshiT (Mesh in Transformer)模块构成,不仅可以利用多层感知机提取高维特征,还可以利用MeshiT模块计算各种几何信息之间的关联性,有效挖掘城市三角网格数据中隐含的关联.实验证明,UMeT能提取高维特征,同时保证城市三角网格数据的空间不变性,从而提升了语义分割的准确性.
中图分类号:
1 | 王静远,李超,熊璋,等. 以数据为中心的智慧城市研究综述. 计算机研究与发展,2014,51(2):239-259. |
Wang J Y, Li C, Ziong Z,et al. Survey of data?centric smart city. Journal of Computer Research and Development,2014,51(2):239-259. | |
2 | 方勇,龚辉,张丽,等. 从全球激光点云到三维数字地球空间框架:全球精确测绘进阶之路. 激光与光电子学进展,2022,59(12):1200002. |
Fang Y, Gong H, Zhang L,et al. From global laser point cloud acquisition to 3D digital geospatial framework:The advanced road of global accurate mapping. Laser & Optoelectronics Progress,2022,59(12):1200002. | |
3 | 王晓宇,孙卡. 基于osgEarth的三维虚拟校园可视化. 计算机与现代化,2020(11):89-93. |
Wang X Y, Sun K. Visualization of 3D virtual campus based on osgEarth. Computer and Modernization,2020(11):89-93. | |
4 | Gao W X, Nan L L, Boom B,et al. PSSNet:Planarity?sensible semantic segmentation of large?scale urban meshes. ISPRS Journal of Photogram?metry and Remote Sensing,2023,196:32-44. |
5 | Vaswani A, Shazeer N, Parmar N,et al. Attention is all you need∥Proceedings of the 31st International Conference on Neural Information Processing Systems. Long Beach,CA,USA:Curran Associates Inc.,2017:6000-6010. |
6 | Gao W X, Nan L L, Boom B,et al. SUM:A benchmark dataset of semantic urban meshes. ISPRS Journal of Photogrammetry and Remote Sensing,2021,179:108-120. |
7 | Zhang J Y, Zhao X L, Chen Z,et al. A review of deep learning?based semantic segmentation for point cloud. IEEE Access,2019,7:179118-179133. |
8 | Sharp N, Attaiki S, Crane K,et al. DiffusionNet:Discretization agnostic learning on surfaces. ACM Transactions on Graphics,2022,41(3):27. |
9 | Smirnov D, Solomon J. HodgeNet:Learning spectral geometry on triangle meshes. ACM Transactions on Graphics,2021,40(4):166. |
10 | Sinha A, Bai J, Ramani K. Deep learning 3D shape surfaces using geometry images∥The 14th European Conference on Computer Vision. Springer Berlin Heidelberg,2016:213-240. |
11 | Le T, Bui G, Duan Y. A multi?view recurrent neural network for 3D mesh segmentation. Computers & Graphics,2017,66:103-112. |
12 | Masci J, Boscaini D, Bronstein M M,et al. Geodesic convolutional neural networks on riemannian manifolds∥Proceedings of 2015 IEEE International Conference on Computer Vision workshop. Santiago,Chile:IEEE,2015:832-840. |
13 | He W C, Jiang Z, Zhang C M,et al. CurvaNet:Geometric deep learning based on directional curvature for 3D shape analysis∥Proceedings of the 26th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining. Virtual Event:ACM,2020:2214-2224. |
14 | Lahav A, Tal A. MeshWalker:Deep mesh understanding by random walks. ACM Transactions on Graphics,2020,39(6):263. |
15 | Hanocka R, Hertz A, Fish N,et al. MeshCNN:A network with an edge. ACM Transactions on Graphics,2019,38(4):90. |
16 | Hu S M, Liu Z N, Guo M H,et al. Subdivision?based mesh convolution networks. ACM Transactions on Graphics,2022,41(3):25. |
17 | Rouhani M, Lafarge F, Alliez P. Semantic segmentation of 3D textured meshes for urban scene analysis. ISPRS Journal of Photogrammetry and Remote Sensing,2017,123:124-139. |
18 | Minaee S, Kalchbrenner N, Cambria E,et al. Deep learning?based text classification:A comprehensive review. ACM Computing Surveys,2021,54(3):62. |
19 | Camg?z N C, Koller O, Hadfield S,et al. Sign language transformers:Joint end?to?end sign language recognition and translation∥Proceedings of 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Seattle,WA,USA:IEEE,2020:10020-10030. |
20 | Dosovitskiy A, Beyer L, Kolesnikov A,et al. An image is worth 16×16 words:Transformers for image recognition at scale. 2020,arXiv:. |
21 | Deng J, Dong W, Socher R,et al. ImageNet:A large?scale hierarchical image database∥2009 IEEE Conference on Computer Vision and Pattern Recognition. Miami,FL,USA:IEEE,2009:248-255. |
22 | Carion N, Massa F, Synnaeve G,et al. End?to?end object detection with transformers∥The 16th European Conference on Computer Vision. Springer Berlin Heidelberg,2020:213-229. |
23 | Chu X X, Tian Z, Wang Y Q,et al. Twins:Revisiting the design of spatial attention in vision transformers. 2021,arXiv:. |
24 | Gao D H, Zhang B, Wang Q,et al. SCAT:Stride Consistency with Auto?regressive regressor and Transformer for hand pose estimation∥Proceedings of 2021 IEEE/CVF International Conference on Computer Vision Workshops. Montreal,Canada:IEEE,2021:2266-2275. |
25 | d’Ascoli S, Touvron H, Leavitt M L,et al. ConViT:Improving vision transformers with soft convolutional inductive biases∥Proceedings of the 38th International Conference on Machine Learning. Vienna,Austria:Curran Associates Inc.,2021:2286-2296. |
26 | Tolstikhin I, Houlsby N, Kolesnikov A,et al. MLP?mixer:An all?MLP architecture for vision. 2021,arXiv:. |
27 | Thomas H, Qi C R, Deschaud J E,et al. KPConv:Flexible and deformable convolution for point clouds∥Proceedings of 2019 IEEE/CVF Inter?national Conference on Computer Vision. Seoul,Korea (South):IEEE,2019:6410-6419. |
[1] | 姚瑶, 杨吉斌, 张雄伟, 陈乐乐, 范君怡. 基于多维注意力机制的单通道语音增强方法[J]. 南京大学学报(自然科学版), 2023, 59(4): 669-679. |
[2] | 曲皓, 狄岚, 梁久祯, 刘昊. 双端输入型嵌套融合多尺度信息的织物瑕疵检测[J]. 南京大学学报(自然科学版), 2023, 59(3): 398-412. |
[3] | 谭嘉辰, 董永权, 张国玺. SSM: 基于孪生网络的糖尿病视网膜眼底图像分类模型[J]. 南京大学学报(自然科学版), 2023, 59(3): 425-434. |
[4] | 宋耀莲, 殷喜喆, 杨俊. 基于时空特征学习Transformer的运动想象脑电解码方法[J]. 南京大学学报(自然科学版), 2023, 59(2): 313-321. |
[5] | 唐伟佳, 张华, 侯志荣. 基于空间卷积融合的中文文本匹配方法[J]. 南京大学学报(自然科学版), 2022, 58(5): 868-875. |
[6] | 苏雅茜, 崔超然, 曲浩. 基于自注意力移动平均线的时间序列预测[J]. 南京大学学报(自然科学版), 2022, 58(4): 649-657. |
[7] | 井花花, 晏涛, 刘渊. 融合全局和局部特征的光场图像空间超分辨率算法[J]. 南京大学学报(自然科学版), 2022, 58(2): 298-308. |
[8] | 曾宪华, 陆宇喆, 童世玥, 徐黎明. 结合马尔科夫场和格拉姆矩阵特征的写实类图像风格迁移[J]. 南京大学学报(自然科学版), 2021, 57(1): 1-9. |
[9] | 胡 太, 杨 明. 结合目标检测的小目标语义分割算法[J]. 南京大学学报(自然科学版), 2019, 55(1): 73-84. |
[10] | 顾健伟, 曾 诚, 邹恩岑, 陈 扬, 沈 艺, 陆 悠, 奚雪峰. 基于双向注意力流和自注意力结合的机器阅读理解[J]. 南京大学学报(自然科学版), 2019, 55(1): 125-132. |
|