南京大学学报(自然科学版) ›› 2019, Vol. 55 ›› Issue (1): 7384.doi: 10.13232/j.cnki.jnju.2019.01.007
胡 太,杨 明*
Hu Tai,Yang Ming*
摘要: 卷积神经网络(Convolutional Neural Networks,CNN)可以提供比传统分类算法更强大的分类器并且能够自学习得到深层特征,有效地提高了图像语义分割的准确性. 然而,基于CNN的语义分割算法依然存在一些挑战,例如在复杂场景中现有较优的方法较难分割小目标. 为了解决复杂场景下小目标分割的难题,提出一种结合目标检测的小目标语义分割算法. 与现有较优方法不同的是,该方法没有直接利用单个神经网络模型同时分割单幅图像中的小尺寸和较大尺寸目标,而是将小目标分割任务从完整图像的分割任务中分离. 算法首先训练一个目标检测模型以获取小目标图像块,然后设计一个小目标分割网络得到图像块的分割结果,最终根据该结果修正整体图像的分割图. 该算法提升了语义分割数据集的总体性能,同时能够有效地解决小目标分割的难题.
中图分类号:
[1] Gould S,He X M. Scene understanding by labeling pixels. Communications of the ACM,2014,57(11):68-77. [2] Koller D,Friedman N. Probabilistic graphical models:Principles and techniques. New York:MIT Press,2009,142-147. [3] Lafferty J D,Mccallum A,Pereira F C N. Conditional random fields:probabilistic models for segmenting and labeling sequence data ∥ Proceedings of the Eighteenth International Conference on Machine Learning. San Francisco,CA,USA:Morgan Kaufmann Publishers Inc.,2001:282-289. [4] Deng J,Dong W,Socher R,et al. ImageNet:A large-scale hierarchical image database ∥ 2009 IEEE Conference on Computer Vision and Pattern Recognition. Miami,FL,USA:IEEE,2009:248-255. [5] Shelhamer E,Long J,Darrell T. Fully convolutional networks for semantic segmentation. IEEE Transactions on Pattern Analysis and Machine Intelligence,2014,39(4):640-651. [6] Dean J,Corrado G S,Monga R,et al. Large scale distributed deep networks ∥ Proceedings of the 25th International Conference on Neural Information Processing Systems. Lake Tahoe,NV,USA:Curran Associates Inc.,2012:1223-1231. [7] Krizhevsky A,Sutskever I,Hinton G E,et al. ImageNet classification with deep convolutional neural networks ∥ Proceedings of the 25th International Conference on Neural Information Processing Systems. Lake Tahoe,NV,USA:Curran Associates Inc.,2012:1097-1105. [8] Sermanet P,Eigen D,Zhang X,et al. Overfeat:Integrated recognition,localization and detection using convolutional networks. 2013,arXiv:1312.6229. [9] Simonyan K,Zisserman A. Two-stream convolutional networks for action recognition in videos ∥ Proceedings of Advances in Neural Information Processing Systems. Red Hook,NY,USA:Curran Associates,Inc.,2014:568-576. [10] Russakovsky O,Deng J,Su H,et al. ImageNet large scale visual recognition challenge. International Journal of Computer Vision,2015,115(3):211-252. [11] Perronnin F,Snchez J,Mensink T. Improving the fisher kernel for large-scale image classification ∥ Proceedings of the 11th European Conference on Computer Vision. Springer Berlin Heidelberg,2010:143-156. [12] Hariharan B,Arbelez P,Girshick R,et al. Hypercolumns for object segmentation and fine-grained localization ∥ Proceedings of the 2015 IEEE Conference on Computer Vision and Pattern Recognition. Boston,MA,USA:IEEE,2015:447-456. [13] Noh H,Hong S,Han B,et al. Learning deconvolution network for semantic segmentation ∥ 2015 IEEE International Conference on Computer Vision. Santiago,Chile:IEEE,2015:1520-1528. [14] Simonyan K,Zisserman A. Very deep convolutional networks for large-scale image recognition. 2014,arXiv:1409.1556. [15] Badrinarayanan V,Kendall A,Cipolla R. Segnet:A deep convolutional encoder-decoder architec-ture for image segmentation. IEEE Transactions on Pattern Analysis and Machine Intelligence,2017,39(12):2481-2495. [16] Ronneberger O,Fischer P,Brox T. U-net:Convolutional networks for biomedical image segmentation ∥ International Conference on Medical image computing and computer-assisted intervention. Springer Berlin Heidelberg,2015:234-241. [17] Chen L C,Papandreou G,Kokkinos I,et al. DeepLab:Semantic image segmentation with deep convolutional nets,Atrous convolution,and fully connected cRFs. IEEE Transactions on Pattern Analysis and Machine Intelligence,2016,40(4):834-848. [18] Chen L,Yang Y,Wang J,et al. Attention to scale:scale-aware semantic image segmenta-tion ∥ 2016 IEEE Conference on Computer Vision and Pattern Recognition. Las Vegas,NV,USA:IEEE,2016:3640-3649. [19] Zhao H S,Shi J P,Qi X J,et al. Pyramid scene parsing network ∥ 2017 IEEE Conference on Computer Vision and Pattern Recognition. Honolulu,HI,USA:IEEE,2017:2881-2890. [20] Yu F,Koltun V. Multi-scale context aggregation by dilated convolutions. 2015,arXiv:1511.07122. [21] Liu W,Anguelov D,Erhan D,et al. SSD:single shot MultiBox detector ∥ European Conference on Computer Vision. Springer Berlin Heidelberg,2016:21-37. [22] He K M,Zhang X Y,Ren S Q,et al. Deep residual learning for image recognition ∥ 2016 IEEE Conference on Computer Vision and Pattern Recognition. Las Vegas,NV,USA:IEEE,2016:770-778. [23] Zeiler M D,Fergus R. Visualizing and understanding convolutional networks ∥ European Conference on Computer Vision. Springer Berlin Heidelberg,2014:818-833. [24] Jia Y Q,Shelhamer E,Donahue J,et al. Caffe:convolutional architecture for fast feature embedding ∥ Proceedings of the 22nd ACM International Conference on Multimedia. New York,NY,USA:ACM,2014:675-678. [25] Ioffe S,Szegedy C. Batch normalization:accelerating deep network training by reducing internal covariate shift ∥ Proceedings of the 32nd International Conference on Machine Learning. Lille,France:ACM,2015:448-456. [26] Everingham M,Eslami S M A,Van Gool L,et al. The Pascal visual object classes challenge:A retrospective. International Journal of Computer Vision,2015,111(1):98-136. [27] Lin T Y,Maire M,Belongie S,et al. Microsoft coco:Common objects in context ∥ European Conference on Computer Vision. Springer Berlin Heidelberg,2014:740-755. [28] Chen L C,Papandreou G,Kokkinos I,et al. Semantic image segmentation with deep convolutional nets and fully connected CRFs. 2014,arXiv:1412.7062. |
[1] | 朱伟,张帅,辛晓燕,李文飞,王骏,张建,王炜. 结合区域检测和注意力机制的胸片自动定位与识别[J]. 南京大学学报(自然科学版), 2020, 56(4): 591-600. |
[2] | 梅志伟,王维东. 基于FPGA的卷积神经网络加速模块设计[J]. 南京大学学报(自然科学版), 2020, 56(4): 581-590. |
[3] | 赵子龙,赵毅强,叶茂. 基于FPGA的多卷积神经网络任务实时切换方法[J]. 南京大学学报(自然科学版), 2020, 56(2): 167-174. |
[4] | 王吉地,郭军军,黄于欣,高盛祥,余正涛,张亚飞. 融合依存信息和卷积神经网络的越南语新闻事件检测[J]. 南京大学学报(自然科学版), 2020, 56(1): 125-131. |
[5] | 狄 岚, 何锐波, 梁久祯. 基于可能性聚类和卷积神经网络的道路交通标识识别算法[J]. 南京大学学报(自然科学版), 2019, 55(2): 238-250. |
[6] | 安 晶, 艾 萍, 徐 森, 刘 聪, 夏建生, 刘大琨. 一种基于一维卷积神经网络的旋转机械智能故障诊断方法[J]. 南京大学学报(自然科学版), 2019, 55(1): 133-142. |
[7] | 孔 颉, 孙权森, 纪则轩, 刘亚洲. 基于仿射不变离散哈希的遥感图像快速目标检测新方法[J]. 南京大学学报(自然科学版), 2019, 55(1): 49-60. |
[8] | 梁蒙蒙1,周 涛1,2*,夏 勇3,张飞飞1,杨 健1. 基于随机化融合和CNN的多模态肺部肿瘤图像识别[J]. 南京大学学报(自然科学版), 2018, 54(4): 775-. |
[9] | 丁轶,郭乔进,李宁**. 一种新的目标检测方法:Latent Dirichlet classification* [J]. 南京大学学报(自然科学版), 2012, 48(2): 214-220. |
[10] | 高凯亮, 覃团发** , 陈跃波, 常侃 . 一种混合高斯背景模型下的像素分类运动目标检测方法* [J]. 南京大学学报(自然科学版), 2011, 47(2): 195-200. |
|