南京大学学报(自然科学版) ›› 2021, Vol. 57 ›› Issue (2): 208–216.doi: 10.13232/j.cnki.jnju.2021.02.005

• • 上一篇    

基于像素⁃目标级共生关系学习的多标签航拍图像分类方法

方志文, 刘青山(), 周峰   

  1. 江苏省大数据分析技术重点实验室,南京信息工程大学自动化学院,南京,210044
  • 收稿日期:2020-10-09 出版日期:2021-03-23 发布日期:2021-03-23
  • 通讯作者: 刘青山 E-mail:qsliu@nuist.edu.cn
  • 作者简介:E⁃mail:qsliu@nuist.edu.cn
  • 基金资助:
    国家自然科学基金(61532009)

Multi label aerial image classification method based on pixel⁃object level co⁃occurrent relation learning

Zhiwen Fang, Qingshan Liu(), Feng Zhou   

  1. Jiangsu Key Laboratory of Big Data Analysis Technology,College of Automation,Nanjing University of Information Science and Technology,Nanjing,210044,China
  • Received:2020-10-09 Online:2021-03-23 Published:2021-03-23
  • Contact: Qingshan Liu E-mail:qsliu@nuist.edu.cn

摘要:

不同类别物体之间的共生关系对多标签航拍图像分类任务有非常重要的作用.提出一种基于像素?目标级共生关系学习网络的多标签航拍图像分类方法,主要包括像素级共生关系学习模块和目标级共生关系学习模块.像素级共生关系学习模块利用不同空间位置像素点之间的特征相似性来间接度量共生关系,但由于单个像素点不能完全表征整个物体,所以这种像素级的共生关系可能无法有效地帮助目标像素点判断其所属类别.目标级共生关系学习模块则从整体的角度考虑物体之间的关系,因而可以弥补像素级共生关系学习的不足.实验结果表明,提出的方法在UCM和DFC15两个公共评测数据集上均取得了较好的分类性能.

关键词: 航拍图像, 多标签分类, 卷积神经网络, 循环神经网络, 共生关系

Abstract:

The co?occurrent relation between different categories of objects plays an important role in the task of multi?label aerial image classification. In this paper,a multi?label aerial image classification method based on pixel?object level co?occurrent relation learning network is proposed. It mainly includes two modules. The pixel level co?occurrent relation learning module mainly uses the feature similarity between pixels at different spatial positions to indirectly measure the co?occurrent relation. However,a single pixel cannot fully represent the whole object and the pixel level co?occurrent relation may not be able to effectively help the target pixel to determine its category. The object level co?occurrent relation learning module considers the relation between objects in a global view,thus it can alleviate the above issue. Experimental results show that the proposed method achieves good classification performance on both UCM and DFC15 public evaluation datasets.

Key words: aerial image, multi?label classification, convolutional neural network, recurrent neural network, co?occurrent relation

中图分类号: 

  • TP391

图1

本文提出的模型框架"

图2

UCM数据集样例图像"

图3

DFC15数据集样例图像"

表1

UCM数据集上的消融实验结果 (%)"

OPOROF1CPCRCF1

基础网络+像素级

共生关系模块

85.9184.2785.0889.7889.7389.76

基础网络+目标级

共生关系模块

88.1386.9587.5493.0491.2092.11
本文模型92.0790.4891.2795.5093.4994.48

表2

DFC15数据集上的消融实验 (%)"

OPOROF1CPCRCF1

基础网络+像素级

共生关系模块

94.8293.8794.3492.5491.3291.93

基础网络+目标级

共生关系模块

97.7694.5096.1096.8591.4594.07
本文模型98.4896.1497.3098.4293.6195.95

表3

测试实验"

测试图像

基础网络+像素级

共生关系模块

本文模型真实标签

裸露土壤,草地,

移动房屋,道路,

树木

裸露土壤,

移动房屋,

道路,树木

裸露土壤,

移动房屋,

道路,树木

飞机,建筑物,

街道

飞机,建筑物,

车辆,道路

飞机,建筑物,

车辆,道路

飞机,道路

飞机,车辆,

道路

飞机,车辆,

道路

表4

中间结果可视化"

输入热力图热力图真实标签
沙地,海洋
树木,草地
水域,船只

表5

UCM数据集上不同顺序的测试实验结果 (%)"

OPOROF1CPCRCF1
字母顺序92.8586.7489.6996.1191.8693.94
稀有优先顺序91.6890.2790.9794.8493.9194.37
频率优先顺序92.0790.4891.2795.5093.4994.48

表6

DFC15数据集上不同顺序的测试实验结果 (%)"

OPOROF1CPCRCF1
字母顺序98.0095.1496.5596.9592.7994.82
稀有优先顺序98.1195.9397.0198.0893.2895.62
频率优先顺序98.4896.1497.3098.4293.6195.95

表7

UCM数据集上的实验结果 (%)"

OPOROF1CPCRCF1
CNN?RNN[9]83.4485.2684.3486.6787.9787.32

RNN?

Attention[10]

85.5781.5283.5090.5484.9687.66
SRN[21]87.5180.5483.8890.5189.2489.87
ResNet[13]85.9184.2785.0889.7889.7389.76
CADM[22]88.2587.9488.1093.0791.2592.15
GCN[12]89.9985.9087.9093.8391.6492.7
CA?Conv[5]90.1885.7787.9293.9091.2692.56
本文模型92.0790.4891.2795.5093.4994.48

表8

DFC15数据集上的实验结果 (%)"

OPOROF1CPCRCF1
CNN?RNN[9]91.0990.3590.7288.9389.8889.40

RNN?

Attention[10]

94.5592.1393.3291.7090.6291.16
SRN[21]95.0293.0894.0491.9791.1491.55
ResNet[13]94.8293.8794.3492.5491.3291.93
CADM[22]96.1794.0595.1094.4391.3492.86
GCN[13]96.8994.5895.7295.4591.4193.39
CA?Conv[5]97.3894.4995.9195.8991.5293.65
本文模型98.4896.1497.3098.4293.6195.95
1 Marmanis D,Schindler K,Wegner J D,et al. Classification with an edge:improving semantic image segmentation with boundary detection. ISPRS Journal of Photogrammetry and Remote Sensing,2018,135:158-172.
2 Wen D W,Hhang X,Liu H,et al. Semantic classification of urban trees using very high resolution satellite imagery. IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing,2017,10(4):1413-1424.
3 Weng Q,Mao Z Y,Lin J W,et al. Land?use scene classification based on a CNN using a constrained extreme learning machine. International Journal of Remote Sensing,2018,39(19):6281-6299.
4 Mou L C,Zhu X X. Spatiotemporal scene interpretation of space videos via deep neural network and tracklet analysis∥IEEE International Geoscience and Remote Sensing Symposium. Beijing,China:IEEE,2016:1823-1826.
5 Hua Y S,Mou L C,Zhu X X. Recurrently exploring class?wise attention in a hybrid convolutional and bidirectional LSTM network for multi?label aerial image classification. ISPRS Journal of Photogrammetry and Remote Sensing,2019,149:188-199.
6 Karalas K,Tsagkatakis G,Zervakis M,et al. Land classification using remotely sensed data:Going multilabel. IEEE Transactions on Geoscience and Remote Sensing,2016,54(6):3548-3563.
7 Yang H,Zhou J T,Zhang Y,et al. Exploit bounding box annotations for multi?label object recognition∥Proceedings of 2016 IEEE Conference on Computer Vision and Pattern Recognition. Las Vegas,NV,USA:IEEE,2016:280-288.
8 Li Q,Qiao M Y,Wei B,et al. Conditional graphical lasso for multi?label image classification∥Proceedings of IEEE 2016 Conference on Computer Vision and Pattern Recognition. Las Vegas,NV,USA:IEEE,2016:2977-2986.
9 Wang J,Yang Y,Mao J H,et al. CNN?RNN:a unified framework for multi?label image classification∥Proceedings of 2016 IEEE Conference on Computer Vision and Pattern Recognition. Las Vegas,NV,USA:IEEE,2016:2285-2294.
10 Wang Z X,Chen T S,Li G B,et al. Multi?label image recognition by recurrently discovering attentional regions∥Proceedings of 2017 IEEE International Conference on Computer Vision. Venice,Italy:IEEE,2017:464-472.
11 Chen Z M,Wei X S,Wang P,et al. Multi?label image recognition with graph convolutional networks∥Proceedings of 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Long Beach,CA,USA:IEEE,2019.
12 Mou L C,Hua Y S,Zhu X X. A relation?augmented fully convolutional network for semantic segmentation in aerial scenes∥Proceedings of 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Long Beach,CA,USA,USA:IEEE,2019:5172-5181.
13 Yuan Y H,Chen X L,Wang J D. Object?contextual representations for semantic segmentation. 2019,arXiv:1909.11065.
14 Ghamraw N,McCallum A. Collective multi?label classification∥Proceedings of the 14th ACM International Conference on Information and Knowledge Management. New York,NY,USA:ACM,2005:548-557.
15 Guo Y H,Gu S C. Multi?label classification using conditional dependency networks∥Proceedings of the 22nd International Joint Conference on Artificial Intelligence. Menlo Park,CA,USA:AAAI Press,2011.
16 Chen L C,Schwing A G,Yuille A L,et al. Learning deep structured models. 2015,arXiv:1407.2538.
17 Chen T S,Xu M X,Hui X L,et al. Learning seman?tic?specific graph representation for multi?label image recognition∥IEEE/CVF International Conference on Computer Vision (ICCV). Seoul,Korea (South):IEEE. 2019.
18 Jin J R,Nakayama H. Annotation order matters:Recurrent image annotator for arbitrary length image tagging∥2016 23rd International Conference on Pattern Recognition. Cancun,Mexico:IEEE,2016:2452-2457.
19 Liu F,Tao X,Hospedales T M,et al. Semantic regularisation for recurrent image annotation. 2017,arXiv:1611.05490.
20 Chen S F,Chen Y C,Yeh C,et al. Order?free rnn with visual attention for multi?label classification. 2018,arXiv:1707.05495.
21 Chen T S,Wang Z X,Li G B,et al. Recurrent attentional reinforcement learning for multi?label image recognition. 2018,arXiv:1712.07465.
22 Zhu F,Li H S,Ouyang W L,et al. Learning spatial regularization with image?level supervisions for multi?label image classification∥2017 IEEE Conference on Computer Vision and Pattern Recognition. Honolulu,HI,USA:IEEE,2017:5513-5522.
23 Chen Z M,Wei X S,Jin X,et al. Multi?label image recognition with joint class?aware map disentangling and label correlation embedding∥2019 IEEE International Conference on Multimedia and Expo. Shanghai,China:IEEE,2019:622-627.
24 He K M,Zhang X Y,Ren S Q,et al. Deep residual learning for image recognition∥Proceedings of 2016 IEEE Conference on Computer Vision and Pattern Recognition. Las Vegas,NV,USA:IEEE,2016:770-778.
25 Tao X,Zhang D P,Wang Z H,et al. Detection of power line insulator defects using aerial images analyzed with convolutional neural networks. IEEE Transactions on Systems,Man,and Cybernetics:Systems,2020,50(4):195-200.
26 Zhang H,Zhang H,Wang C G,et al. Co?occurrent features in semantic segmentation∥Proceedings of 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Long Beach,CA,USA:IEEE,2019.
27 Yazici V O,Gonzalez?Garcia A,Ramisa A,et al. Orderless recurrent models for multi?label classification. 2020,arXiv:1911.09996.
28 Mou L C,Ghamisi P,Zhu X X. Deep recurrent neural networks for hyperspectral image classification. IEEE Transactions on Geoscience and Remote Sensing,2017,55(7):3639-3655.
29 Bengio Y,Simard P,Frasconi P. Learning long?term dependencies with gradient descent is difficult. IEEE Transactions on Neural Networks,1994,5(2):157-166.
30 Hochreiter S,Schmidhuber J. Long short?term memory. Neural Computation,1997,9(8):1735-1780.
31 Yang Y,Newsam S. Bag?of?visual?words and spatial extensions for land-use classification∥Proceedings of the 18th SIGSPATIAL International Conference on Advances in Geographic Information Systems. New York,NY,USA:ACM,2010:522-531.
[1] 高春永, 柏业超, 王琼. 基于改进的半监督阶梯网络SAR图像识别[J]. 南京大学学报(自然科学版), 2021, 57(1): 160-166.
[2] 李一凡, 朱斐, 凌兴宏, 刘全. 具有窗口结构Bi⁃LSTM网络的心电图QRS波检测方法[J]. 南京大学学报(自然科学版), 2021, 57(1): 42-51.
[3] 潘越,王骏,李文飞,张建,王炜. 基于卷积神经网络的蛋白质折叠类型最小特征提取[J]. 南京大学学报(自然科学版), 2020, 56(5): 744-753.
[4] 梅志伟,王维东. 基于FPGA的卷积神经网络加速模块设计[J]. 南京大学学报(自然科学版), 2020, 56(4): 581-590.
[5] 朱伟,张帅,辛晓燕,李文飞,王骏,张建,王炜. 结合区域检测和注意力机制的胸片自动定位与识别[J]. 南京大学学报(自然科学版), 2020, 56(4): 591-600.
[6] 赵子龙,赵毅强,叶茂. 基于FPGA的多卷积神经网络任务实时切换方法[J]. 南京大学学报(自然科学版), 2020, 56(2): 167-174.
[7] 王吉地,郭军军,黄于欣,高盛祥,余正涛,张亚飞. 融合依存信息和卷积神经网络的越南语新闻事件检测[J]. 南京大学学报(自然科学版), 2020, 56(1): 125-131.
[8] 狄 岚, 何锐波, 梁久祯. 基于可能性聚类和卷积神经网络的道路交通标识识别算法[J]. 南京大学学报(自然科学版), 2019, 55(2): 238-250.
[9] 胡 太, 杨 明. 结合目标检测的小目标语义分割算法[J]. 南京大学学报(自然科学版), 2019, 55(1): 73-84.
[10] 安 晶, 艾 萍, 徐 森, 刘 聪, 夏建生, 刘大琨. 一种基于一维卷积神经网络的旋转机械智能故障诊断方法[J]. 南京大学学报(自然科学版), 2019, 55(1): 133-142.
[11] 梁蒙蒙1,周 涛1,2*,夏 勇3,张飞飞1,杨 健1. 基于随机化融合和CNN的多模态肺部肿瘤图像识别[J]. 南京大学学报(自然科学版), 2018, 54(4): 775-.
[12]  芦俊丽1,2,王丽珍1*,赵家松1,肖 清1.  从动态空间数据库中挖掘共生关系和竞争关系[J]. 南京大学学报(自然科学版), 2018, 54(2): 436-.
[13]  徐智康1,李 旸1,李德玉1,2*.  基于可变最小贝叶斯风险的层次多标签分类方法[J]. 南京大学学报(自然科学版), 2017, 53(6): 1023-.
[14] 江雨燕;李平;王清;. 用于多标签分类的改进Labeled LDA模型[J]. 南京大学学报(自然科学版), 2013, 49(4): 425-432.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!