南京大学学报(自然科学版) ›› 2023, Vol. 59 ›› Issue (4): 580589.doi: 10.13232/j.cnki.jnju.2023.04.005
Yuan Meng, Yizhe Zhang(), Gongxuan Zhang, Hui Song
摘要:
近些年,基于深度学习的算法和模型在各种图像分析任务中都取得了显著的成功,与常见的自然图像相比,医学图像数据集依然面临高度不平衡的问题,不平衡数据会导致特征空间里的决策边缘倾向样本多的类别,导致分类效果的下降.为了解决该问题,提出一种基于卷积神经网络考虑特征类内紧凑性的不平衡医学图像分类方法(Z?Score Compactness?based Convolutional Neural Network,ZC3NC).首先,从一个卷积神经网络的最后一层卷积层提取训练集样本与测试集样本的特征图,随后引入一个新的Z分数来度量测试集数据的特征图相对训练集每个类在特征空间上的偏离度,偏离度的度量基于类内的紧凑度,其主要关注样本的分布特性,对各类样本数量的不平衡性不敏感.最终,根据计算的偏离度,对测试集的数据进行分类.在DermaMNIST数据集上的实验表明,在不对数据和神经网络模型做任何额外增强的情况下,该方法的平衡准确率比原卷积神经网络模型平均提高11.15%,最多提高14.08%,证明提出的分类方法能有效地提高多种卷积神经网络对不平衡医学图像数据的分类性能.此外,和最先进的不平衡分类方法Under?Bagging KNN相比,该方法的性能平均提升了2.36%.
中图分类号:
1 | Chen X X, Wang X M, Zhang K,et al. Recent advances and clinical applications of deep learning in medical image analysis. Medical Image Analysis,2022(79):102444. |
2 | 郑光远,刘峡壁,韩光辉. 医学影像计算机辅助检测与诊断系统综述. 软件学报,2018,29(5):1471-1514. |
Zheng G Y, Liu X B, Han G H. Survey on medical image computer aided detection and diagnosis systems. Journal of Software,2018,29(5):1471-1514. | |
3 | Cai L, Gao J Y, Zhao D. A review of the application of deep learning in medical image classification and segmentation. Annals of Translational Medicine,2020,8(11):713. |
4 | Singh R, Bharti V, Purohit V,et al. MetaMed:Few?shot medical image classification using gradient?based meta?learning. Pattern Recognition,2021(120):108111. |
5 | Yang J C, Shi R, Wei D L,et al. MedMNIST v2:A large?scale lightweight benchmark for 2D and 3D biomedical image classification. Scientific Data,2023,10(1):41. |
6 | Quellec G, Lamard M, Conze P H,et al. Automatic detection of rare pathologies in fundus photographs using few?shot learning. Medical Image Analysis,2020(1):101660. |
7 | Batista G E A P A, Prati R C, Monard M C. A study of the behavior of several methods for balancing machine learning training data. ACM SIGKDD Explorations Newsletter,2004,6(1):20-29. |
8 | Hassan A R, Haque M A. An expert system for automated identification of obstructive sleep apnea from single?lead ECG using random under sampling boosting. Neurocomputing,2017(235):122-130. |
9 | Xu Z Z, Shen D R, Nie T Z,et al. A hybrid sampling algorithm combining M?SMOTE and ENN based on random forest for medical imbalanced data. Journal of Biomedical Informatics,2020(107):103465. |
10 | Ghorbani M, Kazi A, Baghshah M S,et al. RA?GCN:Graph convolutional network for disease prediction problems with imbalanced data. Medical Image Analysis,2022(75):102272. |
11 | Frid?Adar M, Diamant I, Klang E,et al. GAN?based synthetic medical image augmentation for increased CNN performance in liver lesion classification. Neurocomputing,2018(321):321-331. |
12 | Li Z W, Liu F, Yang W J,et al. A survey of convolutional neural networks:Analysis,appli?cations,and prospects. IEEE Transactions on Neural Networks and Learning Systems,2022,33(12):6999-7019. |
13 | Curtis A E, Smith T A, Ziganshin B A,et al. The mystery of the Z?score. Aorta,2016,4(4):124-130. |
14 | Kirkwood B R, Sterne J A C. Essential medical statistics. New York:John Wiley & Sons,2010:126-128. |
15 | Wen Y D, Zhang K P, Li Z F,et al. A discriminative feature learning approach for deep face recognition∥The 14th European Conference on Computer Vision. Springer Berlin Heidelberg,2016:499-515. |
16 | Ali Amirshahi S, Pedersen M, Yu S X. Image quality assessment by comparing CNN features between images. Journal of Imaging Science and Technology,2016,60(1):060410. |
17 | Abdelzad V, Czarnecki K, Salay R,et al. Detecting out?of?distribution inputs in deep neural networks using an early?layer output. 2019,arXiv:. |
18 | Narayan A, Berger B, Cho H. Assessing single?cell transcriptomic variability through density?preserving data visualization. Nature Biotechnology,2021,39(6):765-774. |
19 | McInnes L, Healy J, Melville J. Umap:Uniform manifold approximation and projection for dimension reduction. 2018,arXiv:. |
20 | Pang T Y, Xu K, Dong Y P,et al. Rethinking softmax cross?entropy loss for adversarial robustness∥The 8th International Conference on Learning Representations. Addis Ababa,Ethiopia:OpenReview.net,2020,arXiv:. |
21 | Galdran A, Carneiro G, Ballester M A G. Balanced?mixup for highly imbalanced medical image classification∥The 24th International Conference on Medical Image Computing and Computer:Assisted Intervention. Springer Berlin Heidelberg,2021:323-333. |
22 | Zhu R, Guo Y W, Xue J H. Adjusting the imbalance ratio by the dimensionality of imbalanced data. Pattern Recognition Letters,2020(133):217-223. |
23 | Tschandl P, Rosendahl C, Kittler H. The HAM10000 dataset,a large collection of multi?source dermatoscopic images of common pigmented skin lesions. Scientific Data,2018,5(1):180161. |
24 | Marcel S, Rodriguez Y. Torchvision the machine?vision package of torch∥Proceedings of the 18th ACM International Conference on Multimedia. Firenze,Italy:ACM,2010:1485-1488. |
25 | Grandini M, Bagli E, Visani G. Metrics for multi?class classification:An overview. 2020,arXiv:2008. 05756. |
26 | Flach P A. Performance evaluation in machine learning:The good,the bad,the ugly,and the way forward∥Proceedings of the AAAI Conference on Artificial Intelligence. Honolulu,HI,USA:AAAI Press,2019:9808-9814. |
27 | Xu H Y, Zhang H, Han K,et al. Learning alignment for multimodal emotion recognition from speech∥Interspeech 2019,the 20th Annual Conference of the International Speech Communication Association. Graz,Austria:ISCA,2019:3569-3573. |
28 | He K M, Zhang X Y, Ren S Q,et al. Deep residual learning for image recognition∥Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Las Vegas,NV,USA:IEEE,2016:770-778. |
29 | Xie S N, Girshick R, Dollár P,et al. Aggregated residual transformations for deep neural networks∥Proceedings of 2017 IEEE Conference on Computer Vision and Pattern Recognition. Honolulu,HI,USA:IEEE,2017:5987-5995. |
30 | Szegedy C, Liu W, Jia Y Q,et al. Going deeper with convolutions∥Proceedings of 2015 IEEE Conference on Computer Vision and Pattern Recognition. Boston,MA,USA:IEEE,2015:1-9. |
31 | 刘娜. 面向非平衡数据的医疗智能诊断与决策支持研究. 博士学位论文. 天津:天津大学,2021. |
Liu N. Research on medical intelligent diagnosis and decision support based on imbalanced data. Ph.D. Dissertation. Tianjin:Tianjin University,2021. | |
32 | Douzas G, Bacao F, Last F. Improving imbalanced learning through a heuristic oversampling method based on k?means and SMOTE. Information Sciences,2018(465):1-20. |
33 | Hang H Y, Cai Y C, Yang H F,et al. Under?bagging nearest neighbors for imbalanced classification. The Journal of Machine Learning Research,2022,23(1):118. |
[1] | 王一宾, 葛文信, 程玉胜, 吴海峰. 基于多维相关性的弱类属属性学习[J]. 南京大学学报(自然科学版), 2023, 59(4): 690-704. |
[2] | 谭嘉辰, 董永权, 张国玺. SSM: 基于孪生网络的糖尿病视网膜眼底图像分类模型[J]. 南京大学学报(自然科学版), 2023, 59(3): 425-434. |
[3] | 吕佳, 肖锋. 内存有效的快速双层深度规则分类器[J]. 南京大学学报(自然科学版), 2023, 59(3): 446-459. |
[4] | 杨京虎, 段亮, 岳昆, 李忠斌. 基于子事件的对话长文本情感分析[J]. 南京大学学报(自然科学版), 2023, 59(3): 483-493. |
[5] | 宋雨, 肖玉柱, 宋学力. 基于伪标签回归和流形正则化的无监督特征选择算法[J]. 南京大学学报(自然科学版), 2023, 59(2): 263-272. |
[6] | 冯海, 马甲林, 许林杰, 杨宇, 谢乾. 融合标签嵌入和知识感知的多标签文本分类方法[J]. 南京大学学报(自然科学版), 2023, 59(2): 273-281. |
[7] | 杨雨佳, 肖庆来, 陈健, 曾松伟. 融合空间和统计特征的CNN⁃GRU臭氧浓度预测模型研究[J]. 南京大学学报(自然科学版), 2023, 59(2): 322-332. |
[8] | 陈瑞, 徐金东, 刘兆伟, 阎维青, 王璇, 宋永超, 倪梦莹. 基于模糊空谱特征的高光谱图像分类[J]. 南京大学学报(自然科学版), 2023, 59(1): 145-154. |
[9] | 孙林, 张起峰, 徐久成. 基于互信息的Fisher Score多标记特征选择[J]. 南京大学学报(自然科学版), 2023, 59(1): 55-66. |
[10] | 张展云, 罗川, 李天瑞, 李红梅, 刘盾. 基于组标签的多标签流特征选择算法[J]. 南京大学学报(自然科学版), 2023, 59(1): 67-75. |
[11] | 田小瑜, 秦永彬, 黄瑞章, 陈艳平. 基于相关性约束矩阵分解的多标签分类方法[J]. 南京大学学报(自然科学版), 2023, 59(1): 76-84. |
[12] | 王津, 谭安辉, 顾沈明. 基于弱监督对比学习的弱多标记特征选择[J]. 南京大学学报(自然科学版), 2023, 59(1): 85-97. |
[13] | 马学森, 马吉, 蒋功辉, 许雪梅, 周天保. 基于注意力机制和多尺度特征融合的绝缘子缺陷检测方法[J]. 南京大学学报(自然科学版), 2022, 58(6): 1020-1029. |
[14] | 许睿, 刘相阳, 文益民, 沈世铭, 李建. 基于后向气团轨迹的大气污染特征时序混合模型研究[J]. 南京大学学报(自然科学版), 2022, 58(6): 1041-1049. |
[15] | 张艳莎, 冯夫健, 王杰, 潘凤, 谭棉, 张再军, 王林. 基于张量特征的小样本图像快速分类方法[J]. 南京大学学报(自然科学版), 2022, 58(6): 1059-1069. |
|