南京大学学报(自然科学版) ›› 2020, Vol. 56 ›› Issue (2): 186–196.doi: 10.13232/j.cnki.jnju.2020.02.004

• • 上一篇    下一篇

改进模糊聚类在医疗卫生数据的Takagi⁃Sugeno模糊模型

王露1,王士同2()   

  1. 1.江南大学数字媒体学院,无锡,214122
    2.江苏省媒体设计与软件技术重点实验室,江南大学,无锡,214122
  • 收稿日期:2019-01-13 出版日期:2020-03-30 发布日期:2020-04-02
  • 通讯作者: 王士同 E-mail:18862005832@163.com
  • 基金资助:
    江苏省自然科学基金(BK20160187)

Takagi⁃Sugeno fuzzy modeling based on improved fuzzy clustering for health care data

Lu Wang1,Shitong Wang2()   

  1. 1.School of Digital Media,Jiangnan University,Wuxi,214122,China
    2.Key Laboratory of Media Design and Software Technology of Jiangsu Province,Jiangnan University, Wuxi,214122,China
  • Received:2019-01-13 Online:2020-03-30 Published:2020-04-02
  • Contact: Shitong Wang E-mail:18862005832@163.com

摘要:

传统的数据分析方法在挖掘医学数据信息时,没有充分利用可用的信息.针对这一问题,提出一种基于改进模糊聚类的Takagi?Sugeno (T?S)模糊系统,将系数调节与指数调节与经典模糊C均值聚类(Fuzzy C?means,FCM)算法结合,替换经典T?S模糊系统中的逻辑元件,合理利用T?S模糊系统在预测与回归等方面的优势的同时,通过指数或系数的灵活调控,深度挖掘医学数据中不同属性间的关联信息,提高算法在众多医学数据分析预测中的准确性.为具体评估算法有效性,在真实医疗数据集上进行实验,实验结果表明,该算法具有更高的预测精度及可行性.

关键词: 指数调节, 系数调节, 模糊聚类, T?S模糊模型, 医疗卫生

Abstract:

A novel Takagi?Sugeno (T?S) fuzzy system based on an improved fuzzy clustering algorithm is developed to fully leverage useful information in medical data. By combining the classical Fuzzy C?Means (FCM) with adaptive parameters in distance measures and simultaneously keeping the basic structure of T?S fuzzy systems,the proposed fuzzy system has its adaptive modeling superiority in prediction for medical data. The experimental results on the adopted medical datasets indicate both promising performance and feasibility of the proposed fuzzy system.

Key words: exponential regulation, coefficient regulation, fuzzy clustering, T?S fuzzy model, health care applications

中图分类号: 

  • TP391

图1

改进的混合模糊聚类算法的示意图"

表1

数据集描述"

数据集样本数特征数
Breast?cancer?wisconsin6699
Heart27013
Pima7688
Saheart4629
Indian Liver Patient58310

图2

Breast?cancer?wisconsin数据集上指数(左)与系数(右)调节规则数对比"

图3

Heart 数据集上指数(左)与系数(右)调节规则数对比"

图4

Pima数据集上指数(左)与系数(右)调节规则数对比"

图5

Seheart数据集上指数(左)与系数(右)调节规则数对比"

图6

Indian Liver Patient数据集指数(左)与系数(右)调节规则数对比"

表2

不同数据集上的精度以及RMSE对比"

数据集准确率均方误差准确率均方误差准确率均方误差

Breast?

cancer?

wisconsin

规则数202430
指数调节97.61%0.154799.52%0.093797.13%0.1694
系数调节98.09%0.138399.52%0.069298.09%0.1383
FCM97.13%0.169498.09%0.138397.61%0.1547
Heart规则数101220
指数调节83.95%0.400692.59%0.272287.65%0.3514
FCM81.48%0.430386.42%0.368585.19%0.3849
规则数101120
系数调节83.95%0.400691.36%0294080.25%0.4444
FCM81.48%0.430382.72%0.417585.19%0.3849
Pima规则数202930
指数调节69.13%0.555681.30%0.432476.52%0.4845
FCM72.17%0.527574.35%0.506573.04%0.5192
规则数202430
系数调节70.87%0.539777.83%0.470974.35%0.5065
FCM72.17%0.527574.35%0.506573.04%0.5192
Saheart规则数151925
指数调节60.14%0.637074.57%0.504368.12%0.5647
FCM61.59%0.619765.22%0.589862.32%0.6138
规则数152125
系数调节73.70%0.513275.36%0.503674.06%0.5092
FCM61.59%0.619767.39%0.577462.32%0.6138
Indian Liver Patient规则数202730
指数调节69.54%0.551975.29%0.458569.54%0.5519
FCM67.82%0.567370.11%0.567366.67%0.5774
规则数202530
系数调节68.39%0.562274.14%0.530765.52%0.6065
FCM67.82%0.567371.84%0.530766.67%0.5774
1 Ferreira M C,Salgado C M,Viegas J L,et al.Fuzzy modeling based on mixed fuzzy clustering for health care applications∥2015 IEEE International Conference on Fuzzy Systems (FUZZ?IEEE).Istanbul,Turkey:IEEE,2015:1-5.
2 金永波.动态聚类算法及其在医学数据上的应用.硕士学位论文. 杭州:浙江大学,2011.
Jin Y B.Dynamic clustering algorithm and its application in medical data. Master Dissertation. Hangzhou:Zhejiang University,2011.
3 袭肖明,杜亨方,孟宪静等.一种层次化的乳腺肿瘤分割方法.南京大学学报(自然科学),2018,54(1):64-74.
Xi X M,Du H F,Meng X J,et al. Hierarchical segmentation of breast tumor in ultrasound image. Journal of Nanjing University (Natural Science),2018,54(1):64-74.
4 梁蒙蒙,周涛,夏勇等.基于随机化融合和CNN的多模态肺部肿瘤图像识别.南京大学学报(自然科学),2018,54(4):775-785.
Liang M M,Zhou T,Xia Y,et al.Multimodal lung tumor image recognition based on randomized fusion and CNN. Journal of Nanjing University (Natural Science),2018,54(4):775-785.
5 李藤,杨田,代建华等.基于模糊区分矩阵的结直肠癌基因选择.南京大学学报(自然科学),2019,55(4):633-643.
Li T,Yang T,Dai J H,et al. Colon characteristic gene selection based on fuzzy discernibility matrix. Journal of Nanjing University (Natural Science),2019,55(4):633-643.
6 郑天依.基于机器学习的癫痫及精神异常脑电信息识别研究.硕士学位论文. 北京:北京邮电大学,2019.
Zheng T Y.Research on EEG information recognition of epilepsy and mental abnormality based on machine learning. Master Dissertation. Beijing:Beijing University of Posts and Telecommunications,2019.
7 Gao Z T,Lv Z F,DuX J,et al.Achieving data utility?privacy tradeoff in Internet of medical things:a machine learning approach.Future Generation Computer Systems,2019,98:60-68.
8 Li C S,Zhou J Z,Xiang X Q,et al.T?S fuzzy model identification based on a novel fuzzyc?regression model clustering algorithm.Engineering Applications of Artificial Intelligence,2009,22(4-5):646-653.
9 Takagi T,Sugeno M.Fuzzy identification of systems and its applications to modeling and control∥Dubois D,Prade H,Yager R R.Readings in Fuzzy Sets for Intelligent Systems.San Mateo,CA,USA:Morgan Kaufmann,1993:387-403.
10 Bezdek J C,Ehrlich R,Full W.FCM:The fuzzyc?means clustering algorithm.Computers & Geosciences,1984,10(2-3):191-203.
11 贾海宁,王士同.面向重尾噪声的模糊规则模型.南京大学学报(自然科学),2019,55(1):61-72.
Jia H N,Wang S T.Rule?based fuzzy model for heavy?tailed noisy data. Journal of Nanjing University (Natural Science),2019,55(1):61-72.
12 Salgado C M,Viegas J L,Azevedo C S,et al.Takagi?Sugeno fuzzy modeling using mixed fuzzy clustering.IEEE Transactions on Fuzzy Systems,2016,25(6):1417-1429.
13 Salgado C M,Ferreira M C,Vieira S M.Mixed fuzzy clustering for misaligned time series.IEEE Transactions on Fuzzy Systems,2017,25(6):1777-1794.
14 江雨燕,李平,王清.用于多标签分类的改进Labeled LDA模型.南京大学学报(自然科学),2013,49(4):425-432.
Jiang Y Y,Li P,Wang Q.An improved labeled latent dirichlet allocation model for multi?label classification. Journal of Nanjing University (Natural Science),2013,49(4):425-432.
15 郭靖.对K?means聚类算法欧氏距离加权系数的研究.网络安全技术与应用,2016(10):74-75.
Guo J.Study on Euclidean distance weighting coefficient of k?means clustering algorithm. Network Security Technology and Application,2016(10):74-75.
16 Holland M J,Ikeda K.Efficient learning with robust gradient descent.Machine Learning,2019,108(8-9):1523-1560.
17 胡局新,张功杰.基于K折交叉验证的选择性集成分类算法.科技通报,2013,29(12):115-117.
Hu J X,Zhang G J.K?fold crossvalidation based selected ensemble classification algorithm. Bulletin of Science and Technology,2013,29(12):115-117.
18 杨广文,季颖生,王小鸽等.一种基于网格搜索技术用于支持向量机的参数寻优方法.CN103744978A,2014-04-23.
Yang G W,Ji Y S,Wang X G,et al. Parameter optimization method for support vector machine based on grid search technology. CN103744978A,2014-04-23.
[1] 贾海宁, 王士同. 面向重尾噪声的模糊规则模型[J]. 南京大学学报(自然科学版), 2019, 55(1): 61-72.
[2]  唐益明1,2*,赵跟陆1,2,任福继1,2,丰刚永1,2,胡相慧1,2.  图像分割的EMKPFC算法[J]. 南京大学学报(自然科学版), 2017, 53(3): 569-.
[3]  张苏弦1,刘海林2**
.  基于稀疏特性的欠定盲信号分离算法*
[J]. 南京大学学报(自然科学版), 2011, 47(5): 566-570.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
[1] 马宏亮, 万建武, 王洪元. 一种嵌入样本流形结构与标记相关性的多标记降维算法[J]. 南京大学学报(自然科学版), 2019, 55(1): 92 -101 .
[2] 张勋, 石婉玲, 赵祝萱, 朱聪, 李维智, 贾叙东. 生物基聚醚胺型苯并噁嗪树脂的制备与性能研究[J]. 南京大学学报(自然科学版), 2019, 55(5): 832 -839 .
[3] 王冬丽,申俊峰,邱海成,杜佰松,李建平,聂潇,王业晗. 辽宁五龙金矿黄铁矿标型特征研究及深部找矿预测[J]. 南京大学学报(自然科学版), 2019, 55(6): 898 -915 .
[4] 王浩哲,刘虎,韦志伟,邓倩,李诗达,张海祖,程斌,廖泽文. 塔里木盆地东部上寒武统SPICE事件检出及其油气地球化学意义[J]. 南京大学学报(自然科学版), 2020, 56(3): 354 -365 .