南京大学学报(自然科学版) ›› 2019, Vol. 55 ›› Issue (4): 633643.doi: 10.13232/j.cnki.jnju.2019.04.013
Teng Li1,Tian Yang2,4,Jianhua Dai2,Ling Chen3()
摘要:
由于低分化肿瘤很难通过常规组织病理学诊断发现,而结合基因检测的手段可以准确筛选出针对特定肿瘤的致病基因,因此基因选择是进行肿瘤分类和临床治疗的关键问题.肿瘤基因表达数据具有样本小、维度高的特征,现有的基因选择算法在分类精度和计算效率上还有待提高.在模糊粗糙集理论的基础上进行区分矩阵模糊化,并依此设计了模糊区分矩阵属性约简算法.相比于经典的区分矩阵,模糊化的区分矩阵能够体现不同属性对于两个对象区分程度的差异,从而选择区分程度更高的属性而获得更好的分类效果.数值实验表明该方法提高了肿瘤基因数据的分类精度,且降低了计算耗时.实验采用kNN分类器进行结直肠癌(Colon Microarray)分类特征基因选择实验,从2000个特征基因中筛选出了五个结直肠癌发病相关的关键基因,且分类精度高达88.06%.
中图分类号:
1 | 叶明全,高凌云,伍长荣等.基于对称不确定性和邻域粗糙集的肿瘤分类信息基因选择.数据采集与处理,2018,33(3): 426-435. |
Ye M Q,Gao L Y,Wu C R,et al.Informative gene selection for tumor classification based on symmetric uncer?tainty and neighborhood rough set.Journal of Data Acquisition and Processing,2018,33(3):426-435. | |
2 | DaiJ H,XuQ.Attribute selection based on information gain ratio in fuzzy rough set theory with application to tumor classification.Applied Soft Computing,2013,13(1):211-221. |
3 | WangS L,LiX L,ZhangS W,et al.Tumor classification by combining PNN classifier ensemble with neighborhood rough set based gene reduction.Computers in Biology and Medicine,2010,40(2):179-189. |
4 | ChenY M,ZhangZ J,ZhengJ Z,et al.Gene selection for tumor classification using neighbor?hood rough sets and entropy measures.Journal of Biomedical Informatics,2017,67:59-68. |
5 | Al?ThanoonN A,QasimO S,AlgamalZ Y.Tuning parameter estimation in SCAD?support vector machine using firefly algorithm with appli?cation in gene selection and cancer classification.Computers in Biology and Medicine,2018,103:262-268. |
6 | 徐菲菲,苗夺谦,魏莱.基于模糊粗糙集的肿瘤分类特征基因选取.计算机科学,2009,36(3):196-200. |
Xu F F,Miao D Q,Wei L.Feature Selection for Cancer Classification Based on Fuzzy Rough Sets.Computer Science,2009,36(3):196-200. | |
7 | CaoJ,ZhangL,WangB J,et al.A fast gene selection method for multi?cancer classification using multiple support vector data description.Journal of Biomedical Informatics,2015,53:381-389. |
8 | ModelF,AdorjánP,OlekA,et al.Feature selection for DNA methylation based cancer classification.Bioinformatics,2001,17(S1):S157-S164. |
9 | AlgamalZ Y,LeeM H.Penalized logistic regression with the adaptive LASSO for gene selection in high?dimensional cancer classification.Expert Systems with Applications,2015,42(23):9326-9332. |
10 | GolubT R,SlonimD K,TamayoP,et al.Molecular classification of cancer: class discovery and class prediction by gene expression monitoring.Science,1999,286(5439):531-537. |
11 | ZhangY,DingC,LiT.Gene selection algorithm by combining ReliefF and MRMR.BMC Genomics,2008,9(S1):S27. |
12 | Robnik??ikonjaM,KononenkoI.Theoretical and empirical analysis of ReliefF and RReliefF.Machine Learning,2003,53(1-2):23-69. |
13 | PengH C,LongF H,DingC.Feature selection based on mutual information criteria of max?depen?dency,max?relevance,and min?redundancy.IEEE Transactions on Pattern Analysis and Machine Intelligence,2005,27(8): 1226-1238. |
14 | GuyonI,WestonJ,BarnhillS,et al.Gene selection for cancer classification using support vector machines.Machine Learning,2002,46(1-3):389-422. |
15 | WangL,ZhuJ,ZouH.Hybrid huberized support vector machines for microarray classification and gene selection.Bioinformatics,2008,24(3):412-419. |
16 | GhoshD,ChinnaiyanA M.Classification and selection of biomarkers in genomic data using LASSO.Journal of Biomedicine and Biotechno?logy,2005,2005(2):147-154. |
17 | MylonaK,KoukouvinosC,TheodorakiE M,et al.Variable selection via nonconcave penalized likelihood in high dimensional medical problems.International Journal of Applied Mathematics and Statistics,2009,14:1-11. |
18 | HerawanT,DerisM M,AbawajyJ H.A rough set approach for selecting clustering attribute.Knowledge?Based Systems,2010,23(3):220-231. |
19 | ParthalainN M,ShenQ.Exploring the boundary region of tolerance rough sets for feature selection.Pattern Recognition,2009,42(5):655-667. |
20 | MiJ S,WuW Z,ZhangW X.Approaches to knowledge reduction based on variable precision rough set model.Information Sciences,2004,159(3-4):255-272. |
21 | QianY H,LiangJ Y,PedryczW,et al.Positive approximation: an accelerator for attribute reduc?tion in rough set theory.Artificial Intelligence,2010,174(9-10):597-618. |
22 | DuboisD,PradeH.Rough fuzzy sets and fuzzy rough sets.International Journal of General Systems,1990,17(2-3):191-209. |
23 | JensenR,ShenQ.Fuzzy?rough attribute reduction with application to web categorization.Fuzzy Sets and systems,2004,141(3):469-485. |
24 | HuQ H,YuD,XieZ X,et al.Fuzzy probabilistic approximation spaces and their information measures.IEEE Transactions on Fuzzy Systems,2006,14(2):191-201. |
25 | HuQ H,YuD R,XieZ X.Information?preserving hybrid data reduction based on fuzzy?rough techni?ques.Pattern Recognition Letters,2006,27(5):414-423. |
26 | ChenD G,ZhangL,ZhaoS Y,et al.A novel algorithm for finding reducts with fuzzy rough sets.IEEE Transactions on Fuzzy Systems,2012,20(2):385-389. |
27 | TsangE C C,ChenD G,DanielS Y,et al.Attributes reduction using fuzzy rough sets.IEEE Transactions on Fuzzy Systems,2008,16(5):1130-1141. |
28 | DaiJ H,HuH,WuW Z,et al.Maximal?discernibility?pair?based approach to attribute reduction in fuzzy rough sets.IEEE Transactions on Fuzzy Systems,2017,26(4):2174-2187. |
29 | WangC Z,QiY L,ShaoM W,et al.A fitting model for feature selection with fuzzy rough sets.IEEE Transactions on Fuzzy Systems,2017,25(4):741-753. |
30 | QianY H,WangQ,ChengH H,et al.Fuzzy?rough feature selection accelerator.Fuzzy Sets and Systems,2015,258:61-78. |
31 | 胡宝清.模糊理论基础.第2版.武汉:武汉大学出版社,2010,648. |
32 | WangC Z,WuC X,ChenD G.A systematic study on attribute reduction with rough sets based on general binary relations.Information Sciences,2008,178(9):2237-2261. |
33 | HuQ H,YuD,XieZ X.Neighborhood classifiers.Expert Systems with Applications,2008,34(2):866-876. |
34 | ChenJ K,LinY J,LinG P,et al.Attribute reduction of covering decision systems by hypergraph model.Knowledge?Based Systems,2016,118: 93-104. |
35 | WangC Z,HuQ H,WangX Z,et al.Feature selection based on neighborhood discrimination index.IEEE Transactions on Neural Networks and Learning Systems,2018,29(7):2986-2999. |
[1] | 李同军,于洋,吴伟志,顾沈明. 经典粗糙近似的一个公理化刻画[J]. 南京大学学报(自然科学版), 2020, 56(4): 445-451. |
[2] | 任睿,张超,庞继芳. 有限理性下多粒度q⁃RO模糊粗糙集的最优粒度选择及其在并购对象选择中的应用[J]. 南京大学学报(自然科学版), 2020, 56(4): 452-460. |
[3] | 王宝丽,姚一豫. 信息表中约简补集对及其一般定义[J]. 南京大学学报(自然科学版), 2020, 56(4): 461-468. |
[4] | 张龙波, 李智远, 杨习贝, 王怡博. 决策代价约简求解中的交叉验证策略[J]. 南京大学学报(自然科学版), 2019, 55(4): 601-608. |
[5] | 姚宁, 苗夺谦, 张远健, 康向平. 属性的变化对于流图的影响[J]. 南京大学学报(自然科学版), 2019, 55(4): 519-528. |
[6] | 程永林, 李德玉, 王素格. 基于极大相容块的邻域粗糙集模型[J]. 南京大学学报(自然科学版), 2019, 55(4): 529-536. |
[7] | 张 婷1,2,张红云1,2*,王 真3. 基于三支决策粗糙集的迭代量化的图像检索算法[J]. 南京大学学报(自然科学版), 2018, 54(4): 714-. |
[8] | 敬思惠,秦克云*. 决策系统基于特定决策类的上近似约简[J]. 南京大学学报(自然科学版), 2018, 54(4): 804-. |
[9] | 胡玉文1,2,3*,徐久成1,2,张倩倩1,2. 决策演化集的膜结构抑制剂[J]. 南京大学学报(自然科学版), 2018, 54(4): 810-. |
[10] | 陶玉枝1,2,赵仕梅1,2,谭安辉1,2*. 一种基于决策表约简的集覆盖问题的近似解法[J]. 南京大学学报(自然科学版), 2018, 54(4): 821-. |
[11] | 严丽宇1,魏 巍1,2*,郭鑫垚1,崔军彪1. 一种基于带核随机子空间的聚类集成算法[J]. 南京大学学报(自然科学版), 2017, 53(6): 1033-. |
[12] | 赵天娜1,米据生1*,解 滨2,梁美社1,3. 基于多伴随直觉模糊粗糙集的三支决策[J]. 南京大学学报(自然科学版), 2017, 53(6): 1081-. |
[13] | 卢 媛,王 栋*,刘登峰,王远坤. 基于改进的粗糙集-云模型的水质评价方法[J]. 南京大学学报(自然科学版), 2017, 53(5): 879-. |
[14] | 贺晓丽1,2,魏 玲1*,折延宏2. 多粒度粗糙集模型的一致模语义分析[J]. 南京大学学报(自然科学版), 2017, 53(5): 954-. |
[15] | 张春英1,2,乔 鹏1,2,王立亚1,2*,刘 璐1,2,张建松1,3. 基于概率PS-粗糙集的动态三支决策及应用[J]. 南京大学学报(自然科学版), 2017, 53(5): 937-. |
|