南京大学学报(自然科学版) ›› 2022, Vol. 58 ›› Issue (3): 506518.doi: 10.13232/j.cnki.jnju.2022.03.014
• • 上一篇
曾艺祥1,2, 林耀进1,2(), 范凯钧1,2, 曾伯儒1,2
Yixiang Zeng1,2, Yaojin Lin1,2(), Kaijun Fan1,2, Boru Zeng1,2
摘要:
在开放动态环境中,在线流特征选择是降低特征空间维度的有效方法.现有的在线流特征选择算法能够有效地选择一个较优的特征子集,然而,这些算法忽略了类别中可能存在的层次结构.基于此,提出基于层次类别邻域粗糙集的在线流特征选择算法:首先,在邻域粗糙集中引入层次最近异类的邻域关系,避免邻域粒度的选择,借助层次结构计算特征对标记的层次依赖度,推广邻域粗糙集模型以适应层次类别数据;其次,基于层次依赖度提出三个在线特征评价函数,设计了在线相关选择、在线重要度计算和在线冗余更新的层次特征选择框架;最后,在六个层次类别数据集和八个扁平单标记数据集上的实验表明,提出的算法优于现有最先进的在线流特征选择算法.
中图分类号:
1 | 胡清华,王煜,周玉灿,等.大规模分类任务的分层学习方法综述.中国科学:信息科学,2018,48(5):487-500. |
Hu Q H, Wang Y, Zhou Y C,et al. Review on hierarchical learning methods for large?scale classification task. Scientia Sinica:Informationis,2018,48(5):487-500. | |
2 | An L, Adeli E, Liu M X,et al. A hierarchical feature and sample selection framework and its application for Alzheimer's disease diagnosis. Scientific Reports,2017(7):45269. |
3 | Friedman N. Inferring cellular networks using probabilistic graphical models. Science,2004,303(5659):799-805. |
4 | Zhai Y T, Ong Y S, Tsang I W. The emerging "Big Dimensionality". IEEE Computational Intelligence Magazine,2014,9(3):14-26. |
5 | Grimaudo L, Mellia M, Baralis E. Hierarchical learning for fine grained internet traffic classification∥2012 8th International Wireless Communications and Mobile Computing Conference. Limassol,Cyprus:IEEE,2012:463-468. |
6 | Zhao H, Zhu P F, Wang P,et al. Hierarchical feature selection with recursive regularization∥Proceedings of the 26th International Joint Conference on Artificial Intelligence. Melbourne,Australia:AAAI Press,2017:3483-3489. |
7 | Liu X X, Zhao H. Hierarchical feature extraction based on discriminant analysis. Applied Intelligence,2019,49(7):2780-2792. |
8 | 郝世杰,郭艳蓉,陈涛,等.基于自适应稀疏结构学习的神经精神疾病特征选择方法.模式识别与人工智能,2021,34(4):311-321. |
Hao S J, Guo Y R, Chen T,et al. Feature selection method for neuropsychiatric disorder based on adaptive sparse structure learning. Pattern Recognition and Artificial Intelligence,2021,34(4):311-321. | |
9 | Javidi M M, Eskandari S. Online streaming feature selection:A minimum redundancy,maximum significance approach. Pattern Analysis and Applications,2019,22(3):949-963. |
10 | Perkins S, Lacker K, Theiler J. Grafting:Fast,incremental feature selection by gradient descent in function space. The Journal of Machine Learning Research,2003(3):1333-1356. |
11 | Wu X D, Yu K, Ding W,et al. Online feature selection with streaming features. IEEE Transactions on Pattern Analysis and Machine Intelligence,2013,35(5):1178-1192. |
12 | Zhou P, Hu X G, Li P P,et al. Online feature selection for high?dimensional class?imbalanced data. Knowledge?Based Systems,2017(136):187-199. |
13 | Lin Y J, Hu Q H, Liu J H,et al. Streaming feature selection for multilabel learning based on fuzzy mutual information. IEEE Transactions on Fuzzy Systems,2017,25(6):1491-1507. |
14 | Liu J H, Lin Y J, Li Y W,et al. Online multi?label streaming feature selection based on neighborhood rough set. Pattern Recognition,2018(84):273-287. |
15 | Everingham M, Van Gool L, Williams C K I,et al. The PASCAL visual object classes (VOC) challenge. International Journal of Computer Vision,2010,88(2):303-338. |
16 | ?lezak D, Synak P, Wróblewski J,et al. Infobright analytic database engine using rough sets and granular computing∥2010 IEEE International Conference on Granular Computing. San Jose,CA,USA:IEEE,2010:432-437. |
17 | Silla C N, Freitas A A. A survey of hierarchical classification across different application domains. Data Mining and Knowledge Discovery,2011,22(1-2):31-72. |
18 | 胡清华,于达仁,谢宗霞. 基于邻域粒化和粗糙逼近的数值属性约简. 软件学报,2008,19(3):640-649. |
Hu Q H, Yu D R, Xie Z X. Numerical attribute reduction based on neighborhood granulation and rough approximation. Journal of Software,2008,19(3):640-649. | |
19 | 毛振宇,窦慧莉,宋晶晶,等.共现邻域关系下的属性约简研究. 南京大学学报(自然科学),2021,57(1):150-159. |
Mao Z Y, Dou H L, Song J J,et al. Research on attribute reduction via co?occurrence neighborhood relation. Journal of Nanjing University (Natural Science),2021,57(1):150-159. | |
20 | Yu K, Wu X D, Ding W,et al. Scalable and accurate online feature selection for big data. ACM Transactions on Knowledge Discovery from Data,2017,11(2):16. |
21 | Zhou P, Hu X G, Li P P,et al. OFS?Density:A novel online streaming feature selection method. Pattern Recognition,2019(86):48-61. |
22 | Zhou P, Hu X G, Li P P. A new online feature selection method using neighborhood rough set∥2017 IEEE International Conference on Big Knowledge. Hefei,China:IEEE,2017:135-142. |
23 | Friedman M. A comparison of alternative tests of significance for the problem of m rankings. The Annals of Mathematical Statistics,1940,11(1):86-92. |
24 | Dunn O J. Multiple comparisons among means. Journal of the American Statistical Association,1961,56(293):52-64. |
[1] | 刘琼, 代建华, 陈姣龙. 区间值数据的代价敏感特征选择[J]. 南京大学学报(自然科学版), 2021, 57(1): 121-129. |
[2] | 程永林, 李德玉, 王素格. 基于极大相容块的邻域粗糙集模型[J]. 南京大学学报(自然科学版), 2019, 55(4): 529-536. |
[3] | 徐智康1,李 旸1,李德玉1,2*. 基于可变最小贝叶斯风险的层次多标签分类方法[J]. 南京大学学报(自然科学版), 2017, 53(6): 1023-. |
[4] | 贾洪杰1,2丁世飞1,2. 基于邻域粗糙集约减的谱聚类算法[J]. 南京大学学报(自然科学版), 2013, 49(5): 619-627. |
[5] | 谢娟英**,李楠1,2,乔子茵1 . 基于邻域粗糙集的不完整决策系统特征选择算法*[J]. 南京大学学报(自然科学版), 2011, 47(4): 383-390. |
|