南京大学学报(自然科学版) ›› 2020, Vol. 56 ›› Issue (4): 549560.doi: 10.13232/j.cnki.jnju.2020.04.013
Yachong Li1,Youlong Yang1(),Haiquan Qiu1,2
摘要:
对于高维标签的分类问题,标签嵌入法已经受到广泛关注.现有的嵌入方法大都需要完整的标签信息,也没有将特征空间考虑在内;同时,由于数据进行人工标注的成本高以及噪声干扰等原因,仅能获得数据的部分标签信息,使得含有缺失标签的高维标签分类问题变得更加复杂.为解决这一问题,提出一种弱标记嵌入算法(Label Embedding for Weak Label Classification,LEWL).该算法利用矩阵的低秩分解模型,结合样本的流形结构恢复缺失标签;同时采用希尔伯特?施密特独立标准技术(Hilbert?Schmidt Independence Criterion,HSIC)使特征和标签相互作用,联合学习获得一个低维的嵌入空间,可以有效地减少模型的训练时间.通过在七个多标签数据集上与其他算法的对比实验,结果表明了所提算法的有效性.
中图分类号:
1 | Katakis I,Tsoumakas G,Vlahavas I. Multilabel text classification for automated tag suggestion∥Proceedings of the ECML?PKDD/2008 Workshop on Discovery Challenge. Antwerp,Belgium:Springer, 2008,18:5. |
2 | Jia X,Sun F M,Li H J,et al. Image multi?label annotation based on supervised nonnegative matrix factorization with new matching measurement. Neurocomputing,2017,219:518-525. |
3 | Elisseeff A,Weston J. A kernel method for multi?labelled classification∥Proceedings of the 14th International Conference on Neural Information Processing Systems:Natural and Synthetic. Vancouver,Canada:MIT Press,2001:681-687. |
4 | Boutell M R,Luo J B,Shen X P,et al. Learning multi?label scene classification. Pattern Recognition,2004,37(9):1757-1771. |
5 | Tsoumakas G,Vlahavas I. Random k?labelsets:An ensemble method for multilabel classification∥European Conference on Machine Learning. Springer Berlin Heidelberg,2007:406-417. |
6 | Read J,Pfahringer B,Holmes G,et al. Classifier chains for multi?label classification. Machine Learning,2011,85(3):333-359. |
7 | Zhang M L,Zhou Z H. ML?KNN:a lazy learning approach to multi?label learning. Pattern Recognition,2007,40(7):2038-2048. |
8 | Freund Y,Schapire R. A short introduction to boosting. Journal of Japanese Society for Artificial Intelligence,1999,14(5):771-780. |
9 | 马宏亮,万建武,王洪元. 一种嵌入样本流形结构与标记相关性的多标记降维算法. 南京大学学报(自然科学),2019,55(1):92-101. |
Ma M L,Wan J W,Wang H Y. A multi?label dimensionality reduction algorithm embedded sample manifold structure and label correlation. Journal of Nanjing University (Natural Science),2019,55(1):92-101. | |
10 | 彭成伦. 多义性机器学习中的标记嵌入方法研究. 硕士学位论文. 南京:东南大学,2018. |
Peng C L. Research on label embedding in ambiguous machine learning. Master Dissertation. Nanjing:Southeast University,2018. | |
11 | Hsu D J,Kakade S M,Langford J,et al. Multi?label prediction via compressed sensing. 2009,arXiv:0902.1284. |
12 | Tai F,Lin H T. Multilabel classification with principal label space transformation. Neural Computation,2012,24(9):2508-2542. |
13 | Chen Y N,Lin H T. Feature?aware label space dimension reduction for multi?label classification∥Advances in Neural Information Processing Systems. Lake Tahoe,NV,USA:Neural Information Processing Systems Foundation,Inc.,2012,2:1529-1537. |
14 | Lin Z J,Ding G G,Han J G,et al. End?to?end fea?ture?aware label space encoding for multilabel classification with many classes. IEEE Transactions on Neural Networks and Learning Systems,2018,29(6):2472-2487. |
15 | 刘阳. 多标签数据分类技术研究. 博士学位论文. 西安:西安电子科技大学,2018. |
Liu Y. Research on Multi?label data classification technology. Ph.D. Dissertation. Xi'an:Xidian University,2018. | |
16 | Sun Y Y,Zhang Y,Zhou Z H. Multi?label learning with weak label∥Proceedings of the 24th AAAI Conference on Artificial Intelligence. Atlanta,GE,USA:AAAI Press,2010:593-598. |
17 | Wu B Y,Liu Z L,Wang S F,et al. Multi?label learning with missing labels∥2014 22nd International Conference on Pattern Recognition. Stockholm,Sweden:IEEE,2014:1964-1968. |
18 | Guo B,Hou C,Shan J,et al. Low rank multi?label classification with missing labels∥2018 24th International Conference on Pattern Recognition (ICPR2018). Beijing,China:IEEE,2018:417-422. |
19 | Han Y F,Sun G L,Shen Y,et al. Multi?label Learning with Highly Incomplete Data via Collaborative Embedding∥Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining. London,United Kingdom:ACM Press,2018:1494-1503. |
20 | Xu M,Jin R,Zhou Z H. Speedup matrix completion with side information:application to multi?label learning∥Proceedings of the 27th Annual Conference on Neural Information Processing Systems. Montreal,Canada:MIT Press,2013:2301-2309. |
21 | Xu L L,Wang Z,Shen Z F,et al. Learning low?rank label correlations for multi?label classification with missing labels∥2014 IEEE International Conference on Data Mining. Shenzhen,China:IEEE,2014:1067-1072. |
22 | Candès E J,Tao T. The power of convex relaxation:near?optimal matrix completion. IEEE Transactions on Information Theory,2010,56(5):2053-2080. |
23 | Wen Z W,Yin W T,Zhang Y. Solving a low?rank factorization model for matrix completion by a nonlinear successive over?relaxation algorithm. Mathematical Programming Computation,2012,4(4):333-361. |
24 | Zhang Y,Schneider J. Multi?label output codes using canonical correlation analysis∥Proceedings of the 14th International Conference on Artificial Intelligence and Statistics. Fort Lauderdale,FL,USA:JMLR,2011:873-882. |
25 | Gretton A,Bousquet O,Smola A,et al. Measuring statistical dependence with Hilbert?Schmidt norms∥International Conference on Algorithmic Learning Theory. Springer Berlin Heidelberg,2005:63-77. |
26 | Han S J,Qubo C,Meng H. Parameter selection in SVM with RBF kernel function∥World Automation Congress 2012. Puerto Vallarta,Mexico:IEEE,2012:1-4. |
27 | Lin Z J,Ding G G,Hu M Q,et al. Multi?label classification via feature?aware implicit label space encoding∥Proceedings of the 31st International Conference on International Conference on Machine Learning. Beijing, China: JMLR.org,2014:325-333. |
28 | Han Y H,Wu F,Jia J Z,et al. Multi?task sparse discriminant analysis (MtSDA) with overlapping categories∥Proceedings of the 24th AAAI Conference on Artificial Intelligence. Atlana,GA,USA:AAAI Press, 2010:469-474. |
29 | Pacharawongsakda E,Theeramunkong T. Towards more efficient multi?label classification using dependent and independent dual space reduction∥Pacific?Asia Conference on Knowledge Discovery and Data Mining. Springer Berlin Heidelberg,2012:383-394. |
30 | Zhang M L,Zhou Z H. A review on multi?label learning algorithms. IEEE Transactions on Knowledge and Data Engineering,2013,26(8):1819-1837. |
[1] | 刘亮,何庆. 基于改进蝗虫优化算法的特征选择方法[J]. 南京大学学报(自然科学版), 2020, 56(1): 41-50. |
[2] | 袁燕,陈伯伦,朱国畅,花勇,于永涛. 基于社区划分的空气质量指数(AQI)预测算法[J]. 南京大学学报(自然科学版), 2020, 56(1): 142-150. |
[3] | 洪思思,曹辰捷,王 喆*,李冬冬. 基于矩阵的AdaBoost多视角学习[J]. 南京大学学报(自然科学版), 2018, 54(6): 1152-1160. |
[4] | 周星星,张海平,吉根林. 具有时空特性的区域移动模式挖掘算法[J]. 南京大学学报(自然科学版), 2018, 54(6): 1171-1182. |
|