南京大学学报(自然科学版) ›› 2016, Vol. 52 ›› Issue (4): 693–.

• • 上一篇    下一篇

一种利用局部标记相关性的多标记特征选择算法

蔡亚萍,杨 明*   

  • 出版日期:2016-07-24 发布日期:2016-07-24
  • 作者简介: 南京师范大学计算机科学与技术学院,南京,210023
  • 基金资助:
    基金项目:国家自然科学基金重点项目(61432008),国家自然科学基金面上项目(61272222) 收稿日期:2016-03-24 *通讯联系人,E­mail:05354@njnu.edu.cn

A multi­label feature selection algorithm by exploiting label correlations locally

Cai Yaping,Yang Ming*   

  • Online:2016-07-24 Published:2016-07-24
  • About author:College of Computer Science,Nanjing Normal University,Nanjing,210023,China

摘要: 随着近年来研究的深入,多标记学习已快速渗透到了各个领域中.在多标记学习中,每个实例对应着多个标记,且这些标记彼此之间相互关联,因而标记相关性的挖掘与利用对多标记学习有着重要的影响与意义.然而,目前已有的关于多标记学习的算法大多利用了全局标记相关性,即认为对于任一实例,其在学习过程中所利用的标记相关性均相同.而在现实中,不同的实例往往在其学习过程中所利用的标记相关性也不尽相同.将局部标记相关性利用到多标记特征选择算法中,通过对标记空间进行属性聚类将实例划分为组,从而实现局部标记相关性的利用,提出了结合局部标记相关性的多标记特征选择算法(multi­label feature selection by exploiting label correlod locally,Loc­MLFS).与此同时,该算法可以推广为一个统一架构.多个数据集上的实验结果表明局部相关性的利用有效地提高了多标记特征选择算法的有效性.

Abstract: Different from traditional supervised learning framework in which each object is assigned to only one concept of label,the condition which one object may be associated with multiple labels simultaneously in multi­label learning is able to analyze the problems in the real world more effectively.In recent years,multi­label learning has been attracting a great deal of attention in machine learning.In multi­label learning,each object may be associated with multiple labels simultaneously,and these labels are related to each other.So how to effectively discover and exploit correlations among labels is the core research issue of multi­label learning.A series of multi­label learning algorithms by exploiting label correlations have been proposed and applied successfully in many application areas.However,there are a lot of redundant features and irrelevant features existing in high dimensional data which reduce the performance of classifiers,and few multi­label feature selection algorithms consider the label correlations.Meanwhile,most of the existed algorithms for multi­label feature selection exploit the global label correlations,which assuming that the label correlations are shared by all the instances.However,in real­world tasks,different instances may share different label correlations.With this respect,we focuse on how to exploit the label correlations locally to improve multi­label feature selection and help multi­label classification.In this paper,a novel multi­label feature selection algorithm by exploiting the label correlation locally(Loc­MLFS)is introduced.The algorithm takes advantages of local label correlations(the correlations are not shared by all instances)in multi­label feature selection algorithm.To achieve the use of local label correlations,Loc­MLFS divides the samples into groups by category clustering and use multi­label feature selection to each group.At the same time,the algorithm can be extended to a unified framework.Experimental results on the datasets demonstrate that Loc­MLFS achieves superior performance.

[1] Tsoumakas G,Katakis I.Multi­label classification:An overview.International Journal of Data Warehousing and Mining,2007,3(3):1-13. [2] Tsoumakas G,Katakis,Vlahavas I.Mining multi­label data.In:Maimon O,Rokach L.Data Mining and Knowledge Discovery Handbook,Part 6.The 2nd Edition.US:Springer,2010,67-685. [3] Schapire R E,Singer Y.Boostexter:A boosting­based system for text categorization.Machine Learning,2000,39(2/3):135-168. [4] Godbole S,Sarawagi S.Discriminative methods for multi­labeled classification.In:PAKDD’04:The 8th Pacific­Asia Conference on Knowledge Discovery and Data Mining.Berlin:Springer,2004,22-30. [5] Fürnkranz J,Hüllermeier E,Mencía E L,et al.Multilabel classification via calibrated label ranking.Machine Learning,2008,73(2):133-153. [6] Clare A,King R D.Knowledge discovery in multi­label phenotype data.In:De Raedt L,Siebes A.Lecture Notes in Computer Science 2168.Berlin:Springer,2001,42-53. [7] Elisseeff A,Weston J.A kernel method for multi­labelled classification.In:Dietteroch T G,Bercker S,Ghahramani Z.Advances in Neural Information Processing Systems 14.Cambridge,MA:MIT Press,2002,681-687. [8] Barutcuoglu Z,Schapire R E,Troyanskaya O G.Hierarchical multi­label prediction of gene function.Bioinformatics,2006,22(7):830-836. [9] Boutell M R,Luo J,Shen X,et al.Learning multi­label scene classification.Pattern Recognition.2004,37(9):1757-1771. [10] Qi G J,Hua X S,Rui Y,et al.Correlative multi­label video annotation.In:Proceedings of the 15th ACM International Conference on Multimedia.New York,NY:ACM Press,2007,17-26. [11] Snoek C G M,Worring M,Gemert J C V,et al.The challenge problem for automated detection of 101 semantic concepts in multimedia.In:Proceedings of the 14th ACM International Conference on Multimedia(ACM Multimedia’06).Santa Barbara,CA:ACM,2006,421-430. [12] Tang L,Rajan S,Narayanan V K.Large scale multi­label classification via metalabeler.In:Proceedings of the 19th International Conference on World Wide Web(WWW’09).Madrid,Spain:ACM,2009,211-220. [13] Yang B,Sun J T,Wang T,et al.Effective multi­label active learning for text classification.In:Proceedings of the 15th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining(KDD’09).Paris,France:ACM,2009,917-926. [14] 黄圣君.多标记学习中标记关系利用的研究.博士学位论文.南京:南京大学,2014.(Huang S J.Research on the exploiting of the label relationship in multi label learning.Ph.D Dissertation.Nanjing:Nanjing University,2014.) [15] Andre C P L F,Freitas A A.A tutorial on multi­label classification techniques.In:Function Approximation and Classification.Berlin:Springer,2009,5:177-195. [16] Zhang M L,Peña J M,Robles V.Feature selection for multi­label naive Bayes classification.Information Sciences,2009,179(19):3218-3229. [17] Spolaor N,Cherman E,Monard M,et al.A comparison of multi­label feature selection methods using the problem transformation approach.Electronic Notes in Theoretical Computer Science,2013,292:135-151. [18] Kong D,Ding C H Q,Huang H,et al.Multi­label reliefF and F­statistic feature selections for image annotation.In:2013 IEEE Conference on Computer Vision and Paffern Recognition,2012,2352-2359. [19] Pupo O,Morell C,Soto S.ReliefF­ML:An extension of reliefF algorithm to multi­label learning.In:Proceedings of 18th Iberoamerican Congress.Havana,Cuba:Springer,2013,528-535. [20] Cai Y P,Yang M,Gao Y,et al.ReliefF­based multi­label feature selection.International Journal of Database Theory and Application,2015,8(4):307-318. [21] Huang S J,Zhou Z H.Multi­label learning by exploiting label correlations locally.In:Proceedings of the 26th AAAI Conference on Artificial Intelligence.Toronto,Canada:Machine Learning,2012,949-955. [22] Zhang M L,Zhou Z H.A review on multi­label learning algorithms.IEEE Transactions on Knowledge and Data Engineering,2014,26(8):1819-1836. [23] Zhang M L,Zhou Z H.ML­KNN:A lazy learning approach to multi­label learning.Pattern Recognition,2007,40(7):2038-2048.
No related articles found!
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!