南京大学学报(自然科学版) ›› 2016, Vol. 52 ›› Issue (2): 244–252.

• • 上一篇    下一篇

基于特征融合与多元关系一致性的社会标签精化模型

李云毅1,2,苗夺谦1,2,卫志华1,2*   

  • 出版日期:2016-03-16 发布日期:2016-03-16
  • 作者简介:(1. 同济大学计算机科学与技术系,上海,201804;2. 同济大学嵌入式与服务计算教育部重点实验室,上海,201804)
  • 基金资助:
    基金项目:国家自然科学基金(61273304,61202170),高等学校博士学科点基金(20130072130004),上海市自然科学基金(14ZR1442600)

Social tag refinement model based on feature fusion?and multi-correlation consistency

Li Yunyi1,2, Miao Duoqian1,2, Wei Zhihua1,2*   

  • Online:2016-03-16 Published:2016-03-16
  • About author:
    (1.Department of Computer Science and Technology,Tongji University, Shanghai, 201804, China;
    2. Key Laboratory of Embedded System and Service Computing, Ministry of Education, Tongji University, Shanghai, 201804, China)


摘要: 目前,许多社交网站和图片分享网站允许用户自由地选择标签,来对他们上传的图片进行标注。这极大地方便了图片检索、图片排序、标签推荐等多媒体应用。然而,用户提供给网络图片的标签常常具有不相关、不精确、不完整的特点,因而需要对其进行精化。提出一种基于多元特征融合和多元关系一致性的社会标签精化模型。该最优化模型综合考虑了图像视觉特征、用户标签、图像用户信息以及它们之间的关系,并以这三者的一致性关系作为规划目标。此外,在图像视觉特征处理方面,将多元特征融合在最优化模型中,通过迭代算法自动地给出各个特征的权重。与传统的将多个底层视觉特征连接成一个长的特征向量或者仅使用一种特征的做法相比,该方法不但有效避免了维度灾难,还能最大程度地利用不同的视觉信息用于图像区分。实验表明,该方法与目前提出的一些最优秀算法具有可比性。

Abstract: In the current time, user-provided tags are accessible on photo sharing websites which facilitate further tag-based multimedia applications, such as image ranking, image retrieval and tag recommendation. However, user-supplied tags for web images are often irrelevant, imprecise and incomplete, which will lower the performances of image management tasks. And many efforts have been made to solve this problem. Image, user tags and author are three basic elements of web images. However, only one or two basic elements and few correlation consistency among them are considered in many image tag refinement algorithms. In this paper, an optimization model based on feature fusion and multi-correlation consistency is proposed for social tag refinement. Image visual features, user-supplied tags, and authors’ information are all considered in the proposed model. And multi-correlation consistency such as visual content-semantics consitency between image-pair, tag-tag correlation consistency and the user-user correlation consistency are put to use in our framework. Which will gain bettter refinement performance than those works that only consider one or two elements and few correlation consitency between web images. Traditionally, people often connect some visual features into a long vector or only choose one feature for image tag refinement task. The former will suffer from the problem of “Curse of Dimensionality” and the latter can not obtain sufficient image visual information for the task. So, a feature fusion idea is put forward in our framework. Multiple image visual features are considered and weights for each feature can be calculated automatically to estimate the importance of different features by an iteraitve process. F-score macro is used as?evaluation criterion like many other works. And comparative experiment results on MIR-Flickr dataset show that our performances are comparable with works of state-of-the-art. And the advantages of feature fusion and multi-correlation consistency are also proved by the designed experiments.

[1] Liu D, Hua X S, Zhang H J. Content-based tag processing for internet social images. Multimedia Tools and Applications, 2011, 51(2): 723-738.
[2] Wang C, Jing F, Zhang L, et al. Image annotation refinement using random walk with restarts. In: Nahrstedt K, Turk M, Rui Y, et al. Proceedings of the 14th Annual ACM International Conference on Multimedia. USA: ACM Press, 2006: 647-650.
[3] Jia J, Yu N, Rui X, et al. Multi-graph similarity reinforcement for image annotation refinement. In: The 15th IEEE International Conference on Image Processing. California, USA: IEEE Press, 2008: 993-996.
[4] Liu D, Hua X S, Wang M, et al. Image retagging. In: Alberto del Bimbo, Chang S F, Arnold S. Proceedings of the International Conference on Multimedia. Firenze, Italy: ACM Press, 2010: 491-500.
[5] Zhang M L, Zhang K. Multi-label learning by exploiting label dependency. In: Rao B, Krishnapuram B, Tomkins A, et al. Proceedings of the 16th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. DC, USA: ACM Press, 2010: 999-1008.
[6] Xu H, Wang J, Hua X S, et al. Tag refinement by regularized LDA. In: Gao W, Rui Y, Hanjalic A. Proceedings of the 17th ACM International Conference on Multimedia. Beijing, China: ACM Press, 2009: 573-576.
[7] 江雨燕, 李 平, 王 清. 用于多标签的改进Labeled LDA模型. 南京大学学报(自然科学),2013, 49(4): 425-432.
[8] 吕 静,何志凤. 一种基于正则化最小二乘的多标记分类算法. 南京大学学报(自然科学),2015, 51(1):139-147.
[9] Wang M, Li H, Tao D, et al. Multimodal graph-based re-ranking for web image search. IEEE Transactions on Image Processing, 2012, 21(11): 4649-4661.
[10] Miller G A. WordNet: A lexical database for English. Communications of the ACM, 1995, 38(11): 39-41.
[11] Zhu G, Yan S, Ma Y. Image tag refinement towards low-rank, content-tag prior and error sparsity. In: Alberto del Bimbo, Chang S F, Smeulders A. Proceedings of the International Conference on Multimedia. Firenze, Italy: ACM Press, 2010: 461-470.
[12] Bober M. MPEG-7 visual shape descriptors. IEEE Transactions on Circuits and Systems for Video Technology, 2001, 11(6): 716-719.
[13] Stricker M A, Orengo M. Similarity of color images. In: IS&T/SPIE’s Symposium on Electronic Imaging: Science & Technology. International Society for Optics and Photonics, 1995: 381-392.
[14] 王向阳, 杨红颖, 郑宏亮等. 基于视觉权值的分块颜色直方图图像检索算法. 自动化学报, 2011 (10): 1489-1492.
[15] Manjunath B S, Ma W Y. Texture features for browsing and retrieval of image data. IEEE Transactions on Pattern Analysis and Machine Intelligence, 1996, 18(8): 837-842.
[16] Uricchio T, Ballan L, Bertini M, et al. An evaluation of nearest-neighbor methods for tag refinement. In: IEEE International Conference on Multimedia and Expo (ICME). California, USA: IEEE Press, 2013: 1-6.

No related articles found!
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
[1] 周文猛1,杨一品1,周余1,于耀1,金苏文2,都思丹1*. 基于Kinect的无标记手部姿态估计系统[J]. 南京大学学报(自然科学版), 2015, 51(2): 297 .