南京大学学报(自然科学版) ›› 2015, Vol. 51 ›› Issue (2): 285–289.

• • 上一篇    下一篇

瓷砖视觉分类的“带约束半监督”算法

钱亚枫*,贲圣兰,李 勃,陈启美   

  • 出版日期:2015-03-04 发布日期:2015-03-04
  • 作者简介:(南京大学电子科学与工程学院,南京,210023)
  • 基金资助:
    国家自然科学基金(61105015),省自然科学基金(BK201121582)

The “constrained semi-supervised” algorithm of tile vision classification 


Qian Yafeng*, Ben Shenglan, Li Bo, Chen Qimei   

  • Online:2015-03-04 Published:2015-03-04
  • About author:(College of Electronic and Engineering, Nanjing University, Nanjing,210046,China)

摘要: 瓷砖生产线的人工分类环境相当恶劣,而国外工业视觉分类专用设备价格昂贵,因此有必要对瓷砖的自动分类系统进行自主研究.其中,分类算法的优劣是影响系统性能的关键因素.传统的分类算法存在未标记样本置信度估计低、分类器干扰大等不足.为解决这些问题,本文提出了基于带约束Tri-training的半监督分类算法:在大量未标记样本中寻找满足约束条件的样本,扩大已标记样本集,生成两个强分类器,组成集成分类器作为终分类器进行数据分类.经现场数据集的测试,该算法较传统算法,未标记样本置信度平均提高3%,分类精度提高1.8%-3.3%.

Abstract: The environment of artificial tile classification is quite hostile, and the price of foreign industry vision equipment is expensive. As a result, it is important to develop indigenous classification systems and among which, the performance of classification algorithm plays a crucial role in determining the classification ability of the whole system. Traditional classification algorithms have limits in unlabeled instances estimate confidence and the classifier interference is large. To solve such problem, we proposed an algorithm based on constrained semi-supervised classification algorithms – Tri-training: search the instances which meeting the constraints of large number of unlabeled instances, increase the datasets of labeled instances, generate two strong classifiers, compose the integrated classifier as the final classifier for samples classification. Through experiments on real data sets, compared with the traditional algorithm, unlabeled instances estimate confidence increased 3%, classification accuracy increased 1.8%-3.3%. 

[1] Zhou Z H, Zhan D C, Yang Q. Semi-supervised learning with very few labeled training examples.Twenty-Second AAAI Conference on Artificial Intelligence (AAAI-07),2007: 1~4.
[2] Nigam K, McCallum A, Mitchell T. Text classification from labeled and unlabeled documents using EM.Machine Learning, 2000, 39: 103~134.
[3] Zhou Z H, Li M. Tri-training: Exploiting unlabeled data using three classifiers. IEEE Transactions on Knowledge and Data Engineering, 2005, 17(11): 1529~1541.
[4] 过蓓蓓,力兆本.基于SVM的日志挖掘及潜在客户发现.管理工程学报,2010,24(1):129~133.
[5] Zhou D, Hofmann T. Semi supervised learning on directed graphs.Advances in Neural Information Processing System 17.Cambridge: MIT Press,2005: 1633~1640.
[6] 俞亚君,霍 静,史颖欢等.SSXCS:半监督学习分类系统.南京大学学报(自然科学),2013,49(05):611.
[7] 邓 超,郭茂祖.基于Tri-Training和数据剪辑的半监督聚类算法.软件学报,2008,19(3) :663~673.
[8] 潘世超,王文剑,郭虎升.基于概率密度估计的增量支持向量机算法[J].南京大学学报(自然科学),2013,49(05):603.
No related articles found!
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!