南京大学学报(自然科学版) ›› 2018, Vol. 54 ›› Issue (1): 116–.

• • 上一篇    下一篇

 HSEC:基于聚类的启发式选择性集成

 郑丽容1,洪志令2*   

  • 出版日期:2018-01-31 发布日期:2018-01-31
  • 作者简介: 1.厦门大学计算机科学系,厦门,361005;2.厦门大学软件学院,厦门,361005
  • 基金资助:
     基金项目:国家自然科学基金(31200769)
    收稿日期:2017-12-20
    *通讯联系人,E-mail:hongzl@xmu.edu.cn

 HSEC:Clustering based heuristic selelctive ensemble learning algorithm

 Zheng Lirong1,Hong Zhiling2*   

  • Online:2018-01-31 Published:2018-01-31
  • About author:1.School of Information Science and Engineering,Xiamen University,Xiamen,361005,China;
    2.Software School of Xiamen University,Xiamen,361005,China

摘要:  提出一种基于聚类的启发式选择性集成学习算法.集成学习通过组合多个弱分类器获得比单一分类器更好的学习效果,把多个弱分类器提升为一个强分类器.理论上来说弱分类器的个数越多,组合的模型效果越好,但是随着弱分类器的增多,模型的训练时间和复杂度也随之递增.通过聚类的方法去除相似的弱分类器,一方面有效降低模型的复杂度,另一方面选出差异性较大的弱分类器作为候选集合.之后采用启发式的选择性集成算法,对弱分类器进行有效的组合,从而提升模型的分类性能.同时采用并行的集成策略,提高集成学习选取最优分类器子集效率,可以有效地减少模型的训练时间.实验结果表明,该算法较传统方法在多项指标上都有着一定的提升.

Abstract:  This paper introduces a clustering-based heuristic selective ensemble algorithm,called clustering based heuristic selective ensemble learning algrothm(HSEC).Ensemble learning obtains a better predictive through combining multiple weak models.However,with the increasing amount of the weak model,the complexity and training time become more complex.We use a clustering selective method to eliminate similarity classifiers to decrease the amount of weak models.Then,we select the best sequence set of classifiers based on the heuristic selective ensemble algorithm.We use multiprocessors to train model to solve the inefficiency of the selection classifier in the previous study for integration of learning,which could improve efficiency greatly.The experimental results show that the HSEC algorithm has made a certain boost in some indicators comparing with traditional classification algorithm.

 [1] 张春霞,张讲社.选择性集成学习算法综述.计算机学报,2011,34(8):1399-1410.(Zhang C X,Zhang J S.A survey of selective ensemble learning algorithms.Chinese Journal of Computers,2011,34(8):1399-1410.)
[2] 杨 春,殷绪成,郝红卫等.基于差异性的分类器集成:有效性分析及优化集成.自动化学报,2014,40(4):660-674.(Yang C,Yin X C,Hao H W,et al.Classifier ensemble with diversity:Effectiveness analysis and ensemble optimization.Acta Automatica Sinica,2014,40(4):660-674.)
[3] Zhou Z H,Wu J X,Tang W,et al.Combining regression estimators:GA-based selective neural network ensemble.International Journal of Computational Intelligence and Applications,2001,1(4):341-356.
[4] Zhou Z H,Wu J X,Tang W.Ensembling neural networks:Many could be better than all.Artificial Intelligence,2002,137(1-2):239-263.
[5] 唐 伟,周志华.基于Bagging的选择性聚类集成.软件学报,2005,16(4):496-502.(Tang W,Zhou Z H.Bagging-based selective clusterer ensemble.Journal of Software,2005,16(4):496-502.)
[6] 毕 华,梁洪力,王 珏.重采样方法与机器学习.计算机学报,2009,32(5):862-877.(Bi H,Liang H L,Wang J.Resampling methods and machine learning.Chinese Journal of Computers,2009,32(5):862-877.)
[7] 曹 莹,苗启广,刘家辰等.AdaBoost算法研究进展与展望.自动化学报,2013,39(6):745-758.(Cao Y,Miao Q G,Liu J C,et al.Advance and prospects of Ada Boost algorithm.Acta Automatica Sinica,2013,39(6):745-758.)
[8] 李 凯,崔丽娟.集成学习算法的差异性及性能比较.计算机工程,2008,34(6):35-37.(Li K,Cui L J.Diversity and performance comparison for ensemble learning algorithms.Computer Engineering,2008,34(6):35-37.)
[9] Zhou Z H,Wu J X,Tang W.Ensembling neural networks:Many could be better than all.Artificial intelligence,2002,137(1-2):239-263.
[10] Laslett D,Canback B.ARAGORN,a program to detect tRNA genes and tmRNA genes in nucleotide sequences.Nucleic Acids Research,2004,32(1):11-16.
[11] Cole S T,Brosch R,Parkhill J,et al.Deciphering the biology of Mycobacteriumtuberculosis from the complete genome sequence.Nature,1998,393(6685):537-544.
[12] Laslett D,Canbck B.ARWEN:A program to detect tRNA genes in metazoan mitochondrial nucleotide sequences.Bioinformatics,2008,24(2):172-175.
[13] Chan P K,Stolfo S J.Toward scalable learning with non-uniform class and cost-distributions:A case study in credit card fraud detection ∥ International Conference on Knowledge Discovery & Data Mining.New York,NY,USA:AAAI Press,1988:164-168.
[14] Guo J S,Zeng J C,Chen J X,et al.Selective ensemble learning with parallel optimization and hierarchical selection ∥ Proceedings of 2015 Interna-tional Conference on Machine Learning and Cyber-netics.Guangzhou,China:IEEE,2015:194-199.
[15] Brigo D,Capponi A.Bilateral counterparty risk with application to CDSs.Risk,2010,23(3):85-90.
[16] Kohavi R.A study of cross-validation and bootstrap for accuracy estimation and model selection ∥ Proceedings of the 14th International Joint Conference on Artificial Intelligence.Montreal,Canada:Morgan Kaufmann Publishers Inc.,1995,2:1137-1145.
No related articles found!
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!