南京大学学报(自然科学版) ›› 2011, Vol. 47 ›› Issue (4): 391–397.

• • 上一篇    下一篇

 基于粒计算的商业数据流概念漂移特征选择*

 据春华,帅朝谦**,封毅
  

  • 出版日期:2015-04-14 发布日期:2015-04-14
  • 作者简介: (浙江工商大学计算机与信息工程学院,现代商贸研究中心,杭州310018)
  • 基金资助:
     国家自然科学基金(71071141, 60905026,浙江省自然科学基金(ZL091224, Y1091164),浙江省研究生创新科研项日

 Granular computing based concept drift features selection for business data streams

 Ju Chun Huu,Shuai  Zhuo一Qian,Fenh Yi   

  • Online:2015-04-14 Published:2015-04-14
  • About author: (School of Computer Science and information Engineering,Research Center of Modern Business Zhejiang Gongshang University, Hangzhou,310018,China)

摘要:  商业数据流具有动态性、漂移性等特性,概念漂移特征选择是数据流挖掘的重要工作之一 木文从数据流的特点和概念漂移特性出发,提出了数据流的概念形式化分析流程和基于粒计算构建数
据流的概念形式化描述模型;商业数据流的概念漂移实际上取决于其概念外延的变化,文中使用包括外延偶合度和内涵偶合度在内的概念偶合度来描述概念间的相似性;经过粒化的数据流由概念格来表
示,进而通过概念格对的松弛偶合度来分析数据流特征;结合概念偶合分析和数据流变化特征,阐述了一种基于数据流概念格对的松弛匹配偶合度算法,并据此分析概念格对来选择数据流的漂移特征.
通过实例验证、评价了特征选择,证明其有效性.

Abstract:  Concept drift features selection is an important aspect of data streams minim as business data streams aredynamic and easy to drift.丁his paper describes the characteristics and the concept drift of data streams and
proposes a work flow of formal concept analysis and a formal concept description model of strcaming data based on granular computing. Concept drift in business data streams is actually decided by the changes upon the extension of
the concept.Then we describe concept coincidence, including coincidence on extent,mtcnt,and concept. Because conccpt-latticcs can be expressed in terms of the granulated data streams,we analyze coincidence of concept lattice
pairs instead of continued and norrlormal streaming data, Nurthermore, we propose a concept lattice pairs-based concept relaxation-matching coincidence degree algorithm; the feature selection method is also described. Finally,
experiments and analyses are presented in order to explain and evaluate the method.

[1]Widmer U, Kubat M. Learning in the presence of concept drift and hidden contexts. Machine Learning, 1996,23(1):69一101.
[2]Han J W, Kambcr M. Data Mining; Concepts and Technique, 2nd edition. San Francisco:Morgan Kaufman Publishers, 2006.
[3]Gabcr M M, Zaslavsky A,Krishnaswamy S. Mining data streams; a review. Sigmod Record,2005,34(2):18一26.
[4]Wang G Y,Zhang Q H,Hu J. An overview of granular computing. CA AI Transactions on In- tclligcnt Systems, 2007, 2(6):6}-26.(王国胤,张清华,胡军.粒计算研究综述.智能系统学报,2007,2(6):6一26).
[5]Miao D Q, Wang G Y,Liu Q, et al. Granular Computing; Past,Present and Prospects. Bci- sing; Science Press, 2007.(苗夺谦,王国胤,  刘清等.粒计算“过去、现在与展望”.北京:科学出版社,2007).
[6]Yao Y Y. On modeling data mining with granu- lar computing. Proceedings of the 25thAnnual international Computer Software and Applica- tions Conference, 2001:638一6’13.
[7]Yao Y Y. Granular computing for data mining. Proceedings of SPIE Conference on Data Min ing, intrusion Detection, information Assur ance,and Data Networks Security,2006:1一12.
[8]Xu J Q, Pcng X, Zhao W Y. Program clustc- ring for comprehension based on fuzzy formal concept analysis. Journal of Computer Research and Development,2009,16(9):1556一1566. (许佳卿,彭鑫,赵文耘.一种基于模糊形式
概念分析的程序聚类方法.计算机研究与发展, 2009,46 (9):1556一1666).
[9]Liao J,Lcng J,Li Y Y. Research on HowNet based on formal concept analysis and concept similarity. Application Research of Computers, 2007, 24(11);32一36.(廖剑,冷静,李艳燕.知网的形式概念分析及概念相似度研究. 计算机应用研究,2007, 24(11);32~36
[10]Xu J Q, Peng X, Zhao W Y. An evolution anal- ysis method based on fuzzy concept lattice and source code analysis. Chinese Journal of Com- putcrs, 2009, 32(9):1832一1844.(许佳卿,彭鑫,赵文耘.一种基于模糊概念格和代码分析
的软件演化分析方法.计算机学报,2009, 32 (9):1832一1844).
[11]Ghu X D, Huang Z Q.Conceptual modeling rules extracting for data streams. KnowledgeBased Systems, 2008,21:934:940
[12]Wang G J,Zhan D C. A component retrieval method based on feature matching, international Journal of information Technology, 2006,2 (8):60一72.
No related articles found!
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!