南京大学学报(自然科学版) ›› 2017, Vol. 53 ›› Issue (6): 1063–.

• • 上一篇    下一篇

基于邻域粒化的多数据源高投票决策规则的挖掘

陈辉皇1,2,林耀进1,2*,林国平3,唐 莉1,2   

  • 出版日期:2017-11-27 发布日期:2017-11-27
  • 作者简介:1.闽南师范大学计算机学院,漳州,363000;
    2.数据科学与智能应用福建省高等学校重点实验室,漳州,363000;
    3.闽南师范大学数学与统计学院,漳州,363000
  • 基金资助:
    基金项目:国家自然科学基金(61672272,61303131,61603173),福建省高校新世纪优秀人才支持计划
    收稿日期:2017-10-12
    *通讯联系人,E-mail:yjlin@mnnu.edu.cn

Mining high-voting decision rule based on neighborhood granulation in multiple data sources

Chen Huihuang1,2,Lin Yaojin1,2*,Lin Guoping3,Tang Li1,2   

  • Online:2017-11-27 Published:2017-11-27
  • About author:1.School of Computer Science,Minnan Normal University,Zhangzhou,363000,China;
    2.Key Laboratory of Data Science and Intelligence Application,Fujian Province University,Zhangzhou,363000,China;
    3.School of Mathematic and Statistics,Minnan Normal University,Zhangzhou,363000,China

摘要: 多数据源高投票决策规则挖掘是指从多数据源中挖掘存在大部分数据源且具有重要意义的决策规则,此类规则在银行理财产品营销、市场营销、疾病诊断等领域中具有指导性作用.利用样本邻域粒化来构建决策规则的表现形式,在此基础上定义了覆盖度、投票数等多种决策规则的度量指标,用以挖掘满足这些度量指标的高投票决策规则.实验结果验证了所提算法挖掘多源决策信息系统中的高投票决策规则挖掘的有效性.

Abstract: High-voting decision rule mining of multiple data sources is denoted that mining decision rules from multiple data sources,and these decision rules exist in most of multiple data sources and are significant.In many real-world applications,this kind of decision rule has instructive role,such as bank financial products marketing,marketing management,and disease diagnosis.Therefore,the purpose of this work is to discover non-trivial,interesting,and interpretable high-voting decision rules from multiple data sources.Firstly,a formal presentation of decision rule via the sample’s neighborhood granulation is constructed.Then,some metrics about decision rule are defined,such as cover degree,and vote rating.These metrics reflect the significance and interesting degree of decision rule from different views.Finally,the concept of high-voting decision rule is defined according to the metrics of decision rule.Experimental results demonstrate the effectiveness of the proposed algorithm,which is used to mine high-voting decision rules from multiple data sources.

[1] Lin G P,Liang J Y,Qian Y H.An information fusion approach by combining multigranulation rough sets and evidence theory.Information Sciences,2015,314:184-199.
[2] Yuan L,Wang Y L,Thompson P M,et al.Multi-source feature learning for joint analysis of incomplete multiple heterogeneous neuroimaging data.NeuroImage,2012,61(3):622-632.
[3] Zhuang F Z,Luo P,Xiong H,et al.Cross-domain learning from multiple sources:A consensus regularization perspective.IEEE Transactions on Knowledge and Data Engineering,2010,22(12):1664-1678.
[4] Fujino A,Ueda F,Nagata M.Adaptive semi-supervised learning on labeled and unlabeled data with different distributions.Knowledge and Information Systems,2013,7(1):129-154.
[5] Shi X X,Liu Q,Fan W,et al.Transfer across completely different feature spaces via spectral embedding.IEEE Transactionson Knowledge and Data Engineering,2013,25(4):906-918.
[6] 代建华,潘云鹤.一种基于分类一致性的决策规则获取算法.控制与决策,2004,19(10):1086-1090,1096.(Dai J H,Pan Y H.Algorithm for acquisition of decision rules based on classification consistency rate.Control and Decision,2004,19(10):1086-1090,1096.)
[7] He J Y,Hu H J,Chen B,et al.Rule extraction from SVM for protein structure prediction.In:Diederich J.Rule Extraction from Support Vector Machines.Springer Berlin Heidelberg,2007:227-252.
[8] Peng W C,Liao Z X.Mining sequential patterns across multiple sequence databases.Data & Knowledge Engineering,2009,68(10):1014-1033.
[9] Xu W H,Yu J H.A novel approach to information fusion in multi-source datasets:A granular computing viewpoint.Information Sciences,2017,378:410-423.
[10] Zhang S C,Zhang C Q,Wu X D.Knowledge discovery in multiple databases.Springer London,2004,110-120.
[11] Wu X D,Zhang S C.Synthesizing high-frequency rules from different data sources.IEEE Transactions on Knowledge and Data Engineering,2003,15(2):353-367.
[12] Adhikari A,Ramachandrarao P,Pedrycz W.Developing multi-database mining applications.Springer London-Verlag,2010,78-96.
[13] Yan J,Liu N,Yang Q,et al.Mining adaptive ratio rules from distributed data sources.Data Mining and Knowledge Discovery,2006,12(2-3):249-273.
[14] 林国平,梁吉业,李进金.多源决策信息系统的决策规则性能评价.模式识别与人工智能,2015,28(7):657-664.(Lin G P,Liang J Y,Li J J.Evaluation of decision rules performance for multi-source decision information systems.Pattern Recognition and Artificial Intelligence,2015,28(7):657-664.)
[15] Shi X X,Paiement J F,Grangier D,et al.Learning from heterogeneous sources via gradient boosting consensus.In:Proceedings of the 2012 SIAM International Conference on Data Mining.Anaheim,CA,USA:SIAM,2012:224-235.
[16] Zhu X Q,Jin R M.Multiple information sources cooperative learning.In:Proceedings of the 21st International Jont Conference on Artifical Intelligence.Pasadena,CA,USA:ACM,2009:1369-1375.
[17] Gao J,Fan W,Sun Y Z,et al.Heterogeneous source consensus learning via decision propagation and negotiation.In:Proceedings of the 15th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining.Paris,France:ACM,2009:339-348.
[18] Gao J,Liang F,Fan W,et al.Graph-based consensus maximization among multiple supervised and unsupervised models.In:Proceedings of the 22nd International Conference on Neural Information Processing System.Vancouver,Canada:ACM,2009:585-593. 
[19] Acharya A,Hruschka E R,Ghosh J,et al.C3E:A framework for combining ensembles of classifiers and clusterers.In:Sansone C,Kittler J,Roli F.International Workshop on Multiple Classifier Systems.Springer Berlin Heidelberg,2011:269-278.
[20] Lin Y J,Hu X G,Wu X D.Ensemble learning from multiple information sources via label propagation and consensus.Applied Intelligence,2014,41(1):30-41.
[21] Lin G P,Liang J Y,Qian Y H,et al.A fuzzy multigranulation decision-theoretic approach to multi-source fuzzy information systems.Knowledge-Based Systems,2015,91:102-113.
[22] Gilad-Bachrach R,Navot A,Tishby N.Margin based feature selection-theory and algorithms.In:Proceedings of the 21st International Conference on Machine Learning.Banff,Canada:ACM,2004:40-48.
[23] 胡清华,于达仁,谢宗霞.基于邻域粒化和粗糙逼近的数值属性约简.软件学报,2008,19(3):640-649.(Hu Q H,Yu D R,Xie Z X.Numerical attribute reduction based on neighborhood granulation and rough approximation.Journal of Software,2008,19(3):640-649.)
No related articles found!
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!