南京大学学报(自然科学版) ›› 2020, Vol. 56 ›› Issue (4): 469–479.doi: 10.13232/j.cnki.jnju.2020.04.005

• • 上一篇    下一篇

属性组序下基于代价敏感的约简方法

刘鑫,胡军(),张清华   

  1. 计算智能重庆市重点实验室,重庆邮电大学,重庆,400065
  • 收稿日期:2020-06-20 出版日期:2020-07-30 发布日期:2020-08-06
  • 通讯作者: 胡军 E-mail:hujun@cqupt.edu.cn
  • 基金资助:
    国家自然科学基金(61876201);重庆市人工智能创新重大主题专项(cstc2017rgzn?zdyfX0040)

Attribute reduction based on cost sensitive under attribute group order

Xin Liu,Jun Hu(),Qinghua Zhang   

  1. Chongqing Key Laboratory of Computational Intelligence,Chongqing University of Posts and Telecommunications,Chongqing,400065,China
  • Received:2020-06-20 Online:2020-07-30 Published:2020-08-06
  • Contact: Jun Hu E-mail:hujun@cqupt.edu.cn

摘要:

属性约简是粗糙集理论中的重要问题.为了满足用户对属性的偏好,人们研究了属性序下的属性约简,然而对一些问题却很难给出完整的属性序.针对该问题,比较分析了属性组序下的约简子集的优劣,并提出代价敏感下的属性组序约简的算法.该算法通过属性组序的特点考虑用户偏好并结合属性代价以及属性重要度加权的方式选择局部属性,可以得到更符合用户偏好的约简.理论分析和实验结果验证了该算法的可行性和有效性,并且能在一般情形下找到满足用户偏好的约简.

关键词: 属性约简, 用户偏好, 属性组序, 代价敏感

Abstract:

Attribute reduction is an important issue in rough set theory. In order to satisfy users' preferences for attributes,people have studied attribute reduction under attribute order. In some problems,however,it is difficult to give a complete attribute order. Aiming at this problem,this paper compares and analyzes the pros and cons of reduction subsets under attribute group order,and an algorithm based on cost sensitive under attribute group order is proposed. This algorithm considers user preferences based on characteristics of the attribute group order,and selects local attributes by combining the attribute cost and attribute importance weight,which can obtain a reduction more in line with users' preferences. The theoretical analysis and experimental results verify the feasibility and effectiveness of the proposed algorithm,and we can find the reduction that satisfies the users' preferences in the general case.

Key words: attribute reduction, users' preference, attribute group order, cost sensitive

中图分类号: 

  • TP18

图1

算法流程图"

表1

决策表"

a1a2a3a4a5D
211111
111211
120121
121112
012112
211322
221122

表2

Lymphography的分组方式"

ID属性组序S=G3>G2>G1
1

a16,a12,a6,a8,a10>a3,a11,a14,a18,

a1,a2>a5,a17,a4,a13,a7,a15,a9

2

a2,a11,a12,a3,a16,a9,a6>a13,a17,

a10,a7,a14,a5>a8,a1,a4,a15,a18

3

a15,a14,a1,a10,a12,a11>a5,a18,a9,

a17,a2>a8,a3,a4,a13,a7,a16,a6

4

a9,a3,a12,a18,a1,a8>a5,a15,a2,

a14,a4,a13>a6,a10,a7,a17,a11,a16

5

a9,a13,a18,a5,a11,a7>a15,a12,a14,

a2,a6,a17>a8,a3,a16,a10,a4,a1

6

a17,a16,a10,a12,a13>a7,a2,a5,a3,

a4,a18>a11,a14,a9,a1,a8,a15,a6

7

a6,a2,a17,a3,a8>a11,a16,a15,a7,

a10,a18,a4>a5,a14,a1,a13,a9,a12

8

a18,a3,a11,a10,a15>a12,a9,a2,a17,

a7,a5,a6>a1,a13,a8,a16,a14,a4

9

a3,a8,a4,a11,a10,a18,a9>a15,a17,

a12,a7>a16,a14,a2,a6,a5,a13,a1

10

a15,a16,a3,a1,a4,a5>a13,a17,a9,

a12,a8>a6,a12,a2,a7,a18,a10,a14

表3

Lymphography数据集在10种不同分组方式下的约简结果"

IDARCSGO(λ=0.5)ARCSGO(λ=0)
ReductCOSTSAPReductCOSTSAP
1a1,a3,a11,a14,a2,a10,a12,a8,a61702.44a14,a18,a3,a1,a10,a12,a16,a181802.5
2a14,a13,a6,a12,a11,a3,a9,a16,a22202.77a14,a13,a2,a12,a11,a6,a3,a16,a92202.77
3a8,a5,a2,a14,a15,a10,a12,a11802.5a8,a5,a2,a14,a1,a12,a10,a151802.5
4a6,a14,a15,a13,a12,a1,a182202.28a6,a13,a2,a15,a14,a18,a122702.14
5a12,a14,a6,a15,a2,a11,a5,a132202.37a14,a15,a12,a2,a18,a13,a112702.42
6a8,a6,a2,a18,a12,a10,a16,a132402.25a14,a2,a18,a13,a10,a17,a162602.42
7a5,a10,a15,a16,a18,a6,a8,a17,a22302.33a5,a18,a15,a10,a16,a2,a8,a6,a172302.33
8a8,a12,a17,a6,a7,a5,a2,a10,a15,a182302.2a13,a2,a6,a17,a18,a10,a15,a112502.37
9a6,a1,a12,a17,a7,a15,a10,a8,a3,a11,a181902.27a14,a15,a12,a18,a3,a8,a10,a111902.5
10a10,a6,a12,a17,a8,a13,a1,a5,a16,a151702.22a14,a6,a13,a17,a1,a5,a15,a161802.25

表4

Lung?Cancer数据集在10种不同分组方式下的约简结果"

IDARCSGO(λ=0.5)ARCSGO(λ=0)
ReductCOSTSAPReductCOSTSAP
1b33,b35,b32,b23,b7,b36,b14803b14,b12,b23,b36,b33,b71003
2b41,b53,b19,b3,b121003b40,b6,b12,b3,b341903
3b37,b35,b29,b7,b20,b4,b53703b37,b35,b29,b7,b20,b4,b53703
4b33,b10,b8,b28,b29,b32603b28,b8,b10,b25,b32,b5,b38703
5b23,b26,b14,b54,b13603b43,b6,b54,b4,b26903
6b2,b7,b20,b34,b35503b43,b2,b53,b3,b13,b7803
7b37,b35,b13,b3,b33,b2,b49903b40,b35,b2,b49,b3,b241303
8b37,b51,b16,b34,b7,b33,b38703b37,b51,b6,b34,b7903
9b37,b16,b29,b3,b53,b24,b18903b40,b6,b20,b2,b24,b291503
10b41,b56,b13,b34,b3703b41,b3,b14,b6,b191203

表5

Dermatology数据集在10种不同分组方式下的约简结果"

IDARCSGO(λ=0.5)ARCSGO(λ=0)
ReductCOSTSAPReductCOSTSAP
1c21,c5,c28,c32,c17,c18,c20,c31,c8,c1,c41502.9c22,c8,c5,c31,c28,c2,c4,c17,c32,c1,c182102.9
2c3,c5,c9,c14,c20,c29,c17,c16,c281102.11c4,c16,c9,c8,c19,c32,c28,c18,c13,c101502.7
3c5,c29,c15,c10,c16,c32,c7,c18,c2,c1,c41702.9c5,c22,c29,c15,c4,c7,c32,c16,c1,c182202.9
4c4,c19,c17,c2,c6,c24,c10,c341502.87c4,c34,c19,c2,c6,c22,c17,c242102.87
5c32,c21,c9,c14,c26,c5,c3,c18,c8,c41302.8c34,c22,c4,c3,c14,c5,c62002.85
6c26,c28,c14,c19,c18,c1,c33,c341303c25,c34,c19,c28,c1,c18,c14,c261303
7c34,c5,c19,c3,c32,c18,c10,c71202.62c34,c5,c4,c32,c3,c101302.5
8c2,c17,c21,c18,c16,c33,c5,c3,c28,c91202.5c34,c22,c4,c3,c28,c33,c52002.71
9c19,c17,c28,c34,c18,c33,c41402.57c16,c19,c34,c4,c18,c10,c291602.71
10c5,c21,c17,c25,c19,c9,c3,c18,c7,c161202.7c34,c28,c3,c16,c19,c18,c11,c71402.75

表6

BCW数据集在10种不同分组方式下的约简结果"

IDARCSGO(λ=0.5)ARCSGO(λ=0)
ReductCOSTSAPReductCOSTSAP
1d6,d8,d1,d51102.25d6,d5,d8,d7,d91102.6
2d6,d3,d8,d41102.5d6,d3,d1,d81202.5
3d6,d8,d4,d31102d6,d3,d4,d7,d91202.4
4d6,d7,d2,d9,d51102.2d6,d7,d2,d5,d41102.2
5d6,d2,d7,d4,d51102.2d6,d3,d7,d51302
6d6,d7,d9,d8,d51102.2d6,d3,d4,d81102.25
7d6,d7,d9,d8,d51102.2d6,d1,d5,d81102.25
8d6,d7,d1,d4902.5d6,d7,d1,d4902.5
9d6,d2,d7,d1902.25d6,d5,d1,d81102.25
10d6,d2,d7,d1902d6,d3,d1,d81202.25

表7

Connect?4数据集在10种不同分组方式下的约简结果"

IDARCSGO(λ=0.5)ARCSGO(λ=0)
ReductCOSTSAPReductCOSTSAP
1

e26,e32,e14,e7,e8,e21,e25,e1,

e2,e31,e22,e28,e37

6701.76

e14,e21,e15,e26,e1,e31,e25,e2,

e38,e22,e37,e13,e24,e16

7602.07
2

e14,e1,e17,e7,e21,e15,e26,e22,e16,

e3,e31,e8,e38,e28,e25,e32,e2

7201.94

e14,e21,e1,e37,e15,e22,e31,e8,

e26,e3,e25,e32,e13,e2

8501.92
3

e16,e2,e31,e22,e21,e38,e32,e7,

e8,e37,e1,e25,e15

7002

e31,e2,e26,e22,e21,e37,e8,e7,

e32,e38,e1,e15,e25

7002
4

e22,e1,e17,e8,e15,e25,e38,e16,e28,

e14,e7,e26,e31,e3,e32,e2,e21

7202.05

e1,e37,e13,e15,e25,e14,e7,e16,

e21,e31,e23,e2,e32

9402.07
5

e38,e17,e7,e21,e1,e25,e16,e32,e2,

e8,e22,e26,e3,e28,e14,e31,e15

7202.17

e37,e21,e1,e8,e25,e2,e13,e32,

e22,e15,e31,e26,e3,e14

8502.28
6

e39,e32,e1,e7,e25,e31,e3,e16,e27,

e17,e2,e8,e22,e14,e15,e37,e13

8702.05

e1,e21,e26,e38,e31,e25,e2,e23,

e8,e14,e15,e37,e22,e13

9602.07
7

e22,e38,e32,e25,e31,e1,e21,e7,

e8,e16,e2,e37,e15

7002.07

e22,e38,e14,e21,e31,e1,e25,e26,

e7,e37,e23,e15,e2,e16

8102.14
8

e25,e28,e32,e14,e8,e22,e38,e3,e2,

e17,e15,e31,e1,e26,e16,e7,e21

7202.05

e37,e14,e25,e23,e15,e2,e22,e38,

e31,e21,e7,e1,e16,e26

8102.21
9

e22,e16,e1,e26,e38,e24,e14,e15,

e37,e25,e31,e9,e2,e7,e21

7002.2

e1,e22,e16,e37,e15,e14,e26,e38,

e21,e7,e31,e23,e2,e25

8102.21
10e38,e1,e32,e2,e17,e14,e8,e22,e25,e31,e15,e26,e16,e3,e28,e7,e217201.94

e14,e37,e1,e2,e38,e31,e25,e15,e13,

e21,e7,e26,e16,e9,e28,e33

8402.12

图2

Lymphography数据集的SAP值"

图3

Lung?Cancer数据集的SAP值"

图4

Dermatology数据集的SAP值"

图5

BCW数据集的SAP值"

图6

Connect?4数据集的SAP值"

图7

Lymphography数据集在不同组数下的代价"

图8

Lung?Cancer数据集在不同组数下的代价"

图9

Dermatology数据集在不同组数下的代价"

图10

BCW数据集在不同组数下的代价"

图11

Connect?4数据集在不同组数下的代价"

1 Pawlak Z. Rough sets. International Journal of Computer and Information Sciences,1982,11(5):341-356.
2 王国胤,姚一豫,于洪. 粗糙集理论与应用研究综述. 计算机学报,2009,32(7):1229-1246.
Wang G Y,Yao Y Y,Yu H. A survey on rough set theory and applications. Chinese Journal of Computers,2009,32(7):1229-1246.
3 于洪,王国胤,姚一豫. 决策粗糙集理论研究现状与展望. 计算机学报,2015,38(8):1628-1639.
Yu H,Wang G Y,Yao Y Y. Current research and future perspectives on decision:theoretic rough sets. Chinese Journal of Computers,2015,38(8):1628-1639.
4 陈昊,杨俊安,庄镇泉. 变精度粗糙集的属性核和最小属性约简算法. 计算机学报,2012,35(5):1011-1017.
Chen H,Yang J A,Zhuang Z Q. The core of attributes and minimal attributes reduction in variable precision rough set. Chinese Journal of Computers,2012,35(5):1011-1017.
5 Yao Y Y,Lin T Y. Generalization of rough sets using modal logics. Intelligent Automation & Soft Computing,1996,2(2):103-119.
6 Yao Y Y. Decision?theoretic rough set models∥Proceedings of 2nd International Conference on Rough Sets and Knowledge Technology. Springer Berlin Heidelberg,2007.
7 杨传健,葛浩,汪志圣. 基于粗糙集的属性约简方法研究综述. 计算机应用研究,2012,29(1):16-20.
Yang C J,Ge H,Wang Z S. Overview of attribute reduction based on rough set. Application Research of Computers,2012,29(1):16-20.
8 Qian Y H,Liang J Y,Pedrycz W,et al. Positive approximation:an accelerator for attribute reduction in rough set theory. Artificial Intelligence,2010,174(9-10):597-618.
9 Lazo?Cortés M S,Martínez?Trinidad J F,Carrasco?Ochoa J A,et al. A new algorithm for computing reducts based on the binary discernibility matrix. Intelligent Data Analysis,2016,20(2):317-337.
10 Gao C,Lai Z H,Zhou J,et al. Maximum decision entropy?based attribute reduction in decision?theoretic rough set model. Knowledge?Based Systems,2018,143:179-191.
11 Wang Y B,Chen X J,Dong K. Attribute reduction via local conditional entropy. International Journal of Machine Learning and Cybernetics,2019,10(12):3619-3634.
12 Min F,He H P,Qian Y H,et al. Test?cost?sensitive attribute reduction. Information Sciences,2011,181(22):4928-4942.
13 Fang Y,Min F. Cost?sensitive approximate attribute reduction with three?way decisions. International Journal of Approximate Reasoning,2019,104:148-165.
14 Ma X A,Zhao X R. Cost?sensitive three?way class?specific attribute reduction. International Journal of Approximate Reasoning,2019,105:153-174.
15 Jia X Y,Liao W H,Tang Z M,et al. Minimum cost attribute reduction in decision?theoretic rough set models. Information Sciences,2013,219:151-167.
16 Wang J,Wang J. Reduction algorithms based on discernibility matrix:the ordered attributes method. Journal of Computer Science and Technology,2001,16(6):489-504.
17 Zhao K,Wang J. A reduction algorithm meeting users' requirements. Journal of Computer Science and Technology,2002,17(5):578-593.
18 Yao Y Y,Zhao Y,Wang J,et al. A model of machine learning based on user preference of attributes∥International Conference on Rough Sets and Current Trends in Computing. Springer Berlin Heidelberg,2006:587-596.
19 Yao Y Y,Zhao Y,Wang J,et al. A model of user?oriented reduct construction for machine learning. Transactions on Rough Sets VIII. Springer Berlin Heidelberg,2008:332-351.
20 Han S Q,Wang J. Reduct and attribute order. Journal of Computer Science and Technology,2004,19(4):429-449.
21 官礼和,王国胤,胡峰. 一种基于属性序的决策规则挖掘算法. 控制与决策,2012,27(2):313-316.
Guan L H,Wang G Y,Hu F. A decision rules mining algorithm based on attribute order. Control and Decision,2012,27(2):313-316.
22 韩素青,阴桂梅. 一种面向用户需求的属性约简算法. 模式识别与人工智能,2014,27(3):281-288.
Han S Q,Yin G M. An user?oriented attribute reduct construction algorithm. Pattern Recognition and Artificial Intelligence,2014,27(3):281-288.
23 胡峰,王国胤. 属性序下的快速约简算法. 计算机学报,2007,30(8):1429-1435.
Hu F,Wang G Y. Quick reduction algorithm based on attribute order. Chinese Journal of Computers,2007,30(8):1429-1435.
24 王国胤. Rough集理论与知识获取. 西安:西安交通大学出版社,2001:23-26,133-136.
25 Zhang Q H,Shen W. Research on attribute reduction algorithm with weights. Journal of Intelligent & Fuzzy Systems,2014,27(2):1011-1019.
26 Hu K Y,Lu Y C,Shi C Y. Advances in rough set theory and its applicatinons. Journal of Tsinghua University (Science and Technology),2001,41(1):64-68.
[1] 汪敏,赵飞,闵帆. 储层预测的代价敏感主动学习算法[J]. 南京大学学报(自然科学版), 2020, 56(4): 561-569.
[2] 张龙波, 李智远, 杨习贝, 王怡博. 决策代价约简求解中的交叉验证策略[J]. 南京大学学报(自然科学版), 2019, 55(4): 601-608.
[3] 程永林, 李德玉, 王素格. 基于极大相容块的邻域粗糙集模型[J]. 南京大学学报(自然科学版), 2019, 55(4): 529-536.
[4] 陶玉枝1,2,赵仕梅1,2,谭安辉1,2*. 一种基于决策表约简的集覆盖问题的近似解法[J]. 南京大学学报(自然科学版), 2018, 54(4): 821-.
[5]  方 宇1,闵 帆1*,刘忠慧1,杨 新2.  序贯三支决策的代价敏感分类方法[J]. 南京大学学报(自然科学版), 2018, 54(1): 148-.
[6] 李俊余1,2,王 霞1,2*,刘庆凤3. 属性定向概念格的协调近似表示空间[J]. 南京大学学报(自然科学版), 2017, 53(2): 333-.
[7] 黄伟婷1*,赵 红2. 基于误差数据的最小代价属性选择分治算法[J]. 南京大学学报(自然科学版), 2016, 52(5): 890-.
[8] 施玉杰1*,杨宏志2,徐久成3. α-先验概率优势关系下的粗糙集模型研究[J]. 南京大学学报(自然科学版), 2016, 52(5): 899-.
[9] 梁新彦1,2,钱宇华1,2*,郭 倩2,成红红1,2. 面向多标记学习的局部粗糙集[J]. 南京大学学报(自然科学版), 2016, 52(2): 270-.
[10] 刘莹莹1,吕跃进2*. 基于相似度的集值信息系统属性约简算法基于相似度的集值信息系统属性约简算法[J]. 南京大学学报(自然科学版), 2015, 51(2): 384-389.
[11] 张燕平1,2, 邹慧锦1,2,赵姝1,2. 基于CCA的代价敏感三支决策模型[J]. 南京大学学报(自然科学版), 2015, 51(2): 447-452.
[12] 贾洪杰1,2丁世飞1,2. 基于邻域粗糙集约减的谱聚类算法[J]. 南京大学学报(自然科学版), 2013, 49(5): 619-627.
[13]  于洪**,姚园,赵军
.  一种有效的基于风险最小化的属性约简算法*[J]. 南京大学学报(自然科学版), 2013, 49(2): 133-141.
[14]  陈玉明**,吴克寿,孙金华.  基于幂树的决策表最小属性约简*
[J]. 南京大学学报(自然科学版), 2012, 48(2): 164-171.
[15]  赵荣泳 1 , 李翠玲 2 ** , 高晓康 3 , 王昭云 4
.  电子镇流器故障诊断的变精度粗糙集模型*
[J]. 南京大学学报(自然科学版), 2010, 46(5): 494-500.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
[1] 缪长健, 施斌, 郑兴, 王湛, 魏广庆. 海上超长PHC管桩BOFDA内力测试[J]. 南京大学学报(自然科学版), 2018, 54(6): 1057 -1063 .
[2] 林 銮,陆武萍,唐朝生,赵红崴,冷 挺,李胜杰. 基于计算机图像处理技术的松散砂性土微观结构定量分析方法[J]. 南京大学学报(自然科学版), 2018, 54(6): 1064 -1074 .
[3] 梅世嘉,施 斌,曹鼎峰,魏广庆,张 岩,郝 瑞. 基于AHFO方法的Green-Ampt模型K0取值试验研究[J]. 南京大学学报(自然科学版), 2018, 54(6): 1085 -1094 .
[4] 周星星,张海平,吉根林. 具有时空特性的区域移动模式挖掘算法[J]. 南京大学学报(自然科学版), 2018, 54(6): 1171 -1182 .
[5] 刘 素, 刘惊雷. 基于特征选择的CP-nets结构学习[J]. 南京大学学报(自然科学版), 2019, 55(1): 14 -28 .
[6] 马宏亮, 万建武, 王洪元. 一种嵌入样本流形结构与标记相关性的多标记降维算法[J]. 南京大学学报(自然科学版), 2019, 55(1): 92 -101 .
[7] 杨 薇, 王洪元, 张 继, 张中宝. 一种基于Faster-RCNN的车辆实时检测改进算法[J]. 南京大学学报(自然科学版), 2019, 55(2): 231 -237 .
[8] 徐秀芳, 徐 森, 花小朋, 徐 静, 皋 军, 安 晶. 一种基于t-分布随机近邻嵌入的文本聚类方法[J]. 南京大学学报(自然科学版), 2019, 55(2): 264 -271 .
[9] 李嘉明, 邹 勇, 郑 浩, 魏钟波, 杨柳燕, 缪爱军. 养殖塘生态系统重金属污染状况与风险评价[J]. 南京大学学报(自然科学版), 2019, 55(2): 272 -281 .
[10] 王飞永, 彭建兵, 卢全中, 黄强兵, 孟振江, 乔建伟. 渭河盆地地裂缝同生机制研究[J]. 南京大学学报(自然科学版), 2019, 55(3): 339 -348 .