南京大学学报(自然科学版) ›› 2020, Vol. 56 ›› Issue (1): 41–50.doi: 10.13232/j.cnki.jnju.2020.01.005

• • 上一篇    下一篇

基于改进蝗虫优化算法的特征选择方法

刘亮1,2,何庆1,2()   

  1. 1. 贵州大学大数据与信息工程学院,贵阳,550025
    2. 贵州省公共大数据重点实验室,贵州大学,贵阳,550025
  • 收稿日期:2019-08-20 出版日期:2020-01-30 发布日期:2020-01-10
  • 通讯作者: 何庆 E-mail:qhe@gzu.edu.cn
  • 基金资助:
    贵州省科技计划重大专项(黔科合重大专项字[2018]3002,黔科合重大专项字[2016]3022),贵州省公共大数据重点实验室开放课题(2017BDKFJJ004);贵州省教育厅青年科技人才成长项目(黔科合KY字[2016]124),贵州大学培育项目(黔科合平台人才[2017]5788)

An feature selection method based on improved grasshopper optimization algorithm

Liang Liu1,2,Qing He1,2()   

  1. 1. College of Big Data and Information Engineering,Guizhou University,Guiyang,550025,China
    2. Guizhou Provincial Key Laboratory of Public Big Data,Guizhou University,Guiyang,550025,China
  • Received:2019-08-20 Online:2020-01-30 Published:2020-01-10
  • Contact: Qing He E-mail:qhe@gzu.edu.cn

摘要:

针对传统蝗虫优化算法寻优精度低和收敛速度慢的问题,提出一种基于非线性调整策略的改进蝗虫优化算法.首先,利用非线性参数代替传统蝗虫算法中的递减系数,协调算法全局探索和局部开发能力,加快算法收敛速度;其次,引入自适应权重系数改变蝗虫位置更新方式,提高算法寻优精度;然后,结合limit阈值思想,利用非线性参数对种群中部分个体进行扰动,避免算法陷入局部最优.通过六个基准测试函数的仿真结果表明,改进算法的收敛速度和寻优精度均有明显提高.最后将改进算法应用于特征选择问题中,通过在七个数据集上的实验结果表明,基于改进算法的特征选择方法能够有效地进行特征选择,提高分类准确率.

关键词: 蝗虫优化算法, 非线性参数, 自适应权重, limit阈值, 特征选择

Abstract:

Focused on the issue of low search precision and slow convergence speed of traditional grasshopper optimization algorithm,an improved grasshopper optimization algorithm based on non?linear adjustment was proposed. Firstly,non?linear parameters were used to replace the decline coefficient of traditional grasshopper optimization algorithm,which coordinated the exploration and exploitation ability,and improved the convergence speed. Secondly,the adaptive weight coefficient was introduced to change the grasshopper position renewal modes to improve the search precision. Then,in order to avoid premature convergence,the algorithm combined limit threshold idea and used non?linear parameters to disturb some individuals in the population. The simulation results on six benchmark functions show that the improved algorithm has significant improvement in convergence speed and search precision. Finally,the improved algorithm was applied to the feature selection problem. The experimental results on seven datasets show that the feature selection method based on the improved algorithm can effectively select features and improve the classification accuracy.

Key words: grasshopper optimization algorithm, non?linear parameter, adaptive weight, limit threshold, feature selection

中图分类号: 

  • TP301.6

图1

IGOA算法流程图"

图2

基于IGOA的特征选择流程图"

表1

基准测试函数"

函数名表达式维度(Dim搜索空间

理论

最优值

SphereF1=i=1Dimxi25/30[-100,100]0
Schwefel 2.22F2=i=1Dimxi+i=1Dimxi5/30[-10,10]0
Schwefel 1.2F3=i=1Dimj-1ixj25/30[-100,100]0
Schwefel 2.21F4=maxixi,1iD5/30[-100,100]0
RastriginF5=xi2-10cos(2πxi)+105/30[-5.12,5.12]0
AckleyF6=-20exp-0.21Dimi=1Dimxi2-exp1Dimi=1Dimcos(2πxi)+20+e5/30[-32,32]0

表2

算法寻优性能对比"

函数DimGOA文献[12]IGOA
F15

Mean

Std.Dev

1.74E-008

1.97E-008

2.55E-013

5.68E-013

2.03E-035

8.82E-036

30

Mean

Std.Dev

3.86E+001

2.97E+001

6.62E-019

8.53E-019

1.21E-034

2.86E-035

F25

Mean

Std.Dev

2.36E+000

2.88E+000

1.49E+000

2.06E+000

3.85E-019

5.83E-020

30

Mean

Std.Dev

1.68E+001

1.91E+001

3.64E-010

3.63E-010

2.70E-018

4.66E-019

F35

Mean

Std.Dev

8.27E-006

2.51E-005

7.17E-008

2.38E-007

4.96E-035

4.58E-035

30

Mean

Std.Dev

2.60E+003

1.67E+003

6.78E-016

9.99E-016

1.02E-033

1.17E-033

F45

Mean

Std.Dev

1.71E-004

2.71E-004

1.17E-006

3.92E-006

2.82E-018

8.14E-019

30

Mean

Std.Dev

1.50E+001

4.05E+000

1.93E-010

2.15E-010

3.89E-018

4.29E-019

F55

Mean

Std.Dev

1.11E+001

7.57E+000

7.85E+000

5.19E+000

0.00E+000

0.00E+000

30

Mean

Std.Dev

9.45E+001

3.30E+001

0.00E+000

0.00E+000

0.00E+000

0.00E+000

F65

Mean

Std.Dev

1.04E+000

2.52E+000

7.42E-001

1.08E+000

8.88E-016

0.00E+000

30

Mean

Std.Dev

5.50E+000

1.76E+000

2.06E-010

2.03E-010

8.88E-016

0.00E+000

图3

IGOA和GOA算法的收敛曲线(Dim=5)"

图4

IGOA和GOA算法的收敛曲线(Dim=30)"

表3

实验数据集"

Datasets特征个数实例数
D1BreastCancerEW30569
D2Zoo16101
D3Heart12270
D4Parkinson22197
D5Congress16435
D6Wine13178
D7Colon200062

表4

算法在七个数据集上的特征选择性能的比较"

数据集FULLGOA?FSIGOA?FS
D1

Accuracy

Features

0.951

30

0.959

11.2

0.976

13.5

D2

Accuracy

Features

0.961

16

0.931

6.6

0.963

7.1

D3

Accuracy

Features

0.763

12

0.768

6.6

0.801

6.4

D4

Accuracy

Features

0.908

22

0.949

8.9

0.949

8.4

D5

Accuracy

Features

0.940

16

0.945

5.5

0.970

3.3

D6

Accuracy

Features

0.944

13

0.951

6.2

0.960

5.8

D7

Accuracy

Features

0.677

2000

0.745

675.2

0.833

691.9

表5

IGOA?FS与其他算法的性能对比"

Data

set

ALO[15]CCSA[16]

WOA?

CM[14]

IGOA?

FS

D10.9300.9030.9710.976
D20.9090.9370.9800.963
D30.8260.7880.8070.801
D40.9080.949
D50.9290.9560.970
D60.9110.9590.960
D70.9090.833

图5

IGOA?FS和其他算法在七个数据集上的平均特征选取率对比"

1 李炜,巢秀琴. 改进的粒子群算法优化的特征选择方法. 计算机科学与探索,2019,13(6):990-1004.
Li W,Chao X Q.Improved particle swarm optimization method for feature selection. Journal of Frontiers of Computer Science and Technology,2019,13(6):990-1004.
2 张震,魏鹏,李玉峰等. 改进粒子群联合禁忌搜索的特征选择算法. 通信学报,2018,39(12):60-68.
Zhang Z,Wei P,Li Y F,et al. Feature selection algorithm based on improved particle swarm joint taboo search. Journal on Communications,2018,39(12):60-68.
3 Gao W F,Hu L,Zhang P,et al. Feature selection by integrating two groups of feature evaluation criteria. Expert Systems with Applications,2018,110:11-19.
4 Mafarja M M,Mirjalili S. Hybrid whale optimization algorithm with simulated annealing for feature selection. Neurocomputing,2017,260:302-312.
5 Kennedy J,Eberhart R. Particle swarm optimization∥Proceedings of ICNN'95?International Conference on Neural Networks. Perth,Australia:IEEE,1995:1942-1948.
6 Mirjalili S. The ant lion optimizer. Advances in Engineering Software,2015,83:80-98.
7 Mirjalili S,Lewis A. The whale optimization algorithm. Advances in Engineering Software,2016,95:51-67.
8 Saremi S,Mirjalili S,Lewis A. Grasshopper optimisation algorithm:theory and application. Advances in Engineering Software,2017,105:30-47.
9 Ewees A A,Elaziz M A,Houssein E H. Improved grasshopper optimization algorithm using opposition?based learning. Expert Systems with Applications,2018,112:156-172.
10 Luo J,Chen H L,Zhang Q,et al. An improved grasshopper optimization algorithm with application to financial stress prediction. Applied Mathematical Modelling,2018,64:654-668.
11 Arora S,Anand P. Chaotic grasshopper optimization algorithm for global optimization. Neural Computing and Applications,2019,doi:10.1007/s00521?018?3343?2.
doi: 10.1007/s00521?018?3343?2
12 李洋州,顾磊. 一种基于曲线自适应和模拟退火的蝗虫优化算法. 计算机应用研究,2019,doi:10.19734/j.issn.1001?3695.2018.07. 0580. (Li Y Z,Gu L. Grasshopper optimization algorithm based on curve adaptive and simulated annealing. Application Research of Computers,2019,doi:10.19734/j.issn.1001?3695. 2018.07.0580.
doi: 10.19734/j.issn.1001?3695. 2018.07.0580
13 杨菊蜻,张达敏,何锐亮等. 基于Powell搜索的混沌鸡群优化算法. 微电子学与计算机,2018,35(7):78-82.
Yang J Q,Zhang D M,He R L,et al. A chaotic chicken optimization algorithm based on powell search. Microelectronics & Computer,2018,35(7):78-82.
14 Mafarja M,Mirjalili S. Whale optimization approaches for wrapper feature selection. Applied Soft Computing,2018,62:441-453.
15 Emary E,Zawbaa H M,Parv B. Feature selection based on antlion optimization algorithm∥2015 3rd World Conference on Complex Systems (WCCS). Marrakech,Morocco:IEEE,2015:1-7.
16 Sayed G I,Hassanien A E,Azar A T. Feature selection via a novel chaotic crow search algorithm. Neural Computing and Applications,2019,31(1):171-188.
17 Yu L,Liu H. Feature selection for high?dimensional data:a fast correlation?based filter solution∥Proceedings of the 20th International Conference on Machine Learning. Washington DC,USA:AAAI Press,2003:856-863.
[1] 程玉胜,陈飞,庞淑芳. 标记倾向性的粗糙互信息k特征核选择[J]. 南京大学学报(自然科学版), 2020, 56(1): 19-29.
[2] 刘 素, 刘惊雷. 基于特征选择的CP-nets结构学习[J]. 南京大学学报(自然科学版), 2019, 55(1): 14-28.
[3] 陈海娟,冯 翔,虞慧群. 基于预测算子的GSO特征选择算法[J]. 南京大学学报(自然科学版), 2018, 54(6): 1206-1215.
[4] 温 欣1,李德玉1,2*,王素格1,2. 一种基于邻域关系和模糊决策的特征选择方法[J]. 南京大学学报(自然科学版), 2018, 54(4): 733-.
[5] 靳义林1,2*,胡 峰1,2. 基于三支决策的中文文本分类算法研究[J]. 南京大学学报(自然科学版), 2018, 54(4): 794-.
[6]  董利梅,赵 红*,杨文元.  基于稀疏聚类的无监督特征选择[J]. 南京大学学报(自然科学版), 2018, 54(1): 107-.
[7]  崔 晨,邓赵红*,王士同.  面向单调分类的简洁单调TSK模糊系统[J]. 南京大学学报(自然科学版), 2018, 54(1): 124-.
[8]  李 婵,杨文元*,赵 红.  联合依赖最大化与稀疏表示的无监督特征选择方法[J]. 南京大学学报(自然科学版), 2017, 53(4): 775-.
[9]  姚 晟1,2*,徐 风1,2,赵 鹏1,2,刘政怡1,2,陈 菊1,2.  基于改进邻域粒的模糊熵特征选择算法[J]. 南京大学学报(自然科学版), 2017, 53(4): 802-.
[10] 蔡亚萍,杨 明* . 一种利用局部标记相关性的多标记特征选择算法[J]. 南京大学学报(自然科学版), 2016, 52(4): 693-.
[11] 谢娟英*,屈亚楠,王明钊 . 基于密度峰值的无监督特征选择算法[J]. 南京大学学报(自然科学版), 2016, 52(4): 735-.
[12] 珠 杰1,2*,李天瑞1,刘胜久1. 基于条件随机场的藏文人名识别技术研究[J]. 南京大学学报(自然科学版), 2016, 52(2): 289-.
[13] 胡学钢*,许尧,李培培,张玉红. 一种过滤式多标签特征选择算法[J]. 南京大学学报(自然科学版), 2015, 51(4): 723-730.
[14] 周国静,李 云*. 基于最小最大策略的集成特征选择[J]. 南京大学学报(自然科学版), 2014, 50(4): 457-.
[15]  季薇1,李云2**
.  基于局部能量的集成特征选择*[J]. 南京大学学报(自然科学版), 2012, 48(4): 499-503.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
[1] 洪思思,曹辰捷,王 喆*,李冬冬. 基于矩阵的AdaBoost多视角学习[J]. 南京大学学报(自然科学版), 2018, 54(6): 1152 -1160 .
[2] 王冬丽,申俊峰,邱海成,杜佰松,李建平,聂潇,王业晗. 辽宁五龙金矿黄铁矿标型特征研究及深部找矿预测[J]. 南京大学学报(自然科学版), 2019, 55(6): 898 -915 .
[3] 段友祥,柳璠,孙歧峰,李洪强. 基于相带划分的孔隙度预测[J]. 南京大学学报(自然科学版), 2019, 55(6): 934 -941 .
[4] 洪佳明,黄云,刘少鹏,印鉴. 具有结果多样性的近似子图查询算法[J]. 南京大学学报(自然科学版), 2019, 55(6): 960 -972 .
[5] 李俊余, 李星璇, 王霞, 吴伟志. 基于三元因子分析的三元概念约简[J]. 南京大学学报(自然科学版), 2020, 56(4): 480 -493 .