南京大学学报(自然科学版) ›› 2020, Vol. 56 ›› Issue (4): 561569.doi: 10.13232/j.cnki.jnju.2020.04.014
Min Wang1,Fei Zhao1,Fan Min2()
摘要:
传统的储层预测需要耗费大量的时间且对研究人员的专业能力要求极高,采用人工智能方法实现储层预测可以有效地改善预测效率.然而,因为环境、设备等原因导致油气井数据中存在大量属性值缺失,大大降低了储层识别精度.针对属性值缺失造成分类困难的问题,提出一个统一评估和动态选择的代价敏感主动学习算法(Active Learning Algorithm with Unified Evaluation and Dynamic Selection,ALES):(1)考虑各种代价的设置和计算,包括误分类代价、属性代价、标签代价和样本代价;(2)使用softmax回归实现对属性值和标签价值的统一评估;(3)提出一种具有排列组合和贪婪策略的最优获取方案,实现属性值和标签的动态选择.采用三个真实测井数据进行实验,显著性实验分析证明了ALES的有效性及其相对于监督代价敏感分类算法和缺失填补算法的优越性.
中图分类号:
1 | Zahin S A,Ahmed C F,Alam T. An effective method for classification with missing values. Applied Intelligence,2018,48(10):3209-3230. |
2 | Zhang J,Clayton M K,Townsend P A. Missing data and regression models for spatial images. IEEE Transactions on Geoscience and Remote Sensing,2015,53(3):1574-1582. |
3 | Silva?Ramírez E L,Pino?Mejías R,López?Coello M,et al. Missing value imputation on missing completely at random data using multilayer perceptrons. Neural Networks,2011,24(1):121-129. |
4 | Azadeh A,Asadzadeh S M,Jafari?Marandi R,et al. Optimum estimation of missing values in randomized complete block design by genetic algorithm. Knowledge?Based Systems,2013,37:37-47. |
5 | Melville P,Saar?Tsechansky M,Provost F,et al. Active feature?value acquisition for classifier induction∥The 4th IEEE International Conference on Data Mining. Brighton,United Kingdom:IEEE,2004:483-486. |
6 | Kwon O,Sim J M. Effects of data set features on the performances of classification algorithms. Expert Systems with Applications,2013,40(5):1847-1857. |
7 | Min F,Liu F L,Wen L Y,et al. Tri?partition cost?sensitive active learning through kNN. Soft Computing,2019,23(5):1557-1572. |
8 | Settles B. Active learning. San Rafael:Morgan & Claypool Publishers,2012:1-114. |
9 | Tong S,Koller D. Support vector machine active learning with applications to text classification. The Journal of Machine Learning Research,2002,2(1):45-66. |
10 | Wang M,Min F,Zhang Z H,et al. Active learning through density clustering. Expert Systems with Applications,2017,85:305-317. |
11 | Wang M,Fu K,Min F,et al. Active learning through label error statistical methods. Knowledge?Based Systems,2020,189:105140. |
12 | Rodriguez A,Laio A. Machine learning. clustering by fast search and find of density peaks. Science,2014,344(6191):1492-1496. |
13 | Allcock J,Zhang S Y. Quantum machine learning. National Science Review,2019,6(1):26-28. |
14 | Dennis J E,Moré J J. Quasi?newton methods,motivation and theory. SIAM Review,1977,19(1):46-89. |
15 | 黄帷,闵帆,任杰. 基于协同过滤加权预测的主动学习缺失值填补算法. 南京大学学报(自然科学),2018,54(4):758-765. |
Huang W,Min F,Ren J. Missing value imputation with active learning based on collaborative filtering weighted prediction. Journal of Nanjing University (Natural Science),2018,54(4):758-765. | |
16 | Gheyas I A,Smith L S. A neural network?based framework for the reconstruction of incomplete data sets. Neurocomputing,2010,73(16-18):3039-3065. |
17 | Meng F C,Cai C,Yan H. A bicluster?based bayesian principal component analysis method for microarray missing value estimation. IEEE Journal of Biomedical and Health Informatics,2014,18(3):863-871. |
18 | Holmes G,Donkin A,Witten I H. WEKA:A machine learning workbench∥Proceedings of ANZIIS'94:Australian New Zealnd Intelligent Information Systems Conference. Brisbane,Australia:IEEE,1994:357-361. |
19 | Triguero I,González S,Moyano J M,et al. KEEL 3.0:an open source software for multi?stage analysis in data mining. International Journal of Computational Intelligence Systems,2017,10(1):1238-1249. |
20 | Reyes O,Altalhi A H,Ventura S. Statistical comparisons of active learning strategies over multiple datasets. Knowledge?Based Systems,2018,145:274-288. |
[1] | 刘鑫,胡军,张清华. 属性组序下基于代价敏感的约简方法[J]. 南京大学学报(自然科学版), 2020, 56(4): 469-479. |
[2] | 张银芳,于洪,王国胤,谢永芳. 一种用于数据流自适应分类的主动学习方法[J]. 南京大学学报(自然科学版), 2020, 56(1): 67-73. |
[3] | 柴变芳,魏春丽,曹欣雨,王建岭. 面向网络结构发现的批量主动学习算法[J]. 南京大学学报(自然科学版), 2019, 55(6): 1020-1029. |
[4] | 张龙波, 李智远, 杨习贝, 王怡博. 决策代价约简求解中的交叉验证策略[J]. 南京大学学报(自然科学版), 2019, 55(4): 601-608. |
[5] | 黄 帷,闵 帆*,任 杰. 基于协同过滤加权预测的主动学习缺失值填补算法[J]. 南京大学学报(自然科学版), 2018, 54(4): 758-. |
[6] | 方 宇1,闵 帆1*,刘忠慧1,杨 新2. 序贯三支决策的代价敏感分类方法[J]. 南京大学学报(自然科学版), 2018, 54(1): 148-. |
[7] | 黄伟婷1*,赵 红2. 基于误差数据的最小代价属性选择分治算法[J]. 南京大学学报(自然科学版), 2016, 52(5): 890-. |
[8] | 张燕平1,2, 邹慧锦1,2,赵姝1,2. 基于CCA的代价敏感三支决策模型[J]. 南京大学学报(自然科学版), 2015, 51(2): 447-452. |
[9] | 白龙飞1,王文剑2**,郭虎升1. 一种新的支持向量机主动学习策略* [J]. 南京大学学报(自然科学版), 2012, 48(2): 182-189. |
|