南京大学学报(自然科学版) ›› 2021, Vol. 57 ›› Issue (1): 90–100.doi: 10.13232/j.cnki.jnju.2021.01.010

• • 上一篇    下一篇

基于行业背景差异下的金融时间序列预测方法

温玉莲, 林培光()   

  1. 山东财经大学计算机科学与技术学院,济南,250014
  • 收稿日期:2020-09-16 出版日期:2021-01-21 发布日期:2021-01-21
  • 通讯作者: 林培光 E-mail:llpwgh@163.com
  • 作者简介:E⁃mail:llpwgh@163.com
  • 基金资助:
    国家自然科学基金(61802230)

Financial time series forecasting method based on industry background differences

Yulian Wen, Peiguang Lin()   

  1. School of Computer Science and Technology,Shandong University of Finance and Economics,Ji'nan,250014,China
  • Received:2020-09-16 Online:2021-01-21 Published:2021-01-21
  • Contact: Peiguang Lin E-mail:llpwgh@163.com

摘要:

股票市场预测可为投资决策提供重要依据,但在当前的量化投资领域,研究者大多基于单一行业对股票的金融时间序列进行预测研究,忽略了行业背景差异下的股票特征信息;同时,还存在难以有效提取股票时序数据特征、股票情感指标以及股票趋势预测不准确等问题.为解决上述问题,提出利用一种新的WBED(Word2vec?BiLSTM and Encoder?Decoder)混合模型对不同行业背景下的股票信息进行时间序列预测研究.该方法采用WB模型进行情感分类,计算情感值,获取情感指标;然后引入双注意力机制,在Encoder模型中利用特征注意力机制给股票时序数据特征赋予不同权重以区别不同特征的重要程度,在Decoder模型中利用时间注意力机制给Encoder模型中LSTM的隐状态赋予不同权重以区别不同时间维度信息的重要程度;最后,利用股票时序数据和情感指标进行股票预测.另外,考虑到不同行业背景下的股票对象对模型超参数的敏感度可能不同,所以为不同行业的股票对象选择合适的超参数使模型的预测性能更好.参考“2019中国上市公司500强”,选取三个行业中的九家上市公司股票作为研究对象,采用四种对比模型和四个模型评价指标进行实验分析.实验结果表明,提出的新的混合模型在行业背景差异下的金融时间序列预测研究中有一定的优越性.

关键词: 股票预测, 注意力机制, 情感分析, 深度学习

Abstract:

Stock market forecasting can provide an important basis for investment decisions. However,in the current field of quantitative investment,most researchers conduct prediction research on financial time series based on the whole industry,and ignore the characteristics of stocks under different industry backgrounds. At the same time,there are several problems,such as inaccurate stock trend prediction,difficulty in extracting the characteristics of stock time series data and stock sentiment indicators. To solve these problems,the WBED (Word2vec?BiLSTM and Encoder?Decoder)hybrid model is proposed to predict the time series of stock information in different industry backgrounds. Firstly,this method uses WB model to classify sentiment,calculate sentiment value and get sentiment index. Then,the dual attention mechanism is introduced. In the Encoder model,the feature attention mechanism is used to assign different weights to the features of the stocks time series data,so as to distinguish the importance of different features. In the Decoder model,the time attention mechanism is used to assign different weights to the hidden state of the LSTM in the Encoder model,so as to distinguish the importance of information in different time dimensions. Finally,stock timing data and sentiment indicators are used to make stock prediction. In addition,considering that stock objects in different industry backgrounds may have different sensitivity to the hyper?parameters of the model,this paper selects the appropriate hyper?parameters for stock objects in different industry to improve the prediction performance of the model. This paper refers to the “2019 China Top 500 Listed Companies”,selects nine listed companies in three industries as research objects,and uses four comparison models and four model evaluation indicators for experimental analysis. The experimental results show that the new hybrid model proposed in this paper has certain advantages in research of financial time series prediction under different industry background.

Key words: stock prediction, attention mechanism, sentiment analysis, deep learning

中图分类号: 

  • O211.61

图1

WBED混合模型结构图"

图2

情感分类模块"

图3

股票预测模块"

图4

LSTM单元结构"

表1

股票数据集统计"

Industry InformationStock NameTrainTest
Financial IndustryICBC1076269
CCB1076269
ABC1076269
Communications Sevices IndustryChina Mobile1076269
China Unicom1076269
China Telecom1076269
Consumer Goods IndustryYili1076269
Qingdao Beer1076269
Shuanghui Development1076269

图5

金融行业的批处理大小"

图6

金融行业的滑动窗口大小"

图7

通讯服务行业的批处理大小"

图8

通讯服务行业的滑动窗口大小"

图9

日常消费品行业的批处理大小"

图10

日常消费品行业的滑动窗口大小"

表2

各模型在金融行业的实验结果对比"

RNNLSTMSen?LSTMEncoder?DecoderWBED
ICBCMAE0.2850140.2501350.2243660.1572620.097654
MAPE0.0501340.0439670.0402910.0282100.017394
MSE0.0906650.0723000.0574890.0307070.015640
RMSE0.3011060.2688870.2397690.1752360.125061
CCBMAE0.2486250.2427670.1476690.1465270.124838
MAPE0.0343890.0335750.0210030.0208180.017826
MSE0.0821190.0768770.0325940.0325900.024452
RMSE0.2865650.2772670.1805390.1805270.156372
ABCMAE0.0702340.0685280.0631930.0439740.036610
MAPE0.0195530.0190450.0174850.0121070.010084
MSE0.0068580.0064980.0054130.0030540.002211
RMSE0.0828160.0806130.0735760.0552680.047021

表3

各模型在通讯服务行业的实验结果对比"

RNNLSTMSen?LSTMEncoder?DecoderWBED

China

Mobile

MAE1.5610251.3627050.9058871.1854380.862780
MAPE0.0359640.0322840.0200140.0281320.019668
MSE4.2024664.2984281.4613613.2190151.285173
RMSE2.0499912.0732651.2088671.7941611.133654

China

Unicom

MAE0.6176150.6014850.5334210.4958680.478745
MAPE0.0582500.0579120.0485610.0486770.046683
MSE0.4827510.4725440.4059330.4123700.397476
RMSE0.6948030.6874180.6371290.6421600.630457

China

Telecom

MAE2.4522822.1062271.7208791.6735571.585590
MAPE0.0499110.0442830.0369370.0352830.034321
MSE7.2522815.4676144.8864784.3043653.885948
RMSE2.6930052.3382932.2105382.0746961.971281

表4

各模型在日常消费品行业的实验结果对比"

RNNLSTMSen?LSTMEncoder?DecoderWBED
YiliMAE1.0852130.9368320.7748530.7651810.716834
MAPE0.0365260.0316890.0275710.0268000.025210
MSE1.7252061.4897370.9165540.9762340.806238
RMSE1.3134711.2205480.9573680.9880450.897908
Qingdao BeerMAE2.8420052.6754602.2904352.1036171.890686
MAPE0.0614930.0574230.0543510.0500260.043225
MSE12.02091011.0207157.0296926.2614955.439980
RMSE3.4671183.3197462.6513562.5022982.332376
Shuanghui DevelopmentMAE1.1188731.0660581.0363081.0257400.902006
MAPE0.0408020.0379970.0387860.0364960.033398
MSE3.2716702.7966982.0148272.6572021.862085
RMSE1.8087751.6723331.4194461.6300921.364582

图11

中国移动各模型预测结果"

图12

工商银行各模型预测结果"

图13

双汇发展各模型预测结果"

1 Hinton G E,Osindero S,Teh Y W. A fast learning algorithm for deep belief nets. Neural Computation,2006,18(7):1527-1554.
2 Graves A,Mohamed A R,Hinton G. Speech recognition with deep recurrent neural networks∥2013 IEEE International Conference on Acoustics,Speech and Signal Processing. Vancouver,Canada:IEEE,2013:6645-6649.
3 Zhao J F,Mao X,Chen L J. Speech emotion recognition using deep 1D & 2D CNN LSTM networks. Biomedical Signal Processing and Control,2019,47:312-323.
4 Krizhevsky A,Sutskever I,Hinton G E. Imagenet classification with deep convolutional neural networks∥Proceedings of the 25th International Conference on Neural Information Processing Systems. Red Hook,NY,USA:Curran Associates Inc.,2012:1097-1105.
5 白翔,庞彦伟,章国锋. 计算机视觉中的深度学习专题(2020)简介. 中国科学:信息科学,2020,50(2):303-304.
Bai X,Pang Y W,Zhang G F. Special focus on deep learning for computer vision. Scientia Sinica Informations,2020,50(2):303-304.
6 Al?Ayyoub M,Nuseir A,Alsmearat K,et al. Deep learning for arabic NLP:a survey. Journal of Computational Science,2018,26:522-531.
7 Juanals B,Minel J L. An instrumented methodology to analyze and categorize information flows on twitter using NLP and deep learning:a use case on air quality∥Foundations of Intelligent Systems. Springer Berlin Heidelberg,2018:315-322.
8 Fischer T,Krauss C. Deep learning with long short?term memory networks for financial market predictions. European Journal of Operational Research,2018,270(2):654-669.
9 Lasheras F S,De Cos Juez F J,Sánchez A S,et al. Forecasting the comex copper spot price by means of neural networks and arima models. Resources Policy,2015,45:37-43.
10 Devi B U,Sundar D,Alli D P. An effective time series analysis for stock trend prediction using ARIMA model for nifty midcap?50. International Journal of Data Mining & Knowledge Management Process,2013,3(1):65-78.
11 Lin Y L,Guo H X,Hu J L. An SVM?based approach for stock market trend prediction∥2013 International Joint Conference on Neural Networks. Dallas,TX,USA:IEEE,2013:1-7.
12 Rather A M,Agarwal A,Sastry V. Recurrent neural network and a hybrid model for prediction of stock returns. Expert Systems with Applications,2015,42(6):3234-3241.
13 Hochreiter S,Schmidhuber J. Long short?term memory. Neural Computation,1997,9(8):1735-1780.
14 Jia H J. Investigation into the effectiveness of long short term memory networks for stock price prediction. 2016,arXiv:1603.07893.
15 Roondiwala M,Patel H,Varma S. Predicting stock prices using LSTM. International Journal of Science and Research,2017,6(4):1754-1756.
16 Bollen J,Mao H N,Zeng X J. Twitter mood predicts the stock market. Journal of Computational Science,2011,2(1):1-8.
17 周恺越. 基于深度学习的股票预测方法的研究与实现. 硕士学位论文. 北京:北京邮电大学,2018.
Zhou K Y. Research and implementation of stock prediction model based on deep learning. Master Dissertation. Beijing:Beijing University of Posts and Telecommunications,2018.
18 McNally S,Roche J,Caton S. Predicting the price of Bitcoin using Machine Learning∥The 26th Euromicro International Conference on Parallel,Distributed and Network?based Processing. Cambridge,UK:IEEE,2018:339-343.
19 Qin Y,Song D J,Chen H F,et al. A dual?stage attention?based recurrent neural network for time series prediction∥Proceedings of the 26th International Joint Conference on Artificial Intelligence. Melbourne,Australis:AAAI Press,2017:2627-2633.
[1] 曾宪华, 陆宇喆, 童世玥, 徐黎明. 结合马尔科夫场和格拉姆矩阵特征的写实类图像风格迁移[J]. 南京大学学报(自然科学版), 2021, 57(1): 1-9.
[2] 余方超, 方贤进, 张又文, 杨高明, 王丽. 增强深度学习中的差分隐私防御机制[J]. 南京大学学报(自然科学版), 2021, 57(1): 10-20.
[3] 张萌, 韩冰, 王哲, 尤富生, 李浩然. 基于深度主动学习的甲状腺癌病理图像分类方法[J]. 南京大学学报(自然科学版), 2021, 57(1): 21-28.
[4] 李一凡, 朱斐, 凌兴宏, 刘全. 具有窗口结构Bi⁃LSTM网络的心电图QRS波检测方法[J]. 南京大学学报(自然科学版), 2021, 57(1): 42-51.
[5] 李金轩, 杜军平, 薛哲. 基于多视角股票特征的股票预测研究[J]. 南京大学学报(自然科学版), 2021, 57(1): 68-74.
[6] 潘越,王骏,李文飞,张建,王炜. 基于卷积神经网络的蛋白质折叠类型最小特征提取[J]. 南京大学学报(自然科学版), 2020, 56(5): 744-753.
[7] 朱伟,张帅,辛晓燕,李文飞,王骏,张建,王炜. 结合区域检测和注意力机制的胸片自动定位与识别[J]. 南京大学学报(自然科学版), 2020, 56(4): 591-600.
[8] 李康,谢宁,李旭,谭凯. 基于卷积神经网络和几何优化的统计染色体核型分析方法[J]. 南京大学学报(自然科学版), 2020, 56(1): 116-124.
[9] 吴静怡,吴钟强,商琳. 基于Shapelet的不相关情感子序列挖掘方法[J]. 南京大学学报(自然科学版), 2020, 56(1): 57-66.
[10] 徐扬,周文瑄,阮慧彬,孙雨,洪宇. 基于层次化表示的隐式篇章关系识别[J]. 南京大学学报(自然科学版), 2019, 55(6): 1000-1009.
[11] 韩普,刘亦卓,李晓艳. 基于深度学习和多特征融合的中文电子病历实体识别研究[J]. 南京大学学报(自然科学版), 2019, 55(6): 942-951.
[12] 张家精,夏巽鹏,陈金兰,倪友聪. 基于张量分解和深度学习的混合推荐算法[J]. 南京大学学报(自然科学版), 2019, 55(6): 952-959.
[13] 曹欣怡,李鹤,王蔚. 基于语料库的语音情感识别的性别差异研究[J]. 南京大学学报(自然科学版), 2019, 55(5): 758-764.
[14] 钟琪,冯亚琴,王蔚. 跨语言语料库的语音情感识别对比研究[J]. 南京大学学报(自然科学版), 2019, 55(5): 765-773.
[15] 钱付兰, 黄鑫, 赵姝, 张燕平. 基于路径相互关注的网络嵌入算法[J]. 南京大学学报(自然科学版), 2019, 55(4): 573-580.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!