南京大学学报(自然科学版) ›› 2023, Vol. 59 ›› Issue (2): 256–262.doi: 10.13232/j.cnki.jnju.2023.02.008

• • 上一篇    下一篇

基于生成式对抗网络的自监督多元时间序列异常检测方法

周业瀚1,2, 沈子钰1,2, 周清1,2, 李云1,2()   

  1. 1.南京邮电大学计算机学院,南京,210023
    2.江苏省大数据安全与智能处理重点实验室,南京,210023
  • 收稿日期:2022-07-18 出版日期:2023-03-31 发布日期:2023-04-07
  • 通讯作者: 李云 E-mail:liyun@njupt.edu.cn
  • 基金资助:
    江苏省研究生科研创新计划(KYCX_0760)

Self⁃supervised multivariate time series anomaly detection based on GAN

Yehan Zhou1,2, Ziyu Shen1,2, Qing Zhou1,2, Yun Li1,2()   

  1. 1.College of Computer Science, Nanjing University of Posts and Telecommunications, Nanjing, 210023, China
    2.Jiangsu Key Laboratory of Big Data Security & Intelligent Processing, Nanjing, 210023, China
  • Received:2022-07-18 Online:2023-03-31 Published:2023-04-07
  • Contact: Yun Li E-mail:liyun@njupt.edu.cn

摘要:

异常检测是数据挖掘的重要研究方向之一.工业设备的各项指标以多元时间序列的形式被传感器监测,多元时间序列的异常检测对保障安全和提高服务质量至关重要,但是异常的定义相对模糊,具有异常标签的数据很稀少.此外,多元时间序列具有复杂的时间依赖性和随机性,使异常检测存在许多问题.提出CPCGAN模型,使用自监督学习的方法对多元时序数据进行异常检测.首先使用对比学习的方法得到多元时序数据的表示向量,再将具有先验信息的表示向量作为输入用来训练生成式对抗网络,通过生成式对抗网络的重构误差来确定异常.在五个数据集上与五种无监督异常检测方法进行对比,实验结果证明提出的方法能有效地检测两类异常,并且,在大多数数据集上的表现更好.

关键词: 异常检测, 多元时间序列, 自监督学习, 对比学习, 生成式对抗网络

Abstract:

Anomaly detection is one of the important research directions of data mining. The indicators of industrial devices are monitored by sensors in the form of multivariate time series. Anomaly detection of multivariate time series is critical for security and improving service quality. However,the definition of anomalies is relatively vague and the data with anomalous labels is rare. Also,multivariate time series have complex time dependence and stochasticity which makes anomaly detection many issues to be settled. In this paper,we propose CPCGAN,a self?supervised learning method,to perform anomaly detection on multivariate time series data. Our main idea is to obtain the representation vector of multivariate time series data by using the contrastive learning method. We use the representation vector with prior information as input when training the generative adversarial network. The reconstruction error of the generative adversarial network is used to determine anomalies. We compare our method with five unsupervised anomaly detection methods on five datasets. Experimental results show that our method is effective at detecting both types of anomalies and performs better on most datasets compared with other methods.

Key words: anomaly detection, multivariate time series, self?supervised learning, contrastive learning, Generative Adversarial Network

中图分类号: 

  • TP391

图1

对比预测编码"

图2

生成式对抗网络(GAN)"

图3

CPCGAN的总体结构"

表1

实验使用的数据集"

数据集训练集样本数测试集样本数时序数据维度

异常点

占比

SWaT4968004499195111.98%
WADI10485711728011235.99%
SMD70840570842028×384.16%
SMAP13518342761755×2513.13%
MSL583177372927×5510.72%

表2

CPCGAN与其他五种对比方法异常点检测的评价指标情况"

SWaTWADISMDSMAPMSL
PRF1PRF1PRF1PRF1PRF1
CPCGAN0.98150.6610.78990.9910.13160.23230.95110.94840.94970.75810.98220.85570.8820.96860.9232
AE0.93240.57340.71010.30740.1790.22620.56840.78940.66090.56330.62230.59150.5710.66410.614
MAD⁃GAN0.95850.61660.75040.98420.13510.23750.87220.80750.83860.71060.95210.81380.84570.95460.8968
LSTM⁃VAE0.96550.62180.75640.98450.13340.23490.85920.80120.82910.70560.97520.81870.86010.96630.9101
DAGMM0.45760.6710.54410.08510.91170.15560.65730.85490.74310.62340.97760.76130.74670.98170.8482
TadGAN0.95250.64810.77130.95610.12460.22040.91410.93620.9250.74130.98670.84650.90520.89320.8991

表3

包含(不包含)自监督模块的CPCGAN模型评价指标情况"

SWaTWADISMDSMAPMSL
PRF1PRF1PRF1PRF1PRF1
CPCGAN (with)0.98150.6610.78990.9910.13160.23230.95110.94840.94970.75810.98220.85570.8820.96860.9232
CPCGAN (without)0.8420.59120.70140.8710.14160.19920.88330.90260.8370.65180.88720.7850.79660.91810.8395

表4

CPCGAN与其他五种对比方法异常段检测的评价指标情况"

SWaT (segment)WADI (segment)
PRF1PRF1
CPCGAN0.81420.80100.80750.76910.78270.7758
AE0.75130.73340.74220.54230.57370.5575
MAD⁃GAN0.72250.68660.70400.60220.67140.6349
LSTM⁃VAE0.74680.79180.76860.76210.70010.7297
DAGMM0.62210.75100.68050.63660.87820.7381
TadGAN0.73920.85810.79420.77820.70750.7411
1 Jiang B C, Yang W H, Yang C Y. An SPC?based forward?backward algorithm for arrhythmic beat detection and classification. Industrial Engineering & Management Systems201312(4):380-388.
2 Beutel A, Faloutsos C. User behavior modeling and fraud detection. IEEE Intelligent Systems201631(2):84-86.
3 Sun B, Luh P B, Jia Q S,et al. Building energy doctors:An SPC and Kalman filter?based method for system?level fault detection in HVAC systems. IEEE Transactions on Automation Science and Engineering201411(1):215-229.
4 Hundman K, Constantinou V, Laporte C,et al. Detecting spacecraft anomalies using LSTMs and nonparametric dynamic thresholding∥Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining. London,UK:ACM,2018:387-395.
5 Hochreiter S, Schmidhuber J. Long short?term memory. Neural Computation19979(8):1735-1780.
6 Li D, Chen D C, Jin B L,et al. Mad?GAN:Multivariate anomaly detection for time series data with generative adversarial networks∥The 28th International Conference on Artificial Neural Networks. Springer Berlin Heidelberg,2019:703-716.
7 Geiger A, Liu D Y, Alnegheimish S,et al. TadGAN:Time series anomaly detection using generative adversarial networks∥2020 IEEE International Conference on Big Data. Atlanta,GA,USA:IEEE,2020:33-43,DOI:10.1109/BigData50022.2020. 9378139 .
8 Su Y, Zhao Y J, Niu C H,et al. Robust anomaly detection for multivariate time series through stochastic recurrent neural network∥Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining. Anchorage,AK,USA:ACM,2019:2828-2837.
9 Van Den Oord A, Li Y Z, Vinyals O. Representation learning with contrastive Predictive coding. 2018,arXiv:.
10 Gutmann M, Hyv?rinen A. Noise?contrastive estimation:A new estimation principle for unnormalized statistical models∥Proceedings of the 13th International Conference on Artificial Intelligence and Statistics. Chia Laguna Resort,Sardinia:JMLR.org,2010:297-304.
11 Mnih A, Teh Y W. A fast and simple algorithm for training neural probabilistic language models∥Proceedings of the 29th International Coference on International Conference on Machine Learning. Edinburgh,Scotland:Omnipress,2012:419-426.
12 Jozefowicz R, Vinyals O, Schuster M,et al. Exploring the limits of language modeling. 2016,arXiv:.
13 Bengio Y, Senecal Y S. Adaptive importance sampling to accelerate training of a neural probabilistic language model. IEEE Transactions on Neural Networks200819(4):713-722.
14 Goodfellow I J, Pouget?Abadie J, Mirza M,et al. Generative adversarial nets∥Proceedings of the 27th International Conference on Neural Information Processing Systems. Montreal,Canada:MIT Press,2014:2672-2680.
15 Arjovsky M, Chintala S, Bottou L. Wasserstein generative adversarial networks∥Proceedings of the 34th International Conference on Machine Learning. Sydney,Australia:JMLR.org,2017:214-223.
16 Mathur A P, Tippenhauer N O. SWaT:A water treatment testbed for research and training on ICS security∥2016 International Workshop on Cyber?physical Systems for Smart Water Networks. Vienna,Austria:IEEE,2016:31-36.
17 Cho B, Van Merrienboer D, Bahdanau D,et al. On the properties of neural machine translation:Encoder?decoder approaches∥Proceedings of the 8th Workshop on Syntax,Semantics and Structure in Statistical Translation. Doha,Qatar:Association for Computational Linguistics,2014:103-111.
18 Park D, Hoshi Y, Kemp C C. A multimodal anomaly detector for robot?assisted feeding using an LSTM?based variational autoencoder. IEEE Robotics and Automation Letters20183(3):1544-1551.
19 Zong B, Song Q, Min M R,et al. Deep autoencoding Gaussian mixture model for unsupervised anomaly detection∥The 6th International Conference on Learning Representations. Toulon,France:ICLR,2018:1-19.
[1] 王津, 谭安辉, 顾沈明. 基于弱监督对比学习的弱多标记特征选择[J]. 南京大学学报(自然科学版), 2023, 59(1): 85-97.
[2] 刘春红, 王梦情, 王敬雄, 何倩, 张俊娜. 特征表示增强的轻量化异常序列检测方法[J]. 南京大学学报(自然科学版), 2022, 58(4): 640-648.
[3] 邵世宽, 张宏钧, 肖钦锋, 王晶, 刘晓辉, 林友芳. 基于无监督对抗学习的时间序列异常检测[J]. 南京大学学报(自然科学版), 2021, 57(6): 1042-1052.
[4] 房笑宇, 曹陈涵, 夏彬. 基于注意力机制的大规模系统日志异常检测方法[J]. 南京大学学报(自然科学版), 2021, 57(5): 785-792.
[5]  胡 石12,李光辉123*,冯海林12. 基于Top­k(σ)的无线传感器网络异常数据检测算法[J]. 南京大学学报(自然科学版), 2016, 52(2): 261-.
[6] 谢骋;商琳;. 基于三支决策粗糙集的视频异常行为检测[J]. 南京大学学报(自然科学版), 2013, 49(4): 475-482.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!