基于生成式对抗网络的自监督多元时间序列异常检测方法

doi:10.13232/j.cnki.jnju.2023.02.008

南京大学学报(自然科学版) ›› 2023, Vol. 59 ›› Issue (2): 256–262.doi: 10.13232/j.cnki.jnju.2023.02.008

基于生成式对抗网络的自监督多元时间序列异常检测方法

周业瀚¹^,², 沈子钰¹^,², 周清¹^,², 李云¹^,²()

^1.南京邮电大学计算机学院，南京，210023
^2.江苏省大数据安全与智能处理重点实验室，南京，210023

收稿日期:2022-07-18 出版日期:2023-03-31 发布日期:2023-04-07
通讯作者: 李云 E-mail:liyun@njupt.edu.cn
基金资助:
江苏省研究生科研创新计划(KYCX_0760)

Self⁃supervised multivariate time series anomaly detection based on GAN

Yehan Zhou¹^,², Ziyu Shen¹^,², Qing Zhou¹^,², Yun Li¹^,²()

^1.College of Computer Science, Nanjing University of Posts and Telecommunications, Nanjing, 210023, China
^2.Jiangsu Key Laboratory of Big Data Security & Intelligent Processing, Nanjing, 210023, China

Received:2022-07-18 Online:2023-03-31 Published:2023-04-07
Contact: Yun Li E-mail:liyun@njupt.edu.cn

摘要/Abstract

摘要：

异常检测是数据挖掘的重要研究方向之一.工业设备的各项指标以多元时间序列的形式被传感器监测，多元时间序列的异常检测对保障安全和提高服务质量至关重要，但是异常的定义相对模糊，具有异常标签的数据很稀少.此外，多元时间序列具有复杂的时间依赖性和随机性，使异常检测存在许多问题.提出CPCGAN模型，使用自监督学习的方法对多元时序数据进行异常检测.首先使用对比学习的方法得到多元时序数据的表示向量，再将具有先验信息的表示向量作为输入用来训练生成式对抗网络，通过生成式对抗网络的重构误差来确定异常.在五个数据集上与五种无监督异常检测方法进行对比，实验结果证明提出的方法能有效地检测两类异常，并且，在大多数数据集上的表现更好.

关键词: 异常检测, 多元时间序列, 自监督学习, 对比学习, 生成式对抗网络

Abstract:

Anomaly detection is one of the important research directions of data mining. The indicators of industrial devices are monitored by sensors in the form of multivariate time series. Anomaly detection of multivariate time series is critical for security and improving service quality. However，the definition of anomalies is relatively vague and the data with anomalous labels is rare. Also，multivariate time series have complex time dependence and stochasticity which makes anomaly detection many issues to be settled. In this paper，we propose CPCGAN，a self?supervised learning method，to perform anomaly detection on multivariate time series data. Our main idea is to obtain the representation vector of multivariate time series data by using the contrastive learning method. We use the representation vector with prior information as input when training the generative adversarial network. The reconstruction error of the generative adversarial network is used to determine anomalies. We compare our method with five unsupervised anomaly detection methods on five datasets. Experimental results show that our method is effective at detecting both types of anomalies and performs better on most datasets compared with other methods.

Key words: anomaly detection, multivariate time series, self?supervised learning, contrastive learning, Generative Adversarial Network

中图分类号:

TP391

周业瀚, 沈子钰, 周清, 李云. 基于生成式对抗网络的自监督多元时间序列异常检测方法[J]. 南京大学学报(自然科学版), 2023, 59(2): 256–262.

Yehan Zhou, Ziyu Shen, Qing Zhou, Yun Li. Self⁃supervised multivariate time series anomaly detection based on GAN[J]. Journal of Nanjing University(Natural Sciences), 2023, 59(2): 256–262.

图/表 7

图1

图2

图3

表1

表2

表3

表4

参考文献 19

1	Jiang B C， Yang W H， Yang C Y. An SPC?based forward?backward algorithm for arrhythmic beat detection and classification. Industrial Engineering & Management Systems，2013，12(4)：380-388.
2	Beutel A， Faloutsos C. User behavior modeling and fraud detection. IEEE Intelligent Systems，2016，31(2)：84-86.
3	Sun B， Luh P B， Jia Q S，et al. Building energy doctors：An SPC and Kalman filter?based method for system?level fault detection in HVAC systems. IEEE Transactions on Automation Science and Engineering，2014，11(1)：215-229.
4	Hundman K， Constantinou V， Laporte C，et al. Detecting spacecraft anomalies using LSTMs and nonparametric dynamic thresholding∥Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining. London，UK：ACM，2018：387-395.
5	Hochreiter S， Schmidhuber J. Long short?term memory. Neural Computation，1997，9(8)：1735-1780.
6	Li D， Chen D C， Jin B L，et al. Mad?GAN：Multivariate anomaly detection for time series data with generative adversarial networks∥The 28^th International Conference on Artificial Neural Networks. Springer Berlin Heidelberg，2019：703-716.
7	Geiger A， Liu D Y， Alnegheimish S，et al. TadGAN：Time series anomaly detection using generative adversarial networks∥2020 IEEE International Conference on Big Data. Atlanta，GA，USA：IEEE，2020：33-43，DOI：10.1109/BigData50022.2020. 9378139 .
8	Su Y， Zhao Y J， Niu C H，et al. Robust anomaly detection for multivariate time series through stochastic recurrent neural network∥Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining. Anchorage，AK，USA：ACM，2019：2828-2837.
9	Van Den Oord A， Li Y Z， Vinyals O. Representation learning with contrastive Predictive coding. 2018，arXiv:.
10	Gutmann M， Hyv?rinen A. Noise?contrastive estimation：A new estimation principle for unnormalized statistical models∥Proceedings of the 13th International Conference on Artificial Intelligence and Statistics. Chia Laguna Resort，Sardinia：JMLR.org，2010：297-304.
11	Mnih A， Teh Y W. A fast and simple algorithm for training neural probabilistic language models∥Proceedings of the 29th International Coference on International Conference on Machine Learning. Edinburgh，Scotland：Omnipress，2012：419-426.
12	Jozefowicz R， Vinyals O， Schuster M，et al. Exploring the limits of language modeling. 2016，arXiv:.
13	Bengio Y， Senecal Y S. Adaptive importance sampling to accelerate training of a neural probabilistic language model. IEEE Transactions on Neural Networks，2008，19(4)：713-722.
14	Goodfellow I J， Pouget?Abadie J， Mirza M，et al. Generative adversarial nets∥Proceedings of the 27th International Conference on Neural Information Processing Systems. Montreal，Canada：MIT Press，2014：2672-2680.
15	Arjovsky M， Chintala S， Bottou L. Wasserstein generative adversarial networks∥Proceedings of the 34th International Conference on Machine Learning. Sydney，Australia：JMLR.org，2017：214-223.
16	Mathur A P， Tippenhauer N O. SWaT：A water treatment testbed for research and training on ICS security∥2016 International Workshop on Cyber?physical Systems for Smart Water Networks. Vienna，Austria：IEEE，2016：31-36.
17	Cho B， Van Merrienboer D， Bahdanau D，et al. On the properties of neural machine translation：Encoder?decoder approaches∥Proceedings of the 8^th Workshop on Syntax，Semantics and Structure in Statistical Translation. Doha，Qatar：Association for Computational Linguistics，2014：103-111.
18	Park D， Hoshi Y， Kemp C C. A multimodal anomaly detector for robot?assisted feeding using an LSTM?based variational autoencoder. IEEE Robotics and Automation Letters，2018，3(3)：1544-1551.
19	Zong B， Song Q， Min M R，et al. Deep autoencoding Gaussian mixture model for unsupervised anomaly detection∥The 6th International Conference on Learning Representations. Toulon，France：ICLR，2018：1-19.

Metrics

Viewed

Full text

Abstract

Cited

Shared

Discussed

数据集	训练集样本数	测试集样本数	时序数据维度	异常点占比
SWaT	496800	449919	51	11.98%
WADI	1048571	172801	123	5.99%
SMD	708405	708420	28×38	4.16%
SMAP	135183	427617	55×25	13.13%
MSL	58317	73729	27×55	10.72%

	SWaT			WADI			SMD			SMAP			MSL
	P	R	F1	P	R	F1	P	R	F1	P	R	F1	P	R	F1
CPCGAN	0.9815	0.661	0.7899	0.991	0.1316	0.2323	0.9511	0.9484	0.9497	0.7581	0.9822	0.8557	0.882	0.9686	0.9232
AE	0.9324	0.5734	0.7101	0.3074	0.179	0.2262	0.5684	0.7894	0.6609	0.5633	0.6223	0.5915	0.571	0.6641	0.614
MAD⁃GAN	0.9585	0.6166	0.7504	0.9842	0.1351	0.2375	0.8722	0.8075	0.8386	0.7106	0.9521	0.8138	0.8457	0.9546	0.8968
LSTM⁃VAE	0.9655	0.6218	0.7564	0.9845	0.1334	0.2349	0.8592	0.8012	0.8291	0.7056	0.9752	0.8187	0.8601	0.9663	0.9101
DAGMM	0.4576	0.671	0.5441	0.0851	0.9117	0.1556	0.6573	0.8549	0.7431	0.6234	0.9776	0.7613	0.7467	0.9817	0.8482
TadGAN	0.9525	0.6481	0.7713	0.9561	0.1246	0.2204	0.9141	0.9362	0.925	0.7413	0.9867	0.8465	0.9052	0.8932	0.8991

	SWaT			WADI			SMD			SMAP			MSL
	P	R	F1	P	R	F1	P	R	F1	P	R	F1	P	R	F1
CPCGAN (with)	0.9815	0.661	0.7899	0.991	0.1316	0.2323	0.9511	0.9484	0.9497	0.7581	0.9822	0.8557	0.882	0.9686	0.9232
CPCGAN (without)	0.842	0.5912	0.7014	0.871	0.1416	0.1992	0.8833	0.9026	0.837	0.6518	0.8872	0.785	0.7966	0.9181	0.8395

	SWaT (segment)			WADI (segment)
	P	R	F1	P	R	F1
CPCGAN	0.8142	0.8010	0.8075	0.7691	0.7827	0.7758
AE	0.7513	0.7334	0.7422	0.5423	0.5737	0.5575
MAD⁃GAN	0.7225	0.6866	0.7040	0.6022	0.6714	0.6349
LSTM⁃VAE	0.7468	0.7918	0.7686	0.7621	0.7001	0.7297
DAGMM	0.6221	0.7510	0.6805	0.6366	0.8782	0.7381
TadGAN	0.7392	0.8581	0.7942	0.7782	0.7075	0.7411

[1]	王津, 谭安辉, 顾沈明. 基于弱监督对比学习的弱多标记特征选择[J]. 南京大学学报(自然科学版), 2023, 59(1): 85-97.
[2]	刘春红, 王梦情, 王敬雄, 何倩, 张俊娜. 特征表示增强的轻量化异常序列检测方法[J]. 南京大学学报(自然科学版), 2022, 58(4): 640-648.
[3]	邵世宽, 张宏钧, 肖钦锋, 王晶, 刘晓辉, 林友芳. 基于无监督对抗学习的时间序列异常检测[J]. 南京大学学报(自然科学版), 2021, 57(6): 1042-1052.
[4]	房笑宇, 曹陈涵, 夏彬. 基于注意力机制的大规模系统日志异常检测方法[J]. 南京大学学报(自然科学版), 2021, 57(5): 785-792.
[5]	胡　石¹^，²，李光辉¹^，²^，^3*，冯海林¹^，². 基于Topk(σ)的无线传感器网络异常数据检测算法[J]. 南京大学学报(自然科学版), 2016, 52(2): 261-.
[6]	谢骋;商琳;. 基于三支决策粗糙集的视频异常行为检测[J]. 南京大学学报(自然科学版), 2013, 49(4): 475-482.

基于生成式对抗网络的自监督多元时间序列异常检测方法

Self⁃supervised multivariate time series anomaly detection based on GAN

RichHTML

PDF (PC)

摘要/Abstract

引用本文

使用本文

图/表 7

参考文献 19

相关文章 6

Metrics

本文评价

推荐阅读 0