南京大学学报(自然科学版) ›› 2016, Vol. 52 ›› Issue (2): 261–.

• • 上一篇    下一篇

基于Top­k(σ)的无线传感器网络异常数据检测算法

 胡 石12,李光辉123*,冯海林12   

  • 出版日期:2016-03-26 发布日期:2016-03-26
  • 作者简介: 1.浙江农林大学信息工程学院,临安,311300;2.浙江省林业智能监测与信息技术研究重点实验室,临安,311300;3.江南大学物联网工程学院,无锡,214122
  • 基金资助:
    基金项目:国家自然科学基金(61472368,61174023,61272313),浙江省国际科技合作项目(2013C24026)
    收稿日期:2015-09-15
    *通讯联系人,E­mail:ghli@jiangnan.edu.cn

Top­k(σ)outlier detection algorithm for wireless sensor networks

Hu Shi1,2,Li Guanghui1,2,3*,Feng Hailin1,2   

  • Online:2016-03-26 Published:2016-03-26
  • About author: 1.School of Information Engineering,Zhejiang A&F University,Lin’an,311300,China;2.Zhejiang Provincial Key Laboratory of Intelligent Monitoring in Forestry and Information Technology,Lin’an,311300,China;3.School of Internet of Things,Jiangnan University,Wuxi,214122,China

摘要: 异常数据检测在基于无线传感器网络的环境监测系统中起着十分重要的作用,不仅有助于对传感器网络健康状况的监测,而且能够及时发现外部环境发生的突发事件(如森林火灾、环境污染等).通过对top­k算法的改进,提出了一种基于top­k(σ)的无线传感器网络异常数据检测算法.不同于top­k算法,该算法根据传感器节点采集到的数据分布规律,构造合适的数据网格,将多维数据归一化处理后置入相应的网络单元.然后通过增设距离阈值σ来重构PC列表(populated­cells list).除了对每个单元格及其邻域内的数据点个数分别进行排序,还计算不同数据子集之间的欧氏距离,并与阈值σ的比较,确认数据子集与正常值集合的偏离程度,从而提高检测结果的准确性.通过MATLAB仿真实验发现,距离阈值σ的选取对算法效果具有较大的影响,当σ∈[2.5,3]时,top­k(σ)算法在维持较高检测率的同时,最大程度地降低误报率.当取σ=3时,对于给定的5个数据集,top­k(σ)算法的检测率平均达到了93.70%,比top­k算法平均提高了4.94%,误报率则比top­k算法平均降低了4.48%.

Abstract: Outlier detection plays an important role in wireless sensor network(WSN)application system for environment monitoring,which helps people monitor the condition of WSNs themselves,and also can detect the emergent events of the environment such as forest fire and air pollution.After improving the top­k algorithm,a top­k(σ) outlier detection algorithm for WSNs was proposed in this paper.Different from top­k algorithm,the proposed algorithm uses the data distribution collected by the sensor nodes to construct appropriate data grid,and puts the data sets into the grid after normalization,then sets a distance threshold σ to reconstruct the PC list(populated­cells list).This algorithm sorts the numbers of data points in each cell and those of its neighborhood respectively,as well as computes the Euclidean distance R_D between two data subsets,and compares the value of R_D with σ so as to verify the degree of deviation of the subset from the normal data sets.Thus the top­k(σ) algorithm can improve the precision of the outliers detection.For given several datasets,the simulation results under MATLAB platform show that,the threshold σ has great effect on the performance of outlier detect algorithm.When σ∈[2.5,3],the top­k(σ) algorithm has higher detection accuracy and lower false positive rate.If σ=3,for the given five data sets,the average accuracy of outlier detection of top­k(σ) algorithm is 93.70%,which is 4.94% higher than that of top­k algorithm,and the average false positive rate of top­k(σ) algorithm is 4.48% lower than that of top­k algorithm.

[1] 周贤伟,王 培,覃伯平等.一种无线传感器网络异常检测技术研究.传感技术学报,2007,20(8):1870-1874.(Zhou X W,Wang P,Qin B P,et al.An anomaly detection technology in wireless sensor networks.Chinese Journal of Sensors and Actuators,2007,20(8):1870-1874.)
[2]  Subramaniam S,Palpanas T,Papadopoulos D,et al.Online outlier detection in sensor data using non­parametric models.In:Proceedings of the 32nd International Conference on Very Large Data Bases.Almaden,Spanish:VLDB Endowment,2006,187-198.
[3]  Perrig A,Stankovic J,Wagner D.Security in wireless sensor networks.Communications of the ACM,2004,47(6):53-57.
[4]  曹冬磊,曹建农,金蓓弘.一种无线传感器网络中事件区域检测的容错算法.计算机学报,2007,30(10):1770-1776.(Cao D L,Cao J N,Jin P H.A fault­tolerant algorithm for event region detection in wireless sensor networks.Chinese Journal of Computers,2007,30(10):1770-1776.)
[5]  曹 磊,韩 涛,张 婧等.无线多媒体传感器网络拥塞感知的流控制机制.南京大学学报(自然科学),2014,50(2):173-180.(Cao L,Han T,Zhang J,et al.Congestion­ware flow control strategy in wireless multimedia sensor networks.Journal of Nanjing University(Natural Sciences),2014,50(2):173-180.)
[6]  Martincic F,Schwiebert L.Distributed event detection in sensor networks.In:International Conference on Systems and Networks Communication.IEEE,2006,43.
[7]  毕 冉,李建中,程思瑶.无线传感器网络(ε,δ)—近似Top­k查询处理算法.通信学报,2011,32(8):45-54.(Bi R,Li J Z,Cheng S Y.(ε,δ)-approximate Top­k query processing algorithm in wireless sensor networks.Journal of Communications,2011,32(8):45-54.)
[8]  姜旭宝,李光耀,连 朔.基于变宽直方图的无线传感器网络异常数据检测算法.计算机应用,2011,31(3):694-697.(Jiang X B,Li G Y,Lian S.Outlier detection algorithm based on variable­width histogram for wireless sensor network.Journal of Computer Applications,2011,31(3):694-697.)
[9]  肖政宏,陈志刚,李庆华.WSN中基于分布式机器学习的异常检测仿真研究.系统仿真学报,2011,23(1):181-187.(Xiao Z H,Chen Z G,Li Q H.Simulation study of anomaly detection based on distributed machine learning for WSN.Journal of System Simulation,2011,23(1):181-187.)
[10]  刘 群,顾 金,张足生.基于无线传感器网络的车辆分型算法.南京大学学报(自然科学),2013,49(5):655-663.(Liu Q,Gu J,Zhang Z S.Vehicle lassification algorithm based on wireless sensor networks.Journal of Nanjing University(Natural Sciences),2013,49(5):655-663.)
[11]  Aggarwal C C,Philip S Y.Outlier detection with uncertain data.Society for Industrial and Applied Mathematics,2008,483-493.
[12]  Sheng B,Li Q,Mao W,et al.Outlier detection in sensor networks.In:Proceedings of the 8th ACM International Symposium on Mobile ad hoc Networking and Computing.York,USA:ACM,2007,219-228.
[13]  Shaikh S A,Kitagawa H.Distance­based outlier detection on uncertain data of Gaussian distribution.Web Technologies and Applications.Berlin,Germany Springer Berlin Heidelberg,2012,109-121.
[14]  Shaikh S A,Kitagawa H.Fast top­k distance­based outlier detection on uncertain data.Web­Age Information Management.Berlin,Germany Springer Berlin Heidelberg,2013,301-313.
[15]  Knorr E M,Ng R T,Tucakov V.Distance­based outliers:algorithms and applications.The VLDB Journal—The International Journal on Very Large Data Bases,2000,8(3-4):237-253.
[16]  Shaikh S A,Kitagawa H.Efficient distance­based outlier detection on uncertain datasets of Gaussian distribution.World Wide Web,2014,17(4):511-538.
[17]  Shaikh S A,Kitagawa H.Top­k outlier detection from uncertain data.International Journal of Automation and Computing,2014,11(2):128-142.
[18]  Hill D J,Minsker B S.Anomaly detection in streaming environmental sensor data:A data­driven modeling approach.Environmental Modelling & Software,2010,25(9):1014-1022.
No related articles found!
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!