南京大学学报(自然科学版) ›› 2011, Vol. 47 ›› Issue (5): 551–558.

• • 上一篇    下一篇

 面向事务型数据隐私保护的p剖分l一多样化算法*

 吴英杰**,王一蕾,廖尚斌,王晓东
  

  • 出版日期:2015-04-30 发布日期:2015-04-30
  • 作者简介: (福州大学数学与计算机科学学院,福州,350108)
  • 基金资助:
     国家自然科学基金(61003057),福建省自然科学基金(2010J1330),福州大学科技发展基金(2010-XY-20)

 A p-anatomy l-diversity algorithm for protecting transactional data privacy

 Wu Ying一Jie ,Wanh Yi一Lei ,Liao Shang- Bin ,Wang Xiao Dong
  

  • Online:2015-04-30 Published:2015-04-30
  • About author: (College of Mathematics and Computer Science,Fuzhou University, Fuzhou,350108,China)

摘要:  目前关于隐私保护数据发布的研究大多是面向低维的关系型数据,其相关模型及算法无法直接用于解决稀疏的高维事务型数据发布中可能存在的隐私泄露问题.木文以剖分技术为基础,设计出一
个面向隐私保护事务型数据发布的h-剖分l一多样化匿名算法.算法通过计算事务型数据中属性间的均方列联系数将高维属性集剖分成互不相交的h个属性子集,而后对事务型数据进行记录划分,使记录
划分后的事务型数据关于h个属性子集满足l一多样化的要求.实验对匿名前后事务型数据的关联规则挖掘结果进行比较分析.理论分析和实验结果表明,木文的算法可安全地实现事务型数据发布的隐私保
护,同时保证发布数据的可用性较高.

Abstract:  Abstract; Recently, privacy preserving data publishing has been a hot topic in data privacy preserving research fields. Exist research on privacy preserving data publishing mainly focuses on relational data with low
dimensionality. However, many applications require privacy preserving publishing of transactional data, which has no structure and can be extremely high dimensional. Furthermore, unlike most previous works on relational data
publishing,it is hard to distinguish transactional data as sensitive and norrsensitive, which makes traditional models and methods unusable. In this paper, we consider all the combination of itemsets in transactional data as potential
quasi-identifiers and potential sensitive data, depending on the point of view of the adversary. inspired by the anatomy technique,we propose a jranatomy L-diversity algorithm for privacy preserving transactional data
publishing. The algorithm firstly anatomizes the attribute set of the transactional data into p disjoint subsets by calculating the mcarrsquare contingency coefficient between attributes, and then partitions the tuples of the
transactional data into some equivalence classes, each of which satisfies the L-diversity requirement with respect to the above p attribute subsets. Experimental analysis is designed by comparing the rule number and the accuracy of
association rules mining on the transactional data before and after publishing. The theoretical analysis and experimental results show that our algorithm can safely preserve the privacy in transaction data publication, while

[1]Samarati P. Protecting respondent’s identities in microdata release. IEEE Transactions on Knowledge and Data Engineering, 2001,13 (6):1010一1027.
[2]Sweeney L.k一 anonymity; A model for protec ting privacy. International Journal on Uncer  tainty, Fuzziness and Konwledge-Based Sys tems. 2002,10(5)_ 557一570.
[3]Machanavajjhala A,Gehrkc J,Kifcr D, et al. L-diversity; Privacy beyond k-anonymity. Pro- cecdings of the 22nd IEEE international Confer- ence on Data Engineering. Atlanta, USA,2006,24.
[4]Li N,Li T.t-closeness; Privacy beyond k-ano- nymity and L-diversity. Proceedings of the 23rd IEEE international Conference on Data Engi- neering. Istanbul,Turkey, 2007,106一115.
[5]Aggarwal C C. On k-anonymity and the dursc of dimensionality. Proceedings of the 31st,Very  Large Data Bases Conference. Trondhcim, Nor- way, 200,901一909.
[6]Xu Y,Wang K,Fu A W C, et al. Anonymiz ing transaction databases for publication. Pro- ceedings of the l4thACM SIGKDD International Conference on Knowledge Discovery and Data Mining. Las Vegas, USA,2008:24一27.
[7]Terrovitis M, Mamoulis N,Kalnis P. Anonym ity in unstructured data. Hong Kong University, 2008.
[8]Terrovitis M. Mamoulis N. Kalnis P. Local and global recoding methods for anonymizing set -valued data.International .Journal on Very Large Data Bases 2011,20(2):83一106.
[9]Xiao X,Tao Y. Anatomy; Simple and effective privacy preservation. Proceedings of the 32nd In- ternational Conference on Very Large Database. Seoul,Korea, 2006,139一150.
[10]Cramer H. Mathematical methods of statistics Princeton; Princeton University Press, 1948,282
[11]Kaufman L, Rouseeuw P. Finding groups in data an introduction to cluster analysis. Hobo ken; John Wiley &- Sons,1990,68
[12]He Y,Naughton J. Anonymization of set val- ued data via tolrdown, local generalization. Proceedings of the 35th international Conference on Very Large Database. Lyon, France,2009,934~945.
[13]Cao J,Karras P, Raissi C, et al. rho-uncertain- ty; lnference-proof transaction anonymization. Proceedings of the VLDB Endowment,2010,3 (1)1033一1044.
[14]Ghinita U,Tao Y,Kalnis P. On the anony mization of sparse high-dimensional data. Pro cecdings of the 2th IEEE international Confer ence on Data Engineering. Cancun, Mexico 2008,715一724.
[15]Ghinita G, Kalnis P,Tao Y. Anonymous pub- lication of sensitive transactional data. IEEE Transactions on Knowledge and Data Engineer- ing, 201 1,23 (2):161一174.
[16]Gao Y. Progress of data mining in China. Jour- nal of Nanjing University(Natural Sciences),2011, 47(4); 351-353.(高阳.中国数据挖掘研究进展.南京大学学报(自然科学),2011, 47(4):351一353).


No related articles found!
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!