南京大学学报(自然科学版) ›› 2016, Vol. 52 ›› Issue (6): 1050.
汪小寒1,2,罗永龙1,2*,江叶峰1,赵传信1,2,吴文莉1,郭良敏1,2
Wang Xiaohan1,2,Luo Yonglong1,2*,Jiang Yefeng1,Zhao Chuanxin1,2,Wu Wenli1,Guo Liangmin1,2
摘要: 针对现有数据发布隐私保护保护算法中的“局部最优”划分问题,提出了一种基于KD树最优投影划分的k匿名算法.首先,在全局范围内对每一个属性维度进行遍历,根据投影距离方差值衡量每个维度的离散度,并确定最优维度;然后,在最优属性维度上,计算其划分系数值,并确定最优划分点.进一步引入一种改进的KD树结构,与传统的KD树结点是一个数据点不同,新设计的KD树中的每个结点均是一个集合.用经过划分点并垂直于最优维度的超平面将一个结点分成两部分,分别作为其左、右孩子结点.最后通过理论分析证明了本文算法的正确性,用实验比较和验证了算法的性能,实验结果显示所提算法平均概化范围减小10%~22%,能够实现更优的划分和更好的数据集可用性.
[1] 张啸剑,孟小峰.面向数据发布和分析的差分隐私保护.计算机学报,2014,37(4):927-949.(Zhang X J,Meng X F.Differential privacy in data publication and analysis.Chiese Journal of Computers,2014,37(4):927-949.) [2] Sweeney L.kanonymity:A model for protecting privacy.International Journal of Uncertainty,Fuzziness and KnowledgeBased Systems,2002,10(5):557-570. [3] Sweeney L.Achieving kanonymity privacy protection using generalization and suppression.International Journal of Uncertainty,Fuzziness and KnowledgeBased Systems,2002,10(5):571-588. [4] Samarati P,Sweeney L.Protecting privacy when disclosing information:kanonymity and its enforcement through generalization and suppression.Technical report,SRI International,1998. [5] Newton E M,Sweeney L,Malin B.Preserving privacy by deidentifying face images.IEEE transactions on Knowledge and Data Engineering,2005,17(2):232-243. [6] Machanavajjhala A,Kifer D,Gehrke J,et al.ldiversity:Privacy beyond kanonymity.ACM Transactions on Knowledge Discovery from Data(TKDD),2007,1(1):1-12. [7] Li N,Li T,Venkatasubramanian S.tcloseness:Privacy beyond kanonymity and ldiversity.In:IEEE 23rd International Conference on Data Engineering(ICDE 2007).Istanbul,Turkey:IEEE,2007:106-115. [8] Wong R C W,Li J,Fu A W C,et al.(α,k)anonymity:an enhanced kanonymity model for privacy preserving data publishing.In:Proceedings of the 12th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining.Philadelphia,PA,USA:ACM,2006:754-759. [9] Andrews M,Wilfong G,Zhang L.Analysis of kanonymity algorithms for streaming location data.In:2015 IEEE Conference on Computer Communications Workshops(INFOCOM WKSHPS).Hong Kong,China:IEEE,2015:1-6. [10] Di Castro D,LewinEytan L,Maarek Y,et al.Enforcing kanonymity in web mail auditing.In:Proceedings of the 9th ACM International Conference on Web Search and Data Mining.San Francisco,CA,USA:ACM,2016:327-336. [11] Song D,Sim J,Park K,et al.A privacypreserving continuous location monitoring system for locationbased services.International Journal of Distributed Sensor Networks,2015,11(2):1-10.Article ID:815613.. [12] Ni W,Gu M,Chen X.Location privacypreserving k nearest neighbor query under user’s preference.KnowledgeBased Systems,2016,103:19-27. [13] Bhat T P,Karthik C,Chandrasekaran K.A privacy preserved data mining approach based on kpartite graph theory.Procedia Computer Science,2015,54:422-430. [14] Liu C G,Liu I,Yao W S,et al.Kanonymity against neighborhood attacks in weighted social networks.Security and Communication Networks,2015,8(18):3864-3882. [15] Sarjapur K,Suma V,Christa S,et al.Big data management system for personal privacy using SW and SDF.Information Systems Design and Intelligent Applications.Springer India,2016:757-763. [16] 冯登国,张 敏,李 昊.大数据安全与隐私保护.计算机学报,2014,37(1):246-258.(Feng D G,Zhang M,Li H.Big data security and privacy protection.Chinese Journal of Computers.2014,37(1):246-258.) [17] 周水庚,李 丰,陶宇飞等.面向数据库应用的隐私保护研究综述.计算机学报,2009,32(5):847-861.(Zhou S G,Li F,Tao Y F,et al.Privacy preservation in database applications.Chinese Journal of Computers,2009,32(5):847-861.) [18] LeFevre K,DeWitt D J,Ramakrishnan R.Mondrian multidimensional kanonymity.In:Proceedings of the 22nd International Conference on Data Engineering(ICDE’06).Atlanta,GA,USA:IEEE,2006:25-35. [19] 吴英杰,唐庆明,倪巍伟等.基于取整划分函数的k匿名算法.软件学报,2012,8:2138-2148.(Wu Y J,Tang Q M,Ni W W,et al.Algorithm for kanonymity based on rounded partition function.Journal of Software,2012,8:2138-2148.) [20] 王 超,杨 静,张健沛等.基于投影区域密度划分的k匿名算法.通信学报,2015,8:125-134.(Wang C,Yang J,Zhang J P,et al.Algorithm for kanonymity based on projection area density partition.Journal of Communications,2015,8:125-134.) [21] Herranz J,Nin J,Solé M.KDtrees and the real disclosure risks of large statistical databases.Information Fusion,2012,13(4):260-273. |
No related articles found! |
|