南京大学学报(自然科学版) ›› 2014, Vol. 50 ›› Issue (4): 494.
宋威,刘明渊,李晋宏
Song Wei, Liu Mingyuan, Li Jinhong
摘要: 由于能反映用户的偏好,可以弥补传统频繁项集挖掘仅由支持度来衡量项集重要性的不足,高效用项集正在成为当前数据挖掘研究的热点。为使高效用项集挖掘更好地适应数据流环境,提出了一种基于事务型滑动窗口的数据流中高效用项集挖掘算法MHUIDS。首先在二进制向量的基础上,提出了高事务加权效用项集树(HTWUI-树)的结构。进而分别给出了事务型滑动窗口初始化与滑动的算法描述。最后,提出了高效用项集的剪枝策略与挖掘算法。实验结果表明,MHUIDS算法具有较高的挖掘效率及较低的存储开销。
[1] Agrawal R, Srikant R. Fast algorithms for mining association rules in large databases. In: Bocca J B, Jarke M, Zaniolo C. The 20th International Conference on Very Large Data Bases. San Francisco: Morgan Kaufmann, 1994: 487~499. [2] 宋 威,李晋宏,徐章艳等.一种新的频繁项集精简表示方法及其挖掘算法的研究.计算机研究与发展,2010,47(2): 277~285. [3] Liu Y, Liao W K, Choudhary A N. A two-phase algorithm for fast discovery of high utility itemsets. In: Ho T B, Cheung D W L, Liu H. The 9th Pacific-Asia Conference on Advances in Knowledge Discovery and Data Mining. Berlin: Springer, 2005: 689~695. [4] Lan G C, Hong T P, Tseng V S. An efficient projection-based indexing approach for mining high utility itemsets. Knowledge and Information Systems, 2014, 38(1): 85~107. [5] 高 阳.中国数据挖掘研究进展.南京大学学报(自然科学), 2011, 47(4): 351~353. [6] Yao H, Hamilton H J, Butz C J. A foundational approach to mining itemset utilities from databases. In: Berry M W, Dayal U, Kamath C, et al. The 4th SIAM International Conference on Data Mining. Philadelphia: SIAM Press, 2004, 482~486. [7] Yao H, Hamilton H J. Mining itemset utilities from transaction databases. Data & Knowledge Engineering, 2006, 59(3): 603~626. [8] Li Y C, Yeh J S, Chang C C. Isolated items discarding strategy for discovering high utility itemsets. Data & Knowledge Engineering, 2008, 64(1): 198~217. [9] Han J, Pei J, Yin Y, et al. Mining frequent patterns without candidate generation: A frequent-pattern tree approach. Data Mining and Knowledge Discovery, 2004, 8(1): 53~87. [10] Tseng V S, Wu CW, Shie BE, et al. UP-growth: An efficient algorithm for high utility itemset mining. In: Rao B, Krishnapuram B, Tomkins A, et al. The 16th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. New York: ACM Press, 2010, 253~262. [11] Tseng V S, Shie B E, Wu C W, et al. Efficient algorithms for mining high utility itemsets from transactional databases. IEEE Transactions on Knowledge and Data Engineering, 2013, 25(8): 1772~1786. [12] Ahmed C F, Tanbeer S K, Jeong B-S, et al. Efficient tree structures for high utility pattern mining in incremental databases. IEEE Transactions on Knowledge and Data Engineering, 2009, 21(12): 1708~1721. [13] Song W, Liu Y, Li J. Mining high utility itemsets by dynamically pruning the tree structure. Applied Intelligence, 2014, 40(1): 29~43. [14] 李海峰,章 宁,朱建明等.时间敏感数据流上的频繁项集挖掘算法.计算机学报,2012,35(11):2283~2293. [15] 吴 枫,仲 妍,吴泉源.基于时间衰减模型的数据流频繁模式挖掘.自动化学报,2010,36(5):674~684. [16] Chu C J, Tseng V S, Liang T. An efficient algorithm for mining temporal high utility itemsets from data streams. Journal of Systems and Software, 2008, 81(7):1105~1117. [17] Li H F, Huang H Y, Chen Y C, et al. Fast and memory efficient mining of high utility itemsets in data streams. In: The 8th IEEE International Conference on Data Mining.New York: IEEE Press, 2008: 881~886. [18] Ahmed C F, Tanbeer S K, Jeong B S. Efficient mining of high utility patterns over data streams with a sliding window method. In: Chowdhury M U, Ray S, Lee R Y. The 11th IEEE/ACIS International Conference on Software Engineering, Artificial Intelligence, Networking, and Parallel/Distributed Computing. New York: IEEE Press, 2010: 99~113. |
No related articles found! |
|