南京大学学报(自然科学版) ›› 2023, Vol. 59 ›› Issue (4): 570–579.doi: 10.13232/j.cnki.jnju.2023.04.004

• • 上一篇    下一篇

基于数据流的时间条件占优查询

田金灿, 孙雪姣()   

  1. 烟台大学计算机与控制工程学院,烟台,264005
  • 收稿日期:2023-06-05 出版日期:2023-07-31 发布日期:2023-08-18
  • 通讯作者: 孙雪姣 E-mail:sunxuejiao6@sina.com
  • 基金资助:
    国家自然科学基金(62072392)

Query time⁃conditional preference query based on data flow

Jincan Tian, Xuejiao Sun()   

  1. College of Computer and Control Engineering,Yantai University,Yantai,264005,China
  • Received:2023-06-05 Online:2023-07-31 Published:2023-08-18
  • Contact: Xuejiao Sun E-mail:sunxuejiao6@sina.com

摘要:

传统的偏好推理使用权衡增强的条件偏好网络(Tradeoff?Enhanced Conditional Preference Networks,TCP?nets)进行用户的偏好推理,不仅能高效地表示对元组的定性偏好关系并优化用户偏好结果,还能描述每个属性之间的偏好关系,其主要聚焦于关系元组中的单个属性的偏好.但把对条件偏好查询的技术推广到数据流的条件提取却是一个挑战,面临的技术困难主要是对数据流中序列的提取,对提取的序列进行占优查找等.首先,针对偏好数据流,提出一种时间条件查询语言Stream Pref来处理数据流;其次,在Stream Pref中加入时间索引来推理和规范数据流提取序列的时间条件偏好,提出提取对象序列算法、占优对象及占优序列查找算法和数据流序列间占优对比的算法;最后,在数据集上分析验证提出的算法的有效性.实验结果证明,提出的算法与min Top?k,Partition和Incpartition算法相比,得到的结果更准确.

关键词: TCP?nets, 偏好查询, 连续查询语言, 时间索引, 占优对比

Abstract:

Traditional preference inference uses tradeoff?enhanced conditional preference networks for user preference inference,which not only efficiently represent qualitative preference relations over tuples and optimize user preference results,but also describe preference relations between each attribute. The main focus is on the preference of individual attributes in relational tuples,but it is a challenge to extend the technique of conditional preference query to the conditional extraction of data streams,and the technical difficulties are mainly the extraction of sequences in the data streams and the preference finding of the extracted sequences. Firstly,a temporal conditional query language Stream Pref is proposed to process the data streams for preference data streams. Secondly,Stream Pref incorporates a temporal index to reason and standardize the temporal conditional preferences of the extracted sequences of data streams. An algorithm for extracting object sequences,an algorithm for finding preference objects and preference sequences and an algorithm for preference comparison among data stream sequences are proposed. Finally,the effectiveness of the algorithm proposed in this paper is analyzed and verified on the data set. Experimental results show that the proposed algorithm gets more accurate results compared with min top?k algorithm,partition algorithm and incpartition algorithm.

Key words: TCP?nets, preference query, continuous query language, time index, dominant contrast

中图分类号: 

  • TP391

图1

运动员的行动轨迹"

图2

占优对象查找算法的搜索树"

表1

合成数据集的生成参数"

参数变量默认值
属性数量8,10,12,14,1610
序列数量4,8,16,24,328

表2

从合成数据集提取数据流序列的参数"

参数变量默认值
时间范围(s)10,20,40,60,80,10020
时间滑动间隔(s)1,10,20,30,401

图3

合成数据集中使用不同参数时两种算法运行时间的对比(a)属性数不同;(b)序列数不同;(c)时间范围不同;(d)滑动间隔不同;其余参数均采用默认值"

表3

真实数据集的参数"

属性变量时刻
比赛队伍3232
比赛场次6462
运动员736736
动作1670812621
移动方向1306072040
场上位置1376212150

表4

从真实数据集提取数据流序列的参数"

参数变量默认值
时间范围 (s)6,12,18,24,3024
时间滑动间隔 (s)1,3,6,9,121

图4

真实数据集上使用不同参数时两种算法运行时间的对比(a)时间范围不同(滑动间隔为1 s);(b)滑动间隔不同(时间范围为24 s)"

表5

真实数据参数"

参数变量默认值
时间范围(s)5,10,20,40,80,16040
时间滑动间隔(s)1,3,6,9,121

图5

在真实数据集上使用不同参数的连续查询算法运行时间的对比(a)时间范围不同(滑动间隔为1 s);(b)滑动间隔不同(时间范围为24 s)"

1 Doyle J. Prospects for preferences. Computational Intelligence200420(2):111-136.
2 栾艳红,孙雪姣. 基于CP?net偏好的关系数据库的Top?k实现. 中国科学技术大学学报201949(2):93-99.
Luan Y H, Sun X J. Top?k query of relational database based on CP?net. Journal of University of Science and Technology of China201949(2):93-99.
3 Lughofer E, Pratama M. Online active learning in data stream regression using uncertainty sampling based on evolving generalized fuzzy models. IEEE Transactions on Fuzzy Systems201826(1):292-309.
4 刘琴. 大数据分析下分布式数据流处理技术研究. 软件工程201922(12):44-46.
Liu Q. Research on distributed data flow processing technology under big data analysis. Software Engineering201922(12):44-46.
5 王卫星,刘兆伟,石敬华. 基于时间敏感滑动窗口的CP?nets结构学习. 南京大学学报(自然科学)202056(2):175-185.
Wang W X, Liu Z W, Shi J H. Learning of CP?nets structure based on a time?sensitive sliding window. Journal of Nanjing University (Natural Science)202056(2):175-185.
6 Ahmed S, Mouhoub M. Representation and reasoning with probabilistic TCP?nets. Computer and Information Science2018,11(4):9-28.
7 Alanazi E. Extending conditional preference networks to handle changes. International Journal of Advanced Computer Science and Applications201910(9):571-577.
8 Amor N B, Dubois D, Gouider H,et al. Graphical models for preference representation:An overview∥Proceedings of the 10th International Conference on Scalable Uncertainty Management. Springer Berlin Heidelberg,2016:96-111.
9 Kie?ling W, K?stler G. Preference SQL?design,implementation,experiences∥Proceedings of the 28th International Conference on Very Large Databases. Hong Kong,China:VLDB,2002:990-1001.
10 Santhanam G R, Basu S, Honavar V. Representing and reasoning with qualitative preferences:Tools and applications. San Rafael,CA,USA:Morgan & Claypool,2016:138.
11 Ahmed S, Mouhoub M. Conditional preference networks with user's genuine decisions. Computa?tional Intelligence202036(3):1414-1442.
12 de Amo S, Giacometti A. Temporal conditional preferences over sequences of objects∥Proceedings of the 19th IEEE International Conference on Tools with Artificial Intelligence. Patras,Greece:IEEE,2007:246-253.
13 Ribeiro M R, Barioni M C N, de Amo S,et al. Reasoning with temporal preferences over data streams∥Proceedings of the 30th International Florida Artificial Intelligence Research Society Conference. Marco Island,FL,USA:AI Magzine,2017:700-705.
14 刘兆伟. 基于偏好数据库的无环CP?nets结构学习方法研究. 博士学位论文. 济南:山东大学,2018.
Liu Z W. Research on structure learning methods of acyclic CP?nets based on preference database. Ph.D. Dissertation. Ji'nan:Shandong University,2018.
15 Alguliyev R M, Aliguliyev R M, Alakbarov R G,et al. The skyline operator for selection of virtual machines in mobile computing. International Journal of Modern Education and Computer Science201810(11):1-10.
16 El Maarry K, Lofi C, Balke W T. Crowdsourcing for query processing on web data:A case study on the skyline operator. Journal of Computing and Information Technology?CIT201523(1):43-60.
17 Huo Y, Zhang J D. A nonlinear service composition method based on the Skyline operator. Journal of Systems Engineering and Electronics202031(4):743-750.
18 Zervoudakis P, Kondylakis H, Spyratos N,et al. Query rewriting for incremental continuous query evaluation in HIFUN. Algorithms202114(5):149.
19 杨茸,牛保宁. 空间文本数据流上连续查询评估技术综述. 计算机科学与探索202115(4):631-640.
Yang R, Niu B N. Survey of continuous queries over spatial?textual data streams. Journal of Frontiers of Computer Science Technology202115(4):631-640.
20 Arasu A, Babu S, Widom J. The CQL continuous query language:Semantic foundations and query execution. The VLDB Journal:The International Journal on Very Large Data Bases,200615(2):121-142.
21 Ribeiro M R, Barioni M C N, de Amo S,et al. StreamPref:A query language for temporal conditional preferences on data streams. Journal of Intelligent Information Systems201953(2):329-360.
22 Kontaki M, Papadopoulos A N, Manolopoulos Y. Continuous Top?k dominating queries. IEEE Transactions on Knowledge and Data Engineering201224(5):840-853.
23 de Amo S, Bueno M L. Continuous processing of conditional preference queries∥Proceedings of the 26th Simpósio Brasileiro de Banco de Dados. Florianópolis,Santa Catarina,Brasil:SBC,2011:25-32.
24 Ribeiro M R, Barioni M C N, de Amo S,et al. Temporal conditional preference queries on streams∥Proceedings of the 28th International Conference on Database and Expert Systems Applications. Springer Berlin Heidelberg,2017:143-158.
25 王卫星. 流式数据的CP?nets结构学习研究. 硕士学位论文. 烟台:烟台大学,2021.
Wang W X. The research of CP?nets structure learning on streaming data. Master Dissertation. Yantai:Yantai University,2021.
26 李润泽,孙雪姣. 基于时间条件提取序列的数据流偏好查询. 计算机应用202242(3):724-730.
Li R Z, Sun X J. Data stream preference query based on extraction sequence according to temporal condition. Journal of Computer Applications202242(3):724-730.
[1] 梁慧玲, 刘慧, 刘力维, 赵佳, 阮怀军. 基于分位数因子模型的高维时间序列因果关系分析[J]. 南京大学学报(自然科学版), 2023, 59(4): 550-560.
[2] 张寿军, 江海峰, 肖硕, 王树豪, 商景杰. 移动群智感知中基于改进文化基因算法的长时多任务分配[J]. 南京大学学报(自然科学版), 2023, 59(4): 561-569.
[3] 孟元, 张轶哲, 张功萱, 宋辉. 基于特征类内紧凑性的不平衡医学图像分类方法[J]. 南京大学学报(自然科学版), 2023, 59(4): 580-589.
[4] 王冰洁, 张超, 李德玉, 马瑾男, 王渊. 基于区间二型模糊多粒度证据融合方法的钢铁行业耗能决策[J]. 南京大学学报(自然科学版), 2023, 59(4): 600-609.
[5] 赵冠博, 张勇丙, 毛存礼, 高盛祥, 王奉孝. 融入领域知识的跨境民族文化生成式摘要方法[J]. 南京大学学报(自然科学版), 2023, 59(4): 620-628.
[6] 徐远东, 熊永平, 张铮, 伍贵宾, 张兴, 王伟. 基于图像边缘检测的扭曲文档矫正[J]. 南京大学学报(自然科学版), 2023, 59(4): 660-668.
[7] 徐阳, 王磊, 张义宗, 王诚彪. 一种基于相容块划分的动态增量式属性约简方法[J]. 南京大学学报(自然科学版), 2023, 59(4): 680-689.
[8] 杨京虎, 段亮, 岳昆, 李忠斌. 基于子事件的对话长文本情感分析[J]. 南京大学学报(自然科学版), 2023, 59(3): 483-493.
[9] 仲兆满, 熊玉龙, 黄贤波. 基于异构集成学习的多元文本情感分析研究[J]. 南京大学学报(自然科学版), 2023, 59(3): 471-482.
[10] 程钦男, 莫志强, 曹斌, 范菁, 单宇翔. 基于多粒度信息编码和联合优化的篇章级服务事件序列抽取方法[J]. 南京大学学报(自然科学版), 2023, 59(3): 460-470.
[11] 谭嘉辰, 董永权, 张国玺. SSM: 基于孪生网络的糖尿病视网膜眼底图像分类模型[J]. 南京大学学报(自然科学版), 2023, 59(3): 425-434.
[12] 曲皓, 狄岚, 梁久祯, 刘昊. 双端输入型嵌套融合多尺度信息的织物瑕疵检测[J]. 南京大学学报(自然科学版), 2023, 59(3): 398-412.
[13] 张绎凡, 李婷, 葛洪伟. 多样性诱导的潜在嵌入多视图聚类[J]. 南京大学学报(自然科学版), 2023, 59(3): 388-397.
[14] 周业瀚, 沈子钰, 周清, 李云. 基于生成式对抗网络的自监督多元时间序列异常检测方法[J]. 南京大学学报(自然科学版), 2023, 59(2): 256-262.
[15] 宋雨, 肖玉柱, 宋学力. 基于伪标签回归和流形正则化的无监督特征选择算法[J]. 南京大学学报(自然科学版), 2023, 59(2): 263-272.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!