南京大学学报(自然科学版) ›› 2019, Vol. 55 ›› Issue (6): 960972.doi: 10.13232/j.cnki.jnju.2019.06.009
Jiaming Hong1,Yun Huang2(),Shaopeng Liu3,Jian Yin4
摘要:
针对大型图中的各种top?k近似子图查询算法存在的顶点重叠度高、无法满足多样性匹配结果输出等问题,提出具有最大顶点覆盖集的多样性近似子图查询算法.该算法建立基于近邻关系和基于区域划分的双重索引,并为相互关系紧密的同标号顶点建立簇索引.在图查询过程中,利用近邻特征为查询图中的每个顶点快速筛选出满足局部匹配要求的候选顶点集,并从不同区域找到多个满足要求的近似匹配子图,避免了查询结果间的高重复率.同时,基于区域和同标号近邻簇的划分,优先查找属于不同划分或不同簇顶点的匹配,减少了不同区域划分间的交互,提高了查询的效率.在大量数据集上的实验结果验证了该算法在查询效率和结果多样性等方面的有效性.
中图分类号:
1 | Zhang S J , Yang J , Jin W . Sapper:subgraph indexing and approximate matching in large graphs. Proceedings of the VLDB Endowment,2010,3(1-2):1185-1194. |
2 | Zhu G P , Lin X M , Zhu K ,et al . TreeSpan:Efficiently computing similarity all?matching∥Proceedings of the 2012 ACM SIGMOD International Conference on Management of Data. Scottsdale,AZ,USA:ACM,2012:529-540. |
3 | Kim J , Choi D H , Li C . Inves:incremental partitioning?based verification for graph similarity search∥Proceedings of the 22nd International Conference on Extending Database Technology. Lisbon,Portugal:Open Proceedings,2019:229-240. |
4 | Wang R , Fang Y X , Feng X . Efficient parallel computing of graph edit distance∥Proceedings of the 35th International Conference on Data Engineering Workshops. Macao,China:IEEE,2019:233-240. |
5 | Shan X H , Wang G X , Ding L L ,et al . Top?k subgraph query based on frequent structure in |
large?scale dynamic graphs. IEEE Access,2018,6:78471-78482. | |
6 | Jin J H , Luo J Z , Khemmarat S ,et al . GStar:an efficient framework for answering top?k star queries on billion?node knowledge graphs. World Wide Web,2019,22(4):1611-1638. |
7 | Khan A , Wu Y H , Aggarwal C C ,et al . NeMa:fast graph search with label similarity. Proceedings of the VLDB Endowment,2013,6(3):181-192. |
8 | 黄云,洪佳明,覃遵跃 . 基于双索引的近似子图匹配. 计算机应用,2012,32(7):1994-1997.(Huang Y,Hong J M,Qin Z Y. Approximate subgraph matching based on dual index. Journal of Computer Applications,2012,32(7):1994-1997.) |
9 | Yang J , Yang X , Zhou Z B ,et al . Graph matching based on fast normalized cut∥25th International Conference on Neural Information Processing. Springer Berlin Heidelberg,2018:519-528. |
10 | Goyal P , Ferrara E . Graph embedding techniques,applications,and performance:a survey. Knowledge?Based Systems,2018,151:78-94. |
11 | Bi F , Chang L J , Lin X M ,et al . Efficient subgraph matching by postponing cartesian products∥Proceedings of the 2016 ACM SIGMOD International Conference on Management of Data. San Francisco,CA,USA:ACM,2016:1199-1214. |
12 | Zheng W Z , Zou L , Zhao D Y . Answering subgraph queries over large graphs∥12th International Conference on Web?Age Information Management. Springer Berlin Heidelberg,2011:390-402. |
13 | Zhu L H , Ng W K , Cheng J . Structure and attribute index for approximate graph matching in large graphs. Information Systems,2011,36(6):958-972. |
14 | Malliaros F D , Vazirgiannis M . Clustering and community detection in directed networks:a survey. Physics Reports,2013,533(4):95-142. |
15 | Yang Z W , Fu A W C , Liu R F . Diversified top?k subgraph querying in a large graph∥Proceedings of the 2016 ACM SIGMOD International Conference on Management of Data. San Francisco,CA,USA:ACM,2016:1167-1182. |
16 | Yu H W , Yuan D Y . Subgraph search in large graphs with result diversification∥Proceedings of the 2014 SIAM International Conference on Data Mining. Philadelphia,PA,USA:SIAM,2014:1046-1054. |
[1] | 吕国俊,曹建军,郑奇斌,常宸,翁年凤. 基于结构保持对抗网络的跨模态实体分辨[J]. 南京大学学报(自然科学版), 2020, 56(2): 197-205. |
[2] | 吴静怡,吴钟强,商琳. 基于Shapelet的不相关情感子序列挖掘方法[J]. 南京大学学报(自然科学版), 2020, 56(1): 57-66. |
[3] | 李家辉, 周忠眉. 基于多次学习和关联度的关联分类改进算法[J]. 南京大学学报(自然科学版), 2019, 55(4): 564-572. |
[4] | 李藤, 杨田, 代建华, 陈鸰. 基于模糊区分矩阵的结直肠癌基因选择[J]. 南京大学学报(自然科学版), 2019, 55(4): 633-643. |
[5] | 胡 淼,王开军,李海超,陈黎飞. 模糊树节点的随机森林与异常点检测[J]. 南京大学学报(自然科学版), 2018, 54(6): 1141-1151. |
|