南京大学学报(自然科学版) ›› 2012, Vol. 48 ›› Issue (4): 383389.
郭剑毅1.2**,李真1·2,余正涛1·2,张志坤1.2
Guo Jian-Yi1’2,Li Zhen1’2,Yu Zheng Tao1’2,Zhang Zhi一Kun1’2
摘要: 研究了如何使用协作分类器(协作使用条件随机场(LRFs)和支持向量机(SVM) )解决领域概念实例、属性及属性值的抽取以及它们三者之间对应关系预测的问题.首先将概念实例、属性及属性值看作三类实体,把概念实例、属性及属性值的抽取问题转化为命名实体识别问题,利用条件随机场建模进行命名实体识别;在此基础上定义实体间对应关系,对概念实例、属性及属性值三者的对应关系做预测,把概念实例、属性与属性值三者之间存在关系的向量标记为1,否则标记为0,利用支持向量机建模进行关系的预测.且以云南旅游景点概念实例、属性及属性值进行六组相关的实验.实验表明,在开放测试中协作分类器精确度达到84.4%、召回率达到82. 7%及F值达到为83. 6%,相比于词语共现F值提高了20个百分点.
[1]Eric T,Wang W M. A concept relationship ar quisition and inference approach for hierarchical taxonomy construction from tags, Information Processing and Management; An International Journal,2010,46(1):44一57. [2]Sanchez D. A methodology to learn ontological attributes from the Web. Data and Knowledge Engineering, 2010,6(69):57一597. [3]Poesio M,Almuhareb A, Identifying concept attributes using a classifier. Proceedings of the ACL SIGLEX Workshop on Deep Lexical Ac- quisition,Ann Arbor,2005,18~27. [4]Yoshinaga N,Torisawa K. Open-domain at- tributrvalue acquisition from semi-Structured texts. Proceedings of the OntoLex 2007, Susan, South-Korea, 2007,55一66. [5]Ravi S, Pasca M. Using structured text for large-scale attribute extraction. Proceedings of the 17th international Conference on information and Knowledge Management. Napa Valley, California, USA,2008,1183一1192. [6]Kang W, Sui Z F. Ontology concept instances and attributes simultaneously extracted based on web. Journal of Chinese information Process- ing, 2010, 1 ; 54-59.(康为,穗志方.基于Web 弱指导的木体概念实例及属性的同步提取.中文信息学报,2010,1;54-59). [7]Ye Z, Lin H F, Su S, et al. Extraction of char- actor attributes based on support vector ma- chine. Computer Research and Development, 2007, 2;271-275.(叶正,林鸿飞,苏绥等.基于支持向量机的人物属性抽取.计算机研究与发展,2007,2;271-275). [8]Guo J Y,Xue Z S, Yu Z T,et al. Named enti- ty recognition based on cascaded conditional ran- dom fields. Journal of Chinese information Pro- cessing, 2009 , 5 ; 47一52.(郭剑毅,薛征山,余正涛等.基于层叠条件随机场的旅游领域命名实体识别.中文信息学报,2009 ,5:47-52). [9]Darroch J,Lauritzen S, Speed T.Markov fields and log-linear interaction models for contingency tables. Annals of Statistics, 1980,8(3): 522一539. [10]Della P S, Della P V, Lafferty J, Inducting fea- tures of random fields. IEEE Transactions on Pattern Analysis and Machine intelligence, 1997,19(4):380一393. [11]Wallach H. Efficient Training of conditional random fields.http:www.cogsci.ed.ac.uk,2002 [12]Information Retrieval Laboratory, Harbin lnsti- tute of Technology. Synonymous with the word forest(Extended Edition), http;//www. it一 lab. org/, 2008-05- 19.(哈尔滨工业大学信息检索研究室.同义词词林(扩展版).http; // WWW.ir一lab. org/,2008一05一19). [13]Liao S Z,Ding L Z,Jia L. Support vector re gression parameter adjustment. Journal of Nan- jing University(Natural Sciences),2009,45 (5):585-592.廖士中,J立中,贾磊.支持向量回归多参数的同时调节.南京大学学报(自然科学),2009,45(5):585-592). [14]Geng Q, Geng C. Use of the word co-occur- rence for Ontology concept gain. Modern Li- brary and Information Technology, 2006,1(2): 43-45.耿赛,耿崇.利用词语共现进行Ontology的概念获取.现代图书情报技术,2006,1(2):43一45) . [15]Geng H T,Cai Q S, Yu K,et al. Document keywords automatically extracted based on word co-occurrence map. Journal of Nanjing Univer- say(N atural Sciences),2006,42(2):156一162.(耿焕同,蔡庆生,于混等.一种基于词共现图的文档主题词自动抽取方法.南京大学 学报(自然科学),2006, 42(2);156-162). [16]Yao X M Guo J Y Yu Z T,et al. A new algo- rithm based on word co-occurrence and its appli- canon in domain concept extraction. 2009 IEEE international Conference on intelligent Compu- ting and Intelligent Systems, Shanghai,China, 2009,4(3):521一525. |
No related articles found! |
|