南京大学学报(自然科学版) ›› 2011, Vol. 47 ›› Issue (5): 544550.
蔡金凤,白清源**
Cai Jin一Feng,Bai Qing-Yuan
摘要: 针对在关联规则分类算法的构造分类器阶段中只考虑特征词是否存在,忽略了文木特征权重的问题,基于关联规则的文木分类方法(ARC-BC)的基础上提出一种可以提高关联文木分类准确率的
ISARCItcmSet Significance-based ARC)算法.该算法利用特征项权重定义了k领集重要度,通过挖掘重要项集来产生关联规则,并考虑提升度对待分类文木的影响.实验结果表明,挖掘重要项集的ISARC
算法可以提高关联文木分类的准确率.
[1]Liu B, Hsu W, Ma Y, integrating classification and association rule mining. Proceedings of ACM international Conference on Knowledge Discovery and Data Mining. New York:ACM,1998,80一86. [2]Li W, Han.l,Pei .I. Accurate and efficient clan sification based on multiple class-association rules. Proceedings of the 2001 IEEE lnterna- tional Conference on Data Mining. California, 2001,369一376. [3]Aaiane OR,Antonic M. Classifying text docu menu by associating terms with text categories. Proceedings of 13th Australasian Database Con ference. Mclbourncl Australian Computer Soci ety, 2002,21(2);215一222. [4]Gourab K,Md. Monirul I,Sirajum M. ACN:An associative classifier with negative rules. IEE International Conference on Computation al Science and Engineering, 2008,369~375. [5]Cheng H,Yan X F, Han J W, et al. Direct discriminative pattern mining for effective clas- sification. 2008 IEEE 24th International Confer- ence on Data Engineering, 2008,169一178 [6]Elena B, Silvia C,Paolo G. A lazy approach to associative classification, IEEE Transactions on Knowledge and Data Engineering, 2008,20 (2):156一171. [7]Chen X Y,Hu Y F.Text association categori zation hased on self-adaptive weighting. Journal of Chinese Computer Systems,2007,28(1) 116-121.陈晓云,胡运发.基于自适应加权的文木关联分类.小型微型计算机系统,2007,28 (1):116一121). [8]Shang B Z, Bai Q Y, Improved association text classification based on feature weight. Journal of Computer Research and Development,2008.45(Supplement); 252-256.商炳章,白清源. 基于特征项权重改进的关联文木分类.计算机研究与发展,2008,5(增刊)):252-256). [9]Chen D L, Bai Q Y. Association text classifica- tion based on term frequency. Journal of Com- puter Research and Development, 2009,46 (Supplement) ; 464-469.陈东亮,白清源.基于词频向量的关联文木分类.计算机研究与发展,2009,6(增刊):464一469). [10]Zhao G H,Luo B, Lin H. A classification algo- rithm and its applicatio. Journal of Nanjing Uni- versity(Natural Sciences),2001,37(2):142一147.(赵志宏,骆斌,林海.一种分类挖掘算法及其应用.南京大学学报(自然科学),2001, 3 7 (2}:142一147). [11]Zhou L, Zhu Q M, Li P F. A method to recog- nize unkonwn Chinese words based on statistic and regulation. Journal of Nanjing University (Natural Sciences),2005,4l:819一825. (周蕾,朱巧明,李培峰.一种基于统计和规则的未登录词识别方法.南京大学学报(自然科学). 2005,41:819一825). [12]Li I. B, Li N, Yang Y B. Maximal frequent itemset feneration based on graph. Journal of Nanjing University(Natural Sciences),2008,44(5): 486-494.(李立斌,李宁,杨育彬.一种基于分类互补性的特征选择算法.南京大学学报(自然科学),2008,}}(5):486}-494). [13]Agrawal R,Srikant R. Fast algorithms for min- ing association rules. Proceedings of the 20th VLDB Conference. Santiago,1994,187一199. [14]Ouyang W M, Zheng C, Cai Q S. The discov- cry of weighting association rules in DataBase. Journal of Software, 2001,12(4):612一619. (欧阳为民,郑诚,蔡庆生.数据库中加权关联规则的发现.软件学报,2001, 120);612- 619). [15]李荣陆.中文自然语言处理开放平台.http;// www. nlp, org, cn/does/download. php? doc一id =281,2003一O5一28. [16]Sebastiani F. Machine learning in automated text categorization. Association for Computing Machinery(ACM) Computing Surveys,2002,34(1):1一17. |
No related articles found! |
|