南京大学学报(自然科学版) ›› 2016, Vol. 52 ›› Issue (4): 682.
屈太国*,蔡自兴
Qu Taiguo*,Cai Zixing
摘要: FastMap是经典多维标度法(classical multidimensional scaling,CMDS)的一种快速算法,它包含一系列的投影.在每次投影中,两个相距较远的点被选为枢轴点,连接枢轴点得到一个枢轴;然后将各样本投影到枢轴上;最后,修改所有样本间的距离.FastMap的不足在于只能得到CMDS的近似解.对FastMap进行了深入分析,指出FastMap的本质就是把各样本投影到由枢轴确定的一组正交向量上.由于这组向量通常不同于样本集的主轴,使得FastMap只能得到CMDS的近似解;并指出FastMap算法的最大投影次数等于样本集的内在维数.在此理论分析的基础上,提出了一种改进的FastMap 算法—iFastMap(improved FastMap)算法.通过对FastMap坐标进行主成分分析,iFastMap得到了与CMDS完全一致的解;此外,从样本集中选取一个内在维数等于整个样本集内在维数的子集,将枢轴点的选取限定在这个子集上,并在每次投影后只修改枢轴点与各样本间的距离,iFastMap的速度得到进一步提高.实验验证了iFastMap与CMDS解的完全一致性及其高效性.
[1] Cox T F,Cox M A A.Multidimensional scaling.Boca Raton:Chapman & Hall/CRC,2001,9-56. [2] 吴纯青,任沛阁,王小峰.基于语义的网络大数据组织与搜索.计算机学报,2015,38(1):1-17.(Wu C Q,Ren P G,Wang X F.Survery on semanticbased organization and search technologies for network big data.Chinese Journal of Computers,2015,38(1):1-17.) [3] Ingram S,Munzner T.Dimensionality reduction for documents with nearest neighbor queries.Neurocomputing,2015,150,Part B:557-569. [4] Mjolsness E,DeCoste D.Machine learning for science:State of the art and future prospects.Science,2001,293(5537):2051-2055. [5] Patel A P,Tirosh I,Trombetta J J,et al.Singlecell RNAseq highlights intratumoral heterogeneity in primary glioblastoma.Science,2014,344(6190):1396-1401. [6] 明 亮,赵 刚,谢桂海等.面向智能空间的位置感知方法研究.软件学报,2009,20(3):671-681.(Ming L,Zhao G,Xie G H,et al.Research on smart space oriented location awareness method.Journal of Software,2009,20(3):671-681.) [7] Wang S,Zhuo Z,Yang H,et al.An approach to facial expression recognition integrating radial basis function kernel and multidimensional scaling analysis.Soft Computing,2014,18(7):1363-1371. [8] Vogelstein J T,Park Y,Ohyama T,et al.Discovery of brainwide neuralbehavioral maps via multiscale unsupervised structure learning.Science,2014,344(6182):386-392. [9] Arkin A,Shen P,Ross J.A test case of correlation metric construction of a reaction pathway from measurements.Science,1997,277(5330):1275-1279. [10] Li S S M,Wang C C L,Hui K C.Bendinginvariant correspondence matching on 3D human bodies for feature point extraction.IEEE Transactions on Automation Science and Engineering,2011,8(4):805-814. [11] 徐江山,路海明,马卫民等.基于MDS变换聚类算法的数字电视用户特征提取.电子学报,2008,36(9):1786-1789.(Xu J S,Lu H M,Ma W M,et al.The extraction of digital TV user feature based on MDS clustering.Acta Electronica Sinica,2008,36(9):1786-1789.) [12] Tenenbaum J B,Silva V D,Langford J C.A global geometric framework for nonlinear dimensionality reduction.Science,2000,290(5500):2319-2323. [13] Chalmers M.A linear iteration time layout algorithm for visualizing high dimensional data.In:Proceedings of the 7th IEEE Visualization Conference.Los Alamitos,California,USA:IEEE Computer Society Press,1996:127-132. [14] Morrison A,Ross G,Chalmers M.Fast multidimensional scaling through sampling,Springs and Interpolation.Information Visualization,2003,2(1):68-77. [15] Williams M,Munzner T.Steerable,progressive multidimensional scaling.In:IEEE Symposium on Information Visualization 2004.Austin,TX,USA:IEEE Press,2004:57-64. [16] Wang J T L,Wang X,Shasha D,et al.Metricmap:An embedding technique for processing distancebased queries in metric spaces.IEEE Transactions on Systems,Man,and Cybernetics,Part B:Cybernetics,2005,35(5):973-987. [17] Silva V D,Tenenbaum J B.Sparse multidimensional scaling using landmark points.Technical Report.Department of Mathematics,Stanford University,Palo Alto,June 2004. [18] Faloutsos C,Lin K.FastMap:A fast algorithm for indexing,datamining,and visualization.In:Proceedings of the 1995 ACM SIGMOD International Conference on Management of Data.San Jose,California,USA:ACM Press,1995,24(2):163-174. [19] Mignotte M.A bicriteriaoptimizationapproachbased dimensionalityreduction model for the color display of hyperspectral images.IEEE Transactions on Geoscience and Remote Sensing,2012,50(2):501-513. [20] Wang X,Zhang L,Wang Y,et al.3D model features coclustering based on heterogeneous semantic network.In:The 4th IEEE International Conference on Information Science and Technology(ICIST 2014).Shenzhen,Guangdong,China:IEEE Press,2014:75-78. [21] Tejada E,Minghim R,Nonato L G.On improved projection techniques to support visual exploration of multidimensional data sets.Information Visualization,2003,2(4):218-231. [22] Latsoudas G,Sidiropoulos N D.A fast and effective multidimensional scaling approach for node localization in wireless sensor networks.IEEE Transactions on Signal Processing,2007,55(10):5121-5127. [23] Zhang D Y,Bai X,Liu P N.multidimensional scaling localization algorithm in wireless sensor networks.Sensors & Transducers,2014,164(2):71-79. [24] 张培珍,付 平,肖 军等.基于快速聚类索引的图像检索系统.吉林大学学报(信息科学版),2004,22(6):638-642.(Zhang P Z,Fu P,Xiao J,et al.Image retrieval system based on fast clustering indexing.Journal of Jilin University(Information Science Edition),2004,22(6):638-642.) [25] Liu Y,Kender J R.Fast video segment retrieval by SortMerge feature selection,boundary refinement,and lazy evaluation.Computer Vision and Image Understanding,2003,92(2-3):147-175. [26] 王 菁,杨寿保,高鹰等.FCAN:一种基于快速映射的内容访问网络.系统仿真学报,2007,19(17):3955-3960.(Wang J,Yang S B,Gao Y,et al.FCAN:Fastmapbased content addressable network.Journal of System Simulation,2007,19(17):3955-3960.) [27] Khan I,Huang J Z,Tung N T,et al.Ensemble clustering of high dimensional data with FastMap projection.In:Peng W C.PAKDD 2014 Workshops.Tainan,Taiwan:Springer International Publishing,2014:483-493. [28] Touati R,Mignotte M.MDSbased multiaxial dimensionality reduction model for human action recognition.In:2014 Canadian Conference on Computer and Robot Vision(CRV).Montreal,QC,Canada:IEEE Press,2014:262-267. [29] Zheng Y,Xiao L,Tang W,et al.A music recommendation method for largescale music library on a heterogeneous platform.In:Sun X H,Qu W Y,Ivan Stojmenovic I,et al.The 14th International Conference on Algorithms and Architectures for Parallel Processing.Dalian,China:Springer International Publishing,2014:472-482. [30] Moevus A,Mignotte M,de Guise J A,et al.Evaluating perceptual maps of asymmetries for gait symmetry quantification and pathology detection.In:The 36th Annual International Conference of the IEEE Engineering in Medicine and Biology Society(EMBC 2014).Chicago,Illinois:IEEE Press,2014:3317-3320. [31] Platt J.FastMap,MetricMap,and Landmark MDS are all nystrom algorithms.Technical Report.Microsoft Research,Microsoft Corporation,New York,MSR-TR-2004-26. [32] AbuKhzam F N,Samatova N F,Ostrouchov G,et al.Distributed dimension reduction algorithms for widely dispersed data.In:Gonzales T,Akl S G.The 14th IASTED International Conference on Parallel and Distributed Computer Systems.Cambridge,USA:Acta Press,2002:167-174. [33] Ostrouchov G,Samatova N F.On FastMap and the convex hull of multivariate data:Toward fast and robust dimension reduction.IEEE Transactions on Pattern Analysis and Machine Intelligence,2005,27(8):1340-1343. [34] An J Y,Yu J X,Ratanamahatana C A,et al.A dimensionality reduction algorithm and its application to interactive visualization.Journal of Visual Languages and Computing,2007,18(1):48-70. [35] Young G,Householder A S.Discussion of a set of points in terms of their mutual distances.Psychometrika,1938,3(1):19-22. [36] 张润楚.多元统计分析.北京:科学出版社,2006,288-311.(Zhang R C.Multivariate statistical analysis.Beijing:Science Press,2006,288-311.) [37] Pearson K.On lines and planes of closest fit to system of points in space.Philosophical Magazine,1901,2:559-572. [38] Hotelling H.Analysis of a complex of statistical variables into principal components.Journal of Educational Psychology,1933,24(6):417-441. [39] http://archive.ics.uci.edu/ml/datasets.html [40] http://www.cs.toronto.edu/~roweis/data.html [41] Kruskal J B.Nonmetric multidimensional scaling:A numerical method.Psychometrika,1964,29(2):115-129. |
No related articles found! |
|