A tempered transition based learning algorithm for undirected topic model

Jiang Xiaojuan,Zhang Wensheng*,Yang Yang

Journal of Nanjing University(Natural Sciences) ›› 2016, Vol. 52 ›› Issue (2) : 335.

PDF(2485188 KB)
PDF(2485188 KB)
Journal of Nanjing University(Natural Sciences) ›› 2016, Vol. 52 ›› Issue (2) : 335.

A tempered transition based learning algorithm for undirected topic model

  • Jiang Xiaojuan,Zhang Wensheng*,Yang Yang
Author information +
History +

Abstract

Replicated Softmax model,an undirected topic model for text data mining,provides a powerful framework for extracting semantic topics form document collections.Compared to the directed topic models,it has a better way of dealing with documents of different lengths,and computing the posterior distribution over the latent topic values is easy.However,due to the presence of the global normalizing constant,maximum learning procedure for this model is intractable.Constrastive Divergence(CD)algorithm,is one of the dominant learning schemes for RBMs based on Markov chain Monte Carlo(MCMC)methods.It relies on approximating the negative phase contribution to the gradient with samples drawn from a short alternating Gibbs Markov chain starting from the observed training sample.However,using these short chains yields a low variance,but biased estimate of the gradient,which makes the learning procedure rather slow.The main problem here is the inability of Markov chain to efficiently explore distributions with many isolated modes.In this paper,a new class of stochastic approximation algorithms is considered to learn Replicated Softmax model.To efficiently explore highly multimodal distributions,we use a MCMC sampling scheme based on tempered transitions to generate sample states of a thermodynamic system.The tempered transitions move systematically from the desired distribution,to the easily­sampled distribution,and back to the desired distribution.This allows the Markov chain to produce less correlated samples between successive parameter updates,and hence considerably improves parameter estimates.The experiments are conducted on three popular text datasets,and the results demonstrate that we can successfully learn good generative model of real text data that performs well on topic modelling and document retrieval.

Cite this article

Download Citations
Jiang Xiaojuan,Zhang Wensheng*,Yang Yang. A tempered transition based learning algorithm for undirected topic model[J]. Journal of Nanjing University(Natural Sciences), 2016, 52(2): 335

References

[1] Blei D M.Probabilistic topic models.Communi­cations of the ACM,2012,55(4):77-84.
[2]  Blei D M,Ng A Y,Jordan M I.Latent Dirichllocation.The Journal of Machine Learning Research,2003,3:993-1022.
[3]  Teh Y W,Jordan M I.Hierarchical Bayesian nonparametric models with applications.Bayesian Nonparametrics,2010,158-202.
[4]  Welling M,Rosen­Zvi M,Hinton G E.Exponential family harmoniums with an applica­tion to information retrieval.In:Advances in Neural Information Processing Systems.Cambridge.MA:MIT Press,2004,1481-1488.
[5]  Zeng J,Cheung W K,Liu J.Learning topic models by belief propagation.IEEE Transactions on Pattern Analysis and Machine Intelligence,2013,35(5):1121-1134.
[6]  Xing E P,Yan R,Hauptmann A.Mining associated text and images with dual­wing harmoniums.In:Proceedings of the 21st Conference on Uncertainty in Artificial Intelligence(UAI2005).Arlington,Virginia:AUAI Press,2005,633-641.
[7]  Inouye D,Ravikumar P,Dhillon I.Admixture of poisson MRFs:A topic model with word dependencies.In:Proceedings of The 31st International Conference on Machine Learning.Philadelphia,PA:ACM Press,2014,683-691.
[8]  周淑媛,肖鹏峰,冯学智等.基于马尔可夫随机场模型的SAR图像积雪识别.南京大学学报(自然科学),2015,51(5):976-986.(Zhou S Y,Xiao P F,Feng X Z,et al.Recognizing snow cover from SAR image based on Markov Random Field model.Journal of Nanjing University(Natural Sciences),2015,51(5):976-986.)
[9]  Hinton G E,Salakhutdinov R R.Replicated softmax:An undirected topic model.In:Advances in Neural Information Processing Systems.Cambridge.MA:MIT Press,2009,1607-1614.
[10]  Hinton G,Salakhutdinov R.Discovering binary codes for documents by learning deep generative models.Topics in Cognitive Science,2011,3(1):74-91.
[11]  Srivastava N,Salakhutdinov R R,Hinton G E.Modeling documents with deep Boltzmann machines.In:Proceedings of the 29th Conference on Uncertainty in Artificial Intelligence(UAI2013).Corvallis,Oregon:AUAI Press,2013,616-624.
[12]  Srivastava N,Salakhutdinov R R.Multimodal learning with deep Boltzmann machines.Journal of Machine Learning Research,2014,15:2949-2980.
[13]  Kulesza A,Pereira F.Structured learning with approximate inference.In:Advances in Neural Information Processing Systems.Cambridge,MA:MIT Press,2007,785-792.
[14]  Zhu S C,Liu X.Learning in Gibbsian fields:How accurate and how fast can it be?IEEE Transactions on Pattern Analysis and Machine Intelligence,2002,24(7):1001-1006.
[15]  Hinton G E.Training products of experts by minimizing contrastive divergence.Neural Computation,2002,14(8):1771-1800.
[16]  Hinton G E.A practical guide to training restricted Boltzmann machines.Momentum,2010,9(1):926.
[17]  Fischer A,Igel C.Training restricted Boltzmann machines:An introduction.Pattern Recognition,2014,47(1):25-39.
[18]  Neal R M.Sampling from multimodal distribu­tions using tempered transitions.Statistics and Computing,1996,6(4):353-366.
[19]  Salakhutdinov R R.Learning in Markov random fields using tempered transitions.In:Advances in Neural Information Processing Systems.Cambri­dge.MA:MIT Press,2009,1598-1606.
[20]  Tieleman T.Training restricted Boltzmann machines using approximations to the likelihood gradient.In:Proceedings of the 25th International Conference on Machine Learning.NY,USA:ACM New York,2008,1064-1071.
[21]  Tieleman T,Hinton G.Using fast weights to improve persistent contrastive divergence.In:Proceedings of the 26th Annual International Conference on Machine Learning.NY,USA:ACM New York,2009,1033-1040.
[22]  Lewis D,Yang Y,Rose T G,et al.Rcv1:A new benchmark collection for text categorization research.The Journal of Machine Learning Research,2004,5:361-397.
[23]  Manning C D,Schütze H.Foundations of statistical natural language processing.Cambri­dge,MA:MIT Press,1999,620.
[24]  Salakhutdinov R,Murray I.On the quantitative analysis of deep belief networks.In:Proceedings of the 25th International Conference on Machine Learning.NY,USA:ACM New York,2008,872-879.

 

PDF(2485188 KB)

1735

Accesses

0

Citation

Detail

Sections
Recommended

/