plda - A parallel C++ implementation of fast Gibbs sampling of Latent Dirichlet Allocation - Google Project Hosting
plda is a parallel C++ implementation of Latent Dirichlet Allocation (LDA) (1,2). We are expecting to present a highly optimized parallel implemention of the Gibbs sampling algorithm for the training/inference of LDA (3). The carefully designed architecture is expected to support extensions of this algorithm.
We will release an enhanced parallel implementation of LDA, named as PLDA+ (2), which can improve scalability of LDA by significantly reducing the unparallelizable communication bottleneck and achieve good load balancing.
If you wish to publish any work based on plda, please cite our paper as:
Zhiyuan Liu, Yuzhou Zhang, Edward Y. Chang, Maosong Sun, PLDA+: Parallel Latent Dirichlet Allocation with Data Placement and Pipeline Processing. ACM Transactions on Intelligent Systems and Technology, special issue on Large Scale Machine Learning. 2011. Software available at http://code.google.com/p/plda.
If you have any questions, please visit http://groups.google.com/group/plda