Comparison of LSA and PLSA

Source: Internet
Author: User


1. Theoretical background Linear Algebra Probabilities and Statistics
2. Objective function Frobenius Norm Likelihood function
3. Polysemy No Yes
4. folding-in Straightforward Complicated

1. LSA stems from Linear Algebra as it's nothing more than a Singular Value decomposition. On the other hand, pLSA have a strong probabilistic grounding (latent variable models).

2. SVD is a least squares method (it finds a low-rank matrix approximation that minimizes the Frobenius norm of the differ ence with the original matrix). Moreover, as it is well known in machine learning, the least squares solution corresponds to the Maximum likelihood Soluti On when experimental errors is Gaussian. Therefore, LSA makes an implicit assumption of Gaussian noise on the term counts. On the other hand, the objective function maximized in pLSA is the likelihood function of multinomial sampling.

The values in the Concept-term matrix found by LSA is not normalized and may even contain negative values. On the other hand, values found by pLSA is probabilities which means they is interpretable and can be combined with othe R models.

NOTE:SVD was equivalent to PCA (Principal Component analysis) when the data was centered (has Zero-mean).

3. Both LSA and PLSA can handle synonymy but LSA cannot handle polysemy, as words is defined by a unique point in a space .

4. LSA and pLSA analyze a corpus of documents in order to find a new low-dimensional representation of it. In order to is comparable, new documents, were not originally in the corpus must being projected in the lower-dimensional Space too. This is called "folding-in". Clearly, new documents folded-in don ' t contribute to learning the factored representation so it's necessary to rebuild th E model using all the documents from time to time.

In LSA, folding-in are as easy as a matrix-vector product. In pLSA, this requires several iterations of the EM algorithm.

Comparison of LSA and PLSA

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.