I've written two articles before, namely
1) A review of matrix decomposition: scikit-learn:2.5. Matrix factor decomposition problem
2) A brief introduction to TRUNCATEDSVD : Scikit-learn: Implementing LSA via TRUNCATEDSVD (implicit semantic analysis)
Today, the discovery of NMF is also a very good and practical model, simply introduced, it also belongs to the scikit-learn:2.5. Matrix factor decomposition is part of the problem.
The NMF is another method of compression, provided the data matrix is assumed to be non-negative. In the case where the data matrix does not contain negative values,the NMF can replace the PCA and his variants (NMFcan be plugged in instead ofPCAor its variants, in the cases where the data matrix is does not contain negative values. By decomposing X into W and H, he optimizes the following formula:
This norm is a obvious extension of the Euclidean norm to matrices. (Other optimization objectives has been suggested in the NMF literature, in particular Kullback-leibler divergence, but t Hese is not currently implemented.)
Work, to be continued ....
Copyright NOTICE: This article for Bo Master original article, without Bo Master permission not reproduced.
Scikit-learn: LSA (implicit semantic analysis) via non-negative matrix factorization (NMF or NNMF)