def PCA (Datamat, topnfeat=9999999): #数据矩阵, output before Topnfeat feat meanvals = Mean (Datamat, axis=0) # Calculate Average meanremoved = datamat-meanvals Covmat = CoV (meanremoved, rowvar=0) #计算协方差矩阵 eigvals,eigvects = Linalg.eig (Mat (Covmat)) #特征值, eigvalind = Argsort (eigvals) #排序, to find the eigenvalues. In fact, the most inconsistent with other changes Eigvalind = eigvalind[:-( topnfeat+1): -1] #反转 redeigvects = eigvects[:,eigvalind] # Lowddatamat = meanremoved * Redeigvects #映射 reconmat = (Lowddatamat * redeigvects.t) + meanvals return Lowddatamat, Reconmat
The mathematical principle of principal component analysis we can simply look at it: find the most varied direction as a new feature
If you want to infer the meaning of this division from the results of the program, Redeigvects is very critical, and it gives a mapping relationship
[Mathematical model] The principal component analysis Method Python implementation