Example of cosine similarity algorithm calculated by the PHP Data analysis engine, cosine of Data Analysis
This example describes the cosine similarity algorithm calculated by the PHP Data analysis engine. We will share this with you for your reference. The details are as follows:
For more information about cosine similarity, see Baidu Baike: cosine Similarity
<? Php/*** data analysis engine ** the element of the analysis vector must be consistent with the element of the reference vector. The maximum number of elements is obtained. The element of the analysis vector is filled with zero. * Obtain the cosine value of the analytical vector and the reference vector * @ author yu.guo@okhqb.com * // *** obtain the model of the vector * @ param unknown_type $ array to input the n-dimensional vector of the reference point of the analytical data. | Eg: array (, 1); */function getMarkMod ($ arrParam) {$ strModDouble = 0; foreach ($ arrParam as $ val) {$ strModDouble + = $ val * $ val;} $ strMod = sqrt ($ strModDouble); // whether to retain the digits after the decimal point return $ strMod ;} /*** obtain the number of benchmark elements * @ param unknown_type $ arrParam * @ return number */function getMarkLenth ($ arrParam) {$ intLenth = count ($ arrParam ); return $ intLenth;}/*** allocates indexes to the input array. The index of the reference point must be k and the vector index of the angle must be 'J '. * @ param Unknown_type $ arrParam * @ param unknown_type $ index * @ ruturn $ arrBack */function handIndex ($ arrParam, $ index = 'k ') {foreach ($ arrParam as $ key => $ val) {$ in = $ index. $ key; $ arrBack [$ in] = $ val;} return $ arrBack;}/***** @ param unknown_type $ arrMark benchmarking Vector Array (the index has been processed) * @ param unknown_type $ arrAnaly analysis Vector array (index processed) | array ('j0' => 1, 'j1' => 2 ....) * @ param unknown_type $ model of the strMarkMod benchmarking vector * @ param unknown_ty Pe $ intLenth vector length */function getCosine ($ arrMark, $ arrAnaly, $ strMarkMod, $ intLenth) {$ strVector = 0; $ strCosine = 0; for ($ I = 0; $ I <$ intLenth; $ I ++) {$ strMarkVal = $ arrMark ['K '. $ I]; $ strAnalyVal = $ arrAnaly ['J '. $ I]; $ strVector + = $ strMarkVal * $ strAnalyVal;} $ arrAnalyMod = getMarkMod ($ arrAnaly); // evaluate the analysis vector model $ strFenzi = $ strVector; $ strFenMu = $ arrAnalyMod * $ strMarkMod; $ strCosine = $ strFenzi/ $ StrFenMu; if (0! ==( Int) $ strFenMu) {$ strCosine = $ strFenzi/$ strFenMu;} return $ strCosine ;}?>