The differences between the two objects I and J can be calculated based on the mismatch rate:
D (i,j) = (p-m)/p;
where M is the number of matches (that is, the number of attributes in the same state as I and J), and P is the total number of attributes that characterize the object.
Similarity of
D (i,j) =1-d (I,J);
For symmetric two-tuple properties, each state is equally important. The dissimilarity of symmetric two-element attribute is symmetric two-yuan dissimilarity.
D (i,j) = (r+s)/(Q+R+S+T);
Asymmetric two-element attribute, two states are not equally important, asymmetric two-yuan dissimilarity, negative match number T is considered unimportant,
D (i,j) = (r+s)/(Q+r+s);
The dissimilarity of numeric attribute: Euclidean distance, Manhattan Distance,minkoski distance;
Euclidean distance:d (i,j) =sqrt (Power (X1-Y1), 2) + Power ((X2-y2), 2) +power ((Xn-yn), 2));
Manhattan Distance:d (i,j) =abs (x1-y1) +abs (x2-y2) +abs (Xn-yn);
Upper distance:p roduce the max minus value between each dimension of the object
This article is from the "Welcome" blog, make sure to keep this source http://friendsforever.blog.51cto.com/3916357/1612048
Data Mining Notes