1: localitysensitive hashing
Localitysensitive hashing is used to construct a hash function set {G | RD-> u} Where D is the dimension of the vertex, so that for any vertex P, Q has:
-- If | p-q | <= R, then PR [g (p) = g (q)] must be high
-- If | p-q |> Cr, Then PR [g (p) = g (q)] must be very low
For example:
2: projection-based lsh Principle
For example:
A (XA, ya), B (XB, Yb), C (XC, YC), D (XD, YD ); if the hash function is H (A (XA, ya) = XA (projection on the X axis), then a, B, c, d projection on the X axis is the focus of XA, XB, XC, and XD, and the projection on the X axis of similar spatial points is also similar, in this way, we can use this feature to query neighboring points. This is the basic principle of projection-based lsh.
However, the preceding method ensures that the values of the one dimension obtained after hash are close, but the values obtained after hash are not close, for example, A and D.
3: (P1, P2, R, Cr)-sensitive lsh Definition
A family h of functions H: RD → u is called (P1, P2, R, Cr)-sensitive, if for any p, q:
-If | p-q | <r then PR [h (P) = H (q)]> P1
-If | p-q |> Cr Then PR [h (P) = H (q)] <P2
H (A (XA, ya) = XA (projection on the X axis) constructed in part 1 cannot meet this requirement,
The solution is:
The basic idea is also very simple, that is, there are several lines in the space, so that, for a long time, no matter which line of ing is very close, however, the neighboring vertex may be very close to the projection in a certain direction, but it may be very far away in other directions. In this way, the projection results in each direction are used (these results can be hashed ).
For example.
Now the question is: how to obtain the spatial line? How many lines do I need?
In e2lsh, how is the spatial line obtained according to the standard normal distribution? Why? I think it is easy to theoretically prove that it can satisfy the definition of (P1, P2, R, Cr)-sensitive lsh. As for how to prove it, you still need to take a look at the author's thesis and relevant documents.