Basic Principles of lsh

Source: Internet
Author: User

1: localitysensitive hashing

Localitysensitive hashing is used to construct a hash function set {G | RD-> u} Where D is the dimension of the vertex, so that for any vertex P, Q has:

-- If | p-q | <= R, then PR [g (p) = g (q)] must be high

-- If | p-q |> Cr, Then PR [g (p) = g (q)] must be very low

For example:

2: projection-based lsh Principle

For example:

 

A (XA, ya), B (XB, Yb), C (XC, YC), D (XD, YD ); if the hash function is H (A (XA, ya) = XA (projection on the X axis), then a, B, c, d projection on the X axis is the focus of XA, XB, XC, and XD, and the projection on the X axis of similar spatial points is also similar, in this way, we can use this feature to query neighboring points. This is the basic principle of projection-based lsh.

However, the preceding method ensures that the values of the one dimension obtained after hash are close, but the values obtained after hash are not close, for example, A and D.

3: (P1, P2, R, Cr)-sensitive lsh Definition

A family h of functions H: RD → u is called (P1, P2, R, Cr)-sensitive, if for any p, q:
-If | p-q | <r then PR [h (P) = H (q)]> P1
-If | p-q |> Cr Then PR [h (P) = H (q)] <P2

H (A (XA, ya) = XA (projection on the X axis) constructed in part 1 cannot meet this requirement,

The solution is:

The basic idea is also very simple, that is, there are several lines in the space, so that, for a long time, no matter which line of ing is very close, however, the neighboring vertex may be very close to the projection in a certain direction, but it may be very far away in other directions. In this way, the projection results in each direction are used (these results can be hashed ).

For example.

Now the question is: how to obtain the spatial line? How many lines do I need?

In e2lsh, how is the spatial line obtained according to the standard normal distribution? Why? I think it is easy to theoretically prove that it can satisfy the definition of (P1, P2, R, Cr)-sensitive lsh. As for how to prove it, you still need to take a look at the author's thesis and relevant documents.

 

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.