LSH之p-stable分布

來源:互聯網
上載者:User

1:Cauchy distribution

Probability density function

The Cauchy distribution has the probability density function

      = { 1 \over \pi } \left[ { \gamma \over (x - x_0)^2 + \gamma^2 } \right], ">   

 

 

where x0 is the location parameter, specifying the location of the peak of the distribution, and γ is the scale parameter which specifies the half-width at half-maximum (HWHM). γ is also equal to half the interquartile range and is sometimes called the probable error. Cauchy himself exploited such a density function in 1827, with infinitesimal scale parameter, in defining a Dirac delta function (see there).

Probability density function

The purple curve is the standard Cauchy distribution

 

 

The special case when x0 = 0 and γ = 1 is called the standard Cauchy distribution with the probability density function

 

Cumulative distribution function

The cumulative distribution function (cdf) is:

Cumulative distribution function

 

2:p-stable distributions

 

 

根據上面原理,很容易證明標準常態分佈是2-stable。

 

++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++

 

問題:

1:怎麼預先計算k值

通過隨機從dataset中取小量點,讓後按照演算法計算一邊,通過遞增k值,找到一個k值使得計算時間最小。

2:怎麼放bucket裡面

每個點,都有L個K元向量,其實向量中的每個元素都是同一種性質的,只是用了不同hash函數hash的話。至於具體怎麼分布的就要看h1這個函數了。

3:怎麼保證精確度

manual手冊上有詳細說明,其實為什麼作者選用標準常態分佈,就是因為標準常態分佈是2-stable,這樣在精確度方面就有了數學的保證

聯繫我們

該頁面正文內容均來源於網絡整理,並不代表阿里雲官方的觀點,該頁面所提到的產品和服務也與阿里云無關,如果該頁面內容對您造成了困擾,歡迎寫郵件給我們,收到郵件我們將在5個工作日內處理。

如果您發現本社區中有涉嫌抄襲的內容,歡迎發送郵件至: info-contact@alibabacloud.com 進行舉報並提供相關證據,工作人員會在 5 個工作天內聯絡您,一經查實,本站將立刻刪除涉嫌侵權內容。

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.