A mathematical formula is required to generate a random number that can control the distribution?

Source: Internet
Author: User
For example, a random value rand (range: 1-100) is generated ). The probability of a small value in rand is very high, but the probability of a large value is very low. I hope you can give a specific formula. The random value can be replaced by rand. It is best to use code, such as PHPPython. thank you!
For example, a random value rand (range: 1-100) is generated ). The probability of a small value in rand is very high, but the probability of a large value is very low. I hope you can give a specific formula. The random value can be replaced by rand. It is best to use code, such as PHP Python. thank you!
Reply: given a cumulative distribution function (CDF), as long as the inverse function can be obtained, the random number of the even distribution can be mapped. This is called Inverse transform sampling. . Random sampling (numpy. random)

Numpy. random has implemented the sampling function from the defined distribution,
For example

beta(a, b[, size])Draw samples from a Beta distribution.binomial(n, p[, size])Draw samples from a binomial distribution.chisquare(df[, size])Draw samples from a chi-square distribution.dirichlet(alpha[, size])Draw samples from the Dirichlet distribution.exponential([scale, size])Draw samples from an exponential distribution.f(dfnum, dfden[, size])Draw samples from an F distribution.gamma(shape[, scale, size])Draw samples from a Gamma distribution.geometric(p[, size])Draw samples from the geometric distribution.gumbel([loc, scale, size])Draw samples from a Gumbel distribution.
Assume that the landlord is an integer ranging from 1 to 100. the probability of is twice that of 99, and that of 99 is twice that of 98,
Then, use 1 as the benchmark,
SUM = 1 + 2 + 4 + 2 ^ 99 = 2 ^ 100-1
Then Random get rand (1, SUM), if the range is between 2 ^ 30-1 to 2 ^ 31, then get to 30. The simple formula is log (rand (1, SUM)/log2!


This problem happened during the design. I would like to share it with you.

Assume that we have obtained the probability of each value, then accumulate the probability of each value, and put the accumulated result on a single number axis. Let's take 10 values to illustrate the problem, for example:



[I] indicates the I value. The X-axis (0-0.1) indicates the probability of [1]. (0.5-0.75) indicates the probability of [7; (0-0.75) indicates the sum of probabilities from [1] to [7.

Next, for example, in matlab, the rand () function is used to generate a random number with a uniform distribution range of 0-1. the number produced in false is 0.65, which falls in the range (0.5-0.75, this interval represents the probability of [7. Therefore, we decided to use [7] as the result of this probability-based filtering.

Code with matlab

% The number of max_column values. prob indicates each
The probability of % value. prob_array indicates the probability accumulation and
% Largest_cumulative_prob indicates the maximum cumulative probability.

For j = 1: max_column-1
Prob_array (j + 1) = prob_array (j) + prob (j );
End

Largest_cumulative_prob = prob_array (max_column );
% Largest_cumulative_prob and prob_array need to be done
% Normalization, not in the code

Choice = rand () * largest_cumulative_prob;

Low = 1;
High = max_column;
While (high> (low + 1 ))
Middle = (high + low)/2;
Middle = floor (middle );
If (choice> prob_array (middle ))
Low = middle;
Else high = middle;

End
End
% Finally fetch low as the filtering result
% I will not explain it here. The idea is the same as the one shown above.

This method is applicable to discrete numbers. I still don't know how to deal with continuous numbers. my personal conjecture may use probability density? Hope you can answer this question! Select.
Do not worry about the number of computations: Do not worry about the number of computations:
I have read the source code of ndndsim. It is interesting to generate zipf distribution. Hope to give the subject a reference. Given the cumulative distribution function and range, for example, 1-50. The calculated cumulative distribution function is included in the array between 1 and 50 function values. Then a random number is generated with a uniform distribution. Then compare them with 50 distributed function values in sequence. Returns the subscript. Of course, the subscript returned can change as required. 100/rand (1,100) rand (1, rand (1, rand (1,100 ).

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.