Quickly generate a random string

Last Update:2018-09-02 Source: Internet

Author: User

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

Catalogue [−]

The most common scenario
BYTE Substitution Rune
Use remainder
Mask
Mask-enhanced version
Source
Benchmark Code
Other promotion

How to produce a random string efficiently? This may seem like a simple question, but icza, by example, is progressively optimized to achieve a more efficient algorithm for random strings. This is from a question from StackOverflow: How to generate a random string of a fixed length in Go?, everyone together, put forward a good plan and feedback, especially Icza answer. This article is translated and collated from this question and answer.

The problem is this:

I want a go implementation of a fixed-length random string (including uppercase and lowercase letters, but no numbers), which is the quickest and easiest way?

Optimization is based on a scenario presented by Paul Hankin (the first scenario), which is the most basic and easy to understand scenario, and Icza is optimized based on this scenario.

The most common scenario

The most common scenario is to randomly produce each character, so the whole string is also random. The advantage is that you can control which characters you want to use.

12345678910111213

func init () {    rand. Seed (time. Now (). Unixnano ())}var letterrunes = []rune("ABCDEFGHIJKLMNOPQRSTUVWXYZABCDEFGHIJKLMNOPQRSTUVWXYZ" )funcintstring {make    ([]rune, N)     for Range B {        b[i] = Letterrunes[rand. INTN (len(letterrunes))]    }    returnstring(b)}

BYTE Substitution Rune

If the requirement is to use only English alphabetic characters (including case), then we can replace Rune with byte because the English alphabet and byte in the UTF-8 encoding correspond to one by one.

123456789

Const "ABCDEFGHIJKLMNOPQRSTUVWXYZABCDEFGHIJKLMNOPQRSTUVWXYZ" func int string {Make    ([]byte, n) for    range B {        b[i] = Letterbytes[rand. INTN (len(letterbytes))]    }    returnstring(b)}

Use remainder

In the previous step we used rand.Intn to randomly select a character that would be called, and would be called, which would be rand.Intn Rand.Intn Rand.Intn Rand.Int31n slower than a direct call rand.Int63 , which would produce a random integer of 63bit.

We can use rand.Int63 and then divide len(letterBytes) by the remainder to select the characters:

func int string {Make    ([]byte, n) for    range B {        int64(  Len(letterbytes))]    }    returnstring(b)}

This implementation is obviously faster than the solution above, but with a small flaw: the probability that a character is chosen is not exactly the same. But the difference is very very small (the number of characters is 52 far less than 1<<63-1),
Only theoretically there will be differences, in practice can be neglected.

Mask

From the previous scenario, we can see that we don't need too many bits to determine the average distribution of the characters, in fact we just need the last few bits of a random integer to select the letters. For 52 English letters (case), only 6 bit can be used to achieve uniform distribution ( 52=110100b ), so we can use the rand.Int63 latter 6 bit to achieve, we only accept the following six bits in 0..len(letterBytes)-1 the random number, if not in this range, discard the re-election. The last 6 bits of an integer can be obtained through a mask.

12345678910111213141516

ConstLetterbytes ="ABCDEFGHIJKLMNOPQRSTUVWXYZABCDEFGHIJKLMNOPQRSTUVWXYZ"Const(Letteridxbits =6                    //6 bits to represent a letter indexLetteridxmask =1<<letteridxbits-1 //All 1-bits, as many as Letteridxbits)funcRandstringbytesmask (nint)string{b: = Make([]byte, N) forI: =0; I < n; {ifIDX: =int(Rand. Int63 () & Letteridxmask); IDX <Len(letterbytes) {B[i] = Letterbytes[idx] i++}}return string(b)}

Mask-enhanced version

There is a bad place on it that will produce a lot of discarded case, resulting in re-election and waste. rand.Int63generates a random number of 63bit, and if we divide it into 6 parts, we can generate 10 6bit random numbers at a time. This reduces waste.

12345678910111213141516171819202122232425

ConstLetterbytes ="ABCDEFGHIJKLMNOPQRSTUVWXYZABCDEFGHIJKLMNOPQRSTUVWXYZ"Const(Letteridxbits =6                    //6 bits to represent a letter indexLetteridxmask =1<<letteridxbits-1 //All 1-bits, as many as LetteridxbitsLetteridxmax = the/letteridxbits//# of indices fitting in + bits)funcRANDSTRINGBYTESMASKIMPR (nint)string{b: = Make([]byte, N)//A rand. INT63 () generates random bits, enough for Letteridxmax letters!     forI, cache, remain: = N-1, Rand. Int63 (), Letteridxmax; I >=0; {ifremain = =0{cache, remain = rand. Int63 (), Letteridxmax}ifIDX: =int(Cache & Letteridxmask); IDX <Len(letterbytes) {B[i] = Letterbytes[idx] i--} cache >>= letteridxbits remain--}return string(b)}

Source

The code above is really good, there is not much to improve, even if it can be improved, it will cost a lot of complexity.

We can optimize from another aspect, which is to increase the generation of random numbers (source).

crypto/randThe package provides a Read(b []byte) method that can randomly generate the bytes of the bit we need, but because of the security aspects of design and inspection, its random number generation is slower.

We turn back math/rand and rand.Rand use rand.Source to generate a random bit. rand.Sourceis an interface that provides Int63() int64 , exactly what we need.

So we can use it directly rand.Source instead of the global or shared random source.

1234567891011121314151617181920

varsrc = rand. Newsource (time. Now (). Unixnano ())funcRANDSTRINGBYTESMASKIMPRSRC (nint)string{b: = Make([]byte, N)//A src. INT63 () generates random bits, enough for Letteridxmax characters!     forI, cache, remain: = N-1Src. Int63 (), Letteridxmax; I >=0; {ifremain = =0{cache, remain = src. Int63 (), Letteridxmax}ifIDX: =int(Cache & Letteridxmask); IDX <Len(letterbytes) {B[i] = Letterbytes[idx] i--} cache >>= letteridxbits remain--}return string(b)}

The global (default) random source is thread-safe, which uses locks, so it's better without us directly rand.Source .

The following code is a global random source that you can see Lock/Unlock using.

12345678910111213141516171819202122

func Int64 return globalrand.int63 ()}var globalrand = New (&lockedsource{src:newsource(1). ( SOURCE64)})typestruct {lk  sync. Mutexsrc Source64}funcint64) {R.lk.lock () n = r.src.int63 () r.lk.unlock ()return}

With the addition of rand.Read() methods and Rand.Read() functions in Go1.7, we can try to use it to get a set of random bits for higher performance.

One small question is how many bytes of random numbers are better? We can say: As much as the output character. This is an upper-bound estimate because the character index will be less than 8bit.
In order to maintain the uniform distribution of characters, we have to discard some random numbers, which may fetch more random numbers, so we can only estimate n * letterIdxBits / 8.0 random byte that needs bytes.

Of course, the best way to verify is to write a benchmark, the appendix is the benchmark code, the following is the result of the test:

Benchmarkrunes                   1000000              1703 ns/opbenchmarkbytes                   1000000              1328              Ns/opbenchmarkbytesrmndr  1000000              1012 ns/opbenchmarkbytesmask               1000000              1214 ns/ OPBENCHMARKBYTESMASKIMPR           5000000               395        ns/opbenchmarkbytesmaskimprsrc  5000000               303 ns/op

Benchmark Code

Benchmarkrandomstring_test.go

1234567891011121314151617181920212223242526272829303132333435363738394041424344454647484950515253545556575859606162636465 6667686970717273747576777879808182838485868788899091929394959697989910010110210310410510610710810911011111211311411511611 7118119120121122123124125126127128129130131132133134135

 PackageMainImport("Math/rand""Testing""Time")//ImplementationsfuncInit () {rand. Seed (time. Now (). Unixnano ())}varLetterrunes = []Rune("ABCDEFGHIJKLMNOPQRSTUVWXYZABCDEFGHIJKLMNOPQRSTUVWXYZ")funcRandstringrunes (nint)string{b: = Make([]Rune, N) forI: =Rangeb {B[i] = Letterrunes[rand. INTN (Len(Letterrunes))]}return string(b)}ConstLetterbytes ="ABCDEFGHIJKLMNOPQRSTUVWXYZABCDEFGHIJKLMNOPQRSTUVWXYZ"Const(Letteridxbits =6                    //6 bits to represent a letter indexLetteridxmask =1<<letteridxbits-1 //All 1-bits, as many as LetteridxbitsLetteridxmax = the/letteridxbits//# of indices fitting in + bits)funcRandstringbytes (nint)string{b: = Make([]byte, N) forI: =Rangeb {B[i] = Letterbytes[rand. INTN (Len(Letterbytes))]}return string(b)}funcRandstringbytesrmndr (nint)string{b: = Make([]byte, N) forI: =Rangeb {B[i] = Letterbytes[rand. INT63 ()%Int64(Len(Letterbytes))]}return string(b)}funcRandstringbytesmask (nint)string{b: = Make([]byte, N) forI: =0; I < n; {ifIDX: =int(Rand. Int63 () & Letteridxmask); IDX <Len(letterbytes) {B[i] = letterbytes[idx]i++}}return string(b)}funcRANDSTRINGBYTESMASKIMPR (nint)string{b: = Make([]byte, N)//A rand. INT63 () generates random bits, enough for Letteridxmax letters! forI, cache, remain: = N-1, Rand. Int63 (), Letteridxmax; I >=0; {ifremain = =0{cache, remain = rand. Int63 (), Letteridxmax}ifIDX: =int(Cache & Letteridxmask); IDX <Len(letterbytes) {B[i] = Letterbytes[idx]i--}cache >>= letteridxbitsremain--}return string(b)}varsrc = rand. Newsource (time. Now (). Unixnano ())funcRANDSTRINGBYTESMASKIMPRSRC (nint)string{b: = Make([]byte, N)//A src. INT63 () generates random bits, enough for Letteridxmax characters! forI, cache, remain: = N-1Src. Int63 (), Letteridxmax; I >=0; {ifremain = =0{cache, remain = src. Int63 (), Letteridxmax}ifIDX: =int(Cache & Letteridxmask); IDX <Len(letterbytes) {B[i] = Letterbytes[idx]i--}cache >>= letteridxbitsremain--}return string(b)}//Benchmark functionsConstn = -funcBenchmarkrunes (b *testing. B) { forI: =0; i < B.N; i++ {randstringrunes (n)}}funcBenchmarkbytes (b *testing. B) { forI: =0; i < B.N; i++ {randstringbytes (n)}}funcBenchmarkbytesrmndr (b *testing. B) { forI: =0; i < B.N; i++ {Randstringbytesrmndr (n)}}funcBenchmarkbytesmask (b *testing. B) { forI: =0; i < B.N; i++ {randstringbytesmask (n)}}funcBENCHMARKBYTESMASKIMPR (b *testing. B) { forI: =0; i < B.N; i++ {RANDSTRINGBYTESMASKIMPR (n)}}funcBENCHMARKBYTESMASKIMPRSRC (b *testing. B) { forI: =0; i < B.N; i++ {RANDSTRINGBYTESMASKIMPRSRC (n)}}

Other promotion

In fact, if you can replace a better performance of the random number generation algorithm, may be better performance, I use the xorshift algorithm to achieve a fast random number generator, and the previous implementation of the comparison, found that performance will be better.

 BenchmarkRunes-4 1000000 1396 ns/opbenchmarkbytes-4 2000                     799 ns/opbenchmarkbytesrmndr-4 3000000 627 ns/opbenchmarkbytesmask-4 2000000 719 ns/opbenchmarkbytesmaskimpr-4 10000000 260 NS/OPB               EnchmarkBytesMaskImprSrc-4 10000000 227 ns/opbenchmarkbytesmaskimprxorshiftsrc-4 10000000 205 ns/op

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More

Quickly generate a random string

Contact Us

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support