As the company wants to migrate some businesses to windowsazure, it mainly uses winodwsazure storage. In order to reflect the storage reliability, it has carried out a series of tests on winodwsazure storage. however, in the read pressure test phase, the file read latency is intermittent, because you are relatively good at writing test applications (older agricultural codes ), therefore, the problem lies in the storage of winodwsazure. the problem was clarified only after several communications and assistance with the MS technology. Although the problem is not generated by the program code, it is related to the test data built by the test method. the following is a test procedure.
Target
The company hopes to put some website storage resources on windowsazure, with a total capacity of about 3 TB. a stress test plan was developed for the sake of strict implementation of the scheme, mainly from the Read and Write aspects to test the pressure supported by storage. the write test mainly involves writing 8 K, 16 K, 32 K, 64 K, and K files of different sizes respectively, and reading the test will obtain the written files at will.
Test
The writing test time is still very smooth. The storage node's write bandwidth can basically reach 3 GB, and the stress writing test results are very satisfactory. however, an exception occurs during pressure reading, and the read latency usually occurs every 2-3 minutes. The latency is 3-6 seconds. and then it becomes normal...
while (true) { stream.Position = 0; TimeOutLog log = new TimeOutLog(); watch.Restart(); long index = System.Threading.Interlocked.Increment(ref mIndex); string url = mImages[(int)(index % mImages.Count)]; CloudBlob blob = container.GetBlobReference(url); blob.DownloadToStream(stream); System.Threading.Interlocked.Increment(ref mCount); }
Because the test code is very simple, and the CPU and memory of the test machine are adequate, the problem is directly pointed to the storage node. summarize the problems to technical personnel in winodwsazure. After debugging and troubleshooting by the other party, it is said that too many resource URLs for program construction and testing may be the main cause of program problems.
After receiving the question, I really don't understand it. Even if the test URL occupies a large amount of memory, it won't affect the program running. After all, there is no pressure on the CPU of the test server. because I didn't determine whether the program is faulty, I tried to find a method to prove that it was the reason for storage (after all, the solution adopted by the company cannot be settled at will ). modify the program to record the running time of each link.
TimeOutLog log = new TimeOutLog(); watch.Restart(); long index = System.Threading.Interlocked.Increment(ref mIndex); string url = mImages[(int)(index % mImages.Count)]; watch.Stop(); log.GetUrlTime = watch.Elapsed.TotalMilliseconds; log.Url = url; watch.Restart(); CloudBlob blob = container.GetBlobReference(url); blob.DownloadToStream(stream); watch.Stop(); log.GetBlobTime = watch.Elapsed.TotalMilliseconds; if (log.GetBlobTime > 1000 || log.GetUrlTime > 1000) mQueue.Enqueue(log); System.Threading.Interlocked.Increment(ref mCount);
After the processing time is added, it is clearly found that the downloadtostream of blob has intermittent latency, because this winodwsazure API is officially provided. so I made a conclusion that the program does not have a problem. It should be because of storage exceptions. Then I sent an email to winodwsazure asking them to follow up. after communication, the other Party suggested cropping the tested URL to N points and then enabling multiple programs for stress testing. This work is easier for me to adjust, so I split the URL into N tests. I didn't expect the test to be successful.
Summary
. Why is the delay caused by downloadtostream affected by a large amount of memory used by the net program? (after all, it does not affect the storage test, so I have not studied it any more. net Program has some experience, so I began to believe that it is not a program problem... at first, it was rejected because it was caused by data or program. the motive for splitting the N small files is to further prove the problem. the result proves that my idea is wrong. Sometimes it is really bad to judge a thing subjectively based on some experience. the test results of winodwsazure storage are as follows:
Winodwsazure's storage and read/write operations on a single node are still very powerful. Basically, an account named by Ms reaches 3 GB of read/write traffic.
Supplement: The Ms technical service is really very poor. Although the company has not officially purchased the service, it has already realized that the support is in place (here, thank you for your support)
Azure Storage pressure test problem (the subjective consciousness of native code is too strong to be pitted)