Introduction
Because the browser prohibits cross-origin XMLHTTP calls, all Ajax websites must have a server agent to capture content from external domains such as Flickr or Digg. For client JavascriptCodeFor example, an XMLHTTP call will pass the request to the server proxy in the same domain of the host, and then the proxy will download the content from the external server and return it to the client. Generally, all Ajax sites that obtain content from external servers use this proxy scheme, except for some rare ones who use jsonp. When many components on the website are downloading content from an external domain, such a proxy will be called in large quantities. As a result, when the contemporary theory is called for millions of times, it will become a scalable problem. In addition, the overall Server Load balancer of a page depends largely on the performance of the page when it provides content to the page. This articleArticleLet's take a look at how we can make traditional Ajax proxies faster, asynchronously, and continuously provide content streams to make them more scalable.
Ajax proxy ongoing
When you access pageflakes.com, you can see that such a proxy is working. You will see a lot of different content, such as weather forecasts, Flickr images, YouTube videos, and RSS, loaded from many different external domains like parts. All these loads require a content proxy. This content proxy has provided services for almost 42.3 million URLs in the last month. Making it fast and scalable is a great challenge for us! Sometimes the content proxy needs to provide services with megabytes of data, which further increases these challenges. This is because the proxy will be called in a large number. If we can save an average of 100 ms for each call, we can save about 4.23 seconds every month for download, upload, and processing. Around 1175 people are wasted waiting for content to be downloaded in front of browsers by millions of people around the world.
This content proxy uses the URL of the external server as a query parameter. It downloads content from these URLs and then delivers the content as a response to the server.
Image: The content proxy works between the browser and external domain like a man-in-the-middle.
The timeline above shows how a request arrives at the server, then the server sends a request to the external server, downloads the response, and transmits it to the client. The response arrow from proxy to browser is longer than the arrow from external server to proxy, because the host environment of a proxy server is faster than the user's Internet connection.
A basic proxy
This content proxy also exists in my open-source Ajax website --dropthings.com. You can go to codeplex to see how its code implements such a proxy.
Below is a very simple, synchronous, non-streaming blocking proxy.
Although it shows the general principles, it is not close to a real proxy because:
(1) It is a synchronization proxy and therefore has no scalability. Every call to this web method causes the Asp.net thread to wait until the call to the external URL is complete.
(2) It is non-streaming. It downloads the entire content from the server for the first time, stores the content in a string, and then updates the entire content to the browser. If you click an msdn feed URL, it will download a huge kb rss xml from the server and store it in a kb long string (in general, yes. net built-in string type double size, and are Unicode characters), and then write the KB to the buffer of Asp.net response object (response, the other kb utf8 byte array is stored in the memory. Then, kb will be passed to IIS for transmission to the browser.
(3) It does not generate a correct response header on the server to cache the response. It also does not provide important headers from the source file, such as content-type.
(4) If an external URL provides gzip compression for the content, it will extract the content to a string for representation, thus wasting the server's memory.
(5) It does not cache content on the server. Therefore, repeated calls to the same external URL will also download data from the external URL, which wastes the bandwidth of your server.
We need an asynchronousStreaming ProxyWhen it is downloaded from the external Domain Server, the content is transmitted to the browser. Therefore, it downloads bytes from the external server to a small block and transmits them directly to the browser. The result is that after the Web Service is called, the browser will see a continuous byte transmission. When the content has been completely downloaded from the server, there will be no delay.
A better proxy
Previously, I showed a complex stream proxy-based code. Let's discuss a disruptive solution. Let's create a better content proxy than above. The proxy above is a synchronous and non-streaming agent, but there is no other problem. We will construct an HTTP handler named regular. ashx, which uses the URL as the query parameter. It also uses the cache as a query parameter, which is used to generate a correct response header to cache content on the browser. Therefore, it will reduce the time for the browser to repeatedly download the same content.
The proxy above mainly enhances two features:
L it allows the server to cache content. For requests with the same URL from different browsers within a period of time, data will not be downloaded again on the server, but will be retrieved from the server cache.
L it generates a correct output response header and caches the content to the browser.
L it does not extract the downloaded content in the memory. It maintains the integrity of the original byte stream. It saves memory.
L it sends data in a non-buffered form, which means that the Asp.net response object does not buffer the response, thus saving the memory.
However, this is a "blocking" proxy.
Better proxy-stream-based
We need to build a stream-based Asynchronous proxy to provide better performance. The following figure illustrates why:
Image: continuous stream proxy
As you can see, when the server downloads content, the data is sent back to the browser from the server, and the download delay on the server is eliminated. Therefore, if the server spends Ms downloading content from an external resource and then sending it back to the server in MS, you can save Ms of network latency between the server and the browser (because the asynchronous mode downloads and transmits data streams ). This solution is very effective when the content provided by the external server is very slow and takes a long time to provide the content. The slower the external site is, the more time you save by using this continuous stream proxy. When your site is far away, this is a great improvement in performance compared to the blocking solution.
This continuous proxy solution is:
L use a special thread (Reader thread) to read an 8 KB byte block from the external server, so that it is not blocked.
L store the read block in a memory queue called a pipeline stream.
L write the block from the queue to the response object of Asp.net.
L if the queue is complete, keep it in the waiting state until more bytes are downloaded by the reader thread.
This kind of pipeline stream needs to be thread-safe, and it needs to support "blocking" reading. Blocking reading means that if a thread tries to read a block and the stream is empty, it will pause the thread until another thread finishes writing something on the stream. Once a write operation occurs on the stream, it will restore the read thread and allow it to continue reading. I obtained the pipeline stream code from codeprojectarticle by James kolpack and tested it to ensure high performance. It supports storing byte blocks instead of a single byte and waiting for timeout.
I have made some comparisons between common proxies (blocking, synchronization, and data transmission only after download) and streaming proxies (continuously transmitting data from external servers to browsers. The two proxies download the msdn resources and transfer them to the browser. The following time is displayed from the browser sending a request to the proxy and then returning the entire response to the client.
This is not a very scientific image, and the response time also depends on the connection speed from the browser to the proxy server and the speed from the proxy server to the external server. However, it shows that Streaming Proxy performance is better than normal proxy performance most of the time.
Build stream proxy
Building a proxy with better performance is not that easy. I tried three methods and finally found the best combination to show better performance than normal proxy.
The stream proxy uses httpwebrequest and httpwebresponse to download data from an external server. They are used to gain more control over how to read data. More specifically, they are used to read the byte blocks not provided by WebClient. In addition, there are some quick and scalable optimizations for building a proxy.
The downloaddata method downloads data from the output stream (connected to the external server) and sends it to the response stream of Asp.net.
Here, I tried three different solutions. Transmitdataasyncoptimized is the best solution. I will explain these three solutions. The downloaddata method is used to prepare the Asp.net output stream before sending data. Then, it uses one of the three solutions to send data and cache the downloaded bytes to the memory stream.
The first solution is to read 8192 bytes from the output stream connected to the external server and then write them directly to the response (transmitdataasyncoptimized ).
Here, readstream is the output stream returned from the HTTP webresponse. getresponsestream call. It is downloaded from an external server. Responsebuffer is only used to store the entire response to a memory stream in the memory, so that we can cache it.
This solution is even slower than a common proxy. After some code-level performance analysis, it seems that it takes quite a long time to write data into outputstream, because IIS tries to send data to the browser. Therefore, there will be network latency and data transmission latency. From the accumulated network latency of frequent calls to outputstream. Write, the entire operation is significantly delayed.
The second solution is to try multithreading. A new thread created from the Asp.net thread continuously reads data from the socket, and does not even have to wait for response. outputstream to send bytes to the browser side. The main Asp.net thread waits until all the bytes are collected and then directly transmits them to the response.
Here, the read is performed on pipestream, rather than from the socket of the Asp.net thread. Here, a new thread is born, which writes data to pipestream just as it downloads bytes from an external site. As a result, the Asp.net thread continuously writes data to the outputstream, and another thread continuously downloads data from the external server. The following code downloads data from the external server and stores it in pipestream.
The problem with this solution is that many response. outputstream. Write calls are still sent. The external server sends a variety of bytes of content, sometimes 3592 bytes, sometimes 8192 bytes, and sometimes only 501 bytes. It depends entirely on how fast the connection from your server to the external server is. Generally, Microsoft's servers are very fast. When you call _ responsestream. when read reads resources from msdn, you can always get 8192 bytes (the maximum capacity of the cache). However, when you connect to a non-reliable server, such as in Australia, you will not be able to obtain 8192 bytes of data each time you read the call. So you will end with more than the expected number of calls to response. outputstream. Write. Therefore, a better final solution is to introduce another buffer, which writes the storage to ASP. netresponse bytes, and once the 8192 bytes are ready for transmission, it clears its own buffer to response. outputstream. The buffer in the middle will always ensure that 8192 bytes are sent to response. outputstream.
The above method ensures that only 8192 bytes are written to ASP. netresponse. stream at a time. In this way, the number of writes is the total number of bytes/8192.
Construct stream proxy with asynchronous httphandler
Now, we are transmitting bytes based on the stream. We need to make this proxy "Asynchronous" so that it does not hold the main thread of Asp.net for too long. This means that once the Asp.net thread sends a call to the external server, it will be released. When the external server call is complete and all bytes are downloaded, it will seize a thread from Asp.net and then complete the execution.
When the principle is not asynchronous, it will make the Asp.net thread very busy until the entire connection and download operations are completed. If the external server responds slowly, it does not have to hold the Asp.net thread for too long. As a result, if the proxy is initiating too many requests to a very slow server, the Asp.net thread will soon be exhausted and your server will stop responding to any new requests.
The first step to construct an asynchronous proxy is to implement ihttpasynchandler and divide the processrequest method into two parts: beginprocessrequest and endprocessrequest. The begin method sends a call to httpwebrequest. begingetresponse, and then the thread returns to the thread pool of Asp.net.
When the beginprocessrequest call is complete and the external server has begun to send response data to us, Asp.net will call the endprocessrequest method. This method downloads data from the external server and sends it back to the browser.
Now you have it-a fast, scalable, and continuous stream proxy, which always has better performance than the usual proxy.
If you are considering writing httphelper, asyncstate, and syncresult classes, there are some ready-made classes. The following is the code of these help classes:
Source code:
Http://download.csdn.net/detail/yanghua_kobe/3702484