From: http://www.infoq.com/cn/news/2013/02/advices-on-superbowl-spike
I note: Some site optimization strategies listed in this reposted article are of reference significance for building high-performance services.
On October 11, February 4, 2013, the American people's gala-the 47th "Super Bowl" finals of the American Football League (NFL)-was staged at the New Orleans dome stadium. The "Super Bowl" has three most attractive features: Wonderful competitions, mid-field performances, and advertisements inserted in the competition.
This super bowl peak family rating reached 71%, so it is a battle for advertisers, the average price of up to $4 million every 30 seconds. In many advertisements, many people call on the audience to interact with the website. As a result, many websites are overwhelmed.
Yottaa is a company dedicated to providing website optimization services. According to their monitoring, a total of 13 websites were dragged down to varying degrees in this incident. Taking Coca-Cola as an example, the loading time of its website reaches 62 seconds. SodaStream, Calvin
Klein, axe, Got milk? Other movies and car sites were also affected.
So how can we avoid similar situations? Yottaa provides four suggestions:
- Reduce the number of elements on the webpage and create smaller and more lightweight pages to ensure faster page loading;
- Use website performance monitoring to detect any website problems around the clock;
- Using CDN is closer to users all over the world from a geographical location, reducing most of the website traffic;
- Perform a Real-Time load test based on the predicted traffic. This is the only way to measure the actual performance of your website under high load.
Individual developer Michael hamrah wrote an article on his blog titled how to deal with peak traffic like "Super Bowl". The article also provides some suggestions:
Dare to assume your traffic mode. This helps you optimize your specific situation. Many websites targeting the public are anonymous, especially traffic peaks like the Super Bowl. Because you can provide exactly the same content for all anonymous users, you can allow them to access completely the same static page. Cache Control determines how long the content remains valid and can be distributed using HTTP accelerators and CDN. There is no need to optimize for everyone and classify users for most optimizations. Setting the cache rules on the page to one minute can reduce the application burden and release valuable resources. Anonymous users can download static cache content more quickly, and dynamic users can access the server more quickly.
You can also create a specific rendering pipeline for highly dynamic content for anonymous and known users. If you can identify anonymous users earlier, you can avoid costly database queries, external API calls, or page rendering.
The author's note: it actually shows the design of the capacity model.
HTTP is the foundation of the web. If you have a deeper understanding of HTTP, you can better use the tool to optimize the page. Pay special attention to the HTTP header cache settings, which allows you to use web accelerators such as varnish and CDN. Using different headers for anonymous and known users, you can have more refined control over the content accessed by each user. The expired header determines the New and Old content. The worst way is to set the cache header to not publish static content and prevent the browser from caching local content.
Varnish is an http accelerator that caches dynamic content generated by a website. Web frameworks often have their own content caching features, but varnish allows you to completely bypass the application stack and provide faster response time. You can send rendered dynamic pages to more connections, just like sending static pages in memory.
Edge side encryption DES (Esi) is a combination of static and dynamic content. If a page is 90% close to everyone, you can cache 90% in varnish and then let the application server send the other 10%. ESI has just appeared in the Web framework. In rails 4, ESI occupies a more important position.
- Use CDN and multiple data centers
If you run the varnish server in multiple data centers, it is equivalent to creating your own CDN. Databases and content may be stored on the east coast, but if you run a varnish server on the West Coast, your users in San Francisco will get a faster response time, you can also reduce the link to the application server. Even if varnish has to send 10% dynamic content through ESI on the East Coast, it can also use faster connections between data centers.
- Use auto scaling group or warning
Auto scaling groups are an excellent feature of AWS, especially when a threshold value is reached. If you are not using AWS, excellent monitoring tools can help you take action. If you consider auto scaling when designing applications and use Server Load balancer for internal communication, it is also a good choice.
- Compress and serialize data
If compression is enabled, the page size can be greatly reduced. Web traffic is mainly text, which is easy to compress. Don't forget that internal communication can also be compressed. In today's API-driven world, efficient serialization protocols such as protocol cache can significantly reduce network traffic. Many RPC tools support some form of serialization optimization. Soap was a popular method in the early 21st century, but XML was the worst data serialization method in terms of speed. The compressed content can save more things in the cache and reduce network I/O.
When traffic is high, disabling abnormal performance is the fastest way to solve the problem.
Note: The design requires flexible and multi-level grayscale capabilities.
Asynchronous programming is full of challenges. It may be the last thing about scalability. Sometimes the website server crashes, but it does not seem to reach any threshold. You may have seen a slow request, but the memory, CPU, and network access indicators are normal. This is often caused by some form of I/O threads being blocked and waiting. The blocked thread stops other tasks. Based on Asynchronous frameworks, such as node. JS, asynchronous programming is placed on the front end to allow them to process more concurrent requests. Asynchronous programming also paves the way for the queue-based architecture. If each request is processed through a queue, this helps ease traffic peaks. The queue size determines how many workers you need. Asynchronization may not be easy to understand, but it is an important way to process expansion.
Note: The architecture is capable of Asynchronization and asynchronous processing of key paths.
Michael's last suggestion is summative:
- Thinking in an extended way
When handling problems in a high-load environment, everything should be taken into consideration. There are no problems with thousands of users. If they are used to deal with millions of users, they may lose control. Even the smallest problem will grow exponentially and become unmanageable.
Expansion is not just about thinking about tools to deal with loads, but about how your applications work. The most important thing is to determine the degree of New and Old page content for users. Each user provides updates within seconds or several minutes for anonymous users. When dealing with millions of concurrent requests, one can bring a lot of engineering complexity, and the other can be quickly solved.
Note: The architecture must be horizontally scalable.