PHP resolution website Big Data large traffic and high concurrency

Source: Internet
Author: User
Tags apc dedicated server hosting nginx server nginx reverse proxy


This article mainly introduces the content is about the PHP solution website Big Data Large traffic and high concurrency, has a certain reference value, now share to everyone, the need for friends can refer to



1: Hardware aspects

The average one P4 server can support up to about 100,000 IP per day, if the traffic exceeds 10W then need dedicated server to solve, if the hardware does not give force software how to optimize is not helpful. The main impact on the speed of the server

There are: network-hard disk read and write speed-memory size-CPU processing speed.

2: Software aspects

The first thing to say is that the database, the first to have a good architecture, query as far as possible to avoid the related sub-query to the regular query to add the index with the order to replace the non-sequential access, if the conditions allow, the general MySQL server is best to install

In the Linux operating system. The recommended use of Nginx,ginx for Apache and Nginx in highly concurrent situations is a good alternative to Apache servers. Low Nginx memory consumption official test can support 50,000 concurrent connections, running in the actual production environment

To the number of concurrent connections. PHP does not need to close the module as far as possible, using memcached,memcached is a high-performance distributed memory object cache system, do not use the database directly from the memory to tune data, which greatly improves the speed

Degrees, IIS or Apache enable gzip compression optimization website, compress website content greatly save website traffic.

Second, prohibit the external hotlinking.

External Web site pictures or file hotlinking often bring a lot of load pressure, so you should strictly restrict the external
In their own picture or file hotlinking, fortunately, can now simply through refer to control the hotlinking, Apache since
Can be configured to prohibit hotlinking, IIS also has some third-party ISAPI to achieve the same functionality. When
However, the forgery of refer can also be implemented through code to achieve hotlinking, but currently deliberately forge refer hotlinking is not much,
You can not think about it, or use non-technical means to solve, such as the image to add a watermark.

Third, control the download of large files.

Large file downloads can consume a lot of traffic, and for non-SCSI drives, a large number of file downloads can consume
CPU, which makes the website response ability drop. Therefore, try not to provide more than 2M of large file downloads, if required
, it is recommended that large files be placed on another server.

Four, using different host to divert the main traffic

Put the files on different hosts and provide different images for users to download. For example, if you think the RSS file occupies
Traffic, use services such as FeedBurner or Feedsky to put the RSS output on other hosts, which
Most of the traffic pressure on other people's access is concentrated on the FeedBurner host, RSS does not occupy too much resources

V. Use different hosts to divert the main flow
Put the files on different hosts and provide different images for users to download. For example, if you feel that the RSS files occupy a large amount of traffic, then use services such as FeedBurner or Feedsky to put the RSS output on other hosts, so that others access to the traffic pressure is mostly concentrated on FeedBurner host, RSS does not occupy too much resources.

VI, use traffic analysis statistics software.
Install a traffic analysis and statistics software on the website to instantly know which places are consuming a lot of traffic and which pages need to be optimized, so it is necessary to have accurate statistical analysis to solve the traffic problems. For example: Google Analytics.

Constraints for high concurrency and high load: hardware, deployment, operating system, Web server, PHP, MySQL, testing

Deployment: Server detach, DB cluster and library table hashing, mirroring, load balancing

Load Balancer Classification: 1), DNS round-robin 2) Proxy server load balancer 3) Address translation Gateway Load Balancer 4) NAT load Balancer 5) Reverse proxy load balancer 6) Hybrid load Balancing

Deployment Scenario 1:

Scope of application: static content as the main web site and application system, high security requirements for the system of Web sites and applications.

Main server: Primary servers

The main body of the hosting program is under pressure to handle dynamic requests in the Web site or application system;

Push a static page to multiple publishers;

Push the attachment file to the file server;

High security requirements, static-based Web site, the server can be placed in the network of external shielding access.

DB Server: Database servers

Load database reading and writing pressure;

Data volume Exchange only with the primary server, shielding extranet access.

File/video server: File/video Server

The data flow which occupies the system resource and the bandwidth resource is larger in the bearer system;

Storage and reading and writing warehouses as large attachments;

As a video server will have the ability to automate video processing.

Publishing Server Group:

Only responsible for the release of static pages, hosting the overwhelming majority of Web requests;

Load balanced deployment via Nginx.

Deployment Scenario 2:

Scope of application: Web site or application system with dynamic interactive content as the main body; a website or application system with large load-pressure and sufficient budget;

Web server Group:

The Web service has no master-slave relationship and is a parallel redundancy design;

Load balancing via front-end load balancing device or nginx reverse proxy;

Partition dedicated file server/video server effective separation of light/heavy bus;

Each Web server can be connected to all databases via Dec, while dividing the master and slave.

Database server Group:

Relatively balanced load-carrying database reading and writing pressure;

The data synchronization of multiple databases is realized through the mapping of database physical files.

Shared disk/disk array

Unified Read and write for data physical files

Storage warehouse for large attachments

Ensure the overall system IO efficiency and data security through the equalization and redundancy of its own physical disk;

Scenario Features:

Through the front-end load balancing, the reasonable distribution of web pressure;

Through the separation of the file/video server and the regular Web server, the reasonable allocation of light and heavy data flow;

Through the database server group, the reasonable allocation database IO pressure;

Each Web server is usually connected to only one database server, through the DEC heartbeat detection, can be automatically switched to redundant database server in a very short period of time;

The introduction of disk array greatly improves the efficiency of system IO, and enhances the data security.

Web server:

A large portion of the Web server's resource footprint comes from processing Web requests, which, in general, is the pressure that Apache generates, and in the case of high concurrent connections, Nginx is a good substitute for Apache servers. Nginx ("Engine X") is a high-performance HTTP and reverse proxy server written by the Russians. In China, has Sina, Sohu Pass, NetEase news, NetEase Blog, Jinshan Carefree net, Kimshan word bully, Xiaonei, Yupoo album, Watercress, Thunder look at a number of websites, channels using Nginx server.

The advantages of Nginx:

High concurrent connections: The official test can support 50,000 concurrent connections, running in the actual production environment to the number of concurrent connections.

Low memory consumption: With 30,000 concurrent connections, the 10 nginx processes that are open consume 150M of memory (15m*10=150m).

Built-in Health check function: If a WEB server on the backend of Nginx proxy goes down, it will not affect the front-end access.

Strategy: Compared to the old Apache, we choose lighttpd and Nginx, which have smaller resource occupancy and higher load capacity of the Web server.

Mysql:

MySQL itself has a very strong load capacity, MySQL optimization is a very complex task, because this ultimately requires a good understanding of system optimization. We all know that database work is a large number of short-term query and read and write, in addition to program development needs to pay attention to build indexes, improve query efficiency and other software development skills, from the perspective of hardware facilities to affect the efficiency of MySQL is mainly from the disk search, disk IO level, CPU cycles, memory bandwidth.

Perform MySQL optimizations based on the hardware and software conditions on the server. The core of MySQL optimization is the allocation of system resources, which is not equal to the unlimited allocation of more resources to MySQL. In the MySQL configuration file we introduce several of the most notable parameters:

Changing the index buffer length (key_buffer)

Change the length of the table (read_buffer_size)

Sets the maximum number of open tables (Table_cache)

Set a time limit for slow queries (Long_query_time)

If the conditions allow, the general MySQL server is best installed in the Linux operating system, rather than installed in FreeBSD.
Strategy: MySQL optimization requires different optimization scenarios based on the database read-write characteristics and server hardware configuration of the business system, and can deploy MySQL's master-slave structure as needed.

Php:

1, loading as few modules as possible;

2, if it is under the Windows platform, as far as possible to use IIS or Nginx to replace our usual Apache;

3, install the accelerator (all by caching The PHP code precompiled results and database results to improve the execution speed of PHP code)
Eaccelerator,eaccelerator is a free open source PHP accelerator, optimized and dynamic content caching, improving the performance of PHP script cache performance, so that PHP script in the state of compilation, the cost of the server almost completely eliminated.

apc:alternative PHP Cache (APC) is a free, publicly available, optimized code cache for PHP. It is used to provide free, open and robust architecture to cache and optimize PHP's intermediate code.

Memcache:memcache is a high-performance, distributed memory object caching system developed by Danga Interactive for reducing database load and increasing access speed in dynamic applications. The main mechanism is to maintain a unified huge hash table in memory, memcache can be used to store data in various formats, including images, videos, files, and database retrieval results.

Xcache: The Chinese developed the cache,

Strategy: Install the Accelerator for PHP.

Proxy server (cache server):

Squid cache (squid) is a popular free software (GNU General Public License) proxy server and Web cache server. Squid has a wide range of uses, from caching related requests as Web server cache servers to increasing the speed of Web servers, to sharing network resources for a group of people, caching the World Wide Web, domain name systems and other network searches, to help network security by filtering traffic, to LAN via proxy network. Squid is primarily designed to operate on Unix-type systems.

Strategy: Install Squid reverse proxy server, can greatly improve server efficiency.

Stress testing: Stress testing is a basic quality assurance behavior that is part of every important software testing effort. The basic idea of stress testing is simple: not to run manual or automated tests under normal conditions, but to run tests with fewer computers or poor system resources. The resources that are typically used for stress testing include internal memory, CPU availability, disk space, and network bandwidth. Usually use concurrency to do stress testing.
Pressure test tools: webbench,apachebench, etc.

Vulnerability testing: Vulnerabilities in our system include:SQL injection vulnerability, XSS cross-site scripting attacks, and so on. Security aspects also include system software, such as operating system vulnerabilities, MySQL, Apache and other vulnerabilities, can generally be resolved through the upgrade.

Vulnerability Test Tool: Acunetix Web Vulnerability Scanner


No related content found.

Related recommendations:

Several implementations of PHP to solve concurrency problems

PHP solves the problem of post-large data loss

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.