PHP builds high-performance systems

Source: Internet
Author: User

How to solve the performance problems that may exist in the system?

First of all, we need to know what the performance needs are in the business;

The second step, according to the performance requirements to consider the design of the system,

The third step, the system development process to pay attention to the possible local performance problems.

Evaluate the system's performance requirements:

Without developing a performance-sensitive system, it is easy to make the mistake of not thinking about how many people the system will be using in the future, how high concurrent access is, and how much data is needed to be stored. Directly began to do the development of the system, holding to wait for the performance problems to say. After the system is made to go live, the performance problem is exposed. But at this time, to solve the performance problem is going to bear a heavy price, often is a system rewrite. The pressure of progress has become a reason for not testing quality and performance. Cause a phenomenon: we don't have time to do a system well, but we have time to do the same system several times . Short-term is efficient, but long-term is inefficient.

When developing a new system, please understand the following questions:
    • How many users are there in the system? million, level 100,000, millions, or more?
    • How many people are using the peak stage of the system? How many requests (QPS) can be processed per banknote?
    • How many data records are stored in the system? Is the amount of data g,10g or 100G?

Let me illustrate:

* Ad Delivery System:

10 users with write access to the system, level 10 million users have access to the system.
Advertising is mainly put on the site, a page often has several ads to show. Every day to achieve hundreds of millions of PV. The peak of the QPS reached 1 billion.
The amount of data per day is within 1G, and there is no historical data to deal with.

* Interactive paste:

Tens, and both read and write operations.
System during the hot period, need to support 100 million PV, peak QPS to reach 1250 * 2 = 2500 QPS
Posts posted: 10000 bar * 1000 Theme * 100 posts = 1 billion posts; an average of 200 characters, that is, 400G of data.

*B2B website:

The user volume should have millions.
Average daily access to 3 calculations per user, that 1 million * 2 * 20 (average per access PV) = 40 million of PV, at peak times with an average of qps*2 = + + (QPS)

Seeing this data, we all have a lot of questions:

" too advanced, now the business development situation is far from the need of such a high performance industry support, if at a great cost to achieve, it is too wasteful, and the business is currently required urgent problem, is the function, is ease of use is not performance." "
This is indeed a realistic situation, and here I will consider this in the third part, and now assume that we need to address these high-performance challenges.

Guiding principles for performance design

We face the Tens users, thousands of of the QPS, more than the data of more than G. There is really no way to support a separate server, so the first thing to consider when designing a system is:

Horizontal expansion capability of the system:

It should be easy for everyone to understand and agree with this approach. But it is not a simple thing to do, it needs to be designed according to the specific application.

Let's say we need a 5000 QPS, then I break it down to 500 * 10, using 10 groups of servers to support it. The question that we need to consider at this point is:

    • Can the user use a different server to complete his operation?
    • User's session state, is the storage on the client side, or the server?
    • Is there data synchronization on the service side? How to implement on multiple servers
    • How many database servers do you need to store your data? Need data segmentation?

In data storage, more than tens data records, 10G of data, the ability to consider the level of expansion. Whether it's our usual MySQL or Oracle, the single-table processing data volume has a performance inflection point. You need to consider the table, the single-machine database processing capacity is also limited, it is necessary to consider the use of data volume of the cluster.

The level of the system's ability to scale is the first principle to solve the performance-sensitive system, which enables the system to improve performance by adding more hardware devices.
and need to make this extension easy to do.

In addition, even if our system has a good level of ability to carry out, it is very necessary to make a single-machine QPS. The single-machine QPS is too low to make the cost of the server unacceptable.

Differential treatment of read and write access to the system

When faced with high-performance access, you find that in the Internet Web application, the user's read and write access in general will reach 100:1, or even higher. The performance problem of high concurrent access will be reduced, how to improve the concurrent read access speed and ensure the correctness of write access is timely.

In the development system, we often have two of the following practices:

    1. The development of the whole system to pursue performance, the system does not have any structure, all the functions of the implementation are, directly on the page spelling SQL to read and write data.
    2. Consider the structure of the program comprehensively, and pursue the expansibility and dimensionality of the system.

Both of these methods have their scope of application, but they also have their shortcomings.

    • First, the second approach, in the face of business and system-level complexity, abstraction becomes an effective means of solving problems. Under the guidance of abstraction, encapsulation, hide implementation, isolation change, stratification will appear in the system design, but this design brings the problem is the indirect layer too much, in the dynamic scripting language of PHP, more code, more call hierarchy will degrade performance. It is suitable for complex systems of logic, but performance requirements are below (QPS).
    • The first method, there is a sentence to describe "on the top of the mountain is the fastest way to jump down from the top of the mountain" with direct data modification method, can not effectively express the abstract, unable to manage various factors of change, the data will not be changed under the unified logic constraints, we will lose control of the change, we face will be a riddled Rotten-tailed building. Although the correctness of the business many times the modification, more and more can not be guaranteed, then the performance is meaningless? This approach is appropriate for logic-simple functions.

In the face of different requirements, we need to discriminate, read and write separation of ideas, not only the program and program development model separation, it also means:

    • System architecture, the establishment of centralized write (single-point write), multi-point reading mode. It includes the deployment of front-end servers, data storage and synchronization scenarios, and so on.
    • Can make the logic of reading as simple as possible, not by the interference of writing logic. Valuable abandonment for the abstraction of Read access logic, the data is not changed, and no uniform logical constraints are required. In the face of performance requirements, we can discard the structure of the program so that some code can be duplicated.
    • We can have more performance optimization scenarios in the face of simple read logic.
    • The separation of reading and writing is also a way to solve the network environment (Telecommunication and Netcom) of warlords.

To illustrate:
* Affiliate ADS show less than 100 lines of code. Ads show code less than 500 lines (because of targeted delivery), where the logic is just reading local data, set of presentation template. All data is mailed back to the front-end Web server. and the creation of advertising data changes, there is a very complex system to manage, which basically does not consider the performance of the problem.
* Interactive Group Development Review system and paste bar system, set the consideration is also read and write separate mode, write logic emphasizes the structure of the code good, read logic to pursue performance.

Consider the mechanism for using buffered data:

The more efficient the buffer is, the more users and the more specific applications are connected. It may therefore be considered that a cache is established on the Web server. In the design and use of the caching mechanism, the interactive team post-paste system should be worthwhile to learn the project. Of course we also know that good things are the necessary medicine for caching. MEMCHED,BDB, varnish need to be clear about what problems they are suited to solve and use.

Regardless of how the cache mechanism works, we have to make sure that, without the cache, performance basically meets the needs of the business. Because the cache always has his hit rate, and also to consider that after the cache expires, our system can still function properly.

In my understanding, "' cache is a performance amplifier. ”’

How to consider the risk of dealing with performance during the system development process

In the system development process management, in essence or risk-driven, what most likely to produce serious problems, it should be prioritized to solve. After evaluating the performance, we need a high-performance system, and when the system is on the line, it will immediately meet the pressure of this performance, then I recommend the following:

    1. Follow the design principles described above to guide you in designing, and to take account of the experience of people
    2. By the way of pile, quickly set up the system framework, performance testing, to determine the overall performance, the overall performance is higher than the local performance .
    3. Troubleshooting is a performance-critical subsystem or component problem, which is to address local performance issues.
    4. Perform performance tests before you submit a functional test, and don't wait until you're on-line to discover performance issues.

For a tight schedule and expect no performance challenges for the next six months, the 3rd above can be done first. But you must be clear about the current state of the system. After the performance challenge, the system can solve the performance problem calmly.

PHP builds high-performance systems

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.