Easily build billions of web sites per week with HAProxy, PHP, Redis and MySQL

Source: Internet
Author: User
Tags varnish haproxy icinga redis server


English Original: The Easy-to-Building A growing Startup Architecture Using HAProxy, PHP, Redis and MySQL to Ha ...


This case is a guest article written by Antoni Orfin , a co-founder and software architect of Octivi .



In this article, I'll show you how we've developed a very simple architecture based on Haproxy,php,redis and MySQL, which seamlessly handles requests that are about 1 billion times a week. The article also lists possible ways to further expand it, and points out the uncommon patterns for this project.


Data:
    • Server:

      • 3x Application Node

      • 2x MySQL + 1x for backup

      • 2x Redis

    • Application:

      • Application processes 1,000,000,000 Requests per week

      • Single Symfony2 instance reached 700req/s (average 550req/s on weekdays)

      • Average response time-30 ms

      • Varnish-Above 12,000 req/s (achieved in stress test)

    • Data storage:

      • redis-160,000,000 Records, up to a few gigabytes of data (our main data repository!) ),

      • mysql-300,000,000 Records-GB (third tier cache)

Platform:
  • Monitoring:
      • Icinga

      • Collectd

  • Application:
      • HAProxy with keepalived

      • Varnish

      • PHP (PHP-FPM) with Symfony2 Framework

  • Data storage:
    • MySQL (master-master) with HAProxy load balancing
    • Redis (Master-slave)
    • Background
    • Almost a year ago, our friend came to our office with an intractable problem. They are running a fast-growing e-commerce startup that they want to expand to international standards.

      Because they are still an emerging company, the proposed solution must be cost-effective, rather than run out of money on the next server. Legacy systems have been built with a standard lamp architecture, and they already have a strong PHP development team. The introduction of new technologies must be sophisticated, not overly complex architectures, and allow their existing staff to further maintain the platform.

      The system architecture must be designed as an extensible way to implement plans that extend to the next market. So we had to come and check their infrastructure ...


    • The previous system was designed in a holistic manner. Specifically, there are some independent PHP-based Web applications (there are many so-called front-end sites in emerging companies). Most of them use a single database, and they share some common code to handle business logic.
    • Further maintenance of such applications could be a nightmare. Because part of the code has been copied, changing a Web site may lead to inconsistencies in the business logic-they always need to make the same changes in all the Web applications.

      In addition, from a project management point of view this is also a problem-who should be responsible for being scattered in multiple code base "that part" code?

    • Based on this observation, our first step is to extract the core key business functions into a single service (this is the scope of this article). It is a service-oriented architecture pattern. The principle of "separation of concerns" is considered throughout the system-wide context. The service is to maintain a logical, specific, higher level of business functionality. Give you a real example-the service can be a search engine, sales system and so on.
    • The front-end Web site communicates with the service through a REST API . The response is based on the JSON format. The reason we choose it is simplicity, but soap is always difficult for developers (no one likes to analyze xmls ...; -))

      The extracted service does not handle things such as authentication and session management. This is necessary, and these things are dealt with at a higher level. The front-end Web site is responsible for this because only they can identify their users. In this way, we will make the service more streamlined-on things that further extend the problem and code. There's nothing bad about it because it has different tasks to deal with.

    • Advantages:
    • -Different subsystems (services) can easily be developed by a completely different development team. Developers can interfere with each other.

      -Do not handle user authorization and access issues, so there is no common level problem.

      -Maintain business logic in one place-there is no redundant functionality between different front-end sites.

      -It is easy for the service to be accepted by the public .

      Disadvantages:

      -system administrators have a greater amount of work-because the service is based on its own architecture, the system administrator needs to be more concerned about the architecture.

      -Maintain backward compatibility-there are countless changes to the API approach during one year of maintenance. The problem is that these changes must not compromise backwards compatibility, or the code for each front-end site needs to be modified, and the programmer will work when all the sites are deployed ... After a year, all the methods are still compatible with the first version of the document.

    • Application Layer
    • Based on the request flow, the first tier is the application layer, which includes the haproxy load balancer, varnish, and Symfony2 network applications. Requests from the front-end Web site are first reached Haproxy and then distributed to the application node through Haproxy.

      Apply node configuration
      • Xeon [email protected], 64GB RAM, SATA

      • Varnish

      • apache2< /p>

      • PHP 5.4.X running as PHP-FPM, with APC bytecode cache

      We already have three such application servers. It is a n+1 mode in dual Live mode-the "backup" server actively processes the request.

      Keeping varnish independent of each node makes the cache hit rate lower, but there is no spof problem in this way (one node fails and all systems stop working). The purpose of this is to consider availability above performance (performance is not a problem in our case).

      We chose Apache2, which is also used in the front-end Web server. Avoiding the mixing of many technologies makes it easier for system administrators to maintain.

    • The Symfony2 app
    • App itself is built on top of the Symfony2. It is a full PHP stack framework that provides a rich set of useful components that can speed up the development process. Building a typical rest service on top of a complex framework might be unthinkable for some people, and let me explain why:

      • Easy php/symfony Developer Acceptance The  -customer's IT team includes PHP developers. Introducing new technologies, such as node. js, means hiring new developers who can better maintain the system.

      • Clear Project Structure  -Symfony2 does not take advantage of a very complex project structure, but its default project structure is very clear. Recruiting new developers into the project is very simple, because Symfony2 's code is very familiar to them.

      • off-the-shelf components  -Follow dry philosophy ... No one wants to be reconstructed, so neither do we. We use the Symfony2 control component extensively, which is a great framework for generating CLI commands, making applications (debug toolbars), Profiling Tools, and loggers.

      Before using it, we did a performance test to ensure that it handled the set amount of tasks. We developed a proof-of-concept model and used it to run JMeter. Results impressive -700req/s response time up to 50ms. It is our conviction that this complex structure can be used in our project.

    • Application analysis and Monitoring
    • We use Symfony2 tools to monitor our applications. Symfony2 has a great performance analysis component that can be used to collect the execution time of specific methods, especially those related to third-party services. This allows us to identify potential weaknesses and the most time-consuming parts of the application.

      Detailed logs are required. To do this, we use PHP's Monolog library, which allows us to generate friendly, fully formatted log records that can be understood by developers and system managers. It is important to keep in mind that as much detail as possible in the log, we find that the more detailed the log the better. We have used different log levels:

      • Debugging -some information that will be generated-such as request information before invoking an external network service; Some information that has already occurred-the response returned from the API request;

      • Error -An error has occurred but the request flow has not stopped (for example, an error response returned from a third-party API);

      • Danger -Yo ... The app crashed.

      In a product environment, you can see the error level log and the critical level log below it. In the dev/test environment, there are also debug logs to see.

      We divide the logs into different documents (they are called "channels" in the Monolog Library). The primary log file is used to store all application-wide error messages and short log information in a particular channel. We keep detailed log information from different channels in different files.

    • Scalability
    • Extending the application hierarchy on the platform is not a difficult task. Haproxy performance is not consumed by time, we only need to consider the redundancy required to avoid a single point of failure (SPoF).

      You only need to add additional application nodes in this mode.

      Data layer

      We use Redis and MySQL to store all the data. When Redis is the primary data store, MySQL is used for the third tier of cache storage.

    • Redis
    • When designing our systems, we need to consider choosing a database that meets our set requirements:

      • Storing large amounts of data (approximately 250 million records) does not degrade performance

      • Primary use of a simple get (no lookup or complex selects) based on a specific resource identifier

      • Can fetch a large amount of resources in a single request to minimize latency

      After some investigation, we decided to use Redis.

      • The complexity of all our operations is O (1) or O (n), and n represents the number of primary keys we retrieve. This means that the size of the primary key space does not affect performance.

      • When the number of primary keys retrieved at one time is greater than 100, we mostly use the Mget command, which ignores network latency when compared to using multiple gets in a single loop.

      We recently ran two Redis servers in master-slave replication mode. Each configuration is: Xeon [email protected], 128GB, SSD. Memory limit in 100GB ... Memory is often full:-)


    • Since the application does not completely exhaust all the performance of a single Redis server, the secondary server is primarily used for backup and to maintain (system) high availability. Once the primary server goes down, we can easily convert the app to a subordinate server. Replication is also convenient when you are doing maintenance work or migrating a server-the server switchover is simple.
    • You may wonder why our redis is often in the maximum memory state. Most primary keys are permanent types-approximately 90% of the primary key space. While the rest of the primary keys are completely cached, we can set them to TTL (translator Note: time-to-live) expires. Now, the primary key space is divided into two parts: the one that has the TTL setting (cache) and the other part (persistent data) that does not have a TTL setting. Fortunately, the maximum memory policy for Redis is "VOLATILE-LRU" (Translator Note: One of the six memory policies of Redis means that only the key that has the expiration time is set to LRU), and the least-used cache primary key (and only those settings expire) is automatically deleted. (Translator Note: should be the least recently used ... )

      In that case, we can use a single Redis instance either as a primary store or as a typical cache.

    • When using this model, it is important to remember to monitor the number of "expired" primary keys. (Translator Note: The following is the command Line View section)
    • db.redis1:6379> Info keyspace

      # Keyspace

      Db0:keys=16xxxxxxx,expires=11xxxxxx,avg_ttl=0

      When you find that ("expired" primary key) is close to the dangerous value of 0 o'clock, you need to start slicing or improve memory;-)

      How do we monitor it? The Icinga check is able to monitor whether the "expired" quantity has reached the point of collapse. We also use Redis curves to visualize the "lost primary key" ratio.


      After a year, I can say that we are fully integrated into Redis. From this project, Redis has not let us down-there are no downtime and no other events.

    • Mysql
    • In addition to Redis, we also used a traditional MySQL database. The difference is that we use it only as a third-party cache layer. We used it to store content that would take up too much of Redis's memory and not be used in the near future, so we could put it on another hard drive. This is not a novelty technology, we want to be able to keep the stack as simple as possible, so as to facilitate maintenance.

      We have more than two MySQL servers configured to: Xeon [email protected], 64GB RAM, SSD. There is a native asynchronous primary-primary replication. and a separate slave node for backup.

    • High availability of MySQL
    • As you can see from the physical structure diagram, there is a haproxy on each MySQL box and a hot standby is implemented. The connection to MySQL is realized through Haproxy.

      Installing the Haproxy mode on each database server ensures a high level of reliability for the stack and does not need to be a more server for load balancing.

      The Haproxy is run with active-passive mode (only one run at a time). Hot-standby mechanisms can control their availability. Under hot-standby control there is a floating-point IP (VIP) that can check the availability of the primary load Balancer node. When the primary node crashes, the second (subordinate) Haproxy node takes over the IP.

    • Scalability
    • A database is often the biggest bottleneck in an application. In general, there is no need to scale out-this time we are scaling up by increasing the Redis and MySQL space. Although Redis runs on servers with 128GB of memory, there is still space left-(but) it is possible to migrate them to nodes with 256GB of memory. Of course, large capacity can also be inconvenient for some operations, such as snapshots or running a server-starting a Redis server will take longer.

      After the vertical expansion, we do the (horizontal) external expansion. Fortunately, we have prepared a simple split structure for our data.

      We have 4 "heavy" record types in Redis. Records can be divided into four servers based on the data type. We do not want to split based on hashes, and we prefer to split them according to the type of record. This approach allows us to still use mget that normally perform well for a class of primary keys.

      In MySQL, data tables are stored in structures that are easy to migrate to different servers-these tables are also based on record types (stored).

      After analyzing the advantages of splitting data by data type, let's look at the hash.

    • Lessons learned
    • do not share your database -once, there is a front-end Web site that wants to convert its session processing to Redis. They were connected to our database. This allows our Redis cache space to be exhausted, and our application is denied the storage cache primary key. All caches begin to be stored only on the MySQL server, which causes the MySQL server to have too much overhead.
      • to have a detailed log -when there is not enough log information, you cannot quickly debug what went wrong. Once, because of a lack of information, we could not find the cause of the problem, and had to wait for the problem to occur again (after adding the required log information).

      • using a complex architecture does not mean "slow down the site" -some people may be surprised by the request to use the full stack architecture to handle this amount per second. It all depends on how cleverly you use the tools you have--even if you can run slow on node. js. Choose a technology that provides a good development environment, rather than complaining about unfriendly tools (reducing the morale of development).




Who is behind the app


A platform designed by Octivi, a Polish software company. We focus on performance and practicality by focusing on the scalable structure system. We also want to thank the IT department from the client side.


Related articles
    • Use Symfony2 to process 100 million requests within a week-overview of our entire application

    • Push to the extreme-Symfony2 meets high performance requirements-detailed feature description of Symfony2 software architecture





Easily build billions of web sites per week with HAProxy, PHP, Redis and MySQL


Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.