In the company, have seen many different types of websites and systems, there are good and bad. Some of these systems have a good server/network architecture, and are properly tuned and monitored, while general systems have security and performance problems that do not work well and become more prevalent.
In China, the open source lamp stack is the most popular network architecture, which uses PHP development, runs on Apache servers, and MySQL as a database, all running on Linux. It is a reliable platform that works well and is now the world's most popular Internet system architecture. However, it is difficult to scale properly and maintain security because each application layer has its own problems, flaws, and best practices. Our job is to help businesses create and run high-performance, scalable, and secure systems with the lowest operating costs, so we have a lot of experience with these types of issues.
The reality is that many sites are created quickly and cheaply by developers, usually without any IT staff or managers, but by programmers who manage the system. The result is that, while the site can start running at a very low cost, it usually faces real problems when it has a large number of users and needs to scale. After all, China has 380 million internet users, and if 0.01% of them visit the site, it can easily trigger a 250,000 ~50 of page traffic. These issues are generated at all levels, and the rules summarized below summarize the most general issues and explain why they are so important and what are the best ways to fix them. Sites that follow these recommendations increase its scalability, security, and operational stability.
1. Use the appropriate session management
The first way to think of an extended system is to add more hardware. For example, use two servers instead of one. This sounds reasonable, but creates potential problems: Session management. This is a serious problem for Java programs, and it also produces extensibility problems in PHP, especially for database workloads.
A session is defined as a separate end-user login or connection for a period of time, which typically contains multiple TCP/IP HTTP connections, several Web pages, and typically dozens of or even hundreds of page elements, such as frames, menus, Ajax
Updates, and so on. All of these HTTP requests need to know who the user is, to meet security requirements, and to deliver the appropriate content to the user because these are part of the conversation. Typically, each session includes interrelated session data, such as user name, user ID, history, Shopping cart, statistics, and so on.
The problem is that in the case of two Web servers and multiple HTTP connections, user traffic is distributed and moved between the two servers, and it is difficult for the server to know who the user is and to track all the data, because each page or page component can come from a different server. In PHP, this is usually resolved by creating a session ID and placing it in a cookie the first time you connect or log in, and the cookie is sent along with each HTTP request.
This poses a problem, and then each PHP script needs to look up the session data based on the ID. Since PHP cannot maintain state between executions (this differs from Java), this session data needs to be stored somewhere, usually in a database. However, if complex pages need to be searched 10 times per page load (which is often done), it means that each page executes 10 SQL queries, which results in a large load on the database.
In the previous example of China Internet user 0.01%, it might be easy to generate hundreds of queries per second just for managing sessions. The workaround is to use the session ID in the cookie all the time, and use services like memcached to cache session data for high performance.
Also note that there is a security issue, because hackers can forge another user's session ID, which is easy to find or see, especially in public Wi-Fi. The workaround is to encrypt or sign the session ID appropriately and bind it to a time zone, an IP address, and other critical information like a browser or other details. There are plenty of good examples of good session management on the Internet, and you can find the best fit if you need to.
2. Always consider security
While writing code like preventing SQL injection and login security involves a lot of security issues, unfortunately, almost no one has considered security, and those who are considering it don't understand it well enough. And this article is concerned about the operational system security. For this type of security, we focus on three security areas: firewalls, running users, and file access rights.
In addition to configuring a dedicated hardware firewall (like the ASA for Cisco), all servers should also run firewalls such as iptables, which protect servers from other threats and attacks. These threats and attacks may come from the public Internet, other servers or local servers, and also include developers and operators who use VPN or SSH channels. We only open the ports that are really needed for the specified IP. Iptables can be complex, but there are a lot of nice templates that we can usually use to help customers create iptables. For example, the default Redhat or CentOS firewall configuration instructions are only 10 lines, which is obviously not practical. Our best practice iptables configuration is about 5 pages, which includes the highest level of security that Linux can provide.
All public services should be run under dedicated users, such as Apache. Remember never to run with the root user, as this will allow any user who has hacked into Apache to take over the entire server. If Apache is just running under Apache users or running under nobody, it's not easy to break into Apache. Files that the Web server runs or serves (like. php and. html files) should not be writable for Web server users. This means that Apache or Nginx users should not have write access to the Web directory. There are many ways to do this, and the simplest is to have these files owned by other users, and then let
Users such as Apache/nginx belong to a group that can read files with 640 permissions. This will protect against almost all hackers and page-based attacks.
Also, never use FTP to upload files, especially in a public Wi-Fi environment, where hackers can easily steal usernames and passwords. Instead, using SFTP is more secure. In addition, each employee should have their own user ID and a random password.
3. Using the standard path and installation configuration
A nasty deployment problem is that developers rarely consider where their software will be deployed to the production Web server and how to deploy it. We have seen many large systems deploying their PHP code under the/home/xiaofeng or/web/code path. In fact, both of these paths are very nonstandard and pose an operational and security issue. When these systems are moved from the development environment to the test environment to the production environment, because each installation configuration is non-standard, problems often arise, and developers need to adjust to be able to work properly.
You should always use a standard installation package and binaries to install servers like Apache. Do not compile or install Tarball from source code, as this can cause long-term stability and management problems, and it can be confusing to install several different versions on the server.
Web sites should always be tested and deployed under the standard path of the specified platform and Linux release, such as the/var/www/html path under RedHat or CentOS. This facilitates effective rights management, backup, configuration, monitoring, and other operations for the system.
The Web server's logs should be stored under/var/logs or/var/logs/app_name, rather than in the main code area. The reason for this is not only because the path to these standards is important, but also to the fact that the local configuration server configures/var as a detached file system. If the application suddenly writes a large number of logs and consumes all the disk space, we do not cause the system to crash or other serious problems because of the configuration we have made. If the log is in a different location, it can cause problems.
4. Always use logs
It is not too much to log in the Web system. All systems should write important data to the logs, whether their own logs or syslog of the system. Cron jobs, as well as other shell scripts or C-language programs, have the appropriate standards for logging
and a simple function. In a shell script, you only need to use the Logger command to implement log writes. Write log operations are performed in case of script start/stop, important script execution, and real-time data generation. When this happens, looking at the main system log makes it easy to see what's going on.
Large systems often use specialized tools such as LOCAL5 to log logs and configure a syslog or
Syslog-ng to store it in a separate file, which is easier to use. It is important to note that the Syslog tool and Logger (and any syslog calls) default to the use of User.notice, which you can adjust if necessary.
A good system configures the program to open or close the log, and optionally applies different levels of logging at the level of each module or feature. This allows us to record very detailed and powerful logs that are used to analyze and debug problems that occur during production operations.
5. Using a good database design and SQL
In any system, the database is usually the biggest performance bottleneck. The two biggest problems that affect database performance are database design and SQL code quality. Many systems have good or at least a usable database design, but because of the lack of proper performance testing, the quality of SQL code is usually poor. Such SQL code may run very quickly in the development environment, because there are only small datasets and minimal load. But when thousands of users read the millions record in the database at the same time, it is likely to crash.
Unfortunately, these problems are not obvious at first, until the system grows and suddenly begins to collapse. In the process of increasing, the database system looks very fast (because the data is in memory, and there are very few concurrent queries), and the response to the user is fast, but in fact it is inefficient in its internal operation. This does not matter, and we focus on finding and resolving these issues before the system grows and encounters performance problems.
There are a lot of good books and sites for this problem, including the slow query log, the InnoDB state system, and the MySQL stats that describe the current performance. We've seen a lot of systems reading 500,000 of data per second, which is a clear sign of a SQL problem, but companies often don't know about it until the server starts crashing.
The MySQL system should use the InnoDB storage engine for all data, because InnoDB runs faster and more stably than the previous MyISAM, and it is easier and quicker to manage performance and backup work. In the master configuration file, InnoDB should be set as the default database engine, and the system should check from time to times to see if the MyISAM table was accidentally created.
6. Always have a good DB configuration and backup
Many companies do not have a good backup mechanism and do not know how to do the job properly. MySQL dump is not enough because the best way to backup is to use LVM snapshots and InnoDB to perform hot backups of the system, resulting in ultra-fast speeds and high reliability.
In addition, you need to compress and encrypt all the backup files before you transfer them from the server. Also make sure you have a properly designed MySQL configuration. The MySQL default installation instructions for use only 5~10 lines about the configuration, which is not suitable for development use at all. The best practice documents we provide to our customers are 10 pages long. There are approximately 100 useful settings for security, performance, and stability issues in the documentation, including preventing data corruption, many of which are important.
7. Using read/write database separation
As systems become more and more large, especially when they have poor SQL, a single database server is often not enough to handle the load. But multiple databases mean duplication, unless you detach the data. More generally, this means
Flavor establishes the master/slave replica system, where the program writes all UPDATE, insert, and delete change statements to the main library, and all select data is read from the database (or multiple from the database).
Although conceptually simple, it is not easy to achieve a reasonable and precise implementation, which may require a lot of code work. Therefore, even if you start with the same database server, you should plan to use a separate DB connection for read and write operations in PHP as early as possible. If the work is done correctly, the system can scale to 2, 3, or even 12 servers, with high availability and stability.
8. Using a database cache like memcached
Even with good database design, SQL, and read-write, large systems still require faster performance, especially with regard to session status, friend lists, and BBS text. To achieve this, we can use data caches like memcached, a high-performance, simple data cache that has been used by all the largest sites. However, be careful not to 100% rely on a memcache server to improve performance, because if that server crashes, it will disrupt the performance of the entire system. In this case, the cluster schema should be formed using a two-to-one memcache server, optionally containing a cache preparation process and, if the cache server restarts, reload the data, which can quickly load the cache.
9. Build a test and development environment
Many companies have only developer desktop systems and their production servers. As systems become larger and more complex, testing and managing code can lead to serious problems. The best practice is to have two test systems, one integration test for the developer's Code and functionality, and the other to be fully consistent with the production environment, making it easier to transition smoothly to the production environment. Fortunately, it is now easy to use cloud computing (or private cloud). A 5~10 server production environment can be easily replicated in the office or IDC using a single server for testing purposes, and this server can be used for multiple customer projects.
10. Using version control
Finally, you want to use versioning for everything, including testing and deployment of production environments. Many developers use SVN or a similar approach. Ideally, these methods can be used for all code, scripts, HTML, pictures, configurations, documents, and tests. Versioning should be the only way to transfer code to a test environment, rather than simply copying or using a tar file because both are unreliable. Developers should check everything in, tag, and check them out to the test system. If all is not a problem, they will check out the version to the production environment. Summarize
There are a number of things to be aware of when creating a reliable, high-performance Web system, whether in development or in the process of operations. This paper attempts to discuss the most important points from the angle of operability and reliability. When you build and manage your site, don't forget these important questions. Following these rules will help to ensure that the system runs long and well.
10 rules for large, high-performance websites