Recently just finished reading Hae's "large Web site technology architecture-core Principles and case analysis", the key content of each chapter made some notes, in order to deepen the impression and future review.
First, the characteristics of large-scale website software system
high concurrency, large flow: need to face high concurrent users, large traffic access.
High Availability: system 7x24 Hour service.
massive data: the need to store and manage massive amounts of data requires the use of a large number of servers.
users are widely distributed, the network situation is complex: many large-scale Internet users to provide services to the global user, a wide range of users around the network situation varies widely.
The security environment is bad: due to the openness of the Internet, Internet stations are more vulnerable to attack, and large websites are hacked almost every day.
rapid change in demand, frequent release: and the traditional version of the release of software issued by different frequency, Internet products for the rapid application of the market, to meet user demand, its product release frequency is very high.
Asymptotic development: Unlike traditional software products or enterprise applications, where all functional and non-functional requirements are planned from the outset, almost all large Internet sites start from a small website and evolve incrementally.
Second, the development of large-scale website Architecture Evolution history
1, the initial stage of the site structure
All resources, such as applications, databases, files, and so on, are on a single server. Commonly used technology for Linux, PHP, Apache, MySQL, the frame composition is as follows:
2, Application server and data service separation
Separating applications from data, the entire site uses three servers: the application server, the file server, and the database server. After separation, servers of different characteristics assume different service roles.
3. Use cache to improve website performance
The cache used by a Web site can be divided into two types: a local cache cached on the application server and a remote cache that exists on a dedicated distributed cache server. After using the cache, the data access pressure is effectively mitigated and the frame composition is as follows:
4, using the application server cluster to improve the concurrency of the site processing capacity
The use of clustering is a common means for web sites to solve high concurrency and massive data problems. With a Load balancer server, access requests from the user's browser can be distributed to any server in the application server cluster.
5, database reading and writing separation
After the site uses the cache, the vast majority of data read operation Access can be done without a database, but there are some read operations (cache access is not hit, cache expires) and all write operations require access to the database. By configuring two database master-slave relationships, you can synchronize data updates for one database server to another server. In order to facilitate application access to read-write separated databases, the application server side usually uses a dedicated data access module, so that the database read and write separation to the application transparent.
6. Accelerate website response with reverse proxy and CDN
The basic principle of CDN and reverse proxy is cache, the difference is that CDN is deployed in the computer room of the network provider, so that when the user requests the website service, it can obtain the data from the nearest network to provide the opportunity room, while the reverse proxy is deployed in the central room of the website, when the user requests to reach the center room, The first server to be accessed is the reverse proxy server, which is returned directly to the user if the resource requested by the user is cached on the reverse proxy server.
7. Using Distributed File system and distributed database system
Distributed database is the last means of web database splitting, and the more common database splitting means of Web site is the business sub-Library, and the different service databases are deployed on different physical servers.
8. Using NoSQL and search engines
Both NoSQL and search engines are technical tools derived from the Internet and have better support for scalable distributed features.
9. Business Split
Large Web sites in order to cope with increasingly complex business scenarios, by using divide-and-conquer means to divide the entire site business into different product lines, split a site into many different applications, each of which is independently deployed and maintained.
10. Distributed Services
Third, the value of large-scale website architecture evolution
No site in the world from the birth of a large web site, and no site for the first time to have a large number of users, high-concurrency access, huge amounts of data, large-scale web site from the development of small sites. The value of the site is that it can provide users with what value, is what the site can do, but not how it is done, so in the site is still very small when the pursuit of the site's architecture is trifles, outweigh the benefits. Small sites need to do is to provide users with good services to create value, to get the user's recognition, live, savage growth.
1. The core value of large-scale website Architecture technology is the flexible response with the website
The core value of large Web site architecture technology is not to build a large web site from scratch, but to accompany the gradual development of small website business, slowly evolved into a large web site.
2, driving large-scale website technology development of the main force is the business development of the website
The innovative business development model has gradually put forward higher requirements for the website architecture, which makes the innovative website architecture mature.
Four, the website structure design misunderstanding
1, blindly follow the big company's solution
2, for technical and technical
Website technology exists for the business, except that it is meaningless.
3. Attempt to solve all problems with technology
Technology is used to solve business problems, and business problems can only be solved by means of business.
"Large Web site technology architecture-core principles and case studies": The evolution of large-scale web site architecture