One, the characteristics of large sites:
1, the number of concurrent high, access to a large amount.
2, the processing of data large (on 1 is bound to cause the following)
3, the user is widely distributed, the network environment is complex.
4, high stability.
5, easy to expand.
6, high security.
Second, the infrastructure of large-scale websites
1, Server Introduction
1.1, Tower Server: The shape is similar to the ordinary PC, the size is relatively large. Features, multiple slots, strong extensibility, often used for entry-level and workgroup-level servers, but with relatively little space.
1.2, Rack server: Width 19 inches, Height in U (1u=1.75 inch = 44.45 mm)
This server occupies a small space, can be installed on more than 1 cabinets, convenient unified management, but due to the internal restrictions, scalability is more limited. Usually have 1U, 2U, 3U, 4U, 5U, 7U several standard service. Many of these servers are used by large enterprises with a larger number of servers, and many enterprises use this type of server, but the server is delivered to a dedicated server hosting agency, especially for many web sites today.
1.3, Blade Server: The main structure for a large body chassis, the interior can be plugged into a lot of "blades", where each blade is actually a piece of system motherboard, similar to a separate server, they can start their own operating system through the local hard disk. Each blade can run its own system, serving the specified different user groups, with no correlation. In addition, these boards can be assembled into a single server cluster using system software. In cluster mode, all blades can be connected to provide a high-speed network environment, share resources, and serve the same user base. By inserting new blades into the cluster, you can improve overall performance. Because each blade is hot-swappable, the system can be easily replaced, and the maintenance time is reduced to a minimum, more space-saving than the machine-type server, heat dissipation is relatively better than the rack-style, but relatively expensive point.
2, virtual host, VPS host, Cloud host.
2.1, Virtual Host: Virtual host is a form of shared host, shared host is actually a server there are many sites, we all share this server hardware and bandwidth, if it fails, all sites will be inaccessible.
2.2,vps Host: Split a server into multiple virtual exclusive servers (VPS), each VPS can be assigned a separate public network IP, independent operating system, independent operation space, independent CPU resources. Relative to the virtual host, the degree of freedom of operation is very strong, and a little similar to the virtual machine under Windows.
2.3, the Cloud host: It is no longer limited to a server, but all the servers in the cloud, the main feature is stability, if there is a server after a certain line problems, the system will automatically assign your website and IP address to other servers, will never make your site inaccessible situation.
3, Network, security devices (switches, routers, firewalls)
4, networked storage technology
4.1,das (direct attached storage) direct attach storage refers to the use of storage devices connected directly to a single server via a SCSI interface, with low acquisition costs, simple configuration, and no significant difference between using the process and using a native drive.
4.2, NAS (network attached storage) NAS actually has a thin server storage device, you can actually understand it as a network file server. The NAS device is connected directly to the TCP/IP network and the network server accesses the management data via TCP/IP network. Nas as a thin server system, easy to install and deploy, management is also very convenient to use. At the same time, because the client can not access the data directly in the NAS through the server, it can reduce the system overhead for the server. Common Brand Group Hui Wei Unicom and so on.
4.3, a SAN (storage area Network) is separate from TCP/IP networks and is a specialized network built for storage that is a SAN. Because of the high-end RAID array technology, it can even reach the 2-4gb/s of the terror transfer rate.
This cost is relatively high, however, because it requires a light-brazing channel and a San array cabinet, which is not typically used by small businesses.
Third, the basic concept
1,http protocol: Hypertext Transfer Protocol, default port number 80, if it is HTTPS, the default port number is 443. Then HTTPS is the user's request is encrypted, in other words, if someone has robbed our request data, he will not be able to find useful information in our request.
2,HTPP Mainstream version is divided into 1.1 1.0 The biggest difference between the two is that 1.1 is support for long-term connection, support Host request header field, by default we use the browser to access the site using the HTTP protocol is 1.1, we can also choose 1.0.
3, the request method, that is, we go to visit the site, the specific operation is what. Common get POST HEAD PUT DELETE
3.1 GET: Submitting a request to a specific resource
3.2 POST: Submitting data to a specified resource (such as a form, file)
3.3 PUT: Uploading content to the location of the specified resource
3.4 Delete: An operation that deletes a resource, similar to the delete of a database.
4, common status Code 200 301 302 304 401 403 404 500 502 503 504
200,ok 304, document 404 Not Modified, resource not found
301, Permanent redirect 401, user authentication 500, internal error
302, temporary redirection 403, no access 502, the request is not completed, usually nginx can not find PHP
503, the request is not completed, the server is temporarily overloaded or down. In general, there is a problem with the backend server when acting as an agent.
504, when the network timeout, general database problems often show this code.
5. Commentary on the elements of the website
JS CSS PNG Jpeg rar ...
6,url, URI
URL, Uniform Resource Locator, URI, Uniform Resource Identifier. In fact, the URI is included in the URL, the URL is usually the beginning of HTTP, through this open page, and some URI you in the address bar input is not accessible at all.
7, static Web pages, dynamic Web pages, pseudo-static.
7.1 Static Web pages, typically web pages ending in. html,. htm.
7.2 Dynamic Web page, General suffix. asp. php. jsp. CGI Web page, and in the Dynamic Web site URL There is an iconic symbol--"? ", the main thing is that if he wants to interact with the backend database.
7.3 Pseudo-Static, as the name implies, he is actually Dynamic Web page, through the function of Rewire, show is 1 static webpage, this is advantageous to the ranking of the website.
8, common Web server Software
Apache Nginx IIS Tomcat
9, Volume-level terminology
9.1 IP UV PV
IP is worth the number of IPs to access the site.
UV refers to how many people visit the site, or how many computers are available to access it.
PV refers to the number of pages that visit this site, including refresh.
9.2 Number of concurrent connections throughput
Not the number of connections, and how many people visit this site at the same time.
Throughput: The number of requests that our servers can handle at a certain time.
9.3, how to evaluate the maximum number of sites is not connections.
9.3.1: Professional Tools LoadRunner
9.3.2: Temporary Test Tool AB
Yum instll-y httpd with the AB command, need to be specific to a file
Ab-c 100-n www.qq.com/index.php
-C = concurrency number
-N Indicates the number of requests
Important indicators;
Requests per second: number of requests processed per second
Time per request: How long it takes to process the concurrency number here is 100.
Time per request: Every 1 times it is not counted.
Learn to build a high-concurrency site structure record