A large web site architecture scheme based on Java technology __java

Source: Internet
Author: User
Tags failover server array server memory subdomain
Web LayerThe principal architecture can be based on Struts 1.x/2.x, of course there are many better control layer frameworks to choose from, with fast agile as the guideline. Abstract the operation of the core Library encapsulation controller and middle tier. In a large-scale cluster environment, session replication can cause serious performance problems. Consider using cluster caching + cookie validation instead of session to achieve permission control. Cache LayerConfigure Memcache to compose the cluster cache to encapsulate the Memcache client Memcached node to compose the pool, call the schematic: oplist (bizname, policy ...) Middle tier

The "middle tier" can be understood to be based on the hierarchy between application and data. It is designed to provide for Web applications: data caching and transparent data access to applications-that is, the application does not need to consider the problem of data table splitting. Delivers high-performance calls to the storage layer and distributed computing in a service manner. Alternative framework: ICE Hadoop directly based on memcache development (reduced complexity, recommended) storage

Recommended MySQL, Reason: free, after the practice of testing, there are a large number of mature cases, solutions, technical support. Small scale: A data table Maintenance Storage Server array, content-> mount ... Large-scale: Master-slave mode +mysql Proxy, to achieve database reading and writing separation. Under the packaging of the middle tier, the following extensions can be made to support larger data access: Database/table horizontal split, example user-> user33% + user33% + user34% database/table vertical Split, example user-> Userbaseinfo + use Raddrinfo

Also consider the use of Longstore (long storage) solution, by the Dragon Storage Management Memory Array ...DeployDivision of subdomains, each subdomain a Web application package, non-interference static resources (CSS, JS, image ...) ) using a dedicated static serverLoad BalancingSmall scale: DNS polling.
Large scale: F5, 2*x F5 server, F5 is L4/L7 layer switch, each can handle at least 2 million connections (related to server memory).
Ngnix is L7 layer Exchange, LVS load Balancing is also a scheme
Web Middleware SelectionTomcat-up to 400 concurrent with Apache-up to 2000 concurrent Ngnix-Superior to Apache adoption scenarios:Ngnix + Resin, Reason: Resin provides a faster servlet engine-select resin. Gzip Issue-Resin has a memory overflow potential when processing gzip alone, so add a layer of Ngnix. Ngnix can reduce the memory footprint when using resin alone-resin establish 1000 connections to use 1000 threads, and after adding Ngnix, the resin memory pressure is greatly reduced through its "asynchronous connection" and "Establish long connection" mechanism. Ngnix performance optimization for Linux systems-0 Copy, send file ... Therefore uses: 1 Ngnix + 1 Resin, one-to-one.
Static server uses: Squid + Apache, why? Because Squid has cache ability ...New Changes-Nginx, starting from version 0.7.48, supports caching similar to squid. This cache is the URL and the associated combination as a key, with MD5 encoded hash saved on the hard disk, so it can support any URL links, but also support 404/301/302 such a non 200 status code. Although the official Nginx Web caching service can only set an expiration time for a specified URL or status code, and does not support a squid-like purge instruction, manually clear the specified cache page, but through a third-party nginx module, you can clear the cache for the specified URL. The Nginx Web caching service consists of Proxy_cache related instruction set and Fastcgi_cache related instruction set, which is used to cache the back-end content source server for the reverse proxy, which is mainly used to cache the fastcgi dynamic program. The functions of both are basically the same. The latest version of Nginx 0.8.31, Proxy_cache and Fastcgi_cache, has been perfected, plus a third party Ngx_cache_purge module (used to clear the cache for the specified URL), which can be completely replaced by squid. Some websites already use the Nginx Proxy_cache cache function in the production environment for more than two months, very stable, the speed is inferior to Squid.

Functionally, Nginx already has the Web cache Acceleration feature squid has, and the ability to clear the specified URL cache. But in the performance, Nginx to multi-core CPU's utilization, surpasses squid many. In addition, Nginx is much more powerful than squid in reverse proxies, load balancing, health checks, back-end server failover, rewrite rewriting, and ease of use. This allows a nginx to be used both as a load-balancing server and as a Web cache server. The following is a configuration fragment for reference:

HTTP {... client_body_buffer_size 512k;  
  Proxy_connect_timeout 5;  
  Proxy_read_timeout 60;  
  Proxy_send_timeout 5;  
  Proxy_buffer_size 16k;  
  Proxy_buffers 4 64k;  
  Proxy_busy_buffers_size 128k;  
  Proxy_temp_file_write_size 128k;  
  #注: The paths specified by Proxy_temp_path and Proxy_cache_path must be proxy_temp_path/data0/proxy_temp_dir in the same partition;  
  #设置Web缓存区名称为cache_one, the memory cache space size is 200mb,1 day clean cache, hard disk cache space size is 30GB.  
Proxy_cache_path/data0/proxy_cache_dir levels=1:2 keys_zone=cache_one:200m inactive=1d max_size=30g;  
    The server {... location/{#如果后端的服务器返回502, 504, execution timeout error, automatically forwards the request to another server in the upstream load balancing pool for failover.  
    Proxy_next_upstream http_502 http_504 error timeout invalid_header;  
    Proxy_cache Cache_one;  
    #对不同的HTTP状态码设置不同的缓存时间 proxy_cache_valid 304 12h;  
    Proxy_cache_valid 302 1h; #以域名, URI, and parameters are combined into the Web cache key value, nginx the cache content to the level two cache directory Proxy_cache_key $host $uri$is_args$args According to the key value hash; 
    Proxy_set_header Host $host;  
    Proxy_set_header x-forwarded-for $remote _addr;  
    Proxy_pass Http://backend_server;  
  Expires 1d;  
  #用于清除缓存, assuming that a URL is http://192.168.1.44/test.txt, you can clear the cache of the URL by accessing the http://192.168.4.44/purge/test.txt.  
    Location ~/purge (/.*) {#设置只允许指定的IP或IP段才可以清除URL缓存.  
    Allow 127.0.0.1;  
    Allow 192.168.0.0/16;  
    Deny all;  
  Proxy_cache_purge Cache_one $host $1$is_args$args;  
  Dynamic applications that end #扩展名以. PHP,. JSP,. CGI are not cached. Location ~. * * *.  
    (php|jsp|cgi)? $ {Proxy_set_header Host $host;  
    Proxy_set_header x-forwarded-for $remote _addr;  
  Proxy_pass Http://backend_server;  }  
}

At the same time, for static resources that affect page presentation, for example: CSS, JS, etc. can be placed with high-quality bandwidth of the IDC (idc= Internet Data Center, high quality/high speed bandwidth is also more expensive, is called a price of goods); Other static resources, such as pictures, can be placed in the relatively low price of IDC, The domain name distinguishes between two kinds of static resources, saves each cent money. Network topology Map

/Ngnix-1:1-resin
F5--
/Squid-1:n-apache
Monitoring statistics platform Business Statistics-user access Statistics software performance-application system monitoring, such as: Request Response time ... Hardware/network performance-ganglia monitoring other points ie browser to the same domain name (including the subdomain) can only establish 2 connections, more than the queue only ... Dual F5 architecture, two functions divided into different, mirror, heartbeat take over ... RAID Storage Array ... Linux operating system and its optimization ... Reproduced from: http://blog.csdn.net/kthq/article/details/4456385

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.