Architecture and deployment issues for large Java Web projects

Source: Internet
Author: User

One ID is jackson1225. Javaeye asked a large web system architecture and deployment selection problem, hoping to improve the existing Java-based Web application service capabilities. Because architectural patterns and deployment tuning have always been a hot topic in the Java community, this issue has led to a lot of enthusiastic netizens ' discussions, some of which are instructive for other large Web projects as well. At the beginning of the discussion jackson1225 described the architecture and deployment scenarios for the current application:

The system architecture is as follows:

  1. The web layer adopts STRUTS+TOMCAT implementation, the whole system adopts more than 20 Web servers, and its load balance is implemented by hardware F5.
  2. The middle layer is implemented with a stateless session Bean+dao+helper class, a total of 3 WebLogic servers, with multiple EJBSdeployed, and load balancing is implemented using F5;
  3. The operation of the database layer is implemented by the common class, and the two Oracle database servers hold the user information and the business data respectively; A SQL Server database is a third-party business data information;

The web layer invokes the EJB remote interface to access the middleware layer. The web layer first invokes the appropriate EJB remote interface through the EJB interface information configured in an XML configuration file;

One operation in this system involves the access and operation of two Oracle libraries and one SQL Server library, i.e. three database connections, completed in one transaction.

Such architectures are actually used by many companies, as struts and Tomcat are the most popular Java WEB MVC framework and servlet containers, while F5 's load balancer is a common solution for scale-out (such as configuring the session sticky scenario). Because there are transactions across data sources in this system, it is possible to guarantee the integrity of things across data sources using the WebLogic Server EJB container and the database driver that supports two-phase commits (of course, the distributed transactions managed by the container are not the only and optimal solution).

But with the popularity of Rod Johnson's heavyweight book "The Java Development without EJB" and the spring framework in it, the concept of lightweight frameworks and lightweight containers has taken root. So for the jackson1225 proposed this scenario, most netizens have raised doubts that the system misuse of technology, is a waste of money. Most netizens think that SLSB (stateless session bean) does not necessarily appear in this scenario, and it is considered that SLSB access to local resources through a remote interface can have a significant performance cost, which is why Rod Johnson criticizes the without EJB a large anti-pattern in EJB 2.x.

Because Java EE is a pattern-based solution, patterns and architectures play an important role in Java EE, so many industry experts are wary of the advent of anti-pattern (anti-patterns). As to whether the scheme described above is anti-pattern, jackson1225 immediately stands out to plead:

Our project is to use EJB as a facade, just to provide the remote interface to the Web layer call, and only with the stateless session bean, so the performance is also possible.

This explanation was soon recognized by some netizens, but it quickly became clear that the architecture was good or bad enough to satisfy the user's needs, and Davexin (perhaps Jackson1225 's colleague) described the user and concurrency of the system:

Now there are users 40 million, to be merged with another company's membership system immediately, add up a total of 90 million users. There are more than 100 million data in the data Quantity list. This is the basic situation, in fact, I think the current architecture is still possible, now support the concurrency of about 5000 concurrent users, then the system transformation, the target supports 10,000 concurrent users.

After the specific concurrency volume was published, the Netizen questioned the data, thinking that the servlet container of the system is too small to support the number of concurrent, and doubts whether the configuration is not optimized. Davexin also complements the server configuration for this project:

System front-end tomcat are used blades, configured in 2G of memory, CPU around 2.0G, each machine also support 250-400 concurrent, and then more, will be the corresponding time very often, more than 20 seconds, lost significance, so we came to the conclusion.

One ID is Cauherk's netizens put forward a more pertinent opinion, he did not from the Web container simple concurrency support ability to propose an improvement program, but put forward some general improvement tips for similar applications, here Summary:

  1. Database Stress Issues

    The database can be configured according to the characteristics of business, region and so on, which can be used to ensure that the database can be traded properly, such as Sub-Library , RAC, partition, sub-table and so on.

  2. Transaction issues

    To operate in two databases, a distributed transaction must be considered. You should design your system carefully to avoid the use of distributed transactions to avoid more database stress and other problems with distributed transactions. It is recommended that you use deferred commit policies (which do not guarantee the integrity of the data) to avoid problems with distributed transactions, after all, the probability of a commit failure is low.

  3. Optimization of the Web

    Static, picture independent use of different servers, for the normal static files, using E-tag or client-side cache, Google is a lot of this is done. For hotspots, consider using full load to memory to ensure absolute responsiveness, with centralized caching (multiple can be load balanced) for hot-spot data that requires frequent access, reducing the pressure on the database.

    For almost all but binary files, you should configure a hardware-based compression scheme on L4 to reduce network traffic. Improve the perception of user usage.

  4. Network problems

    Consider using mirroring, multiple network access, and DNS-based load balancing. If you have enough investment, you can use a CDN (content distribution network) to reduce your server pressure.

This analysis of Cauherk is relatively in place, where Etags's scheme is a recent hotspot, Infoq's use of etags to reduce Web application bandwidth and load in the face of this scenario is described in detail. Typically database-centric Web applications have performance bottlenecks on the database, so cauherk the database and transaction issues to the first two. But Davexin explains that the database is not a bottleneck in the project under discussion:

Our stress is not in the database layer, at the Web layer and F5. When the peak, F5 is also a bit dead, that is, click More than 300,000 per second, the Web dynamic part can not bear. According to our program records, 20 of the web can withstand up to 5,000 concurrent, if more, Tomcat will not respond. It's like dead.

This reply allows the next discussion to focus on the performance optimization of the Web container, but Javaeye webmaster Robbin made his own comments and cited the topic back to the architecture of the project itself:

Performance tuning the most important thing is where to locate the bottleneck, and how the bottleneck is generated.

My guess is that the bottleneck is still on the EJB remote method call!

The Java application above Tomcat accesses the stateless Sessionbean above WebLogic through an EJB remote method call, which is generally at the 100ms~500ms level, or more. If there is no remote method call, even if a large number of spring-based dynamic reflection, a complete Web request processing within the local JVM completion time is generally less than 20ms. A Web request that takes too long to execute can cause the servlet thread to take more time to respond to more subsequent requests in a timely manner.

If that is the case, then my suggestion is that since you are not using distributed transactions, then simply remove the EJB. WebLogic can also be removed, the business layer uses spring to replace the EJB, do not engage in a distributed architecture, and deploy a complete hierarchy on each Tomcat instance.

In addition to high concurrency, Apache handles static resources and consumes memory and CPU, and can consider replacing it with a lightweight web server such as Lighttpd/litespeed/nginx.

Robbin's inference was supported by netizens, Davexin also agreed with Robbin's view, but explained that the company thought it was risky to give up SLSB, so the company tended to replace Tomcat with the user support capability of the WebLogic Server 10来 lift System. Robbin immediately criticized this approach:

Frankly, I've never heard of a precedent for large-scale Internet applications using EJBS. Why large-scale Internet applications can not use EJB, in fact, because the EJB performance is too poor, using EJB almost inevitably have performance barriers.

The performance of the Web container is nothing more than the servlet thread scheduling capability, Tomcat is not as weblogic as the addition of N multi-management functions, run fast is normal. Compare and test the performance of the WebLogic database connection pool and the C3P0 connection pool will also find similar conclusions, C3P0 can be several times faster than WebLogic's connection pool. This is not to say that WebLogic performance is not good, but weblogic to achieve more features, so in a single speed will sacrifice a lot of things.

In my experience, using versions above tomcat5.5, configuring APR support, making the necessary tuning, using the BEA JRockit JVM, it is possible to support 500 concurrency on your current blades. With the hardware of your current 20 blades, it is no problem to reach 10,000 concurrency. Of course, the premise is that the EJB must be thrown away and the web and business layers within the same JVM.

Next, Robbin also analyzed the test data for Davexin on Tomcat and WebLogic, respectively:

Reference:

2. 1 weblogic10 Express (equivalent to 1 tomcat for publishing JSP apps) plus 1 weblogic10 (publish EJB app), can support 1000 concurrent users .... 4. 1 tomcat4.1 plus 1 Weblogic8, can only support 350 concurrent users, Tomcat link timeout, indicating this structure bottleneck in Tomcat.

This indicates that the bottleneck is not yet on the EJB remote call, but the problem is becoming clearer. Why does the WebLogic act as a web container when initiating a remote EJB call to support 1000 concurrent, but Tomcat only to 350? There are only two possible causes:

  1. Your tomcat has not been configured to significantly affect performance
  2. The interface between Tomcat and WebLogic is out of the question.

Then Springside Project Initiator Jiangnan White also proposed an overall optimization guide:

1. Basic Configuration optimization

Tomcat 6? Tuning tomcat parameters? JRockit JVM? Tuning JVM parameters? Apache+squid handling static content?

2. Business Layer Optimization

Partial functionality localization, without the remote session bean being transferred? Asynchronous commit operation, JMS? Cache hotspot data?

3. Display Layer Optimization

Dynamic pages are published as static pages? Cache part of dynamic page content?

Davexin, after adjusting the Tomcat configuration, fulfilled Robbin's challenge to the Tomcat configuration problem, Davexin this describes the test results after configuration optimization:

After testing, the number of concurrent people can reach the same as Robbin said, can be around 600 people, if the pressure to 700 people, there is about 15% of the failure, although after adjusting the above parameters, the number of concurrent people up, but in the same time the number of transactions completed decreased by about 10%, And the response time is delayed by about 1 seconds, but overall, sacrificing a little transaction throughput and response time, the number of concurrent people can increase by 500, it is worth it.

This topic has a relatively good result. This topic is not entirely aimed at a specific project to make sense, more importantly, in the process of analysis and discussion of the problem of the users of the idea, especially Cauherk, Robbin, Jiangnan White, and several other users put forward the comments can let the vast number of Java Web project developers understand the The key issues that need to be considered in the architecture and deployment of a large project also eliminate some misconceptions about the performance of lightweight servlet containers and EJB containers.

There are also some episodes in the discussion, such as the davexin and the Jiangnan white dress, which discusses whether the Jrocket real-time (Realtime) version can enhance the corresponding capabilities of the servlet container, and the answer is no. And the user ID mfc42d from the concurrency support capabilities of the servlet container to the Java thread scheduling ability and the significance of NIO to the Servelet container, he recommended his two good blog Java thread Implementation and Java process use of the maximum memory value, Blog article from the JVM source code level analysis of Java threading support capabilities, facing the JVM performance tuning problems can be read seriously.

Architecture and deployment issues for large Java Web projects

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.