What you need to focus on developing high-performance websites with Java

Source: Internet
Author: User
Tags connection pooling log4j java se

Whether a large portal site or a small or medium-sized vertical type site will pursue stability, performance and scalability. The technical experience of large-scale web site sharing is worth learning and borrowing, but the implementation to more specific practice is not applicable to all sites, other language development sites I dare not say more, but the Java development system, I can still you to plug in a few words:

JVM
The correct use of JVM parameter configuration parameters running in Jee container is directly related to the performance and processing capability of the whole system, and the tuning of the JVM is mainly about the tuning of memory management, the direction of optimization is divided into the following 4 points:
1.HeapSize Heap Size, it can also be said that Java Virtual machine use memory strategy, this is very critical.
2.GarbageCollector uses the 4 algorithms (policies) of the garbage collector in Java by configuring the relevant parameters.
The 3.StackSize stack is the memory instruction area of the JVM, with each thread having his own stack,stack size limiting the number of threads.
The 4.debug/log can also set the log output to the JVM runtime and the JVM after it is installed in the JVM, which is critical to configure the appropriate parameters based on the log output of the various JVM.
JVM configuration skills are ubiquitous on the web, but I recommend reading Sun's 2 official articles, which can still have an understanding of the configuration parameters
1.Java HotSpot VM Options
http://www.oracle.com/technetwork/java/javase/tech/vmoptions-jsp-140102.html
2.Troubleshooting Guide to Java SE 6 with HotSpot vmhttp://www.oracle.com/technetwork/java/javase/ Index-137495.html
In addition, I believe that not everyone of the siege division is on these JVM parameters every day, if you forget those key parameters you can enter java-x (uppercase X) to prompt.

JDBC


1. For example: AutoReConnect, Prepstmtcachesize, Cacheprepstmts, Usenewio, blobsendchunksize, etc.,
2. For example in a clustered environment: Roundrobinloadbalance, Failoverreadonly, Autoreconnectforpools, Secondsbeforeretrymaster.

database connection pool (DataSource)

Here I think there is a point to note:

reference:
http://dev.mysql.com/ Doc/refman/5.1/zh/connectors.html#cj-connection-pooling

Data access
Database server optimization and data access, what type of data in what place better is worth thinking about the problem, the future of storage is likely to be mixed, cache,nosql,dfs,database in a system will have, life of tableware and everyday wear clothes need to be placed at home, But not with the same type of furniture storage, it seems that no one else put the tableware and clothes in the same cupboard inside. This is like the different types of data in the system, the need to use the appropriate storage environment for different types of data. The storage of files and pictures is first categorized by the popularity of the access, or by the size of the file. Strong relationship types and require transactional support for traditional databases, weak relational type does not require transactional support to consider NoSQL, massive file storage can consider the DFS that supports networked storage, as far as the cache depends on the size of your individual data storage and the ratio of read and write.
It is also worth noting that the data read and write separation, whether in the database or NoSQL environment, most of the reading is greater than write, so in the design should also consider not only need to let the data read scattered on multiple machines, but also to consider the data consistency between multiple machines, MySQL, a master many from, Add Mysql-proxy or borrow some parameters from JDBC (roundrobinloadbalance, failoverreadonly, Autoreconnectforpools, Secondsbeforeretrymaster) for subsequent application development, it is possible to separate read and write, to spread a large amount of read pressure across multiple machines, and also to ensure data consistency.

Cache
On the macro level, the cache is generally divided into 2 types: local cache and distributed cache
1. Local cache, for the local cache of Java is to say the data into static data in combination, and then need to use it from the static data in combination to take out, for high concurrency environment recommended Concurrenthashmap or Copyonwritearraylist as the local cache. The use of the cache is more specific to the use of system memory, the use of how much memory resources need to have an appropriate proportion, if more than the appropriate use of storage access, will be counterproductive, resulting in inefficient operation of the entire system.
2. Distributed cache, generally used in distributed environment, the cache on each machine centralized storage, and not only for the use of the cache category, but also as a distributed system data synchronization/transmission of a means, generally the most used is memcached and Redis.
data stored on different media read/write efficiency is different, how to use the cache in the system, so that your data closer to the CPU, there is a picture you need to always remember in mind, from Google technology Daniel Jeff Dean (REF) masterpiece,:

Concurrent/Multithreaded
in highly concurrent environments, developers are advised to use the concurrency package (java.util.concurrent) that comes with the JDK. Using the tool class under java.util.concurrent after JDK1.5 can simplify multithreaded development, which is divided into the following main parts in Java.util.concurrent tools:
1. Thread pool, thread pool interface (Executor, executorservice) and implementation class (Threadpoolexecutor, Scheduledthreadpoolexecutor), Using the thread pool framework that comes with the JDK, you can manage the queue and schedule of tasks and allow controlled shutdowns. Because running a thread consumes system CPU resources, and creating and ending a thread also has overhead on the system's CPU resources, using the thread pool can not only effectively manage the use of multithreading, but it can also improve the efficiency of threading.
2. Local queues provide an efficient, scalable, thread-safe, non-blocking FIFO queue. The five implementations in Java.util.concurrent support the extended Blockingqueue interface, which defines the blocking versions of Put and take: Linkedblockingqueue, Arrayblockingqueue, Synchronousqueue, Priorityblockingqueue and Delayqueue. These different classes cover the most common use contexts for producer-consumer, messaging, parallel task execution, and associated concurrency design.
3. Synchronizer, four classes can assist in the implementation of common private synchronization statements. Semaphore is a classic concurrency tool. Countdownlatch is an extremely simple but extremely common utility used to block execution before a given number of signals, events, or conditions are maintained. Cyclicbarrier is a multi-path synchronization point that can be reset, which is useful in some parallel programming styles. Exchanger allows two threads to Exchange objects at the collection point, which is useful in multi-pipelined designs.
4. And contracted Collection, this package also provides a Collection implementation designed for use in multi-threaded contexts: Concurrenthashmap, Concurrentskiplistmap, Concurrentskiplistset, Copyonwritearraylist and Copyonwritearrayset. When many threads are expected to access a given collection, Concurrenthashmap is usually better than synchronous hashmap,concurrentskiplistmap usually better than synchronous TreeMap. Copyonwritearraylist is better than synchronous ArrayList when the desired readings and traversal are far greater than the number of updates in the list.

Queue
about queues can be divided into: local Queue and Distributed queue Class 2
local queues: Commonly used for non-timely data bulk write, you can cache the obtained data in an array medium to a certain number of times in the bulk of a write, you can use Blockingqueue or list/map to achieve.
Related information: Sun Java API.
distributed queue: Generally as a message middleware, to build a distributed environment sub-system and subsystem communication between the bridge, JEE environment is most used in the Apache Avtivemq and Sun Company's OPENMQ.
The lightweight MQ middleware has been introduced to you for example: Kestrel and Redis (Ref http://www.javabloger.com/article/mq-kestrel-redis-for-java.html), I've recently heard that LinkedIn's search technology team has launched an MQ product,-kaukaf (Ref Http://sna-projects.com/kafka), to keep an eye on it.
Related information:
1.ActiveMQ http://activemq.apache.org/getting-started.html
2.OpenMQ http://mq.java.net/about.html
3.Kafka Http://sna-projects.com/kafka
4.JMS article HTTP://WWW.JAVABLOGER.COM/ARTICLE/CATEGORY/JMS

NIO
NiO is in the post-JDK1.4 version, before Java 1.4, the JDK provides a stream-oriented I/O system, such as a read/write file is one byte at a time to process data, an input stream produces a byte of data, an output stream consumes one byte of data, flow-oriented i/ O is very slow, and a packet either has been received by the entire datagram, or not yet. Java NiO non-clogging technology is actually to take reactor mode, there is the content in the automatic notification, do not have to death, dead cycle, greatly improve the system performance. In the real situation, NIO technology uses two aspects, 1 is the file read and write operation, and 2 is the operation of the data stream on the network. There are several core objects in NiO that need to be mastered: 1 selectors (Selector), 2 channels (channel), 3 buffers (buffer).
my nonsense:
1. In the technical category of Java NIO, a memory-mapped file is an efficient way to isolate the cold/hot data stored in the cache, and to process some of the cold data in the cache, which is much faster than regular stream-based or channel-based I/O. By making the data in the file appear as the contents of an array of memory, the portions of the file that are actually read or written are mapped into memory, not the entire file in memory.
2. The database can also be manipulated using NIO technology in MySQL's JDBC driver to improve the performance of the system.

long connection/servlet3.0

servlet3.0 is to turn on request requests for a thread to suspend, intermediate set wait time to timeout, if background events trigger request requests, the resulting results are returned to the client's re Quest requests that if no events occur during the set wait time-out period and the request is returned to the client, the client will initiate the request again, and the client-side interaction with the server can be reciprocated.

log

simply, the logs are exported to different environments according to the different policies and levels defined, so that we could analyze and manage them easily. On the contrary you do not have the output of the strategy, then a lot of machines, a long time, there will be a big push a messy log, will let you wrong when the error, so the output strategy of the log is to use the key point of the log.
reference: http://logging.apache.org/log4j/1.2/manual.html

Package/deploy

If the web-and timed-crawl function modules are fully packaged in a single project, it will lead to the need for a Web application as well as a timer on each machine when it needs to be expanded. Because the function modules are not separated, the timer work on each machine will result in duplication of data inside the database.

Framework
The so-called popular SSH (struts/spring/hiberanet) lightweight framework, for many small and medium-sized projects is not lightweight, developers need not only maintain code, but also need to maintain cumbersome XML configuration files, And maybe a configuration file is not written so that the whole project will not work. No configuration files can replace the SSH (struts/spring/hiberanet) framework the product is really too much, and I've introduced some of the products (REF) to you before.
This I am not blindly against the use of SSH (struts/spring/hiberanet) framework, in my eyes the SSH framework really is to do the normative development, and do not use the SSH (struts/spring/hiberanet) framework can improve how much performance.
SSH framework just for a very large number of people on the team, but also need to continue to increase the size of the company, it is necessary to select some of the market is recognized, and familiar with the technology, SSH (struts/spring/hiberanet) framework is more mature, so it is the first product.
But for some small teams have a technical tall team can choose a more concise framework, really to speed up your development efficiency, the early abandonment of the SSH framework selection of more concise technology in small team development is a more knowingly choice.

Other:

Using Java to do a large-volume, high-concurrency site should be how to do the underlying framework? What framework techniques are used for comparison?

For:

There are a few common measures
1. Set up cache module for common functions
2, the website as far as possible static
3, using a separate picture server, reduce the pressure on the server, so that it will not be caused by the image load crash
4, using the image to solve different network access providers and different geographical user access differences
5. Database Cluster Chart Hash
6, strengthen the network layer hardware configuration, hard not to soft.
7. Ultimate Approach: Load balancing

Ride the Wind Water
Links: https://www.zhihu.com/question/19809311/answer/13181721
Source: Know
Copyright belongs to the author. Commercial reprint please contact the author for authorization, non-commercial reprint please specify the source.

Article transferred from http://www.javabloger.com/java-development-concern-those-things/

What you need to focus on developing high-performance websites with Java

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.