From JAVA multithreading to cluster distributed and network design analysis, java Multithreading

Last Update:2015-11-09 Source: Internet

Author: User

Tags time in milliseconds try catch

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

From JAVA multithreading to cluster distributed and network design analysis, java Multithreading

JAVA multithreading is widely used, and there is almost no way to do it in the current system. In many cases, how to apply multithreading becomes a first choice, in addition, there are a lot of knowledge about java multithreading. In this article, we will first introduce and describe some common ones. If necessary, we will explain more complex ones in subsequent articles, this article describes several topics about multithreading:

1,When Will multithreading be selected during application development?

2What should I pay attention to when using multiple threads?

3, Status conversion control, how to solve the deadlock?

4, How to design a scalable multi-thread processor?

5. multi-thread Association: expansion under multiple hosts-Cluster?

6. WEBMultithreading and persistent connection principles of applications.

1. When to choose multithreading in application development.

I have mentioned some articles about multi-threaded applications in the previous article. By controlling some web threads and downloading traffic, it is only a small skill, there are also many problems that need to be solved, but it is generally not a problem in the face of a small number of users.

The embodiment of multithreading in life is to hand over multiple identical things to multiple people for parallel completion, and there is a main thread in the middle to play the role of the scheduler, the operator can forcibly rely on the existence of the main thread, or make the main thread dependent on itself. I have heard many people say that if your machine is a single CPU, multithreading is meaningless, in fact, I don't think so. I think that a single CPU in it can only prove that when the thread is scheduled, it can only execute a command at the bottom layer at the same time, but does not mean that it cannot improve the efficiency in CPU requisition, one is at the memory level, and the other is at the CPU level. There is still a big gap in efficiency; (This allows a single program thread to cycle for 1 billion times (each auto-increment of 1). This is the same action as running 10 threads for 0.1 billion times independently. Remember hereDo not separate each data entrySystem. out. when println comes out, one is machine failure, and the other is that it will affect the test data, because this method is blocked as described in the previous article, especially in the case of concurrency congestion, even if the results under a single CPU is certainly a big gap, I do not have a single-core PC machine at the moment, therefore, you cannot obtain some test result data. If you have any requirements, please test the data yourself ).

In today's systems, the idea of Multithreading is inseparable all the time. clusters and distributed systems can all be understood as a principle of multithreading. So what is the principle of multithreading? What is multithreading and multi-process?

In fact, the simplest idea of distribution is multi-process. In fact, it is similar to a vertical separation in the process of system separation, where different business systems are distributed on different nodes to run, they do not interfere with each other, while the overhead of multiple processes in applying for and releasing resources is large, and the resource occupation is not at the CPU level, while the thread is more detailed inside the process, A process can allocate N threads to it. These threads requisition CPU resources in parallel. If your machine is a multi-core processor, concurrency will lead to exceptional performance improvement, actually, the principle is how to maximize the performance advantage with limited resources (but it must be that the resources have a certain margin, so the so-called work cannot be done too much ).

In java, there are three methods that are often used to implement multithreading:

1. inherit from the Thread class and override the run Method

2. Implement the Runable interface and run Method

3. Implement the Callable interface and call method (with return values)

As to the variety of calling methods, you can directly start with start, you can also use java. util. concurrent. Executors to create a thread pool to complete, the created thread pool is also mainly divided:

1. Executors. newSingleThreadScheduledExecutor () creates a thread pool for sequential execution. You do not need to use synchronized for synchronization within the run method because it is sequential.

2. Executors. newCachedThreadPool () creates a thread pool and the thread executes it in parallel.

3. Executors. newFixedThreadPool (10) creates a thread pool with a size of 10. This thread pool creates a queue with a length of 10 at most. If there are more than 10 threads, a maximum of 10 threads can be executed, that is, the number of threads can be controlled or executed in parallel.

If your system is a WEB application, we recommend that you do not use multiple threads in the web application, because this part of thread control is mainly controlled by web containers, if it is necessary to establish a thread, try to establish less, or try to release the thread that can be scheduled less frequently after use, even if the next reconstruction does not matter.

If your multi-threaded sequence runs independently and is used to receive and process some messages, I believe that at least one thread is continuously tested (many programs will sleep for a moment, for example: timeUnit. MINUTES. sleep (SLEEP_TIME) This method is to sleep for a period of time in milliseconds. For such programs, it is best to set the thread to the background thread (setDaemon (true ), this method must be called before the thread calls run). The main difference between the background thread and the non-Background thread is that after all the non-Background threads die, background threads are automatically killed and recycled. Just as you write other multi-threaded programs, even if your main method (main thread) is completed, the subthreads applied for in main are not completed, the program will not end.

In general, in fact, the code written almost every moment is multi-threaded, but many containers have helped us to complete the process. Even if we write local AWT and SWING, we are also controlling and processing Chinese asynchronous methods, this kind of Asynchronization is relatively small, and more Asynchronization can be compiled by programs. Custom multithreading is generally used in background processing independent of the previous container application. Why does the front-end of a web application process multithreading for a long time? One is that it is not easy to write multithreading to reduce programs and bugs, this will enable programmers to care more about things that do not need to be concerned, but also require programmers to have a high level. However, to become a good programmer, they must understand multithreading, let's start with a few questions and then explain:

If a system is used for clock processing and trigger processing, the system may be distributed. How should we write it in a system? In addition, what is the most depressing thing in the process of writing multithreading, and what is the most difficult thing to consider about patches: What is the current running status of multithreading? My thread cannot die. How can I find out if it dies? How can I handle it (automatic, manual, or restart )?

With these questions, we have introduced some of the topics below.

2What should I pay attention to when using multiple threads?

Multithreading is easy to use, and you are not so cool when there is a problem. Simply put, you are most puzzled about multithreading. But don't be afraid of it, if you are afraid of it, you will never conquer it. Haha, as long as you know your temper, we will always be able to conquer it.

◆ Understand the status information of multiple threads and the conversion rules between them?

◆ Under what circumstances will multithreading weld or die?

◆ How can I capture and process a multi-thread welding failure or death?

This is just to raise the question. After raising the question, before talking about the question, I will first mention the extended knowledge points. The following sections will illustrate these problems.

A good choice in the open-source multi-threaded scheduling task framework is: Quartz, the article on it can go to the http://wenku.baidu.com/view/3220792eb4daa58da0114a01.html

Download this document. This document also describes how to use the framework. However, because the framework has many encapsulation layers, many underlying implementation contents are not so obvious, in addition, the management of the thread pool is basically transparent, and you can only get this content through some other means.

So after learning the features of this framework, we will further look at how to further encapsulate it to get the content that best fits your project.

In addition, multithreading also has a lot of skills in data structure options. We will discuss the Data Structure Selection in multi-thread concurrent resource sharing, because there are indeed many skills, in particular, jdk 1.6 and later have put forward a lot of data structures. It refers to the version principle similar to oracle, and implements data replication and atomic copy in the memory, the implementation ensures consistent read/write and greatly reduces the acquisition of concurrency. In addition, the optimistic lock mechanism is also a very important knowledge system in high-performance multi-thread design.

3, Status conversion control, how to solve the deadlock.

3.1.javaWhat are the statuses of default threads? (The default thread is not overwritten by itself)

NEW: The created thread has nothing to do, that is, the thread that has not been started using the start command.

BLOCKED: Blocking or obstruction, that is, the thread is blocked due to the lock or some network reasons, and there are signs of welding.

WAITING: Waiting for the lock status. It is waiting for the notify of a resource, that is, a lock opportunity of the resource. This status is usually bound to a static resource and is packaged with the synchronzed keyword in use, when obj. when wait () method is used, the current thread will wait for a notify method on the obj object. this object may be this, if this is used, a synchronized keyword is usually found on the method body.

TIME_WAITDE: Time-based wait. When the thread uses the sleep command, it will be in the time wait state, and the time will be restored to the running state.

RUNNING: Running status, that is, the thread is running (when the thread is blocked ).

TERMINATED: The thread has been completed. At this time, the isAlive () of the thread returns false.

Generally, the default thread status is the same as this. Some containers or frameworks further encapsulate the thread status and the thread name and status content will change a lot, however, as long as you find the corresponding principle, it will not be beyond this essence.

3. 1. Under what circumstances will a thread normally die?

Lock, Cross-party, and eventually lead to deadlocks; may be caused by the program itself, write shared cache and custom part out of the container's thread pool management here you need to pay attention; there may also be some distributed shared files or distributed database locks.

Network obstructionThe network is not afraid of not having, not afraid of too fast, it is afraid of fast and slow, now the words are too bad, can not afford to hurt! This is often the case in China. When the network is used to communicate and call the content (including the database interaction is usually through the network), it is very likely to cause welding, that is to say, it is difficult to determine what the thread is, unless there is an advance monitoring plan.

In other cases, will the thread still die? As far as my personal experience is concerned, I have never met, but it is not absolutely impossible. I think that the probability that the threads operated within the same JVM will die is only that the system will crash, otherwise, SUN's Java virtual machine is too untrusted. At least from this point, we can decide that the main cause of thread blocking in most cases is the above two main sources.

On the basis of understanding the vast majority of the reasons, the problem has been raised and initially analyzed, so how can we continue to solve these problems, or reduce the probability of a problem to a very low level (because there is no high availability environment, we just want to do it as perfect as possible, amazon's cloud computing also has an amazing time of downtime ).

3.1. How can I capture and process a multi-thread welding failure or death?

Speaking of capture, the first thing that comes to mind when learning java is try catch. But when the thread is suspended, no exception will be thrown. How can we know that the thread is dead?

This requires starting from our design level. We can use the thread pool provided by java later, but we need to design and manage many very complicated thread management tasks by ourselves. How to capture, let's use an example in our daily life to give it back to the actual system design in the next long.

First, I don't know how to know the status quo of multithreading when I die. I just want a person to get drunk or be dizzy? Put forward two practical ideas: one is a person with a shift, and the other is that there is a leader above to bring a group of people to play, and the following are lost, and they are definitely going to find one.

Let's take a look at the first idea. If I keep him in the class and do nothing at ordinary times, I will follow him. When I find that the former falls down, I will immediately replace the work, this is also a commonly used redundant master-slave switchover in the system architecture, which may be one master node and multiple slaves; cloud computing is also based on the further implementation of remote shunting switching and dynamic resource scheduling (that is, there are fewer tasks, these people can do other things or sleep to raise spirits and save food for the country, there will be people who do other things or those who are on standby to help with these things. Even if this place is subject to earthquake flood and natural disasters or something, there will also be institutions in different places to replace the same work content, in this way, external services are continuously interrupted, that is, the legendary 24*7 High Availability Service), but such redundancy is too large and the cost will be very huge.

Let's look at the second type of service. There is a boss above. It will take a short look at what the younger brother is doing. Is it difficult? It is busy with Dynamic Allocation of this resource on it; it seems like this mode is good? If the younger brother is too busy, he will be too busy. Because the allocation of resources requires understanding the details of the following resources in advance, or the lead will not be a good leader. Then, let us think about it again, we can use multiple bosses. Each boss leads a small team and allocates resources between teams. However, the boss can control everything within the team, the boss has a boss who only cares about what the boss does, and does not need to care about the behaviors of the younger siblings. This means that everyone's affairs are on average. Then the problem arises again, the problem with younger brother is that we can see it transparently. What if the boss has an accident or even the boss has an accident? At this time, combined with the first idea, we only need to take a shift following the boss, and combine the characteristics of the two models, that is, the younger brother does not need to assign a shift, this saves a lot of costs (because the number of leaf nodes is the most), and we need to keep up with the above nodes. If you want to maximize cost savings, you only need to configure one or more follow-ups for the master node, but the recovery costs will go up, because the recovery information needs to be found layer by layer, generally, there is no need to further reduce costs on this basis.

These are practical things. How can we combine them into the computer system architecture and return to the multi-threaded design in this article? Chapter 4 will discuss them together.

4, How to design a scalable multi-thread processor.

In fact, in Chapter 3, many solutions have been found in the management mode of life, which is also a method I personally use to solve problems, because I personally think that complicated mathematical algorithms are not as complex as human nature itself, there may be a lot of magical effects if they are used in computers.

If you do not use any open-source technology, you should start with a multi-threaded Processing Framework. Based on the above analysis, we generally break down a system that specifically processes multiple threads into two layers at least, that is, the main thread directs multiple running threads to solve the problem. Well, we need to solve the following problems:

A)The content processed by multiple threads is similar. How to Control Concurrent Data Acquisition or reduce the granularity of concurrent hotspots.

Method 1:Hash is an excellent principle. data frames are decomposed based on data features. The data rules of each frame are distributed according to a hash rule. hash is easy to traverse for programming, in addition, the computing speed is very fast, and the time for locating the group can be almost ignored. However, the structure expansion process is troublesome, but this problem is generally not required in multithreading design.

Method 2:The range distribution and range distribution data are used to let managers know the approximate distribution of data in advance and hand over the data to the running thread below to process the data within their own range according to a relatively average rule, there is no crossover between each other's data, and its scalability is good and can be expanded at will. If the number of decomposed data is not controlled, too much decomposition will lead to a slower positioning range, however, this issue is generally not considered in the multi-threaded design, because the program is compiled by itself.

Method 3: Bitmap distribution, that is, the data has a bitmap rule, which is generally a state. After such data is distributed according to the bitmap, the thread can set it as the number of bitmap, and find its own bitmap data to perform operations, there is no need for further updates, but the number of bitmaps is usually limited, and the amount of data to be processed is large. A thread can process all the data under a bitmap, if multiple threads process a bitmap, the same would happen again.

Each of the three methods has its own advantages and disadvantages. Therefore, we often adopt a combination mode to show that the architecture of a real system is perfect. Of course, there is no perfect thing. Only the architecture that best suits the current application environment can be used, therefore, a lot of foresight issues need to be considered before the design. Such data distribution is more used in the architecture, but the basis of the architecture also comes from the program design idea. Both ideas are consistent, the Architecture and Data Storage distribution will be discussed separately later.

B)How to discover (and process) when the thread dies ):

In addition to running threads, the management thread also has 1 ~ N follow-up. The number of jobs depends on the actual situation. At least one of the jobs can be replaced immediately when the management thread fails. In addition, there should be a two-way thread to regularly check the running status of the thread, because it is only responsible for this task, it is very simple, and any thread in this group can replace each other and re-initiate a new thread to replace it, this detection cycle does not need to be too fast or too slow. As long as the application can accept it, it is normal that the application blocks a little time because it crashes something.

We found that the thread was blocked and found some other blocking in the execution. The cause has been analyzed above, the solution is generally to detect several times (This number is usually based on the configuration) and find that it is in the blocking status, so you can basically think that it is wrong; in this case, you need to execute an interrupt () method for the thread. In this case, the internal execution of the thread automatically throws an exception, that is to say, when you understand the content of the execution thread, especially when there are network operations, you need to bring a try catch, and the execution part is in the try. When there is a false death or other status quo, when an interrupt () method is used externally, the running program will jump to the catch, which does not cause resource requisition, and quickly execute the content that needs to be rolled back, and think that the thread execution is complete, the corresponding resources will be released, the stop method is not recommended because it does not release resources, which may cause many problems.

In addition, if some network operations are involved before writing the code, you must have a lot of in-depth understanding of the network interaction program you are using, such as during socket interaction, generally, when you start a connection to the peer end, after the socket connects to the other party for several minutes, it will display a timeout connection. This is the default value. Therefore, you need to set a timeout for starting the connection in advance to ensure that the network can communicate, run the command again (Note that there is another timeout in the socket that is the constant time after the connection. The former is the timeout time set before the connection. Generally, this time is very short, generally, it takes a long time to connect in 2 seconds, because the network cannot be connected in 2 seconds, and the latter is running. Some interactions may take several hours, however, Asynchronous interaction is recommended to ensure stable operation ).

C) If you start and manage the second-level management thread group:

There is a main thread above to control startup and shutdown. Here we can set these threads to setDaemon (true) before start, then this thread will be set as a background thread, the so-called background thread means that after the main thread finishes executing the release of resources, these threads created by the main thread will automatically release the resources and die. If a thread is set as a background thread, if other sub-threads created in the run method are created, the sub-threads are automatically created as background threads (this is not the case if they are created in the constructor ).

The management thread can also manage subnodes like a second-level thread. As long as your program is not afraid of writing complicated enough, although very good code is required for writing, it takes a complex test to run stably. However, once successful, the framework will be very beautiful, stable, and highly available.

5. multi-thread extension on multiple hosts-Cluster

In fact, we have mentioned some distributed knowledge above, which can also be called data partitioning knowledge (using PC in a network environment to implement partition mode similar to that on the same host, data is stored in distributed storage ).

However, there are some differences between the cluster mentioned here and this. It can be said that the concept of the cluster is included in the distributed architecture, but there are also many differences in the general concept of the cluster, which should be divided into the app cluster and the database cluster.

A cluster generally refers to multiple nodes under the same unit (multiple nodes can be deployed on the same machine). These nodes almost do the same thing, or something similar, this is the same as multithreading, and multithreading is the same. In comparison, multithreading is implemented in multiple host groups. Therefore, for app clusters, there is usually a management node, it does almost a few things, because we don't want to let it go, because although he does less things, it is very important, you can obtain the application deployment, configuration, status, and other information of each node from the proxy node or distribution node, it only performs distribution under the control of management nodes. Of course, session consistency must be ensured.

Another embodiment of the multi-thread cluster is that one cluster is suspended, and the rest can be replaced without causing the entire cluster to die. The cluster group is equivalent to a large thread group, which is related to containment management, they can also fail to switch to each other, and multiple services or tools are divided into different cluster groups, this is similar to the multi-group thread group mode in the three-layer thread mode of our design thread. Each thread group has its own personalized attributes and shared attributes.

In the face of database clusters, it is relatively more complex than app clusters. During vertical scaling, apps are almost limited by the distribution node capabilities, which can be adjusted, therefore, it is very convenient in the vertical expansion process, while the database cluster is different. It must ensure transaction consistency and implement transaction-level switching and grid computing capabilities to a certain extent, the memory is also complicated in the middle, because its data is read into the memory and the memory of multiple hosts needs to be configured as one memory (completed by heartbeat ), in addition, dynamic scalability is required, which is also one of the reasons for the limited development of scalability under the database cluster.

Isn't apps as difficult as databases? Yes, but the granularity is relatively small. Generally, transactions do not need to be considered in the app cluster, because a user's session will not have replication requirements without downtime, instead, they always access a specified machine, so there is almost no need to communicate between them. The coupling granularity lies in the application design, some application systems write code to initialize and inject some content into the memory, or inject it into a local file of the app as the File Cache; in this way, when the data changes, they first change the database, and then modify the memory or notify the memory to become invalid. Because the cluster uses heartbeat connection, the database remains consistent, the data on the app side only modifies its memory-related information and does not modify the memory information of other machines, therefore, the content on the machine accessing other data is inconsistent. As for this part of the solution, depending on the actual project, it is completed through communication, you can also use the shared buffer (but return to the lock generated by resource requisition in the Shared Pool) or other methods.

The final data distribution of large-scale system architectures, centralized management, distributed storage and computing, horizontal business cutting, vertical separation of apps under the same business, data-level hash + range + bitmap distribution structure, the integration of remote shunting and Disaster Tolerance, standby units and resource allocation comes from the implementation of multi-threaded design architecture on distributed units.

6. WEBMultithreading and persistent connection principles of applications

In WEB applications, special server customization will be made for some special business services, similar to some high-concurrency access systems or even systems dedicated to instant high concurrency (in many cases, the system is not afraid of high concurrency, but they are afraid of instant high concurrency) but their access is often relatively simple, mainly used for transaction processing and data consistency assurance, in terms of data processing, they require that a large amount of computing workload not be allowed on the Database End. computing is generally completed in the app. The database generally only performs operations such as storage, retrieval, and transaction consistency, this type is generally a special OLTP system. There is also a big classification. The other is that the concurrency is not too large, but there are usually a lot of data and computing processed each time. One is an OLAP system, the data source is generally OLTP, and the data volume processed by OLAP each time may be very large. Generally, dump data in type collection and statistics, the OLTP data needs to be extracted and organized to store valid information in another place based on a certain business rule query and retrieval method. This place may be a database, but it may not (Although the database's computing power is the strongest in data, it is the slowest thing in actual application, because the database is more about ensuring a lot of transaction consistency and lock mechanisms, the overhead of intermediate parsing and optimization is very large, and the interaction process between applications needs to be completed through the network. Therefore, many data does not have to be used in actual applications.); These two types of systems have major differences in design and architecture, but common systems both have characteristics, but they are not so extreme, so there is no need to consider too much, we need to mention a very special system that pushes data in real time and has a high concurrency. So far, I personally don't know which system to merge it, this is indeed a special type of system.

Such systems include: high-concurrency access, and the data on the same platform must be obtained in real time by the client, this type of website is unlikely to obtain a lot of content at a time to access the client, but it must be completed through many Asynchronous interaction processes. Below we will briefly describe this Asynchronous interaction.

All the frameworks of Web Asynchronous interaction are based on ajax, and other similar frameworks are based on this. How can ajax control interaction to obtain almost real-time content? Does the client constantly refresh the same URL? If there are many clients, similar to a large website, the server may be down soon, unless it is done at a server cost many times higher than normal, in addition, more servers may also need to be transformed to achieve their performance (because 1 + 1 is always less than 2 in the server architecture, more servers are overhead ).

Another way of thinking is to push data from the server to the client. The question is how to push the data. This kind of operation is based on a persistent connection mechanism, and persistent connections are continuously established, when the client uses ajax to communicate with the backend, the backend feedback can be viewed as a persistent connection mechanism as long as it has not been disconnected. Many of them communicate with the server through socket, and ajax can also be used, however, ajax requires a lot of processing on it.

The server must also use the corresponding policy. Currently, many of them are javaNIO, which has a lower performance than BIO, but it is also very good, when a user request is obtained, it does not immediately allocate a thread to the user request for processing, but queues the request. The queuing process can control the granularity by itself, the thread will also be allocated and processed as the queue of the thread pool, that is, the server side's request to the client is asynchronous response (note that this is not a pure Asynchronous interaction of ajax, but the asynchronous response from the server to the request. The response to many requests is not timely. When data changes, the server immediately obtains the Client session list from the request list and outputs the list, similar to actively pushing data to the client on the server side; the advantage of Asynchronous interaction is that the server does not allocate or apply for a new thread for each client. This will lead to memory overflow caused by excessive resource allocation during high concurrency; after the preceding two problems are solved, another problem that needs to be solved is that when a thread is processing a request task, because the thread waits until processing a task is completed, otherwise, it will not be disconnected. This is for sure ( We can cut some large tasks into some small tasks, and the thread will process much faster), but there is a problem, the thread on the server side may soon process the data to be processed and push the data to the client. However, due to various network communication problems, the client may be unable to accept the data, at this time, this thread will also take up some unnecessary time, so do you need to further implement a breakpoint transfer cache in the middle? Caching not only replaces the content of the application server when the breakpoint data is required, but also outputs information to the client asynchronously. At the same time, the application server processes almost all the time in data and business processing, instead of occupying a lot of resources on the output network, there are many ways to use the network cache. In the future, I will have the opportunity to discuss with you about the network cache.

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More