From: ibmdeveloperworks thadomalshahani 23:38 Build your own Java-based supercomputer Author: reposted from: If you ever wanted to build your own supercomputer, but it was daunting to use the C language for parallel programming, the pseudo remote thread can help you solve this problem. This award-winning Java programming model greatly simplifies parallel programming on the cluster and enables Super computing to go out of the lab so that every Java programmer can use it. In the past three years, parallel clusters have been changing the face of super computing. Once a single machine with millions of dollars is dominant, parallel clusters will soon become the super computer's choice. As you can imagine, the high enthusiasm in the open source code circle has produced hundreds of parallel cluster projects, if not thousands. Beowulf is the first and most famous open source cluster system. Sponsored by NASA, Beowulf, launched by Thomas Sterling and Donald Becker in 1994, started as a 16-node demonstration cluster. Today, Beowulf has hundreds of implementations, from Stone soupercomputer at the Oak Ridge National Laboratory to the custom-built commercial cluster of Aspen Systems Corporation (see references ). The disadvantage for Java programmers is that most cluster systems are implemented around c-language-based software Message Passing APIs, such as message passing interfaces (MPI) or parallel virtual machines (PVM. It is not easy to use the C language for parallel programming, so I have designed an alternative solution. This article describes how to use Java threads and Java remote method call (RMI) to create your own Java-based supercomputer. Note that this document assumes that you have knowledge about Java threads and RMI applications. What is in a supercomputer? A supercomputer is a cluster that consists of eight or more nodes and works as a single high-performance machine. A Java-based supercomputer contains a Job scheduler and any number of running servers (also known as hosts ). The Job scheduler generates multiple threads. Each thread contains code for executing different subtasks. Each thread migrates its code to different running servers. Then, each running server executes the code to be migrated to it and returns the result to the Job scheduler. Finally, the Job scheduler combines the results of each thread. This parallel cluster system is called a pseudo remote thread because the thread is scheduled on the Job scheduler, but the code in the thread is executed on the remote computer. What components does the system have? The term "component" refers to the logical module that makes up the parallel cluster system of "pseudo remote thread. The system includes the following components: Job dispatcher is the machine for execution control. It generates different threads. Each thread contains a subtask of the supervisor to be processed by this cluster. Code in each thread is sent to a remote computer for execution. The thread is scheduled on the Job scheduler. Therefore, theoretically, this machine should not be used to execute any subtasks. Subtask is a user-defined class that defines a data or function-independent part of a service director. You can define different classes for different parts of the main task. The subtask class name is an example. You can take any name for a subtask class, but this name should be able to describe the subtask assigned to it. When defining a subtask class, you must implement the jobcodeint interface and the jobcode () method, which are described below. Jobcodeint is a Java interface. You must implement this interface and the jobcode () method in the class that defines the subtask. The jobcode () method describes the code that will be executed remotely. If you want to remotely use a local resource, you must initialize the resource outside the jobcode () method. For example, to send a group of images to remote processing, you must initialize the image object outside the jobcode () method. You can call classes in the standard Java library in this method because these libraries exist on the remote computer. Runserver is a Java object that allows a remote procedure to call its method. One of its methods is to implement the jobcodeint interface object as a parameter. The runserver executes the code in the object on the computer (running server) that runs the object, and returns the calculation result as an instance of the object class. Object is the highest class in the Java class hierarchy. Pseudoremthr is a Java class that encapsulates an instance of a thread and accepts a given subtask class. It selects a remote host and sends the subtask instance to this host for execution. If you want to use specific resources (such as databases or printers) available on a host, you can specify the host. Hostselector is a module. If you do not specify a remote host, the udoremthr class calls the hostselector module to select a specific host. If no idle host exists, the hostselector returns the remote computer with the least load. If a remote computer is a multi-processor system, hostselector may return the host name more than once. Currently, hostselector cannot select a host based on the complexity of a given task. How a pseudo remote thread works To use a pseudo remote thread, you must implement the Job scheduler and running server. This section describes how to implement each part. Implement Job scheduler First, it is divided into sub-tasks with independent data or functions. For each subtask, define a class that implements the jobcodeint interface (thereby implementing the jobcode () method. In the jobcode () method, define the code to be executed for each subtask. Note that you cannot call user-defined local resources on the Job scheduler. Initialize all such resources outside this method. For example, you can initialize such resources in the subtask class constructor. Create several instances of the pseudoremthr class and pass the subtask instance to each instance of the pseudoremthr class. To specify a remote host, you can call another constructor of the udoremthr object. Wait until these threads are finished. Call the getresult () method to obtain the execution results of each of the udoremthr instances. If the calculation is incomplete, a Boolean object with the value of false is returned. Otherwise, an instance of the object class is returned, which contains the calculation result. You must convert this instance to the expected class type. Combine the results of all subtasks into the final results. Implementation Server Implementing the running server is a simple task: Start the RMI registration program. Start runserver. When the running server starts, it connects to the Job scheduler and notifies the Job scheduler that it is ready to accept the task to be executed. A computing example Now we need to test this model. The following example uses two computers to run in parallel. One is a 333 MHz Pentium II computer running Windows 98, and the other is a 2000 MHz Pentium III computer running Windows 500 Professional Edition. To calculate the sum of the square root of All integers from 1 to 10 ^ 9, I created the SQRT class, which calculates the sum of the square root of All integers between dblstart and dblend. SQRT implements the jobcodeint interface, so it also implements the jobcode () method. In the jobcode () method, I defined the code to complete this calculation. The constructor is used to pass data to the SQRT class and initialize all local resources on the Job scheduler. You must send the start and end points of the integer that calculates the sum of the square root to the constructor. Listing 1 is the definition of the SQRT class Listing 1. Defining SQRT classes // The SQRT class calculates the sum of the square root of All integers between dblstart and dblend. // The calculation is completed in the jobcode () method. // This class implements the jobcodeint interface and the implementation code is in the jobcode () method // Pass data to the class in the constructor and initialize local resources on the Job scheduler. // In this example, the start and end of the integer sequence whose square root is to be calculated is sent to the SQRT class. Public class SQRT implements jobcodeint { Double dblstart, dblend, dblpartialsum; Public SQRT (double start, double end) { Dblstart = start; Dblend = end; } Public object jobcode () { Dblpartialsum = 0; For (double I = dblstart; I <= dblend; I ++) // You can call standard Java functions and objects. Dblpartialsum + = math. SQRT (I ); // The returned result is a standard Java class object. Return (New Double (dblpartialsum )); } } The jobdispatcher class creates two SQRT instances. Then, it breaks down the supervisor, assigns a subtask to a SQRT object (sqrt1), and assigns the remaining subtasks to another SQRT object (sqrt2 ). Next, jobdispatcher creates two objects of the pseudoremthr class and passes the SQRT objects as parameters to them respectively. Then wait for the thread to execute. Once the thread execution is complete, partial results can be obtained from each udoremthr instance. Combine the results of each part to obtain the final result, as shown in Listing 2. Listing 2. jobdispatcher at work // This class can be named as any name you choose // Jobdispatcher is used for convenience. Public class jobdispatcher { Public static void main (string ARGs []) { Double fin = 10000000; // 10 ^ 9 Double finbyten = FIN/10; // represents 10 ^ 8 Long nlstarttime = system. currenttimemillis (); // Range from 1 to 3*10 ^ 8 SQRT sqrt1 = new SQRT (1, finbyten * 3 ); // Ranges from (3*10 ^ 8) + 1) to 10 ^ 9 SQRT sqrt2 = new SQRT (finbyten * 3) + 1, fin ); // The following two instances of the receivudoremthr class are created. // The parameters of this constructor are as follows. // The first parameter indicates the instance of a class of the subtask. // The second parameter is the remote host that executes this subtask. // The third parameter is the descriptive name of the pseudoremthr instance. Required udoremthr psr1 = new Using udoremthr (sqrt1, "// 192.168.1.1: 3333/", "Win98 "); Required udoremthr psr2 = new Pseudoremthr (sqrt2, "// 192.168.1.2: 3333/", "Win2k "); Psr1.waitforresult (); // wait until execution ends // obtain the results of each thread Double RES1 = (double) psr1.getresult (); Double RES2 = (double) psr2.getresult (); Double finalres = res1.doublevalue () + res2.doublevalue (); Long nlendtime = system. currenttimemillis (); System. Out. println ("total time taken:" + (nlendtime-nlstarttime )); System. Out. println ("sum:" + finalres ); } } Performance Evaluation The total execution time of this calculation is between 120,000 ms and 128,000 Ms. If you run the same task locally without breaking down the task, the execution time will be between 183,241 and 237,641 milliseconds. Initially, the Director included calculating the sum of the square root of All integers from 1 to 10 ^ 7. To test the performance, I expanded the computing scope to 10 ^ 8, and finally to 10 ^ 9. As the number of tasks increases, the time difference between remote parallel execution and local execution becomes more and more obvious. This means that remote parallel execution takes less time to execute large tasks. Remote parallel execution is not suitable for small tasks, because the system overhead of Inter-machine communication cannot be ignored. As the number of tasks increases, the overhead of Inter-machine communication becomes insignificant compared with the overhead of executing all tasks on a single machine. Therefore, I come to the following conclusion: the pseudo remote thread system can well complete the tasks that require a lot of computing. What are the advantages of using pseudo remote threads? The pseudo remote thread is a Java-based system that can be used to implement clusters that contain multiple operating systems or heterogeneous clusters. By using a pseudo remote thread, you can avoid the trouble of converting the original C/C ++ code, and use the java standard library and various extension libraries. In addition, the pseudo-remote thread frees you from caring about memory management. Of course, its disadvantage is that the system performance is directly related to the JRE performance. Development Direction Currently, a considerable number of commercial applications are created on the Java platform, and considering the practical difficulties in converting the original C/C ++ code to use parallelism, it may be time for Java-based supercomputing to enter the business field. It is a good start to consider parallelism and load balancing when creating Java-based applications. The Internet is a good example of heterogeneous clusters. Therefore, pseudo remote threads can be deployed on the Internet, convert the Web into a single, Java-based supercomputer (for details about this concept, see references ). However, from the actual application, you should note that the best results will be obtained in a homogeneous cluster dedicated to executing a single task. Finally, starting from the daily application, the pseudo remote thread makes it quite easy to convert a LAN (LAN), such as a campus network and a home network, into a micro-supercomputer. This is what the Beowulf system has created. With pseudo remote threads, Java programmers can also create their own supercomputers. Reference resources "Linux clustering cornucopia" (developerworks, September May 2000) helps you understand the available open source code cluster solutions and secure source code cluster solutions on Linux. For more information about distributed operating systems, see Andrew S. Tanenbaum's modern operating systems (Prentice Hall, February 1992 ). For more information about parallel programming, see practical parallel programming of Gregory V. Wilson (Massachusetts Institute of Technology Press, December 1995 ). For more information about clusters, see cluster cookbook of Scalable Computing laboratory. For more information about how to use Java and web for supercomputing, see create your own supercomputer with Java? " (Javaworld, January 1, January 1997 ). The Linux document. tion project hosts the Beowulf howto document. Visit the Beowulf website to learn more about the Beowulf Project. For more information, see the famous Stone soupercomputer at the Oak Ridge National Laboratory. Aspen systems is currently one of the few vendors that provide customized cluster solutions.
|