Build Your own Java-based supercomputer

Source: Internet
Author: User
Tags constructor contains execution thread

If you've ever thought about building your own supercomputer, but it's daunting for parallel programming in C, then pseudo-remote threads can help you solve the problem. This award-winning Java programming model greatly simplifies parallel programming on the cluster and enables supercomputing out of the lab so that every Java programmer can use it.

In the past three years, parallel clusters have been changing the face of supercomputing. Once the millions of-dollar unit dominates, parallel clusters will soon become supercomputers ' choices. As you can imagine, the high enthusiasm in the open source circle has led to the creation of hundreds of--if not thousands of--parallel cluster projects. The first and most famous open source cluster system is Beowulf. The Beowulf, released by Thomas Sterling and Donald Becker in 1994, was launched as a 16-node demo cluster under NASA sponsorship. Today, Beowulf has hundreds of implementations, from stone soupercomputer of Oak Ridge National Laboratory to custom-built commercial clusters of Aspen Systems (see Resources).

To the disadvantage of Java programmers, multi-cluster systems are implemented around software messaging api-such as message Passing Interface (MPI) or parallel virtual machines (PVM) based on C language. It's not easy to do parallel programming in C, so I designed an alternative. This article will show you how to use the Java thread and Java remote method call (RMI) to create your own java-based supercomputer.

Please note that this article assumes that you have Java threading and RMI application knowledge.

What's inside the supercomputer?

A supercomputer is defined as a cluster consisting of eight or more nodes that work as a single high-performance machine. A java-based supercomputer contains a job scheduler and any number of running servers (also known as hosts). The job scheduler generates multiple threads, each containing code that performs different subtasks. Each thread migrates its code to a different running server. Each running server then executes the code that is migrated to it and returns the results to the job scheduler. Finally, the job scheduler combines the results of individual threads.

This parallel cluster system is called a pseudo remote thread because the thread is scheduled on the job scheduler, but the code inside the thread is executed on the remote computer.

What are the components of the system?

The term component refers to the logical module of a parallel cluster system composed of "pseudo remote Threads". The system contains the following components:

Job Dispatcher (Job scheduler) is the machine that executes the control. It generates a different thread, each containing a subtask of the primary task that this cluster will handle. The code within each thread is sent to a remote computer to execute. Threads are scheduled on the job scheduler, so in theory, the machine should not be used to perform any subtasks.

SubTask is a user-defined class that defines a data or functionally independent part of a primary task. You can define different classes for different parts of the main task. The class name SubTask is an example. You can take any name for a subtask class, but the name should describe the subtasks assigned to it. When defining the Subtask class, you must implement the Jobcodeint interface and the Jobcode () method, as described below.

Jobcodeint is a Java interface. You must implement the interface and the Jobcode () method in the class that defines the subtasks. The Jobcode () method describes the code that will be executed remotely. If you intend to use a local resource remotely, you must initialize the resource outside of the Jobcode () method. For example, if you want to send a set of images to remoting, you must initialize the image object outside the Jobcode () method. You can call classes in the standard Java library in this method because these libraries exist on the remote computer.

Runserver is a Java object that allows a remote procedure to call its methods. One of its methods is to implement the object of the Jobcodeint interface as an argument. Runserver executes the code within the object on the computer running the object (the running server) and returns the result of the calculation as an instance of the object class. Object is the class at the highest level in the Java class hierarchy.

Pseudoremthr is a Java class that encapsulates a thread and accepts an instance of a given subtask class. It selects a remote host and sends the SubTask instance to this host for execution. You can specify a host if you want to take advantage of specific resources (such as databases or printers) that are available on a host computer.

Hostselector is a module. If you do not specify a remote host, the Pseudoremthr class invokes the Hostselector module to select a specific host. If there are no idle hosts, Hostselector will return the remote computer with the least load. If a remote computer is a multiprocessor system, Hostselector may return the host name more than once. Currently, Hostselector cannot select a host based on the complexity of a given task.

How pseudo-Remote threads work

To use pseudo remote threads, you must implement the job scheduler and the running server. This section shows you how to implement each section.

Implementing the Job Scheduler

First, the primary task is decomposed into data or functionally independent subtasks. For each subtask, define a class that implements the Jobcodeint interface (thereby implementing the Jobcode () method). In the Jobcode () method, define the code to execute for each given subtask.

Note that you cannot invoke a user-defined local resource on the job scheduler. Please initialize all such resources outside of the method. For example, you can initialize this type of resource in the constructor of the Subtask class.

Creates several instances of the class Pseudoremthr and passes the subtask instance to each instance of the PSEUDOREMTHR. If you want to specify a remote host explicitly, you can do so by calling another constructor of the Pseudoremthr object.

Wait for these threads to complete. Call the GetResult () method to get the results of the execution of each instance of the PSEUDOREMTHR. If the calculation is not completed, the result returns a Boolean object with a value of false; otherwise, an instance of the object class is returned, which contains the result of the calculation. You must convert this instance to the class type that you want. Combine all subtask results into the final result.

Implementing a Running Server

Implementing a running server is a simple task:

Start the RMI registration program.

Start Runserver.

The running server connects to the job scheduler at startup and notifies the job scheduler that it is ready to accept the task to be performed.

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.