Resource scheduling is never a tempting topic. They are things that must be done, but they are often complex and often frustrating for the user and keeping the system administrator busy. The most common complaint is: "Why is my job not running?" "The answer to the question usually depends on an explanation of some of the scheduling rules, or simply being overloaded, or, in rare cases, calling a user's program to cause the problem."
If you don't know what a resource schedule is, the next few paragraphs must be read. The term is that you have a lot of resources, a lot of jobs in the queue, and you need to list these resources to work in the best state. Some common resource schedules such as Sungird, Engine,torque/maui, Moab, PBS, platform, and Platformlava. Clustering is the best example of resource scheduling. In a 128-node cluster, each compute node has eight cores. Most user programs require 1-16 cores to work, but some require 256 cores. The problem is, given a list of jobs, what is the best way to make the cluster work?
When a user submits a "job", a script (similar to qsub,queuesummit) is often used to insert the job into the queue schedule, and if allowed, the user can use a script like Qstat (queuestatus, queue State) to control his or her program. At the same time, print out some disturbing information, no one can answer you "why my job is not running" (of course, this also provides this message, but it seems the easiest way to send a message to the system administrator).
To make the scheduling problem a little trickier, in some cases we don't know how long these applications will run, and there may be other resources (such as memory capacity, storage, processor type, etc.) that might be needed. Therefore, resource scheduling is not a simple task, but it is important for cluster utilization. In fact, the emergence of multi-core makes kernel-level scheduling more important (and, of course, more difficult) than ever before. At the kernel level, the kernel must be scheduled, and the tasks must be transferred between the cores based on caching. Interestingly, high-level resource scheduling capabilities have been extended to the CPU, and controlling the location of the core is essential for getting the best performance.
Why is resource scheduling a new, cool tool after high-performance computing? Not because of a new frontal GUI or some other mysterious feature. The real reason is cloud computing. But that does not mean that the cloud will soon be everywhere, in fact, resource scheduling will put the cloud in the right place.
Recently, I heard a davidperel of the New Jersey Institute of Technology using Sungridengine (SGE) apachehadoop Dynamic Resource allocation experiment. Then there is an in-depth study of the articles on Sungridengine updates. In the new version there are two attractive updates, the first is cloud computing, and the second is Hadoop, something like a mass cloud.
Most specifically, SGe's new version allows swaps in the cloud, like Amazon's EC2. Jobs are allowed, and SGE can control the connection. With EC2, the user needs to build an AMI image for the application. In addition, they need to provide account information on EC2. Once this is done, the user can insert the job into the queue, and for EC2, there is a "cloud burst".
Another new feature is integration with Hadoop. If you don't know what Hadoop is, Google it. Just erecting a Hadopp cluster is not easy. This is a powerful search pattern that does not rely on a single database. Typically, map searches reduce the number of startup servers and set different data for each local hard drive. SGE has been enhanced and now the Hadoop job can be submitted directly.
At this point, high performance computing in the cloud is a mixed blessing. Unless you use a specially designed HPC cloud, like Penguin's pod service, I/O resources that are critical to HPC performance can be diversified. This may change. Contains more kernels as a separate server. HPC Application surveys show that 57% of HPC users use 32 processors or fewer cores. These people confirmed Clustermoney.net's figure for 55% of the survey. When cloud computing starts using 48-kernel servers, it may eliminate some server to server communication problems.
High-performance computing may take a different approach to cloud computing using a dense multi-core server. Users can add jobs to the SGE on the desktop. This resource scheduling approach will touch local or cloud resources that can run virtual machines. This resource scheduling approach can make HPC a valuable desktop. Sounds like grid computing, but it's simpler.
The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion;
products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the
content of the page makes you feel confusing, please write us an email, we will handle the problem
within 5 days after receiving your email.
If you find any instances of plagiarism from the community, please send an email to:
info-contact@alibabacloud.com
and provide relevant evidence. A staff member will contact you within 5 working days.