Architecture of cost-effective Distributed Computer Clusters (1)

Source: Internet
Author: User

As you can see, your machine is idle most of the time. If you use the task manager or other Linux platform tools such as top or xload in Windows, observe the CPU, you will see that the CPU usage is usually 1 ~ 2%. In fact, if you have more computers, this waste will increase. In a department with 300 computers, the CPU idle rate is astonishing. However, these departments still need powerful servers to compile or simulate computing. This situation will even worse, because with the increase of users, even eight CPU servers, at full load, tasks cannot be handed over to another idle server, because users rarely change their habits to log on to another server. If you can use existing computing resources, use idle CPU resources, or intelligently migrate the load on the server, it is your reason.
The basic unit of a cluster is a separate computer called a node ). It is a scalable feature, that is, adding computers to a cluster. There is no strict definition of a cluster. It can be said that many computers that utilize high-speed connections, have high-speed computing capabilities, and have a single user interface. This is not a definition of a cluster, but a description of a superficial phenomenon. The nodes in the cluster must be the same as possible. The hardware clusters with different hardware clusters are called heterogeneous clusters. Although this does not change the characteristics of the cluster, however, heterogeneous clusters may take extra time to handle latency caused by heterogeneous processing. On the other hand, this is also the advantage of clusters. Any other multi-CPU system strictly requires the same CPU, the cluster has enough freedom to increase or decrease nodes, which is not limited by the type.
This article selects MOSIX as the cluster solution and uses diskless nodes. The architecture cluster is not as complex as you think. You can refer to the following steps to step-by-step architecture cluster. The entire solution is not very expensive and easy to expand. RedHat is used because it is the most authoritative version in China. Beginners and professionals can customize their own versions in RedHat Linux. The Terminal Server LTSP is simple in structure and easy to expand. The hardware used can be improved. For example, the dual-CPU motherboard and Xeon processor are used, so that the processors in a single cabinet are intensive into 48 CPUs. In this example, 24 CPUs can be used. In addition to the CPU, you can also use a gigabit Nic or an optical fiber Nic. The vswitch can use an optical fiber switch with a total memory of 48 GB. However, with the upgrade of hardware, the overall price will be too high, and the performance improvement is not proportional, so the configuration in this example is a better combination of cost-effectiveness. MOSIX patches the kernel source code to add kernel functions to support kernel-level clusters. It is similar to SMP multi-processor systems started by the compiled kernel, from an external perspective, there is only one large machine with many processors, but it is built on many internal machines. The MOSIX cluster is transparent to users. The original application can distribute computing in the cluster without modifying the code.
In several cases, you do not need to use a cluster system. You can use scripts similar to grid computing to complete computation, such as 3D animation rendering. Different Nodes can be used for such computation, there is no need for symmetric hardware, no need for consistent operating systems. Applications must have versions that support different operating systems.) You only need to separate the rendering task segments by processor, run the corresponding task segment on each node, and then merge the computing results of all nodes. Because the computing data is discrete, and the result is visual continuity), scripts similar to grid computing can also be used. This type of computing can be completed without using clusters.
This article will show you how to prepare the hardware and software environment. First, plan your computing environment.
Hardware environment:
49U standard cabinet, 1U switch, 24 2U rack cabinets.
1U 24 port MB switch is not a hub)
Intel P4 2 GHz CPU, 1 gb ddr, Intel 845D motherboard, soft drive, NVIDIA graphics card can start X Windows), 2U rack-mounted chassis, 3C905B 10/100 MB adaptive Nic, you must configure the optical drive, 80 GB hard drive, and dual Nic for the gateway.
Category 5 cables, pressed by 693A 3 m network cable 24.
Display, keyboard, and mouse only for installation.
In addition to special instructions, the software environment and necessary installation packages are not necessarily applicable to the higher version ):
Redhat 7.2 CD1 and CD2 installation CD www.redhat.com
The dhcp-3.0.1rc9-1.i386.rpm is the DHCP version www.redhat.com that supports kernel startup calls
Dhcpcd-1.3.22pl1-7.i386.rpm is the DHCPD daemon www.redhat.com
Mknbi-1.2-6.noarch.rpm is a necessary software package for creating a client to start the kernel www.redhat.com
MOSIX-1.6.0.tar.gz is the original MOSIX file, the latest version is 1.7.0 www.mosix.com
MOSKRN-1.6.0.tar.gz is the original MOSIX Kernel File, the latest version is 1.7.0 www.moxis.com
Openmosix-kernel-2.4.18-openmosix2.i386.rpm is the generic kernel for the MOSIX branch version, and the latest version is 2.4.19 www.openmosix.org
Openmosix-kernel-2.4.18-openmosix2.i686.rpm is the new processor kernel for the MOSIX branch version, the latest version is 2.4.19 www.openmosix.org
Openmosix-kernel-smp-2.4.18-openmosix2.i686.rpm is the multi-processor kernel of MOSIX branch version, the latest version is 2.4.19 www.openmosix.org
Openmosix-kernel-2.4.18-openmosix2.src.rpm is MOSIX branch version source code www.openmosix.org
Openmosix-tools-0.2.2-1.i386.rpm is the client tool for MOSIX branch version www.openmosix.org
Ltsp_core-3.0-11.i386.rpm is the core document of LTSP www.ltsp.org
Ltsp_kernel-3.0-3.i386.rpm is the kernel of LTSP www.ltsp.org
Ltsp_floppyd-3.0.0-2.i386.rpm is the LTSP floppy disk tool www.ltsp.org
Ltsp_initrd_kit-3.0.1-i386.tgz is the starting tool for creating LTSP www.ltsp.org
Linux_kernel-2.4.18.tar.gz is the need to compile the kernel source code www.kernel.org
Network Configuration environment:
Configure 10.193.15.169 for the gateway's external Nic. The subnet mask 255.255.0 is used for the Intranet for logon and submission processes. The gateway configures 192.168.0.254 for the internal Nic, The subnet mask 255.255.0, and 10.193.15.169 for the cluster to use for DHCP servers, NFS servers, and LTSP servers.
The gateway configures DHCP for the internal network card. The allocated address ranges from 192.168.0.100 to 192.168.0.253, And the subnet mask 255.255.255.0.
When all the preceding conditions are met, you can start to build a MOSIX cluster.
1. Install all the hardware and ensure that the gateway server system can be started from the disc. The node can be started with a floppy disk, check the BIOS startup settings, and confirm that the system can be correctly started. Install all nodes and switches on the cabinet, and connect the switches and nodes with a network cable. The Gateway requires an additional network cable to connect to the LAN, because the cluster Computing environment can be called Computing Farm ), therefore, the corresponding LAN is called the network of the cluster system as the computing network. After the power supply is connected, use two sets of display and keyboard and mouse to connect a node and gateway server respectively.
2. install RedHat 7.2 On the gateway server with two NICs. It is easy to use automatic partitioning for partitions. This article does not discuss other Linux problems.) choose custom installation, but do not install all software packages, in addition to the default options, you need to select two sets of software development and kernel development packages, which are not selected during installation, you can refer to the software package "how to compile the kernel" for installation after the system starts normally. When installing the network configuration, configure the IP address according to the network environment. You need to modify the settings of the external Nic for DNS configuration, and use the lower case mosix as the host name. After the installation, verify that the gateway server can be started properly and set the system according to your preferences. We recommend that you use the text mode, which consumes a large amount of resources. In addition, you need to connect to the RedHat website to upgrade the defective software package to reduce system vulnerabilities. Be sure not to upgrade the kernel. This is not because of the operations in this article, but after the kernel upgrade, the system may fail to start, this article will compile the kernel by yourself. After the upgrade is complete, restart and confirm again that the system has no errors.
3. Install the MOSIX package (the installation of openmosix is another branch) requires many steps. Pay attention to the accuracy of the operation steps:
A. Upload all downloaded software packages to the/usr/src/tmp directory on the server, confirm that the downloaded software package is complete, and confirm that the md5 verification results are consistent:
Su-
Cd/usr/src/
Makdir tmp
Md5 package_file_name
B. Add RedHat 7.2 CD2 to the optical drive. Follow these steps to check that the required software package for Kernel compilation already exists:
Mount/dev/cdrom/mnt/cdrom
Cd/mnt/redhat/RPM
Rpm-Uvh kernel-headers *
Rpm-Uvh kernel-source *
Rpm-Uvh kernel-doc *
Rpm-Uvh dev86 *
Rpm-Uvh make -*
Rpm-Uvh glibc-devel *
Rpm-Uvh cpp *
Rpm-Uvh ncurses-devel *
Rpm-Uvh binutils *
Gcc-2 rpm-Uvh *
Rpm-Uvh tftp *
Cd/usr/src
Umount/mnt/cdrom
C. Install the local software package and deploy all the tar.gz software packages:
Tar xvfz MOSIX-1.6.0.tar.gz
Tar xvfz MOSKRN-1.6.0.tar.gz
Tar xvfz linux-2.4.18.tar.gz
D. If there is no error in expanding the file, move the expanded directories of each software to the correct location:
Music MOSIX-1.6.0/usr/src/
Music MOSKRN-1.6.0/usr/src/
Mv linux/usr/src/linux-2.4.18
E. To avoid previous MOSIX script errors, perform the following steps:
Chmod goa + x/usr/src/MOSIX-1.6.0/inst/add_kernel_to_grub
Mkdir/usr/local/man
F. The following are really interesting and fascinating steps. First, we need to create a directory for compiling the Kernel configuration file. This is a good habit, because each compilation configuration is not necessarily the same, related Problems will also be mentioned in the following troubleshooting:
Cd/usr/src
Mkdir config. backup
D/usr/src/linux-2.4.7-10/configs
Cp kernel-2.4.7-i686.config/usr/src/config. backup/kernel-2.4.8.config
G, copy the configuration file to the directory of the kernel to be compiled:
Cd/usr/src/
Cp config. backup/kernel-2.4.18.config linux-2.4.18/. config
H. Modify the EXTRAVERSION part of Makefiles according to your situation. The original value is 18. You can change it to mosix to indicate the compiled kernel version and distinguish the version of the module.
Cd/usr/src/linux-2.4.18
Vi Makefile
EXTRAVERSION = 18
I. start installing MOSIX 1.6.0
Cd/usr/src/MOSIX-1.6.0
./Mosix. install

J. There will be some problems after the installation starts. Except for the new kernel which is added to LILO or GRUB, if the answer is G, press enter to use the default value of uppercase letters ). The problem is followed by the path of the kernel source code. The startup options of the new kernel are added to the Startup Program, the library file connections contained in the kernel, the startup level of the MOSIX service, and the MFS Mount directory, start the kernel compilation configuration menu to check whether the detailed process of kernel compilation is displayed and whether the detailed process of user-level compilation is displayed. Then the system starts to patch the kernel source code and starts the kernel compilation configuration menu.


Figure 1
K. in the kernel compilation configuration menu, you can see the newly added MOSIX option.

 


Figure 2
L. Select the MOSIX option and add Direct File-System Access and MOSIX File-System ). Use the ESC key to launch the current menu.


Figure 3
M: added support for memory Virtual Disks of Block devices (Block devices) and Initial RAM disk (initrd) support.



Figure 4
N. Add the IP kernel-level automatic configuration (IP: kernel level autoconfiguration) to the network options (Networking options). IP dynamic address support (IP: DHCP support ), support for binding IP start addresses (IP: BOOTP support ).


Figure 5
O. We recommend that you remove the SCSI support. In this example, the SCSI device is not used and the kernel compilation can be avoided. We recommend that you remove the sound card support...


Figure 6



Figure 7
P: added NFS support for the Root File System (NFS) in the Network file system ).



Figure 8
Q: press the ESC key one more time. The system prompts you to save the configuration file and select Yes. The system will start to compile the kernel, compile the module, install the kernel, and install the module. This process may generate some warnings, as long as the compilation is not exited, and the compilation is completed after the normal completion is returned to the prompt state.


Figure 9


Figure 10
R, do not busy restarting, need to modify the/boot/grub. conf file MOSIX kernel path line, the original path is/boot/vmlinuz-2.4.18-mosix, modify to/vmlinuz-2.4.18-mosix. After modification, Type reboot to restart.



Figure 11
S. Then, Mosix 1.6.0 (2.4.18) appears in the Start Menu. Select this option to start the MOSIX system.



Figure 12
T. Several errors may occur during system startup. The first in the figure is caused by the absence of mfs, and the second is because MOSIX has changed the sshd service permissions, the sshd of MOSIX is not started. When you start the MOSIX system for the first time, you are required to configure the mosix. map File. Press enter to select the default editor to edit the file.


Figure 13
U. After the configuration file is modified, the system will prompt you to modify the node number in the mosix. map file when the node IP address changes. Because the gateway server is 192.168.0.254, it is defined as node 1. Other nodes have a total of 253 nodes starting from 192.168.0.1, And the node number starts from 2.



Figure 14


Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.