Add hard disks to the Hadoop cluster.

Source: Internet
Author: User

Add hard disks to the Hadoop cluster.

Hadoop worker nodes expand hard disk space

After receiving the task from the boss, the hard disk space in the Hadoop cluster is insufficient, and a machine is required to be added to the Hadoop cluster, and each machine is added with a 2 TB hard disk on the original basis. The boss is awesome. Haha.

I will summarize the steps for completing this task and the problems and solutions I have encountered and share them with you.

1. First, introduce the basic commands and configurations used. If the time in this article is too short, you can skip some of the content and read "2. How to load the New Hard Disk" directly.

(1) commandFdisk

Syntax:

Fdisk [-B sectorsize] device

Fdisk-l [-u] [device...]

Fdisk-s partition...

Fdisk-v

Note:

-B <sectorsize> specifies the size of each partition. You can also run fdisk device (for example, fdisk/dev/sdb) and specify it when prompted.

-L list the Partition Table Status of the specified peripheral device. If you only run fdisk-l, the system lists known partitions.

-U and "-l" parameter lists replace the number of cylinders with the number of partitions to indicate the start address of each partition.

-S <parttion> outputs the size of the specified partition to the standard output, in blocks.

-V displays the fdisk version.

(2) commandsMkfs

Syntax: mkfs [-V] [-t fstype] [fs-options] filesys

Note:

-V shows a brief usage method.

-T <fstype> specifies the file system to be created, such as ext3 and ext4.

Fs specifies the parameter when a file system is created.

-V displays the version information and detailed usage.

(3) commandMount

Syntax:

Mount [-afFnrsvw] [-t vfstype] [-Llabel] [-o options] device dir

Mount [-lhv]

Note:

-A: load all the devices set in the file/etc/fstab.

-F does not actually load the device. You can use it with parameters such as-v to view the execution process of mount.

-F must be used together with the-a parameter. All devices set in/etc/fstab will be loaded at the same time, which can speed up the execution.

-T vfstype specifies the type of the file system to be loaded, such as ext3 and ext4.

-L label: specify a label name for the mount point.

-L displays the label of the partition.

-H: displays help information.

-V displays the mount version.

The partition or file to which the device is attached. If the device is a file, the-o loop parameter must be added during mounting.

The mount point of the dir partition.

(4)Fstab configuration instructions

There are 6 columns in/etc/fstab:

File system: Specify the device name of the file system to be mounted (for example,/dev/sdb ). You can also use UUID. You can use the blkid command to view the UUID of the specified device (for example, blkid/dev/sdb.

Mount point: the mount point. Manually create a directory and mount the partition to the directory.

Type: specifies the type of the file system. For example, ext3, ext4, and ntfs.

Option dump: 0 indicates no backup; 1 indicates backing up the entire content in <file system>. We recommend that you set this parameter to 0.

Pass: used to specify how the fsck checks the hard disk. 0 indicates that no check is performed. If the mount point is a partition/(root partition), it must be set to 1, and other mount points cannot be set to 1. If a mount ass is set to a value greater than 1, after checking the root partition, check the value of pass from small to large, and check the same value at the same time. For example, if the pass of/home And/boot is set to 2 and the pass of/devdata is set to 3, the system checks the root partition and then checks the/boot AND/home at the same time, check/devdata again.

 

2. How to attach a new hard disk (for more information, see section 3 "detailed steps)

(1) run the fdisk-lu command to display "Disk/dev/sdb doesn't contain a valid partition table". This indicates that sdb is the newly added hard Disk. The following describes how to operate it. 1.

 

Figure 1

(2) run the fdisk/dev/sdb command to partition sdb, as shown in Figure 2. Follow the prompts.

 

Figure 2

Enter n as prompted to add a partition for the new hard disk. When Commandaction is displayed, enter e to specify the extended partition ). When the Partition number (1-4) is displayed, input 1 to separate only one Partition.

Specify cylinder to complete the partitioning. 3.

 

Figure 3

(3) Input p to print the new hard disk partition table, as shown in figure 4.

 

Figure 4

(4) Enter w at the Command (m for help) prompt to save the partition table. System prompt: Thepartition table has been altered! 5.

 

Figure 5

(5) In this case, run the fdisk-lu Command, as shown in figure 6.

 

Figure 6

(6) format the new partition: sudo mkfs-t ext4/dev/sdb. (We recommend that you use ext4. For more information, see section 3 "detailed steps".) 7.

 

Figure 7

(7) mount The sudo mount-t ext4/dev/sdb/devdata disk (devdata is a self-developed directory here. You can specify any directory. The directory I load is dfs. data. directory specified by dir). All the steps have been completed by now. The following is the fix. Now you can run the sudo df-h command to view the information, as shown in figure 8.

 

Figure 8

(8) If you want to manually load each time, run the mount-a command. To enable automatic system loading, configure/etc/fstab, as shown in figure 9.

 

Figure 9

The loading process has been completed.

For a Hadoop cluster, you need to set each worker node to dfs. data. run the command: chown-R dm: dm/usr/local/hadoop/data/(this is dfs. data. directory pointed to by dir), and then format the namenode. Here I forgot to change the dfs. data. dir points to the directory permission, causing the datanode to fail to start.

3. Detailed steps

(1) For the name sdb in 2 (1), in fact, Linux has its own set of rules for device naming. storage devices are differentiated according to the interface type, the identifier is assigned to the system interface number occupied by the storage device. If the identifier of an IDE storage device (parallel port device) is hd, the device names had, hdb, and hdc Based on the interface used by the device. For SCSI interfaces, SATA interface devices (serial ports) the serial bus interface uses sd as the identifier and is still named after sda and sdb according to the interface number.

(2) Partitions In (2) of (2. When using a hard disk to store data, you also need to partition the hard disk. You can divide hard disk partitions into three types by partition: Primary partition, extended partition, and logical partition. The primary partition is the most basic partition type. It can directly mount and store data. A hard disk can have up to four primary partitions. In Linux, 1, 2, 3, and 4 are used as identifiers for the four primary partitions. For example, sda1 is the ID of the 1st primary partitions on the hard disk sda, and the other three are sda2, sda3, and sda4 respectively. Extended partitions are a special primary partition, if you want to use extended partitions to store data, you must first divide the extended partitions into logical partitions (that is, logical partitions plus your extended partitions ), if you want to create more than four partitions on a hard disk, you must use extended partitions. As an extended partition is also a primary partition, the extended partition also occupies one primary Partition Number. On the basis of the extended partition, you can create multiple logical partitions, logical partitions can be directly mounted and stored. The ID number of a logical partition starts from 5, for example, sda5 or sda6. In Linux, hard disk partitions are named with the above identifiers and saved in/dev. to store data using partitions, You need to mount the corresponding block device files to a directory. The process of attaching a device can be described as follows: it provides an interface and path for users to store and read data using this partition. The reason for creating an extended partition, instead of creating a primary partition, is that the primary partition contains additional information about startup, which is used to guide the system to start, the disk here is only used to expand the capacity of the original disk to store data. If the extended partition is directly divided into logical partitions, the additional information is not required, so that the space of the new hard disk can be fully utilized, this explanation is not correct. If it is incorrect, please correct it.

(3) 2 (6) mentioned File System ext4, can refer to http://baike.baidu.com/view/266589.htm#7

(4) For the fstab settings in (8) 2, see http://baike.baidu.com/view/5499388.htm.

 

URL: http://aofengblog.blog.163.com/blog/static/6317021201101502540117/

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.