HDFs Add Delete nodes and perform HDFs balance
Mode 1: Static add Datanode, stop Namenode mode
1. Stop Namenode
2. Modify the slaves file and update to each node
3. Start Namenode
4. Execute the Hadoop balance command. (This is used for the balance cluster and is not required if you are just adding a node)
-----------------------------------------
Mode 2: Dynamically add Datanode, keep Namenode way
For example, if the ip address of the newly added node is 192.168.1.xxx, add the hosts of 192.168.1.xxxdatanode-xxx to all nn and dn nodes, create useraddhadoop-sbinbash-m on xxx, and add the ip address of another dn. all files in ssh are copied to homehadoop on xxx. install jdkapt-getinstallsun-java6-j in ssh path
For example, if the ip address of the newly added node is 192.168.1.xxx, add the hosts of 192.168.1.xxx datanode-xxx to all nn and dn nodes, create useradd
performance will be better. This is why the previous article proposed configuration, the X 1TB disk than the X 3TB disk for a better reason. The space constraints inside the blade server tend to constrain the possibility of adding more hard drives. From here, we are better able to see why Hadoop is so-called running on a standalone commercial server, and its deliberately Share architecture of nothing. Task Independent, Io Independent for
Hadoop cluster hardware standard configuration
When selecting hardware, we often need to consider the performance and expenditure of applications. To this end, we must find a perfect balance between meeting actual needs and being economically feasible. The following uses the Hadoop cluster application as an example to
Deploy Hbase in the Hadoop cluster and enable kerberos
System: LXC-CentOS6.3 x86_64
Hadoop version: cdh5.0.1 (manmual installation, cloudera-manager not installed)
Existing Cluster Environment: node * 6; jdk1.7.0 _ 55; zookeeper and hdfs (HA) installed), yarn, historyserver, and httpfs, and kerberos is enabled (kdc is
region:#hbase> major_compact ‘r1‘, ‘c1‘#Compact a single column family within a table:#hbase> major_compact ‘t1‘, ‘c1‘
Configuration Management and node restart1) Modify the HDFs configurationHDFs Configuration Location:/etc/hadoop/conf
# 同步hdfs配置cat /home/hadoop/slaves|xargs -i -t scp /etc/hadoop/conf/hdfs-site.x
During online hadoop cluster O M, hadoop's balance tool is usually used to balance the distribution of file blocks in each datanode in the hadoop cluster, to avoid the high usage of some datanode disks (this problem may also lead to higher CPU usage of the node than other servers ).
1) usage of the
Introduction to spark Basics, cluster build and Spark ShellThe main use of spark-based PPT, coupled with practical hands-on to enhance the concept of understanding and practice.Spark Installation DeploymentThe theory is almost there, and then the actual hands-on experiment:Exercise 1 using Spark Shell (native mode) to complete wordcountSpark-shell to Spark-shell native modeFirst step: Import data by file modescala> val rdd1 = Sc.textfile ("File:///tmp
GB in this iteration...
Solution:1. Increase the available bandwidth of the Balancer.We think about whether the Balancer's default bandwidth is too small, so the efficiency is low. So we try to increase the Balancer's bandwidth to 500 M/s:
hadoop dfsadmin -setBalancerBandwidth 524288000
However, the problem has not been significantly improved.
2. Forcibly Decommission the nodeWe found that when Decommission is performed on some nodes, although the da
Hadoop cluster all datanode start unfavorable (solution), hadoopdatanode
Datanode cannot be started only in the following situations.
1. First, modify the configuration file of the master,
2. Bad habits of hadoop namenode-format for multiple times.
Generally, an error occurs:
Java. io. IOException: Cannot lock storage/usr/had
All datanode operations in the hadoop cluster are unfavorable (solution)
Datanode cannot be started only in the following situations.
1. First, modify the configuration file of the master,
2. Bad habits of hadoop namenode-format for multiple times.
Generally, an error occurs:
Java. io. IOException: Cannot lock storage/usr/had
Address: http://blog.cloudera.com/blog/2013/04/how-to-use-vagrant-to-set-up-a-virtual-hadoop-cluster/
Vagrant is a very useful tool that can be used to program and manage multiple virtual machines (VMS) on a single physical machine ). It supports native virtualbox and provides plug-ins for VMWare Fusion and Amazon EC2 Virtual Machine clusters.
Vagrant provides an easy-to-use ruby-based internal DSL that all
Yesterday because Datanode appeared large-scale offline situation, the preliminary judgment is dfs.datanode.max.transfer.threads parameter set too small. the hdfs-site.xml configuration files for all Datanode nodes are then adjusted. After restarting the cluster, in order to verify, try to run a job, see the configuration of the job in Jobhistory, it is surprising that the display is still the old value, that is, the job is still running with the old
physically stored:You can see that the null value is not stored, so "contents:html" with a query timestamp of T8 will return NULL, the same query timestamp is T9, and the "anchor:my.lock.ca" item also returns NULL. If no timestamp is specified, the most recent data for the specified column should be returned, and the newest values are first found in the table because they are sorted by time. Therefore, if you query "contents" without specifying a timestamp, you will return the T6 data, which ha
: Mkdir-p/hd/sdb1, and then mount/dev/sdb1/hd/sdb1, same mount other partitions.5, modify the/etc/fstabIf not modified, every time you turn on the manual to do the 4th step, more trouble. Open the Fstab file, add 5 new partitions according to an existing entry, and the last two data for each entry are 0 0Iv. Expansion of HDFsI add all of the above 5 partitions to HDFs. First create a new subdirectory in the Mount directory for each partition/dfs/dn, such as Mkdir-p/hd/sdb1/dfs/dn, and then modif
Overview: Hadoop cluster, 1 sets of Namenode, a secondnamenode, a jobtracker and Taiwan Datanode, the specific installation method on the Internet there are too many, the following is just their own set up the experimental environment and the problem solution. 1, the configuration IP corresponding hostname/etc/hosts configuration namenode and Datanode, shape as follows:
192.168.1.1 Namenode
192.168.1.2 Seco
MastersHost616) Configuration Slaveshost62Host635. Configure host62 and host63 in the same way6. Format the Distributed File system/usr/local/hadoop/bin/hadoop-namenode format7. Running Hadoop1)/usr/local/hadoop/sbin/start-dfs.sh2)/usr/local/hadoop/sbin/start-yarn.sh8. Check:[Email protected] sbin]# JPS4532 ResourceMa
Add hard disks to the Hadoop cluster.
Hadoop worker nodes expand hard disk space
After receiving the task from the boss, the hard disk space in the Hadoop cluster is insufficient, and a machine is required to be added to the Hadoop
1. Install JDKa) download the JDK Installation File jdk-6u30-linux-i586.bin under Linux from here. B) copy the JDK installation file to a local directory and select the/opt directory. C) 1. Install JDK
A) download the JDK Installation File jdk-6u30-linux-i586.bin under Linux from here.
B) copy the JDK installation file to a local directory and select the/opt directory.
C) Execution
Sudo sh jdk-6u30-linux-i586.bin (if you cannot execute chmod + x jdk-6u30-linux-i586.bin first)
D) after installat
The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion;
products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the
content of the page makes you feel confusing, please write us an email, we will handle the problem
within 5 days after receiving your email.
If you find any instances of plagiarism from the community, please send an email to:
info-contact@alibabacloud.com
and provide relevant evidence. A staff member will contact you within 5 working days.