region:#hbase> major_compact ‘r1‘, ‘c1‘#Compact a single column family within a table:#hbase> major_compact ‘t1‘, ‘c1‘
Configuration Management and node restart1) Modify the HDFs configurationHDFs Configuration Location:/etc/hadoop/conf
# 同步hdfs配置cat /home/hadoop/slaves|xargs -i -t scp /etc/hadoop/conf/hdfs-site.x
Enter HDUser user EnvironmentA. Su-hduserB. TAR-ZXF hadoop.2.2.0.tar.gzC. ln-s hadoop-2.2.0/
Editing environment variablesVim ~/.BAHSRC
modifying system parametersA. Turn off the firewallService Iptables StopChkconfig iptables offVim/etc/selinux/configChange into disabledSetenforce 0Service Iptables StatusB. Modifying the maximum number of open files1) vim/etc/security/limits.conf
AC Group: 335671559 Hadoop Cluster
Hadoop Cluster Build
The IP address of the master computer is assumed to be the 192.168.1.2 slaves2 assumption of the 192.168.1.1 Slaves1 as 192.168.1.3
The user of each machine is Redmap, the Hadoop root directory is:/
The environment for this configuration is the Hadoop1.2.1 version, and Hadoop introduced the Hadoop2.0 version in 13, which was modified on the basis of the Hadoop1.0 release to improve the efficiency of Hadoop cluster task scheduling, resource allocation, and fault handling.Hadoop2.0 on the basis of Hadoop1.0, the first to make a change to HDFs, in Hadoop1.0, HD
HDFs Add Delete nodes and perform HDFs balance
Mode 1: Static add Datanode, stop Namenode mode
1. Stop Namenode
2. Modify the slaves file and update to each node
3. Start Namenode
4. Execute the Hadoop balance command. (This is used for the balance cluster and is not required if you are just adding a node)
-----------------------------------------
Mode 2: Dynamically add Datanode, keep Namenode way
Most Hadoop clusters adopt Kerberos as the authentication protocolInstalling the KDC
Starting Kerberos authentication requires the installation of the KDC server and the necessary software. The command to install the KDC can be executed on any machine.
Yum-y Install krb5-server krb5-lib krb5-auth-dialog krb5-workstation
Next,
Introduction to spark Basics, cluster build and Spark ShellThe main use of spark-based PPT, coupled with practical hands-on to enhance the concept of understanding and practice.Spark Installation DeploymentThe theory is almost there, and then the actual hands-on experiment:Exercise 1 using Spark Shell (native mode) to complete wordcountSpark-shell to Spark-shell native modeFirst step: Import data by file modescala> val rdd1 = Sc.textfile ("File:///tmp
performance will be better. This is why the previous article proposed configuration, the X 1TB disk than the X 3TB disk for a better reason. The space constraints inside the blade server tend to constrain the possibility of adding more hard drives. From here, we are better able to see why Hadoop is so-called running on a standalone commercial server, and its deliberately Share architecture of nothing. Task Independent, Io Independent for
physically stored:You can see that the null value is not stored, so "contents:html" with a query timestamp of T8 will return NULL, the same query timestamp is T9, and the "anchor:my.lock.ca" item also returns NULL. If no timestamp is specified, the most recent data for the specified column should be returned, and the newest values are first found in the table because they are sorted by time. Therefore, if you query "contents" without specifying a timestamp, you will return the T6 data, which ha
During online hadoop cluster O M, hadoop's balance tool is usually used to balance the distribution of file blocks in each datanode in the hadoop cluster, to avoid the high usage of some datanode disks (this problem may also lead to higher CPU usage of the node than other servers ).
1) usage of the
-scripts/ifcfg-eth0(4) Restart the virtual machine in effect 4. Using Xshell client to access virtual machine Xshell is a particularly useful Linux remote client, with many quick features that are much more convenient than simply manipulating commands in a virtual machine.(1) Download and install Xshell(2) Click on the menu bar--New, enter the name and IP address of the virtual machine and determine(3) Accept and save(4) Enter user name and password (
GB in this iteration...
Solution:1. Increase the available bandwidth of the Balancer.We think about whether the Balancer's default bandwidth is too small, so the efficiency is low. So we try to increase the Balancer's bandwidth to 500 M/s:
hadoop dfsadmin -setBalancerBandwidth 524288000
However, the problem has not been significantly improved.
2. Forcibly Decommission the nodeWe found that when Decommission is performed on some nodes, although the da
MastersHost616) Configuration Slaveshost62Host635. Configure host62 and host63 in the same way6. Format the Distributed File system/usr/local/hadoop/bin/hadoop-namenode format7. Running Hadoop1)/usr/local/hadoop/sbin/start-dfs.sh2)/usr/local/hadoop/sbin/start-yarn.sh8. Check:[Email protected] sbin]# JPS4532 ResourceMa
Source: http://suxain.iteye.com/blog/1748356
Hadoop is a distributed system that works in Linux. As a developer, it has limited resources and has to use terminal-only virtual machines to run hadoop clusters. However, in this environment, development and debugging become so difficult. So, is there a way to issue debugging in windows. The answer is yes.
Hadoop
Hadoop cluster all datanode start unfavorable (solution), hadoopdatanode
Datanode cannot be started only in the following situations.
1. First, modify the configuration file of the master,
2. Bad habits of hadoop namenode-format for multiple times.
Generally, an error occurs:
Java. io. IOException: Cannot lock storage/usr/had
All datanode operations in the hadoop cluster are unfavorable (solution)
Datanode cannot be started only in the following situations.
1. First, modify the configuration file of the master,
2. Bad habits of hadoop namenode-format for multiple times.
Generally, an error occurs:
Java. io. IOException: Cannot lock storage/usr/had
elasticsearch-0.90.5/config/elasticsearch.yml
Delete Cluster.name previous comment, modify cluster name
Cluster.name:es_cluster
Delete the pre-node.name comment, modify the name of the node, and do not modify it, the system will generate the node name immediately after startup.
Node.name: "Elastic_inst1"
Node.master:true set the node as the primary node
192.168.0.2 Editing files
VI elasticsearch-0.90.5/config/elasticsearch.yml
Delete Cluster.name p
Operation of the Java interface on the Hadoop cluster
Start with a configured Hadoop cluster
This is what I implemented in the test class of the project that I built in the SSM framework.
One, under Windows configuration environment variable download file and unzip to C drive or other directory.Link:
Yesterday because Datanode appeared large-scale offline situation, the preliminary judgment is dfs.datanode.max.transfer.threads parameter set too small. the hdfs-site.xml configuration files for all Datanode nodes are then adjusted. After restarting the cluster, in order to verify, try to run a job, see the configuration of the job in Jobhistory, it is surprising that the display is still the old value, that is, the job is still running with the old
Hadoop advanced 1. Configure SSH-free (1) Modify the slaves fileSwitch to master machine, this section is all done in master.Enter the/usr/hadoop/etc/hadoop directory, locate the slaves file, and modify:slave1slave2slave3(2) Sending the public keyEnter the. SSH directory under the root directory:
Generate Public Private key
SSH-KEYGEN-T RSA
The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion;
products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the
content of the page makes you feel confusing, please write us an email, we will handle the problem
within 5 days after receiving your email.
If you find any instances of plagiarism from the community, please send an email to:
info-contact@alibabacloud.com
and provide relevant evidence. A staff member will contact you within 5 working days.