Questions:Using HDFS client to locally connect to Hadoop deployed on Alibaba Cloud server, an exception occurred during operation of HDFs: could only is replicated to 0 nodes instead of minreplication (=1). There is 1 Datanode (s) running and 1 node (s) is excluded in this operation. And, on the Administration Web page to view the file file size is all 0;
Reaso
Describe:If A large directory is deleted and Namenode was immediately restarted, there is a lot of blocks that does not belong to any File. This results in a log:2014-11-08 03:11:45,584 INFO Blockstatechange (BlockManager.java:processReport (1901))-block* Processreport:blk_ 1074250282_509532 on 172.31.44.17:1019 size 6 does no belong to any file.This log is printed within Fsnamsystem lock. This can cause Namenode to take a long time in coming out of SafeMode.One solution is to downgrade the logg
Recently, I am looking for an overall storage and analysis solution. We need to consider massive storage, analysis, and scalability. When I got to hadoop, I just started to position it to HDFS for storage. The more I see it, the more I get excited.
First, perform the HDFS operation test.CodeThe complete eclipse + Tomcat project uses the Tomcat plug-in and
1. View HelpHadoop fs-help 2. UploadPaths on files > such as: Hadoop fs-put test.log/3. View the contents of the filePaths on Hadoop fs-cat such as: Hadoop fs-cat/test.log4. View File listHadoop Fs-ls/5. Download the filePaths on Hadoop fs-get 6, execution jar: such as the implementation of the WordCount
Tag:ar use sp file divart bsadef The call file system (FS) shell command should use the form Bin/hadoop FS. All FS shell commands use the URI path as the parameter. The URI format is Scheme://authority/path. The scheme for HDFs is HDFs, the scheme is file for the local filesystem. The scheme and authority parameters are optional, and if not specified, the default
1. Cd/usr/local/hadoop/tmp/dfs/name/current can see the key files edits and fsimage2.cd/usr/local/hadoop/conf can see the key configuration files:Core-site.xml:The Dfs.name.dir property of Hdfs-site.xmlThe Dfs.replication property of Hdfs-site.xmlFor more information, please open the source with Eclipse to view!Reading
In the Hadoop installation configuration process, the HDFS format
$ HDFs Namenode-format
An error occurred;
Java.net.UnknownHostException:centos0
As follows:
View Machine Name
$ hostname
Solution Method:
Modifying the hosts mapping file
Vi/etc/hostsModify to the following configuration, Centos0 is the machine name,
127.0.0.1
Distributed File System HDFS-namenode architecture namenode
Is the management node of the entire file system.
It maintains the file directory tree of the entire file system [to make retrieval faster, this directory tree is stored in memory],
The metadata of the file/directory and the data block list corresponding to each file.
Receives user operation requests.
Hadoop ensures the robustness of namenode and i
The architecture of HadoopHadoop is not only a distributed file system for distributed storage, but a framework designed to perform distributed applications on large clusters of common computing devices.HDFs and MapReduce are the two most basic, most important members of Hadoop, providing complementary services or higher-level services at the core level.Pig Chukwa Hive HBaseMapReduce HDFS ZookeeperCore Avro
The client needs to specify the NS name, node configuration, Configuredfailoverproxyprovider and other information.code example:Package Cn.itacst.hadoop.hdfs;import Java.io.fileinputstream;import java.io.inputstream;import Java.io.outputstream;import Java.net.uri;import Org.apache.hadoop.conf.configuration;import Org.apache.hadoop.fs.filesystem;import Org.apache.hadoop.fs.path;import org.apache.hadoop.io.IOUtils; Public classHdfs_ha { Public Static voidMain (string[] args) throws Exception {Conf
size of a data Block, it does not occupy the space of the entire data Block.
Write1), the Client initiates a file write request to the NameNode.2) according to the file size and file block configuration, NameNode returns the information of the DataNode managed by the Client.30. The Client divides the file into multiple blocks and writes them to each DataNode Block in sequence based on the DataNode address information.
Read1), the Client initiates a File Read Request to the NameNode.2). NameNode
1. Start Hadoop. Then Netstat-nltp|grep 50070, if the process is not found, the port modification without configuring the Web interface is hdfs-site,xml with the following configurationIf you use the hostname: port number, go first to check the hostname under/etc/hosts IP, whether configured and your current IP is the same, and then restart Hadoop2. Now in the virtual machine to try to access hadoop002:5007
First, the preparation conditions:1. Four Linux virtual machines (1 namenode nodes, 1 secondary nodes (secondary and 1 datanode shared), plus 2 datanode)2. Download the Hadoop version, this example uses the Hadoop-2.5.2 versionSecond, install Java JDKBest installed, JDK 1.7 is best for JDK 1.7 compatibility-IVH jdk-7u79-linux-/root/. Bash_profilejava_home=/usr/java/jdk1. 7 . 0_79path= $PATH: $JAVA _home/bin
The following subsections complement each other, and you will find a lot of interesting places to combine. Reprint please indicate source address: http://blog.csdn.net/lastsweetop/article/details/9065667
1. Topological distancesHere is a brief account of the computing distance of the network topology of Hadoop in a large number of scenarios, bandwidth is scarce resources, how to make full use of bandwidth, perfect computational cost and limiting fac
$handler.run (Server.java:1754)At this point, you can see the directory that holds the synchronization files/hadop-cdh-data/jddfs/nn/journalhdfs1 not found, SSH remote connection to the node to see that there is no such directory. Here, basically can be fixed to the problem, there are 2 ways to solve: one is to initialize the directory through the relevant command (I think this method is the correct way to solve the problem), and the second is to directly copy the normal Journalnode files over.
Hadoop is now a very hot big data running framework and platform, for this amazing big guy I am not clear, the previous time to ignore it to run HADOOP, look at its operation record storage part (Operation log), IMAGE records all the platform's file operation records, such as creating files, Delete files, rename and so on, here are some of my little observations.Formatting----InitializationThis is the initi
The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion;
products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the
content of the page makes you feel confusing, please write us an email, we will handle the problem
within 5 days after receiving your email.
If you find any instances of plagiarism from the community, please send an email to:
info-contact@alibabacloud.com
and provide relevant evidence. A staff member will contact you within 5 working days.