Hadoop is mainly deployed and applied in the Linux environment, but the current public's self-knowledge capabilities are limited, and the work environment cannot be completely transferred to the Linux environment (of course, there is a little bit of selfishness, it's really a bit difficult to use so many easy-to-use programs in Windows in Linux-for example, quickplay, O (always _ success) O ~), So I tried to use eclipse to remotely connect to
Use:hadoop fs-mkdir [-p]
Create a directory with a URI as the path.
Parameters:
-P is similar to Linux mkdir-p, where a parent path is created if the parent path does not exist.
Example:
[[emailprotected] bin]# hadoop fs -mkdir file:///home/a1 file:///home/a2
[[emailprotected] bin]# ll /home/
total 20
drwxr-xr-x 2 root root 4096 Aug 8 09:45 a1
drwxr-xr-x 2 root root 4096 Aug
, including file copying, deletion, and viewing. If it has these functions, the file system is normal. However, there is still a problem in front of me. In the hadoop cluster environment currently installed, its HDFS file system should be empty. First, you must add some materials to it. In other words, from the perspective of Linux, it is obviously the most basic operation to copy files from the Linux File System to
Exception Analysis
1. "cocould only be replicated to 0 nodes, instead of 1" Exception
(1) exception description
The configuration above is correct and the following steps have been completed:
[Root @ localhost hadoop-0.20.0] # bin/hadoop namenode-format
[Root @ localhost hadoop-0.20.0] # bin/start-all.sh
At this time, we can see that the five processes jobtracke
Some Hadoop facts that programmers must know and the Hadoop facts of programmers
The programmer must know some Hadoop facts. Now, no one knows about Apache Hadoop. Doug Cutting, a Yahoo search engineer, developed this open-source software to create a distributed computer environment ......
1:
1. What is a distributed file system?
A file system stored across multiple computers in a management network is called a distributed file system.
2. Why do we need a distributed file system?
The reason is simple. When the data set size exceeds the storage capacity of an independent physical computer, it is necessary to partition it and store it on several independent computers.
3. distributed systems are more complex than traditional file systems
Because the Distributed File System arc
hadoop first?
What are your motivations for learning hadoop? It's just fun. I still want to engage in this aspect.If it is the latter, it would be a joke not to learn about linux. hadoop, simply put, involves multiple hosts for one storage or database. You don't know how to configure various linux environments in linu
keyId_rsa_pub >> Public KeyThe local public key also needs to be put inHostname and IP-free password may not be universal, both of them try to succeed Note:SCP remote Copy:scp-r/usr/jdk1.8.0 [email protected]:/usr/(-R comes with copy folder contents)Note Permission denied situation, if directly with the normal user write to the path such as/usr without write permission, will be error, the solution is to use[Email protected] Write or write to \home\us
is requiredDFS. Replication value is set to 1No other operations are required.
Test:
Go to the $ hadoop_home directory and run the following command to test whether the installation is successful.
$ mkdir input $ cp conf/*.xml input $ bin/hadoop jar hadoop-examples-*.jar grep input output ‘dfs[a-z.]+‘ $ cat output/*
Output:1 dfsadmin
After the above steps, if there is no error,
master, slave1 and other IP to host under C:\Windows.1) Browse the network interfaces of Namenode and Jobtracker, their addresses by default:namenode-http://node1:50070/jobtracker-http://node2:50030/3) Use Netstat–nat to see if ports 49000 and 49001 are in use.4) Use JPS to view processesTo check if the daemon is running, you can use the JPS command (which is the PS utility for the JVM process). This command lists 5 daemons and their process identifiers.5) Copy the input files to the Distribute
reads local data, while pseudo-distributed reads data on HDFS.
To use HDFS, you must first create a user directory in HDFS:#./Bin/hdfs dfs-mkdir-p/user/hadoop#./Bin/hadoop fs-ls/user/hadoopFound 1 itemsDrwxr-xr-x-hadoop supergroup 0/user/hadoop/input
Next. the xml file in/etc/hado
;mapred.job.trackername>
value>localhost:9001value>
Property>
property>
name>dfs.replicationname>
value>1value>
Property>
configuration>
Namenode and Jobtracker status can be viewed via web page after launchnamenode-http://localhost:50070/jobtracker-http://localhost:50030/Test:Copying files to a distributed file system[Plain]View Plaincopyprint?
$ bin/hadoop fs-pu
://ftp1.bkjia.com
Username: ftp1.bkjia.com
Password: www.bkjia.com
In 2014, LinuxIDC.com \ December \ Hadoop used Eclipse in Windows 7 to build a Hadoop development environment.
For the download method, see
------------------------------------------ Split line ------------------------------------------
Jar package name is hadoop-eclipse-plugin-2.3.0, can be appli
The previous several are mainly Sparkrdd related foundation, also used Textfile to operate the document of this machine. In practical applications, there are few opportunities to manipulate common documents, and more often than not, to manipulate Kafka streams and files on Hadoop.
Let's build a Hadoop environment on this machine. 1 Installation configuration Hadoop
The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion;
products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the
content of the page makes you feel confusing, please write us an email, we will handle the problem
within 5 days after receiving your email.
If you find any instances of plagiarism from the community, please send an email to:
info-contact@alibabacloud.com
and provide relevant evidence. A staff member will contact you within 5 working days.