Hadoop Core components: Four steps to knowing HDFs

Source: Internet
Author: User
Tags hdfs dfs

Hadoop Distributed File System (HDFS) is designed to be suitable for distributed file systems running on general-purpose hardware, which provides high throughput to access application data and is suitable for applications with very large data sets, so how do we use it in practical applications?

One, HDFs operation mode:

1. command-line Operations

– Fsshell :
$ HDFs DFS

650) this.width=650; "title=" 11.1.png "src=" http://s4.51cto.com/wyfs02/M01/8B/A1/ Wkiol1htdvnhwm4waabkrwdvqpq363.png-wh_500x0-wm_3-wmp_4-s_1994626882.png "alt=" Wkiol1htdvnhwm4waabkrwdvqpq363.png-wh_50 "/>

2. other computational frameworks-such as Spark

Through URIs, such as: Hdfs://nnhost:port/file, invoke HDFs protocol, host, port, or externally provided service vectors, and files, to implement access to HDFs in the Spark program.

650) this.width=650; "title=" 11.2.png "src=" http://s5.51cto.com/wyfs02/M02/8B/A5/ Wkiom1htdyud2v8haabvarckg90640.png-wh_500x0-wm_3-wmp_4-s_3100019673.png "alt=" Wkiom1htdyud2v8haabvarckg90640.png-wh_50 "/>

3. Other programs:

(1) Java API, with the help of some other computational framework or analysis tools, can access HDFs, such as sqoop load data to Hdfs,flume load log to Hdfs,impala based on HDFs query

(2) REST API : Access to HDFs via HTP.

650) this.width=650; "title=" 11.3.png "src=" http://s4.51cto.com/wyfs02/M02/8B/A5/ Wkiom1htdbngmdaaaaczmj42gzm133.png-wh_500x0-wm_3-wmp_4-s_3269709696.png "alt=" Wkiom1htdbngmdaaaaczmj42gzm133.png-wh_50 "/>

Second, focus on the HDFs command line way:

(1) copy files from a local disk foo.txt to the user directory in HDFs

650) this.width=650; "title=" 22.png "src=" http://s3.51cto.com/wyfs02/M00/8B/A5/ Wkiom1htdeqwckgxaaahiseufz8920.png-wh_500x0-wm_3-wmp_4-s_1447745337.png "alt=" Wkiom1htdeqwckgxaaahiseufz8920.png-wh_50 "/>

– files will be copied to/user/username/foo.txt

(2) get the directory list of the user's home directory

650) this.width=650; "title=" 33.png "src=" http://s3.51cto.com/wyfs02/M00/8B/A1/ Wkiol1htdggrak4qaaaezqcojv8076.png-wh_500x0-wm_3-wmp_4-s_2420105002.png "alt=" Wkiol1htdggrak4qaaaezqcojv8076.png-wh_50 "/>

(3) get the root directory of HDFs

650) this.width=650; "title=" 44.png "src=" http://s1.51cto.com/wyfs02/M01/8B/A5/ Wkiom1htdhdbnixcaaafkckofl8131.png-wh_500x0-wm_3-wmp_4-s_3138406402.png "alt=" Wkiom1htdhdbnixcaaafkckofl8131.png-wh_50 "/>

(4) display HDFs file/user/fred/bar.txt content

650) this.width=650; "title=" 55.png "src=" http://s3.51cto.com/wyfs02/M01/8B/A1/ Wkiol1htdhywnoz2aaaozirohuu430.png-wh_500x0-wm_3-wmp_4-s_786816464.png "alt=" Wkiol1htdhywnoz2aaaozirohuu430.png-wh_50 "/>

(5) Copy the file to the local disk, named Baz.txt

650) this.width=650; "title=" 66.png "src=" http://s5.51cto.com/wyfs02/M02/8B/A5/ Wkiom1htdivwj7tkaaavuzatyjm967.png-wh_500x0-wm_3-wmp_4-s_2979212918.png "alt=" Wkiom1htdivwj7tkaaavuzatyjm967.png-wh_50 "/>

(6) Create the input directory in the user's home directory

650) this.width=650; "title=" 77.png "src=" http://s1.51cto.com/wyfs02/M00/8B/A5/ Wkiom1htdkiimxqdaaaddnjbkb0103.png-wh_500x0-wm_3-wmp_4-s_3246100386.png "alt=" Wkiom1htdkiimxqdaaaddnjbkb0103.png-wh_50 "/>

(7) Delete the Input_old directory and all content inside

650) this.width=650; "title=" 88.png "src=" Http://s2.51cto.com/wyfs02/M01/8B/A5/wKiom1hTdlmgX_qEAAAfM5Jg_ J8276.png-wh_500x0-wm_3-wmp_4-s_3968900288.png "alt=" Wkiom1htdlmgx_qeaaafm5jg_j8276.png-wh_50 "/>

Third, the operation through Hue.

File browser allows you to browse and manage the contents and files of HDFs, as well as create, move, rename, modify, upload, download, and delete directories and files, and view the contents of files

650) this.width=650; "title=" 99.png "src=" Http://s1.51cto.com/wyfs02/M00/8B/A5/wKiom1hTdpWgv16HAAC-KAfi_ Lg396.png-wh_500x0-wm_3-wmp_4-s_465701151.png "alt=" Wkiom1htdpwgv16haac-kafi_lg396.png-wh_50 "/>

Four, HDFs recommended

HDFS is the repository of all data , the directory (such as log directory, data directory) should be properly planned and organized when using HDFS. The best practice is to define the standard directory structure and separate the phase temporary data. The planning examples are as follows:

(1)/user-user directory, storing data and configuration information belonging to individual users

(2)/etl-etl stage data

(3)/tmp-data that is temporarily generated by users that are shared among users

(4)/data-data sets used by the entire organization for analysis and processing

(5)/app-non-data files, such as: Configuration files, jar files, SQL files, etc.

Mastering the above four steps for the application of HDFs has important role and significance, but we should be based on their own situation gradually, pay attention to practice, can continue to make progress. I usually like to find some case analysis, so as to exercise to improve their skills, this is more like "Big Data CN" This service platform. But the truth is more from practice, only to learn and understand the experience of others, can go higher and farther, I love to pay attention to the subscription number "Big Data times Learning Center", the study of the data Daniel's experience sharing, for the promotion of my personal technical growth has extraordinary significance.

Hadoop Core components: Four steps to knowing HDFs

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.