Statement
- This article is based on CentOS 6.x + CDH 5.x
HTTPFS, what's the use of HTTPFS to do these two things?
- With Httpfs you can manage files on HDFs in your browser
- HTTPFS also provides a set of restful APIs that can be used to manage HDFs
It's a very simple thing, but it's very practical. Install HTTPFS in the cluster to find a machine that can access HDFs installation Httpfs
$ sudo yum install Hadoop-httpfs
Configuration Editor/etc/hadoop/conf/core-site.xml
<property> <name>hadoop.proxyuser.httpfs.hosts</name> <value>*</value> </property> <property> <name>hadoop.proxyuser.httpfs.groups</name> <value>*</value> </property>
This is the definition of a user group and host that can use HTTPFS, and write * is to restart Hadoop after the configuration is not restricted.
Start HTTPFS
$ sudo service Hadoop-httpfs start
Using HTTPFS
Open browser access HTTP://HOST2:14000/WEBHDFS/V1?OP=LISTSTATUS&USER.NAME=HTTPFS can see
{"Filestatuses": {"Filestatus": [{"Pathsuffix": "HBase", "type": "DIRECTORY", "Length": 0, "owner": "HBase", "Group": " Hadoop "," permission ":" 755 "," Accesstime ": 0," modificationtime ": 1423446940595," BlockSize ": 0," Replication ": 0},{" Pathsuffix ":" tmp "," type ":" DIRECTORY "," Length ": 0," owner ":" HDFs "," group ":" Hadoop "," Permission ":" 1777 "," Accesstime ": 0," modificationtime ": 1423122488037," BlockSize ": 0," Replication ": 0},{" Pathsuffix ":" User "," type ":" DIRECTORY "," Length ": 0," owner ":" HDFs "," group ":" Hadoop "," permission ":" 755 "," Accesstime ": 0," Modificationtime ": 1423529997937, "BlockSize": 0, "Replication": 0},{"Pathsuffix": "var", "type": "DIRECTORY", "Length": 0, "owner": "HDFs", "Group": "Hadoop", "permission": "755", "Accesstime": 0, "modificationtime": 1422945036465, "BlockSize": 0, "replication ": 0}]}}
This &USER.NAME=HTTPFS means access with the default user Httpfs, and the default user does not have a password.
Webhdfs/v1 This is the root directory of HTTPFS
Visit Http://host2:14000/webhdfs/v1/user?op=LISTSTATUS&user.name=httpfs to see
{"Filestatuses": {"Filestatus": [{"Pathsuffix": "Cloudera", "type": "DIRECTORY", "Length": 0, "owner": "Root", "group": " Hadoop "," permission ":" 755 "," Accesstime ": 0," modificationtime ": 1423472508868," BlockSize ": 0," Replication ": 0},{" Pathsuffix ":" HDFs "," type ":" DIRECTORY "," Length ": 0," owner ":" HDFs "," group ":" Hadoop "," permission ":" 700 "," Accesstime ": 0," modificationtime ": 1422947019504," BlockSize ": 0," Replication ": 0},{" Pathsuffix ":" History "," type ":" DIRECTORY "," Length ": 0," owner ":" Mapred "," group ":" Hadoop "," Permission ":" 1777 "," Accesstime ": 0," Modificationtime ": 1422945692887, "BlockSize": 0, "Replication": 0},{"Pathsuffix": "Hive", "type": "DIRECTORY", "Length": 0, "owner": "Hive" , "group": "Hadoop", "permission": "755", "Accesstime": 0, "modificationtime": 1423123187569, "BlockSize": 0, " Replication ": 0},{" Pathsuffix ":" Hive_people "," type ":" DIRECTORY "," Length ": 0," owner ":" Root "," group ":" Hadoop "," Permission ":" 755 "," Accesstime ": 0," modificationtime ": 1423216966453," BlockSize ": 0," Replication ": 0},{"Pathsuffix": "Hive_people2", "type": "DIRECTORY", "Length": 0, "owner": "Root", "group": "Hadoop", "Permission": " 755 "," Accesstime ": 0," modificationtime ": 1423222237254," BlockSize ": 0," Replication ": 0},{" Pathsuffix ":" Impala "," Type ":" DIRECTORY "," Length ": 0," owner ":" Root "," group ":" Hadoop "," permission ":" 755 "," Accesstime ": 0," Modificationtime ": 1423475272189," BlockSize ": 0," Replication ": 0},{" Pathsuffix ":" Root "," type ":" DIRECTORY "," length ": 0," owner ":" Root "," group ":" Hadoop "," permission ":" 0, "" modificationtime ": 1423221719835," BlockSize ": 0," Replication ": 0},{" Pathsuffix ":" Spark "," type ":" DIRECTORY "," Length ": 0," owner ":" Spark "," group ":" Spark "," permission ":" 755 "," Accesstime ": 0," modificationtime ": 1423530243396," BlockSize ": 0," Replication ": 0},{" Pathsuffix ":" Sqoop "," type ":" DIRECTORY "," Length ": 0," owner ":" HDFs "," group ":" Hadoop "," permission ":" 755 "," Accesstime ": 0," modificationtime ": 1423127462911," BlockSize ": 0," Replication ": 0},{" Pathsuffix ":" Test_hive ","Type ":" DIRECTORY "," Length ": 0," owner ":" Root "," group ":" Hadoop "," permission ":" 755 "," Accesstime ": 0," Modificationtime ": 1423215687891," BlockSize ": 0," Replication ": 0}]}}
Oddly enough, HTTPFS's documents are rare, and more specific commands go to Webhdfs's documents to see Webhdfs REST API-supported commands
Operations
- HTTP GET
- OPEN (See Filesystem.open)
- Getfilestatus (See Filesystem.getfilestatus)
- Liststatus (See Filesystem.liststatus)
- getcontentsummary (See Filesystem.getcontentsummary)
- Getfilechecksum (See Filesystem.getfilechecksum)
- gethomedirectory (See Filesystem.gethomedirectory)
- Getdelegationtoken (See Filesystem.getdelegationtoken)
- HTTP PUT
- CREATE (See Filesystem.create)
- mkdirs (See Filesystem.mkdirs)
- RENAME (See FILESYSTEM.RENAME)
- setreplication (See Filesystem.setreplication)
- SetOwner (See Filesystem.setowner)
- setpermission (See Filesystem.setpermission)
- Settimes (See Filesystem.settimes)
- Renewdelegationtoken (See Distributedfilesystem.renewdelegationtoken)
- Canceldelegationtoken (See Distributedfilesystem.canceldelegationtoken)
- HTTP POST
- APPEND (See Filesystem.append)
- HTTP DELETE
- DELETE (See Filesystem.delete)
Create a folder try to create a folder called ABC
[[email protected] hadoop-httpfs]# curl-i-x PUT "Http://xmseapp03:14000/webhdfs/v1/user/abc?op=MKDIRS&user.name =httpfs "http/1.1 okserver:apache-coyote/1.1set-cookie:hadoop.auth=" u=httpfs&p=httpfs&t=simple&e= 1423573951025&s=ab44ha1slg1f4xcrk+x4r/s1emy= "; path=/; Expires=tue, 10-feb-2015 13:12:31 GMT; Httponlycontent-type:application/jsontransfer-encoding:chunkeddate:tue, 03:12:36 GMT{"boolean": True}
Then use the HDFs dfs-ls command on the server to see the results.
[[email protected] conf]# HDFs dfs-ls/userfound itemsdrwxr-xr-x-httpfs Hadoop 0 2015-02-10 11:12/user/a Bcdrwxr-xr-x-root Hadoop 0 2015-02-09 17:01/user/clouderadrwx-------HDFs Hadoop 0 2015-02-0 3 15:03/user/hdfsdrwxrwxrwt-mapred Hadoop 0 2015-02-03 14:41/user/historydrwxr-xr-x-Hive Hadoop 0 2015-02-05 15:59/user/hivedrwxr-xr-x-root Hadoop 0 2015-02-06 18:02/user/hive_peopledrwxr-xr-x- Root Hadoop 0 2015-02-06 19:30/user/hive_people2drwxr-xr-x-root Hadoop 0 2015-02-09 17:47/user /IMPALADRWX-------root Hadoop 0 2015-02-06 19:21/user/rootdrwxr-xr-x-Spark spark 0 2015-02- 09:04/user/sparkdrwxr-xr-x-HDFs Hadoop 0 2015-02-05 17:11/user/sqoopdrwxr-xr-x-root Hadoop 0 2015-02-06 17:41/user/test_hive
Can see the creation of a folder belonging to HTTPFS. ABC Open File upload a text file from the background test.txt to the/USER/ABC directory, the content is
Hello world!
Access with HTTPFS
[[email protected] hadoop-httpfs]# curl-i-x GET "http://xmseapp03:14000/webhdfs/v1/user/abc/test.txt?op=open& User.name=httpfs "http/1.1 okserver:apache-coyote/1.1set-cookie:hadoop.auth=" u=httpfs&p=httpfs&t= Simple&e=1423574166943&s=jtxqijusblvbehvuts6jcv2ubbs= "; path=/; Expires=tue, 10-feb-2015 13:16:06 GMT; Httponlycontent-type:application/octet-streamcontent-length:13date:tue, 03:16:07 GMTHello World!
Alex's Novice Hadoop Tutorial: Lesson 18th Accessing the Hdfs-httpfs Tutorial in HTTP mode