An explanation of common Java APIs in HDFs

Source: Internet
Author: User

Transferred from: http://blog.csdn.net/michaelwubo/article/details/50879832

First, using the Hadoop URL to read data

Package Hadoop;import Java.io.inputstream;import Java.net.url;import Org.apache.hadoop.fs.fsurlstreamhandlerfactory;import Org.apache.hadoop.io.ioutils;public class URLCat {    static {        url.seturlstreamhandlerfactory (new Fsurlstreamhandlerfactory ());    }    public static void Readhdfs (String url) throws Exception {        inputstream in = null;        try {in            = new URL (URL). OpenStream ();            Ioutils.copybytes (in, System.out, 4096, false);        } finally {            ioutils.closestream (in);        }    }        public static void Main (string[] args) throws Exception {        Readhdfs ("hdfs://192.168.49.131:9000/user/hadoopuser/ Input20120828/file01 ");}    }

Among these, the jar packages I use are:

The version of Hadoop-core must be consistent with the version of Hadoop installed on the distributed environment, or it will be an error:

12/09/11 14:18:59 INFO Security. Usergroupinformation:jaas Configuration already set up for Hadoop, not re-installing.
Exception in thread "main" java.lang.noclassdeffounderror:org/apache/hadoop/thirdparty/guava/common/collect/ Linkedlistmultimap
At Org.apache.hadoop.hdfs.socketcache.<init> (socketcache.java:48)
At Org.apache.hadoop.hdfs.dfsclient.<init> (dfsclient.java:240)


The Hadoop versions installed on the distributed environment are as follows:

Run the Main method, output: Hello World bye The file information stored in HDFS is consistent:

Second, using the filesystem API to read data

Package Hadoop;import Java.io.ioexception;import Java.io.inputstream;import java.net.uri;import Org.apache.hadoop.conf.configuration;import Org.apache.hadoop.fs.filesystem;import Org.apache.hadoop.fs.Path; Import Org.apache.hadoop.io.ioutils;public class Filesystemcat {public    static void Readhdfs (String url) throws IOException {        Configuration conf = new configuration ();        FileSystem fs = Filesystem.get (uri.create (URL), conf);        InputStream in = null;        try {in            = Fs.open (new Path (URL));            Ioutils.copybytes (in, System.out, 4096, false);        } finally {            ioutils.closestream (in);        }    }    public static void Main (string[] args) throws IOException {        Readhdfs ("Hdfs://192.168.49.131:9000/user/hadoopuser /output20120828/part-00000 ");}    }

Execution output:

Bye 2
Hadoop 2
Hello 2
World 2

Iii. Creating a Directory

3.1 Write Data public boolean mkdirs (Path f) throws IOException will create a parent directory that does not exist as a client request

Package Hadoop;import Java.io.bufferedinputstream;import Java.io.fileinputstream;import java.io.IOException;import Java.io.inputstream;import Java.io.outputstream;import Java.net.uri;import org.apache.hadoop.conf.Configuration; Import Org.apache.hadoop.fs.filesystem;import Org.apache.hadoop.fs.path;import org.apache.hadoop.io.IOUtils; Import Org.apache.hadoop.util.progressable;public class Filecopywithprogress {public static void FileCopy (String local File, String hdfsfile) throws ioexception{InputStream in = new Bufferedinputstream (new FileInputStream (LocalFile))        ;        Configuration conf = new configuration ();        FileSystem fs = Filesystem.get (Uri.create (Hdfsfile), conf); OutputStream out = fs.create (new Path (Hdfsfile), new progressable () {public void progress () {Sys            Tem.out.println ("*");        }        });    Ioutils.copybytes (in, out, 4096,true); } public static void Main (string[] args) throws IOException {FileCopy("D://heat2.txt", "hdfs://192.168.49.131:9000/user/hadoopuser/output20120911/"); }}

After execution, the following error will be given:

Exception in thread "main" org.apache.hadoop.security.AccessControlException: Org.apache.hadoop.security.AccessControlException:Permission Denied:user=libininfo, Access=write, inode= "/user/ Hadoopuser ": hadoopuser:supergroup:drwxr-xr-x

Because writing files to Hadoop is not allowed,

Workaround: In Hdfs-site.xml, remove the permissions check by adding the following configuration:

To modify the Hadoop configuration file on the server: Conf/hdfs-core.xml, locate the configuration entry for dfs.permissions, and change the value to False

Run again if you have the following error:

Exception in thread "main" org.apache.hadoop.ipc.RemoteException: Org.apache.hadoop.hdfs.server.namenode.SafeModeException:Cannot Create file/user/hadoopuser/output20120911. Name node is in safe mode.
The reported blocks 6 have reached the threshold 0.9990 of total blocks 6. Safe mode is turned off automatically in 5 seconds.

Namenode that Hadoop is in safe mode, what is the security model for Hadoop?
When the Distributed file system starts, there will be a security mode at the beginning, and when the Distributed file system is in Safe mode, the contents of the file system are not allowed to be modified or deleted until the end of safe mode. The Safe mode is to check the validity of the data blocks on each datanode when the system is started, and to copy or delete some data blocks according to the policy. The run time can also be entered in safe mode through commands. In practice, when the system starts to modify and delete files will also have a safe mode does not allow the modification of the error prompt, only need to wait a while.
Now it is clear that now to solve this problem, I want to let Hadoop is not in safe mode, can not wait, directly solve it?
The answer is yes, just enter it in the directory of Hadoop:
Bin/hadoop Dfsadmin-safemode Leave
That is, the security mode of turning off Hadoop, which solves the problem. If you do not, we can wait a few seconds, and then execute the program again, you can see that the program executes properly, has the following output:

*
*
*
*
*
"*", that is, upload progress, did not write 64KB is output a "*"
Then look at the directory of HDFs to find the file already exists.

3.2 File system queries list directory file information

Package Hadoop;import Java.io.ioexception;import Java.net.uri;import org.apache.hadoop.conf.configuration;import Org.apache.hadoop.fs.filestatus;import Org.apache.hadoop.fs.filesystem;import Org.apache.hadoop.fs.FileUtil; Import Org.apache.hadoop.fs.path;public class Liststatus {public    static void Readstatus (String url) throws IOException {        Configuration conf = new configuration ();        FileSystem fs = Filesystem.get (uri.create (URL), conf);        path[] paths = new path[1];        Paths[0] = new Path (URL);        filestatus[] status = Fs.liststatus (paths);        path[] Listedpaths = fileutil.stat2paths (status);        for (Path p:listedpaths) {            System.out.println (p);        }    }    public static void Main (string[] args) throws IOException {        readstatus ("hdfs://192.168.49.131:9000/user/ hadoopuser/output20120828/");}    }

Output:

Hdfs://192.168.49.131:9000/user/hadoopuser/output20120828/_success
Hdfs://192.168.49.131:9000/user/hadoopuser/output20120828/_logs
hdfs://192.168.49.131:9000/user/hadoopuser/output20120828/part-00000

An explanation of common Java APIs in HDFs

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.