一、使用Hadoop URL讀取資料
package hadoop;import java.io.InputStream;import java.net.URL;import org.apache.hadoop.fs.FsUrlStreamHandlerFactory;import org.apache.hadoop.io.IOUtils;public class URLCat { static { URL.setURLStreamHandlerFactory(new FsUrlStreamHandlerFactory()); } public static void readHdfs(String url) throws Exception { InputStream in = null; try { in = new URL(url).openStream(); IOUtils.copyBytes(in, System.out, 4096, false); } finally { IOUtils.closeStream(in); } } public static void main(String[] args) throws Exception { readHdfs("hdfs://192.168.49.131:9000/user/hadoopuser/input20120828/file01"); }}
其中,我使用到的jar包有:
hadoop-core的版本一定要和分布式環境上安裝的hadoop版本保持一致,不然會報錯:
12/09/11 14:18:59 INFO security.UserGroupInformation: JAAS Configuration already set up for Hadoop, not re-installing.
Exception in thread "main" java.lang.NoClassDefFoundError: org/apache/hadoop/thirdparty/guava/common/collect/LinkedListMultimap
at org.apache.hadoop.hdfs.SocketCache.<init>(SocketCache.java:48)
at org.apache.hadoop.hdfs.DFSClient.<init>(DFSClient.java:240)
分布式環境上安裝的hadoop版本如下: