A. Use the program in Eclipse and then make the jar package,
Program code:
Package TJU.CHC;
Import Java.io.File;
Import java.io.IOException;
Import Java.util.Scanner;
Import org.apache.hadoop.conf.Configuration;
Import Org.apache.hadoop.fs.FSDataInputStream;
Import Org.apache.hadoop.fs.FSDataOutputStream;
Import Org.apache.hadoop.fs.FileStatus;
Import Org.apache.hadoop.fs.FileSystem;
Import Org.apache.hadoop.fs.Path; public class Resultfilter {public static void main (string[] args) throws ioexception{Configuration conf = new Config
Uration ();
In the following two sentences, HDFs and local correspond to the HDFs instance and native file System Instance FileSystem HDFs = Filesystem.get (conf);
FileSystem local = filesystem.getlocal (conf);
Path InputDir, LocalFile;
Filestatus[] Inputfiles;
Fsdataoutputstream out = null;
Fsdatainputstream in = null;
Scanner Scan;
String str;
Byte[] BUF;
int singlefilelines;
int Numlines, numfiles, I; if (args.length! = 4) {//input parameter number is insufficient, prompt parameter format terminates program execution SYSTEM.OUT.PRINTLN ("Usage resultfilter <DFS path><local Pat H><match str><single file lines> ");
Return
} InputDir = new Path (args[0]);
Singlefilelines = Integer.parseint (args[3]);
Inputfiles = Hdfs.liststatus (InputDir);
Numlines = 0;
Numfiles = 1;
LocalFile = new Path (args[1]);
if (local.exists (LocalFile)) {Local.delete (LocalFile, true);
} for (i = 0; i < inputfiles.length; i++) {if (inputfiles[i].isdirectory () = = true) {continue;
} System.out.println (Inputfiles[i].getpath (). GetName ());
in = Hdfs.open (Inputfiles[i].getpath ());
Scan = new Scanner (in);
while (Scan.hasnext ()) {str = Scan.nextline ();
if (Str.indexof (args[2]) = =-1) {continue;
} numlines++;
if (Numlines = = 1) {LocalFile = new Path (Args[1] + file.separator + numfiles);
out = Local.create (LocalFile);
numfiles++;
} BUF = (str + "\ n"). GetBytes ();
Out.write (buf, 0, buf.length);
if (Numlines = = singlefilelines) {out.close ();
Numlines = 0;
}} scan.close ();
In.close (); } IF (out! = null) {out.close ();
}
}
}
Packaging can be referenced
Transfer from http://www.cnblogs.com/mq0036/p/3885407.html
1, first to confirm that their own program has been written error.
2. For the first time I wrote about Web project and I have not been successful, so it is best to create Java project
Packaging steps:
1. On the project, right-click, select Export.
2. Go to the following image screen and select the jar file below Java
3. Select the project, verify that the necessary files are selected, select the path to save the jar package, as shown below
4. After completing step 3, click Next to enter the following screen:
5. Direct click Next to enter the following interface:
6. Direct point Finish,jar file package complete.
Note:
1. You can choose some conditions according to your own needs.
2. Open the meta-inf\manifest under the jar file package. MF This file to see if the information is correct.
Class-path: This import is a few other jar packages needed for this project
Main-class: This is the path to the. class file of the class where the main () method resides, and at the end there is a carriage return (as shown above, the main () method is in the Postgressqlsync class below the test package. )
Run the jar package under Windows to verify that there is a problem with the exported jar package:
1. Open DOS command: Switch to this project path via CD, then enter Java-jar Xxx.jar.
If you enter information, and when you run the project, the information that is output in the console indicates that the jar file package is complete.
Here are some of the problems I have encountered:
1. In the DOS command, the following error occurred, indicating that Main-class was not in Meta-inf\manifest. MF is configured in this file.
2. In a DOS command, when the following error occurs, it is indicated in Class-path: or after Main-class: No space is entered.
Second, enter the Hadoop root directory, run
Bin/hadoop jar ~/local/chc/resultfilter.jar tju/chc/resultfilter hdfs://master:9000/test/input/ file:///home/hadoopuser1/local/chc/tmp/ txt 50
Attention
1. Ensure that HDFs is started before running the command. Start command: bin/start-dfs.sh (sbin/start-dfs.sh) or sbin/start-all.sh (Hadoop 2.6.4 version)
2, the operation process may encounter permissions issues