Java Learning: Iterator data processing plant

Source: Internet
Author: User

A long time ago to listen to a great God lectures, during the mention of iterator, do not look at its simple method, but assembled like "Data processing plant" the same!

The following is a brief introduction to the first three main methods of iterator:

Next () gets the next element in the sequence.

Hasnext () Checks if there are elements in the sequence.

Remove () Removes the newly returned element of the iterator.

The data processing plant mainly uses next () and Hasnext () These two methods, take the following question to say

< issues >

Linux has many commands for manipulating text, such as cat filename to output all the contents of a file to the console.

grep keyword filename can output the contents of the file containing keyword to the console.

Wc-l filename to count the number of rows in the filename file.

| is the meaning of the pipe, the result of the left command of the pipeline as input to the pipeline right command, such as cat filename | grep Exception | Wc-l, which can be used to count the number of rows in a file that exception appear.

Implement a function that resolves a linux command (containing only the three commands above and the parameters mentioned above and the combination with the pipeline, other things not considered), such as the following examples:

Cat Xx.txt

Cat Xx.txt | grep xml

Wc-l Xx.txt

Cat Xx.txt | grep XML | WC--l

</questions >

The above problem is to do several different processing of a data source,

The first way of thinking is roughly as follows:

First, the input command is divided into a single command, and then according to the function of the command to write the function module processing data, traverse the list of commands, the data processing, the result record, as the next command of the incoming parameters, until all the command is completed.

Advantages : Flexibility is good, various commands can be easily combined into different functions, extensibility is also very good, need to add new commands only need to write a new function module on the line, the implementation of the mainstream process almost no modification.

cons : Need to retain intermediate results, that is, when the data source is too large to be powerless, in addition, if the processing of data in the middle of the accident, then even if you have done most of the work, to the end of the results will not be obtained.

The second way of thinking: assembly line factory , using iterator characteristics to achieve: (Hasnext () to determine whether there is the next data, next () take the next piece of data, each of the iterator mentioned below will be repeated use of these two methods)

First, the data source (the above problem is the file) is constructed into iterator;1

Then the individual iterator constructors are written for each command, each implementing Hasnext and Next, the passed parameter is iterator, and the return is iterator;2

Next, each iterator is stitched according to the input commands, resulting in the final result (also iterator);3

Finally, output result. 4

In the above procedure, 123 did not do anything to the data source, just constructs a set of processing logic, the final 4th step seems to be the output result, in fact, in this step inside the data is processed.

Advantages : Contains all the advantages of the first idea, and the first idea of the shortcomings are resolved, the data source arbitrarily large (because the memory is always only the line currently being processed), and even if there is a problem in the middle, the normal data before the problem will be normal processing and output.

disadvantage : the need to always occupy the data source of the read handle, in fact, the time taken is longer than the previous method, but not much longer

The first method takes time: Data length * (IO time + time of processing)

The second method takes time: Data length * (IO time + time to do multiple processing)

Generally speaking, Io time is much larger than processing time, so taking up time is actually a little longer.

Here's the code for the second method:

The main process, constructs a set of processing logic according to the command

//result is used to record the last iteratoriterator<string> result =NULL; for(String command:commands) {List<String> args =splitter.omitemptystrings (). splittolist (command);if(Result = =NULL) {//at the first execution time, result is empty, and the source file needs to be processed by the iteratorresult =NewFileiterator (NewFile (Args.get (Args.size ()-1)));}//initialize the corresponding iterator according to the command nameresult = Commandmap.getobject (args.get (0). Init (args, result);}returnresult;//construct the file iterator FileiteratorPrivateBufferedReader Reader;PrivateString line =NULL; PublicFileiterator (File file)throwsFileNotFoundException {Reader=files.newreader (file, charsets.utf_8);} @Override Public BooleanHasnext () {Try{ Line= (line = =NULL) ?reader.readline (): line;} Catch(IOException e) {logger.error ("ReadLine () Error", E);}if(line = =NULL) {iohandler.closeignoreexception (reader);}returnLine! =NULL;} @Override PublicString Next () {Try{String temp= (line = =NULL) ?reader.readline (): Line;line=NULL;if(temp = =NULL) {iohandler.closeignoreexception (reader);}returntemp;} Catch(IOException e) {logger.error ("ReadLine () Error", E);}return NULL;}//An iterator to the grep command (the rest is almost, not affixed)PrivateString filter;PrivateString line =NULL;Private voidGetLine () {String temp; while(line = =NULL) {if(!Parent.hasnext ()) { Break;} Temp=Parent.next ();if(Temp.contains (filter)) { line=temp; Break;}}} @Override Public BooleanHasnext () {getLine ();returnLine! =NULL;} @Override PublicString Next () {getLine (); String result=Line;line=NULL;returnresult;} @Override PublicIterator<string> Init (list<string> args, iterator<string>parent) { This. Parent =Parent;filter= Args.get (1);return  This;}

Note: The above code is just a part, just easy to see, want to follow the sticker on it.

Java Learning: Iterator data processing plant

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.