The use of Java.util.Scanner

Source: Internet
Author: User
Tags scanner class in java
The use of Java.util.Scanner


Java 5 Adds the Java.util.Scanner class, which is a new utility for scanning input text. It is a kind of union between the previous StringTokenizer and Matcher classes. In the previous section, it is useful to use Matcher to search within a string to find data that matches a given pattern, but is limited to matching a single pattern. Because any data must be retrieved through a capturing group of the same pattern or by using a lasso to retrieve parts of the text. You can then use a combination of regular expressions and methods for retrieving specific types of data items from the input stream. Thus, in addition to being able to use regular expressions, the scanner class can optionally parse data for strings and base types such as int and double. With scanner, you can write a custom parser for any text content that you want to work with.


Below, you use the scanner class to read an input source and plan to select data items from the text. For example, read data from the file format used by the U.S. Census Bureau. This data summarizes the statistical distribution of the first 90% of the 1990-year population census, with names and surnames that are statistically separate and therefore unable to identify individuals. The Census Bureau provided this list to the public for use by pedigree experts and statisticians. The data contains three separate files: Surnames (dist.all.last), female names (Dist.female.first) and male names (Dist.male.first). Each file contains several lines of text, which separate the following data with a blank character:


Name


Frequency expressed in percent


Cumulative frequency expressed in percent


Order


The first two lines of text in the LastName file are given below:


SMITH 1.006 1.006 1


JOHNSON 0.810 1.816 2


This indicates that Smith accounted for 0.81% of the population's surname, 1.006%,johnson. This is a simple file structure where you can read each row of this data using matcher and a regular expression with a capturing group as follows:


(\s+) \s+ (\s+) \s+ (\s+) \s+ (\s+)


Alternatively, you can use the Split method of string:


For each line of text, assume it's in a variable called line


string[] DataArray = Line.split ("\\s+");


String name = Dataarray[0];


String frequency = dataarray[1];


String cumulativefrequency = dataarray[2];


String rank = dataarray[3];



If necessary, you can also convert a string of each data item to a float and an int type. If each row has the same structure, you can use the regex to process each row, and you can easily use the Matcher or string split method to read the data (refer to the relevant chapters later in this chapter for more details). But using the string split method for this last example has one disadvantage: when you are working on each row, you create an unnecessary string array. Using Matcher on the entire input text is more efficient, but doing so first requires caching the entire data stream into a string. You can use the scanner class to do multiple things at the same time: read data from a valid input stream, efficiently parse each line of text, scan with multiple regular expressions, and put the retrieved data elements directly into the required basic type of variables. The following code uses the scanner class to read surname data files (these files can be read in the same way because the name file also uses the same structure).


Import Java.io.FileReader;


Import Java.util.Scanner;


public class Surnamereader {public


arraylist<string> getNames () throws IOException {


arraylist< string> surnames = new arraylist<string> ();


FileReader FileReader =


new FileReader ("/census/dist.all.last");


Create a scanner from the data file


scanner scanner = new Scanner (filereader);


Repeat while there are a next item to being scanned while


(Scanner.hasnext ()) {


//retrieve per data element
  
   string name = Scanner.next ();


Float frequency = scanner.nextfloat ();


float cumulativefrequency = Scanner.nextfloat ();


int rank = Scanner.nextint ();


Surnames.add (name);


Scanner.close (); Also closes the FileReader return


surnames;


}


Public Surnamereader () {for


(String s:getnames ()) {


System.out.println (s);


}


}


}
  




Scanner use whitespace as the default delimiter, and users can easily change the default settings for delimiters. The default separator can be easily serviced for us. The Hasnext method in the while loop above is used to check whether the input string has the next tag (token) to process. In addition to whitespace, there are only four entries per line. Because each item is processed sequentially, you can assume that when the next method retrieves each name, it moves to the next line of read input. Be careful here, because if the file does not match what the code expects (such as data loss or data type errors), then scanner throws an exception.


In the example above, each loop causes a new name to be added to the ArrayList. These surname data and name information are useful for building test databases using actual data (a link to this data can be found on the Web site of this book). Once you have the data in the ArrayList, you can randomly select the name from the list. For information about how to randomly select data from a list, refer to the "Generate Random Text" section.


In the previous section, we used the scanner class in Java 5 to read a data file. The file is very simple because each row takes the same structure. What if you want to read a data file with a different text structure for each row? Matcher is not qualified for this task because it can only use a single regex. The scanner class can, however, because it can use regular expressions on the input text to predict the patterns that will appear in the text. Because you can read the input by reading the tag one by one, you can use it to write a custom parser for any type of text. The following is an example of a file format fabricated by the building security event log. Each row of the log file takes the following structure:


EventType Year Month day time Type-dependent-data


The structure of the last part of each line depends on the event type. For such a structure, you will need to read the correct markup logically according to the event type. The following creates a simple file with event types such as a building entrance (entry), a building exit (exit), and an alert (alarm). Here is a sample file:


Entry 1043 Meeting Smith, John


Exit 1204 Smith, John


Entry 1300 Work Eubanks, Brian


Exit 2120 Eubanks, Brian


Alarm 2301 Fire This is a drill


Each event type requires a different structure to be read. In the first line of this file, John Smith enters the building at 10:43 to attend a meeting. He left the building at 12:04. After that, Brian Eubanks into the building at 1:00 for a job and left the building at 9:20. Then a fire broke out at 11:01 in the evening with a note stating "This is a drill". I can use scanner to read this file, as shown in the following code:


Scanner Scanner = new Scanner (New FileReader ("LogFile.txt"));


while (Scanner.hasnext ()) {


String type = Scanner.next ();


int year = Scanner.nextint ();


int month = Scanner.nextint ();


int day = Scanner.nextint ();


int time = Scanner.nextint ();


if (Type.equals ("entry")) {


String Purpose = Scanner.next ();//purpose of Visit//Get the rest of the line and


m Ove to start the next line


String restofline = Scanner.nextline ();


} else if (Type.equals ("Exit")) {


String E Xitname = Scanner.nextline (); Rest of the line


} else if (Type.equals ("alarm")) {


String alarmtype = Scanner.next ();


String comment = Scanner.nextline (); Rest of the line


}


}


scanner.close ();




Another technique that can be used when using the scanner class is to use the Findinline method. It can be used to look forward to patterns in the current row. Another similar method available is Findwithinhorizon, which can be used to find patterns in data flows other than the current row. This parsing requires that we understand the grammatical or grammatical structure of a file for processing. In the code that has just been written, there is actually a hidden syntax for the inner system-the parser for this "log" language. For more complex syntax, such as the syntax of a scripting language, writing your own parser from scratch is likely to cause logic errors. Therefore, for larger and more complex grammars, a syntax description language is used to illustrate that the syntax itself is much better, such as the language provided in JAVACC (refer to the section in chapter 3rd, "Creating a parser using JAVACC"). Lexical analysis generators use the syntactic meta language to produce parser classes that can handle syntax.


Resources:


The details of the syntax analysis and compiler principles are outside the scope of this book, and to understand these, refer to the Addison-wesley publishing house published in 1986 by Alfred V. Aho, Ravi Sethi and Jeffrey D. The classic books written by Ullman about compilers compilers:principles, techniques, and Tools.
Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.