The Set calculator assists java in processing Structured Text Data Reading, java structured

Source: Internet
Author: User

The Set calculator assists java in processing Structured Text Data Reading, java structured

JAVA only provides the most basic data reading functions such as specifying delimiters. Other common functions must be implemented at the underlying layer. For example: reads a specified column by column name, the sequence of the specified column, the specified data type, and no delimiter. Although it is not difficult to implement such functions in JAVA, the code is cumbersome and error-prone.

Java programming is assisted by the set calculator. You do not need to write code to solve these problems. The following is an example of a specific practice.

Data.txt is a tab-separated text file with 30 columns. The first row is a column name with business significance. You need to read these columns by column name: ID, x1Shift, x2Shift, and radio, and calculate the value of the new column according to the business formula "(x1Shift + x2Shift)/2) * radio. The first few rows of a file are listed as follows:

When using JAVA, We must split all 30 columns and reference specific columns with Subscripts for computation. If the formula is large and the calculation is complicated, the error probability is high. To reduce the number of accidental writes, we can only use objects to store each piece of data, assign the business name to each field, and then calculate the formula based on the business name.

The Set calculator helps JAVA avoid these troubles. The Code is as follows:

A1: Function import is used to read files, but not to read 30 columns into the memory, but to read specified columns by column name. Parameter option @ t indicates that the first row is read as the column name. The calculation result of this step is as follows:

A2: the calculation result is as follows:

In actual use, the above calculation results are sometimes output to the file. This code can be used to achieve this purpose: = file ("E: \ contents, the file content is as follows:

If you want to send the calculation result back to JAVA for further use, you only need to write the code in the Set OPERATOR: result A2.new (ID, value )), this indicates that the ID and result columns are returned to JAVA through the JDBC interface. The data type is resultSet. Then, you only need to use the JDBC call set calculator script in the JAVA code to obtain the result. The Code is as follows.

// Establish an esProc jdbc connection
Class. forName ("com. esproc. jdbc. InternalDriver ");
Con = DriverManager. getConnection ("jdbc: esproc: local ://");
// Call esProc, where test is the script file name
St = (com. esproc. jdbc. InternalCStatement) con. prepareCall ("call test ()");
St.exe cute (); // execute the esProc Stored Procedure
ResultSet set = st. getResultSet (); // obtain the result set

When reading data, you sometimes need to specify the column order to operate data more intuitively. For example, for the same file data.txt, data is read in the new order of x1Shift, x2Shift, radio, and ID. You can specify the sequence directly by writing the following code: = file ("E: \ data.txt"). import @ t (xShift, yShift, ratio, ID ).

The calculation result is as follows:

In the above Code, the set operator automatically sets the appropriate data type, for example, xShift and yShift are set to float. But sometimes we need to specify the data type. For example, although the ID is similar to an integer, it is actually a string. If you want to extract the first four characters of the ID separately, you can use the following code:

A1: Forced type conversion. Read the ID column as a string. The result is as follows:

Note: The set operator specifies the left-aligned display of strings in IDE and right-aligned display of numbers, as shown above.

A2: truncates the first four characters. The result is as follows:

When reading data, there may be situations where the data is not separated. For example, data2.txt has 20 columns, and some data is as follows:

As you can see, data2.txt does not have a column delimiter, and some data is useless empty rows. You can use the following code to read the correct data:

A1: reads data into a single-column ordered table. The column name defaults to "_ 1 ". The function option @ s indicates that fields are not split and can be directly read. The result is as follows:

A2: A1.select (trim (_ 1 )! = "") To filter out non-empty rows. The select function can be queried by field name or serial number. The result is as follows:

A3: = A2.new (mid (_, 1), mid (_, 1), mid (_, 1), mid (_, 1 ), mid (_, 1), mid (_, 1), mid (_, 1), mid (_, 1), mid (_, 1 ), mid (_, 1), mid (_, 1), mid (_, 1), mid (_, 1), mid (_, 1 ),
Mid (_, 1), mid (_, 1), mid (_, 1), mid (_, 1), mid (_, 1 ), mid (_ 1, 20, 1 ))

This long code is used to split each line of data into 20 fields. The mid function has three parameters: the split field name, start position, and truncation length. The split result is as follows:

A3 is the calculation result we need.

The code in A3 is too long, which is not conducive to error check and maintenance. You can use the dynamic code of the cube to simplify it as follows:

A4: = 20. loops (~~ + "Mid (_ 1," + string (~) + ", 1 ),")
A5: = exp = left (A4, len (A4)-1)
A6: = eval ("A2.new (" + A5 + ")")

In A4, the loops function can be used for cyclic computing to generate regular strings, that is, "mid (_, 1), mid (_, 1), mid, 1), mid (_, 1), mid (_, 1), mid (_, 1), mid (_, 1), mid, 1 ),
Mid (_, 1), mid (_, 1), mid (_, 1), mid (_, 1), mid (_, 1 ), mid (_, 1), mid (_, 1 ),
Mid (_, 1), mid (_, 1), mid (_, 1), mid (_, 1), mid (_, 1 ),"
A comma is added to the end of the string A4. You can use the code in A5 to remove the comma.

A6: Execute dynamic scripts. The function eval can dynamically resolve a string to an expression. For example, eval ("2 + 3") is equivalent to expression 2 + 3 with a value of 5. Therefore, the expressions in A5 are exactly the same as those in A3, and the calculation results are naturally the same:

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.