Usage of variable loading (tcontextload) in talend and other precautions

Source: Internet
Author: User

This article is for my private reference and is not limited to expressions.

I,Use tcontextloadTo:

For example:

Add component 1 and tfileinputdelimited. A txt file can be extracted from this group, and you can build a.txt file on the drive. The content is as follows:

Myname; along <br/> sex; man

Add the key and value fields in the schema below.

2. Add the tcontextload component to connect the former control with this control. At this time, you must add the context variable before running, and the two variable names must be: context. myname and context. Sex. If you select "print operation", you will find the relevant information.

The preceding section describes how to use context in a text file. The following describes how to extract context variables from a database:

1. Put a toracleinput. the table must have two fields, the first field is the key (that is, the description), and the second field is the value. Note: you must add the variable in the key of toracleinput in the contexts column.

If the table has multiple rows, the first field is the key, and the second field is the value, just like the text file above, of course, you can also reverse the order of keys and values.

2. Put a tcontextload.

Later:

 

Therefore, the common operation process of talend is:

Tfileinputdelimited reads the TXT content ---> tcontextload writes global variables (of course, you can also use custom static variables to replace global variables)

---> Normal operation with the context variable as the reference value ---> extract the key and value of the text file to be written from the database

---> Use tfileoutputdelimited to write data to the TXT file. This text file is eventually in the form of the TXT column above.

 

Ii. Date issues

Since talend is made in Java, after the date is converted to string, it is like 200-3-12 11:11:11. 0.

The next step is to use substring to intercept the where of the custom select statement, for example:

"Select * From Table1 where riqi> to_date ('" + context. mykey. substring (0, context. mykey. length ()-2) + "', 'yyyy-MM-DD hh24: MI: ss ')"

 

Iii. Custom schema content

When the software uses a key control: TMAP, it usually extracts all the schema fields in toracleinput. If you want to customize fields, such as select max (Shijian) from, click the guess schema button. If it happens to be a time type, you also need to select the date type.

 

If toracleout is not inserted but updated, you need to set a key for the update table in TMAP. If the table has only one column, another column is set as a constant and as a key.

 

Iv. Exchange stamp

 

Some people use the timestamp of the local server as the data exchange timestamp. The method is to record the time of the last exchange and the time of the current exchange. Then, the condition is greater than the time of this record during the next exchange. The problem is that your server time cannot be the same as the database server time in one second. For example, if the current time of your server is: 9: 00: 00, and the database server time is: 8: 59: 58, follow the above method, you will record the last exchange time as nine o'clock, but the next time you switch, the database will miss 8:59:59 seconds.

Therefore, a good exchange method should be the max (time) record of data transferred to the server after each exchange. The next time is later than this time. (In fact, this method is also flawed, because there may be multiple records in the last second of the exchange, and you may have transferred only one record at that time, so other data in the same second will be omitted. Of course, this possibility is relatively small, but in theory, the solution is: the next time, it should not be greater than, but greater than or equal to this time. However, there will be some efficiency problems, because the update efficiency is low, and because it is an update, you need to set a "key ")

Here I still have a question to research, that is, when getting the maximum timestamp transferred to the database, I do not know that it is a time limit plus max (time), high efficiency, or direct max (time) efficiency is high.

Of course, I suggest that you still use the numeric primary key as the exchange stamp, so that it will not be repeated.

 

V. Local text file Problems

The content in this section below, with my understanding of talend, I found that the use of XML is not very good, it is better to use TXT for convenience and direct.

I advocate using XML instead of txt, and talend also has a special chapter to introduce how to operate XML.

XPath recommends that you list some axes that contain nodes related to the currently selected node (also known as the context node. To avoid being lengthy, some abbreviations of common axes are specified. The following table shows these abbreviations and their equivalent axes.

Abbreviations Axis
. SELF: node ()
.. Parent: node ()
// /Descendent-or-self: node ()/
@ Attribute ::

Another fact is that the default axis used for each location step or path expression is child: axis. Therefore,/BK: Books/BK: Book is actually equivalent to/child: BK: Book/child: BK: book, which is much easier to directly type.

 

For example, I wrote an XML file:

<? XML version = "1.0" encoding = "UTF-8"?> <Br/> <Hello Si = "XX"> <br/> <Ren sex = "male"> along </ren> <br/> <Ren sex = "female"> Yanzi </ren> <br/> </Hello>

In the loop XPath query of talend, write:/Hello/Ren.

Fill in '.' In the XPath query in mapping column/XPath query. More information is as follows:

 

Then there is the relative path problem: for example, in the file path, directly write: "ABC. XML ", in the test, it is actually in the installation directory of talend, rather than under a specific project; After exporting the job, this ABC. XML will not be exported. You need to copy the ABC in the same folder of batch processing. XML.

 

 

6. Ternary expressions:

The map of talend is stored in the form of Java, so if you want to write Java judgment code in the expression, my current method is to write a ternary expression:

1. row1.abc> context. XY? Row1.abc: context. XY

 

VII. Complete the serial number:

String. Format ("% 05d", ++ row2.id)

In this way, the ID passed from table 2 is first added with 1, and then 0 is added in front if there are not five digits.

If you want to record the maximum value in another table, you can imagine it.

 

8. User-Defined Functions

Although I tried to use an anonymous function to define an expression for the variable (for example, string myfunction () {if .....;}), but it is always unsuccessful. It is probably not supported.

On the left side of the talend tool, there is a "code". Right-click "regular" and select "create transaction" to customize the function.

Note that the blue comment above is also required, especially the cagtegory, which determines whether your function can be displayed in the custom function toolbar.

In addition, if you write: public static string Xiao = "hello"; then you can write: along. xiao (along is the name of the class you wrote when creating the function) to replace the context variable.

 

9. handle errors

If an error occurs during the inverted database, you need to record the error log and send an alarm. There is a "logs & errors" component group in talend. I personally think that for me, tdie and twarn are useless currently. What is important is tlogcatcher and tassertcatcher. There is no big difference between them, and they all have the function of capturing Java exceptions, but the project captured by the former is a little more. To view the tlogcatcher schema, you need to know:

Moment: indicates the time when an error occurs.

Project: Project name job: the specific job in the project.

Type: Error type. If you select only Java errors, this must be a Java exception.

Origin: Which component has an error. For example, toracleinput_1. Although this name may not exist in your project (you renamed it), it indicates that the component recorded in the talend system is the one that appears for the first time, how can I know? You only need to click a component and view the "Code". This is important for determining where an error occurs.

Message: This is the most important part of the error content.

Code: Error Code

Finally, the above error content is written to another database to facilitate record and alarm.

 

10,Other considerations

1. The equals in Java is generally not =, but the character. Equals ("dfdf ")

2. For a method that may be null, the parameter is written as follows:

Public static AA (string B) {<br/> If (B = NULL) {B = "anonymous ";}

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.