Use shell to generate derivative script instances for TXT-based data exchange

Source: Internet
Author: User

For example, most insurance industry associations and insurance companies in different regions communicate with each other through text,

This method is very flexible, and it is very easy to post-process with a pack! However, if I give you an MDB or xls, it's not that easy to handle,

These data may sometimes be millions of records. Can you imagine what XLS will do? Besides, you need to import the data into your database,

Then you have to develop anotherProgramIt's not a good deal!

Maybe you want to avoid text development? Maybe, when we get used to the win platform, we immediately think of UE,

The invincible column editing mode (Alt + C), but does it mean that the field length is not exactly the same?

In fact, with the help of powerful shell, we don't need to develop anything at all. We just write some simple scripts, and we can also get home from scratch.

The following is a real case. For data security, I have slightly processed the original data!

Let's take a look at the basic structure of data (downloads) from industry associations.

 
$ Tail tp_claim.txt
Xxxx02 |, | 7802410222009000261 |, | 80xxxx020000000000115156175n |, | 6802410222009000039a |, | 23xxxx029000000100915307e |, | 84xx020000000000426295906e
Xxxx02 |, | 7802410222009000263 |, | 80xxxx020000000000115170337n |, | 6802410222009000126 |, | 23xxxx02900000000000000100918119e |, |

From the data point of view, there are a total of 6 fields with the separator "|, |", and the sixth field may be blank

Here we will briefly introduce the target. We will use the second column as the primary key to update the two fields in a table, which are the third column and the sixth example.

Obviously, it is most appropriate to use a powerful awk in this case. Let's take a look at the script first.

 

Awk script

  #  ! /Bin/awk-F  
Begin {
# FS = "|, | ";
}

{
Printf " Update t_upd_bj_buss_claim set claim_code = '% s', c_state = '1' " , $ 2 ;
Printf " Where c_comm_typ = '0' and c_rpt_no = '% s'; \ n " , $ 3 ;

If ( Length ($ 6 ) ! = 0 )
{
Printf " Update t_upd_bj_buss_claim set c_clm_cnfrm_cde = '% s', c_state = '1' " , $ 6 ;
Printf " Where c_comm_typ = '2' and c_rpt_no = '% s'; \ n " , $ 3 ;
}
Print "" ;
}
End {
# Print "exit ";

This script is very simple and is discussing how to output SQL statements,

The three functions print, printf, and length of awk are used. You don't need to parse them much. It's a bit like C.

Note that printf does not output line breaks, '% s' in strings, and formatted characters.

In the third row, if the last column is not empty, update one more field.

Then, let's execute it. We can first modify the script to be an executable file, then print the text, and use the pipeline technology to connect the field separator '|,

And replace the delimiter '|, |' with the tr command and use

We processed the script just now and output the processing result set as tp_claim. SQL

 

  $ Chmod     +  X tp_claim .  Awk

$ Cat Tp_claim . Txt | Tr ' |, | ' ' ' | ./ Tp_claim . Awk > Tp_claim . SQL

 

The rest is that we use sqlplus to execute this script.

 
SQL> @ tp_claim. SQL

In fact, we need to write a script above, but in fact, if the requirement is simpler, we only need one command.

 
Sed-E's/|, | // G' tp_claim.txt | awk '{print "Update t_upd_bj_claim set claim_code = \ 047 $3 \ 047 where c_rpt_no = \ 047 $2 \ 047 ;"}'

 

To explain, we have used SED, which is equally powerful, to replace. S stands for replacement. |, | is replaced with a space,

We often use this in VI, for example:

: % S/|, | // G

In the above text, we use the tr command

And connect them with pipelines. \ 047 in the awk command represents single quotes, although it is a bit awkward,

It is a little troublesome to output single quotes. I don't know if anyone has a better solution.

Awk 'in in {print "\ 047 "}'

Awk "begin {print \"'\"}"

Summary:

Shell's text capabilities are incredibly powerful, and can even join,

If you are considering the format of data exchange, consider the TXT file. This is king.

Use xmlless. The most important thing is to use a binary format (such as. xls and. mdb,

Of course, most people may writeCodeAfter all, it is the choice of most people, so we should install cygwin,

Play a shell and give you a new experience

 

Technorati tags: shell, Linux, derivative, awk, data exchange, cygwin
Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.