Use BizTalk flat file disasembler to parse CSV files

Last Update:2018-12-07 Source: Internet

Author: User

Tags biztalk

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

CSV files, as a simple data file exchange standard, are widely used in commercial activities, especially B2B.

In BizTalk, you can use flat file disasembler to Parse Files of this type. The following example uses BizTalk Server 2006 R2 as the development environment.

The CSV sample file is as follows:

Itemnumber, description, issuedate, issueuser, status
00-011-003, Microsoft office2007, 2008-9-27, uid01,
00-022-234, Vista flagship edition,-1, uid02, d
00-101-001, "Xbox 360," "Elite" "edition", 2006-7-5, uid03,

The procedure is as follows:

Step 1: Select flat file schema wizard from the Add project file to start the flat file schema wizard.

Step 2: Enter the sample file to provide the wizard for Structure Analysis of the flat file. The root node name of the flat file schema to be generated has been namespace,

Select the correctCodePage to correctly decode the flat file.

Step 3: select the first two lines of data to provide structural analysis. Note that you must select the last two characters in each line and press enter to return the line break, this is to analyze the row structure.

There are two levels of analysis in the Wizard, one is row analysis, and the other is field analysis. Field analysis will be introduced later.

Step 4: Select whether to parse by separator or use relative positions to split the row set. Here, select the separator mode.

Step 5: select a line-separated string. Since the first two lines of text are selected, the analyzer analyzes all the selected content as a string,

The content of each line is equivalent to the child (or cell) divided by delimiters. to split a set of strings into two lines of text, naturally, you need to use the Windows standard line break {Cr} {lf}, that is, the carriage return line break.

Step 6: Set the element data type after separation. The file header is fixed, so the record type is used, while the content adopts the repeating record type to indicate record items that can be repeated.

Step 7: define two subnodes, header and record in the root node. Select the header to set the file header field.

Step 8: In header field settings, select the text content of the first line of the sample text extracted above. Do not select the carriage return line character here.

Step 9: Use separators to split fields.

Step 10: Here, child is the fields separated by commas in the first line, which must be separated by commas (,) according to the CSV standard.

For non-CSV standard structured flat files, you can use other symbols to split fields, such as tabs.

Step 11: Follow the prompts in the sample content to set the name of the header node in the flat file schema to be generated. The element type (element or attribute) and data type are already in XML.

Step 12: set the record node in the same way.

Step 13: Set the flat file schema through the wizard.

Step 14: This step is critical. According to the CSV standard, double quotation marks are special characters. The text contained in double quotation marks, including the comma separator, will be processed as text content, the double quotation marks in the text must be escaped in pair double quotation marks.

For example, in the sample document, the text in the 4th rows and 2nd cells is as follows:"Xbox 360," "Elite" "edition"

Should be parsedXbox 360, "Elite" EditionSuch content.

In the generated flat file schema, you must set default wrap character to double quotation marks in the attribute bar of the schema layer.

In this way, the packaging symbols in the entire Schema are set to double quotation marks.

However, you also need to set the wrap character type attribute to default wrap character in each field before it can be applied to each cell.

(This setting is troublesome, but after all, flat file schema is not just designed for CSV files that require wrap character)

You can test whether the generated XML file is correct by setting the generate instance output type attribute of the schema file to "native.

Step 15: create a custom pipeline and drag the flat file discycler component from the toolbox to the disasseble stage,

Set the document schema attribute of the flat file disassembler component to the created CSV schema.

Now, you have created a custom Pipeline component that can parse the CSV flat file,

After deploying it to BizTalk application, you only need to select this pipeline in the receive location to be used.

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More