CSV files, as a simple data file exchange standard, are widely used in commercial activities, especially B2B.
In BizTalk, you can use flat file disasembler to Parse Files of this type. The following example uses BizTalk Server 2006 R2 as the development environment.
The CSV sample file is as follows:
Itemnumber, description, issuedate, issueuser, status
00-011-003, Microsoft office2007, 2008-9-27, uid01,
00-022-234, Vista flagship edition,-1, uid02, d
00-101-001, "Xbox 360," "Elite" "edition", 2006-7-5, uid03,
The procedure is as follows:
Step 1: Select flat file schema wizard from the Add project file to start the flat file schema wizard.
Step 2: Enter the sample file to provide the wizard for Structure Analysis of the flat file. The root node name of the flat file schema to be generated has been namespace,
Select the correctCodePage to correctly decode the flat file.
Step 3: select the first two lines of data to provide structural analysis. Note that you must select the last two characters in each line and press enter to return the line break, this is to analyze the row structure.
There are two levels of analysis in the Wizard, one is row analysis, and the other is field analysis. Field analysis will be introduced later.
Step 4: Select whether to parse by separator or use relative positions to split the row set. Here, select the separator mode.
Step 5: select a line-separated string. Since the first two lines of text are selected, the analyzer analyzes all the selected content as a string,
The content of each line is equivalent to the child (or cell) divided by delimiters. to split a set of strings into two lines of text, naturally, you need to use the Windows standard line break {Cr} {lf}, that is, the carriage return line break.
Step 6: Set the element data type after separation. The file header is fixed, so the record type is used, while the content adopts the repeating record type to indicate record items that can be repeated.
Step 7: define two subnodes, header and record in the root node. Select the header to set the file header field.
Step 8: In header field settings, select the text content of the first line of the sample text extracted above. Do not select the carriage return line character here.
Step 9: Use separators to split fields.
Step 10: Here, child is the fields separated by commas in the first line, which must be separated by commas (,) according to the CSV standard.
For non-CSV standard structured flat files, you can use other symbols to split fields, such as tabs.
Step 11: Follow the prompts in the sample content to set the name of the header node in the flat file schema to be generated. The element type (element or attribute) and data type are already in XML.
Step 12: set the record node in the same way.
Step 13: Set the flat file schema through the wizard.
Step 14: This step is critical. According to the CSV standard, double quotation marks are special characters. The text contained in double quotation marks, including the comma separator, will be processed as text content, the double quotation marks in the text must be escaped in pair double quotation marks.
For example, in the sample document, the text in the 4th rows and 2nd cells is as follows:"Xbox 360," "Elite" "edition"
Should be parsedXbox 360, "Elite" EditionSuch content.
In the generated flat file schema, you must set default wrap character to double quotation marks in the attribute bar of the schema layer.
In this way, the packaging symbols in the entire Schema are set to double quotation marks.
However, you also need to set the wrap character type attribute to default wrap character in each field before it can be applied to each cell.
(This setting is troublesome, but after all, flat file schema is not just designed for CSV files that require wrap character)
You can test whether the generated XML file is correct by setting the generate instance output type attribute of the schema file to "native.
Step 15: create a custom pipeline and drag the flat file discycler component from the toolbox to the disasseble stage,
Set the document schema attribute of the flat file disassembler component to the created CSV schema.
Now, you have created a custom Pipeline component that can parse the CSV flat file,
After deploying it to BizTalk application, you only need to select this pipeline in the receive location to be used.