CSV file, full comma-separated values, is a comma-delimited data file. A standard part of data exchange that is commonly used for data integration.
Recently saw a project group in the discussion interface file CSV specification, really worried for them. Discussion points:
- If the file has a header row (header row), one party insists on having it, and the other side insists that it cannot.
- Line delimiter, one party insists on using the 0x0a character of the Unix style, the other party insists on using the Windows/dos style 0x0d0x0a (or \ r \ n) and returns two characters.
- Column delimiter, one party insists on using an invisible character 0x05, which prevents conflicts with the content string, and the other party insists on using the 0X1B (Esc key).
- If there is a newline in the string how to deal with, there is no unified opinion.
Not Google really scary, this thing is very simple, first see if there is a standard, if there are strictly according to the standard. If there is no standard, see if there is a common practice (or is called the fact standard). Google the keyword "CSV", the first one is Wikipedia's explanation.
An official-the CSV file format does not exist, but RFC 4180 provides a de-facto standard for many aspects of It.
Jiger:csv does not have a formal standard, but the Internet Engineering Task Force (IETF) describes the structure of the CSV file for the recommended standard RFC 4180.
The following is a common configuration:
- Ms-dos-style lines that end with (CR/LF) characters (optional for the last line)
Jiger: {Use carriage return line break (two characters) as the line delimiter, the last row of data can not have these two characters. }
- An optional header record (there is no sure-to detect whether it's present when required).
jiger:{the header row is required, both parties must show the Convention}.
- Each record "should" contain the same number of comma-separated fields.
jiger:{the number of fields for each row of records to be the same, separated by commas. The comma is the value used by default, and the parties can agree on something else.
- Any field is quoted (with double quotes).
jiger:{the value of any field can be enclosed in double quotation marks}. For simple periods, you can require that you use double quotes.
- Fields containing a line-break, double-quote, and/or commas should is quoted. (If They is not, the file would likely be impossible to process correctly).
jiger:{field values if there are line breaks, double quotes, commas, you must enclose them in double quotation marks. This is a must. }
- A (double) quote character in a field must is represented by (double) quote characters.
jiger:{if there are double quotes in the value, use a pair of double quotation marks to denote the original double quotation mark}
If you use the above recommendation criteria, you can reduce the amount of time to discuss the scenario.
Specifications for CSV files