Specifications for CSV files

Source: Internet
Author: User
Tags rfc

CSV file, full comma-separated values, is a comma-delimited data file. A standard part of data exchange that is commonly used for data integration.

Recently saw a project group in the discussion interface file CSV specification, really worried for them. Discussion points:

    1. If the file has a header row (header row), one party insists on having it, and the other side insists that it cannot.
    2. Line delimiter, one party insists on using the 0x0a character of the Unix style, the other party insists on using the Windows/dos style 0x0d0x0a (or \ r \ n) and returns two characters.
    3. Column delimiter, one party insists on using an invisible character 0x05, which prevents conflicts with the content string, and the other party insists on using the 0X1B (Esc key).
    4. If there is a newline in the string how to deal with, there is no unified opinion.

Not Google really scary, this thing is very simple, first see if there is a standard, if there are strictly according to the standard. If there is no standard, see if there is a common practice (or is called the fact standard). Google the keyword "CSV", the first one is Wikipedia's explanation.

An official-the CSV file format does not exist, but RFC 4180 provides a de-facto standard for many aspects of It.
Jiger:csv does not have a formal standard, but the Internet Engineering Task Force (IETF) describes the structure of the CSV file for the recommended standard RFC 4180.

The following is a common configuration:

  • Ms-dos-style lines that end with (CR/LF) characters (optional for the last line)
    Jiger: {Use carriage return line break (two characters) as the line delimiter, the last row of data can not have these two characters. }
  • An optional header record (there is no sure-to detect whether it's present when required).
    jiger:{the header row is required, both parties must show the Convention}.
  • Each record "should" contain the same number of comma-separated fields.
    jiger:{the number of fields for each row of records to be the same, separated by commas. The comma is the value used by default, and the parties can agree on something else.
  • Any field is quoted (with double quotes).
    jiger:{the value of any field can be enclosed in double quotation marks}. For simple periods, you can require that you use double quotes.
  • Fields containing a line-break, double-quote, and/or commas should is quoted. (If They is not, the file would likely be impossible to process correctly).
    jiger:{field values if there are line breaks, double quotes, commas, you must enclose them in double quotation marks. This is a must. }
  • A (double) quote character in a field must is represented by (double) quote characters.
    jiger:{if there are double quotes in the value, use a pair of double quotation marks to denote the original double quotation mark}

If you use the above recommendation criteria, you can reduce the amount of time to discuss the scenario.

Specifications for CSV files

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.