Read and write questions about CSV files (especially with double quotes and commas)

Source: Internet
Author: User
Tags split trim
I. Read/write to CSV can read and write as read/write TXT file
/*=================== Read file =======================*/inputstreamreader read = new InputStreamReader (New FileInputStream
(file), encoding);//considering the encoding format BufferedReader BufferedReader = new BufferedReader (read);
 String Linetxt;  while ((Linetxt = Bufferedreader.readline ()) = null) {if (!linetxt.trim (). Equals ("")) {string[] temp =
              Linetxt.split ("\ T",-1);
      Txtlist.add (temp);


}} read.close ();
/*================== Write files ========================*/file txtfile = new file (FilePath);
Append bufferedwriter bw = new BufferedWriter (new OutputStreamWriter (New FileOutputStream (txtfile, True), encoding)); if (content! = null && content.size () >0) {for (Integer i=0; i<content.size (); i++) {string[] element = Content.get (i);//Line element if (element! = null && element.length > 0) {for (integ ER j=0; j<element.length;
                     J + +) {String temp = element[j]; if (j = = element.length-1){Bw.write (temp + "\ n");
                     }else{Bw.write (temp + separator);                }
                }
       }
    }

Java can also be provided by the library javacsv and Opencsv write, but note javacsv not append, Chinese ok;opencsv can be added, but Chinese will be garbled. Both can write content that contains commas in the field. two. To write a TXT file in the form of CSV, the content of the comma and double quotation marks to do special processing (see article):

The CSV full name is: Comma separated values or Character separated values.

Although the first is more common, I think the second is more accurate because you can use other characters to make delimiters.

A comma-delimited CSV file has one benefit: you can open it directly with Excel. If you use a different delimiter, such as "|", you must tell Excel that the file uses the "|" Excel can import it into a delimiter.

A few days ago, project manager told me: we have many CSV files, all using the "|" Do the delimiter, open with Excel when the trouble, the user wants to open in Excel directly. This will require the use of "," as a delimiter. However, some fields in the file contain commas, which must be handled specifically to ensure that the CSV file opens correctly in Excel. He gave me an algorithm:

If the field has a comma (,), the field is enclosed in double quotation marks (");

if there are double quotation marks in the field, enclose the double quotation mark with a double quote before the field is enclosed in double quotation marks.

I was really full of bags after I finished the project manager's requirements. After more than 10 years in the IT industry, I don't know how to deal with commas.
For a few examples:






Before field processing After field processing
Abc,d2 "Abc,d2"
AB "C,d2 "AB" "C,d2"
"ABC "" "ABC"
"" """"""

If the field has two double quotes next to each other, for example: AAA "" CCC. This situation does not need special treatment. This is the time I wrote this blog when I remembered, and quickly use the Excel test. Fortunately, without special treatment can open normally, or I have to change the program. Of course, if you treat it as: "AAA" "" "" CCC "Excel can also open normally. It's good to summarize afterwards.

In addition, if the first character of a field is a space, the second character is a double quotation mark, which is not handled in a special case. When Project Manager gave me a description of the algorithm, there was also an example of a field with only a double quote, but he accidentally added a space in front of the double quotation mark to conceal the problem.

Therefore, when writing a CSV file, you need to determine whether there are commas and double quotes, and do the corresponding processing

Boolean quoteflag = false;//tag whether to add double quotation marks
//20161214 If a comma is found  to be preceded or preceded by quotation marks, the string
value = Element[i] will occur.
if (Value.contains ("\")) {//If double quotes are found  to replace one double quotation mark in a string with two and a double quote
        value = Value.replaceall ("\" "," \ "\");
        Value = "\" "+ Value +" \ "";
        Quoteflag = true;
 }
 if (Value.contains (",") &&!quoteflag) {//If a comma is found,  enclose the quotation mark
        value = "\" "+ Value +" \ ";
 }

if (j = = Fileheader.size ()-1) {
         Bw.write (value + "\ n");
 } else{
         Bw.write (value + separator);
 }

When reading such a CSV file, similar processing is required:

Remove string before and after string for CSV output public
static string Clearstartandendquote (String str) {
        if (str! = NULL && Str.length () >=2) {
            if (str.indexof ("\") ==0) str = str.substring (1,str.length ());   Remove the first "
            If" (Str.lastindexof ("\") = = (Str.length ()-1)) str = str.substring (0,str.length ()-1);  Remove the last "

            str = str.replaceall (" \ "\" "," \ "");//convert two double quotes into a double quotation mark
        }
        return str
 ;

String Tempvalue = Classutil.clearstartandendquote (Temp[j]);
three. Comma-delimited text content (reference article)

When you do this, you usually encounter the following problems.
1: Want to split the data for example:

"123", "Jack Jona", "Computer Science"
"234", "Joen Dan", "philosophy"

When this data is encountered, it can be split directly with commas.

2: Split the data as follows:

"123", "Jack Jona", "New York, NY"
"234", "Lee Jack", "Fort Myers, FL"

When this data is encountered, it can be used directly

string[] Strarr = Str.trim (). Split (", (? = (= [^\\\"]*\\\ "[^\\\"]*\\\ ") *[^\\\"]*$) ",-1); Commas in double quotation marks do not divide by  commas in double quotes

To split. The advantage of this is that only the comma outside the quotation marks is divided, the quotation marks are not split.
Simply put, the result of this division is
123| Jack Jona| New York, NY
234| Lee jack| Fort Myers, FL

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.