Recently in the processing of text strings, no row of data are separated by commas, each field value is usually enclosed in double quotation marks, but some field values also contain commas, and even some fields do not have double quotation marks, this is a bit troublesome to split up
Here's a look at my solution, if anyone has a better way, welcome to join the discussion O (∩_∩) o~
/*** Java string comma split parsing method * This is specifically designed for cases where there are commas in double quotes or a field with a single quote * For example, string sss= "101,\" A\ "," China, Jiangsu \ "," b\ "," China, Beijing \ ", 1,0,\" c\ "" separated by commas * The result of the correct split is (101) (a) (China, Jiangsu) (b) (China, Beijing) (1) (0) (c) * If using the Java split method, when encountering (China, Beijing) These Field values will be more divided into a field, it is not correct * at the same time, the above 101, 1,0 do not have to double quotes, we expect the ideal string of course is a double-quoted field value of a string of the number of * but the situation above we feel very annoyed, the above is the original design of the method, in fact, this method is mentioned in the data structure of the university textbook, * implemented in Java, but the efficiency of the method execution I haven't tested yet.@authorHsuchan *@version2014-11-30 22:30 *@paramSSS *@returnString []*/ Publicstring [] Commadivider (String sss) {//Double quote start tag intQutationstart =0; //Double quote closing tag intQutationend =0; Char[] Charstr =Sss.tochararray (); //used to stitch characters as a field valueStringBuffer SBF =NewStringBuffer (); //Results Listlist<string> list =NewArraylist<string>(); //CHAR-per- character processing for(inti=0;i<charstr.length;i++) { //no double quotes have been encountered before and the current character is \ " if(Qutationstart = = 0&&charstr[i] = = ' \ ' ') {Qutationstart= 1; Qutationend= 0; Continue; } Else if(Qutationstart = = 1&&charstr[i] = = ' \ ' '){ //preceded by a double quotation mark and the current character is \ "Description field stitching that's overQutationstart = 0; Qutationend= 1; //when the last character is double quotation marks, because the next loop will not be executed, save it here if(i = = Charstr.length-1&&sbf.length ()! = 0) {List.add (sbf.tostring ()); Sbf.setlength (0); } Continue; } Else if(Qutationstart = = 1&&charstr[i] = = ', ' &&qutationend = = 0) { //handling \ "China, Beijing \" This nonstandard stringsbf.append (Charstr[i]); Continue; } Else if(Charstr[i] = = ', ') {//The field ends and the stitched field values are stored in the listList.add (sbf.tostring ()); Sbf.setlength (0); Continue; } //do not belong to the delimiter on the splicingsbf.append (Charstr[i]); if(i = = Charstr.length-1&&sbf.length ()!=0) {List.add (sbf.tostring ()); Sbf.setlength (0); } } return(string[]) List.toarray (Newstring[list.size ()]); }
This article is 100% original, if you share please indicate the source, thank you.
Comma delimiter--the parsing method of the field with commas and other cases Java implementation