CSV table processing (bottom)--pure JS parsing import csv

Source: Internet
Author: User
Tags hex code

Several days ago, the previous article introduced the CSV table, and JS combined with the Back-end PHP parsing table filling the form method. Where the CSV converted to a two-dimensional array when the logic is more complex multi-pit, fortunately PHP has a wealth of library functions to deal with, and now with JS parsing is not so fortunate, everything must be a self-→_→ or introduce a library.

JS Import csv--read Text

JS can read the front-end files? Previously only through Ie's ActiveXObject or flash to read the file locally. With the advent of H5, there is a general solution to this Problem. Talk was cheap,show you the code

$.fn.csv2arr =function( ){    varFiles = $ ( this) [0].files; if(typeof(filereader)!== ' undefined ') {//H5        varReader =NewFileReader (); Reader.readastext (files[0]);//read in text FormatReader.onload =function(evt) {vardata = evt.target.result;//Read the dataConsole.log (data); }    }Else{alert ("IE9 and the following browsers are not supported, please use Chrome or Firefox browser"); }}//Calling Methods$ ("#startBtn"). Click (function(){    $("#csvInput"). Csv2arr ();});

The key here is filereader, which is a standard implementation of read files in the H5 standard, IE10 and above, and Chrome/firefox/safari Support. The invocation method is also relatively simple, only need to pass in the DOM of the file input box, set the Read method and then bind the callback function on the LINE. Here is the way to use Readastext (), which is read as text format. Refer to the MDN document of firefox, as well as base64, binary and other ways, you can refer to try it yourself. The UTF8 text file reads as Follows:

Note: Readastext () will automatically remove the BOM header (if any) of the UTF8 file, and other reading methods should be taken into manual removal.

Digression: Why does H5 have this API to directly read local files, which is a big threat to security? In fact, this security threat to the browser users is basically no expansion, imagine, the original does not have such a file api, when the site has access to local files? Of course there is, or through this input type= "file", binding a onchange event to the Ajax submission, the User's files are quietly uploaded to the website back End. This problem still depends on raising the security awareness of netizens, like the previous phishing website, fake a QQ login interface can be Fallout. This can also forge a download button and dialog box to induce users to upload important confidential documents.

JS Import Csv--text Parsing plugin

Because JS does not like PHP as a CSV processing function, the previous article said there are many complex situations to deal with, then the most machine (chicken) wisdom (thief) method is of Course: find Plug-ins. One of the most used CSV plugins is papaparse.js. The classic method of use is as follows

// Parse Local CSV file$ ("#csvBtn"). Click (function() {    var file = $ ("input[name= csv] "). [0].files[0];    Papa.parse (file, {    function(results) {        console.log ("finished:",    results.data);    }    });});

This plug-in is relatively powerful, the analysis of the basic no big problem, but still not very perfect. The problem is as Follows:

    1. The empty line at the end of the file is not automatically removed, which may cause the form to fill a little more empty data;
    2. Can not automatically identify UTF8 and gbk, Chinese parsing may be garbled;
    3. UTF8 code, \ r \ n mixed with \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ papaparse
JS Import csv--encoding Automatic recognition

In the 3rd, if the contents of the table are in chinese, it is a big problem. Because the General page encoding is UTF8, the exported table will also be UTF8 encoding format, if you do not modify the direct upload is UTF8. however, if modified, common forms software under the Windows platform, including office and wps, are all converted to GBK encoding. If the program does not automatically identify the encoding processing, there will be a large probability of garbled.

On the other hand, if the Web page uses GBK encoding format to download, also does not ensure that the user uploads the file must be gbk, because the Mac system is UTF8, probably originally GBK in the modification has become UTF8.

Or you can give a drop-down bar to let the user manually select the encoding format, but you want to instruct users to know what the encoding format is, How to view, this is not something easy to accept. So how do you do code recognition automatically? UTF8 and GBK are not clearly coded features to differentiate, apologies are really not. So what? Find the Wheel. Where do you find it? For the JS wheel, there is a good CDN library, Although the introduction is all-english but still very easy to find. We are looking for encoding decoding, then ctrl+f search encod (encode and encoding of the first few words), a look at the introduction, can really find a, named Jschardet.

Click into it and don't specify it, then go to the GitHub Page. Look at the sample code

// "àíà??" in UTF-8jschardet.detect ("\xc3\xa0\xc3\xad\xc3\xa0\xc3\xa7\xc3\xa3")//  { Encoding: "UTF-8", confidence:0.9690625}//  "secondary commonly used national standard matrix" in Big5jschardet.detect ("\xa6\xb8\ Xb1\x60\xa5\xce\xb0\xea\xa6\x72\xbc\xd0\xb7\xc7\xa6\x72\xc5\xe9\xaa\xed ")//  {encoding:" Big5 ", confidence:0.99}

What the heck? It looks like it's not just a normal string, it looks a hex code. Practice found that passing ordinary strings in all are recognized as ASCII encoding, it is a bit difficult to do ah. What to do?

Don't panic, aren't we going to read the local file to parse it? Take a look at the MDN document in firefox, except for Readastext () to read as a string. There is a readasbinarystring (), but not the standard H5 read method, Some browsers may not support it. Then there is a readasdataurl (), this thing, try to know. And get a bunch of stuff like that.

Data:text/csv;base64,niywzczywpvm2kgkslikmyzn0lddtvlluagkx+0kocy2xc3+oas3xrdcxmkk

Change the file and try several more times, the original is this: the front of the data:text/csv;base64, is a fixed string, only for firefox, Chrome and IE in front of the data:;base64, The following string is the file content is Base64 Encoded. Then the back of this string decoding out to see, ie>=10, firefox, Chrome has the native Base64 decoding function Atob (). And then got a normal english, Chinese is all garbled string, and this string of garbled look like UTF8 is not like Gbk. Then most likely this is the hexadecimal code, with Jschardet detection, success!

Summarize and organize

here, we have used Third-party JS to solve the biggest two puzzles, code recognition and CSV Parsing. So put these together and encapsulate them into a more convenient way to Call.

/** * CSV file to 2D arr **/$.fn.csv2arr=function(callback) {if(typeof(filereader) = = ' undefined ') {//If not H5Alert ("IE9 and the following browsers are not supported, please use Chrome or Firefox browser \nyour browser is too old,please using Chrome or firefox"); return false; }    if( ! $( this) [0].files[0]) {alert ("please Select File \nplease select a file"); return false; }    varFreader =NewFileReader (); Freader.readasdataurl ($ ( this) [0].files[0] ); $fileDOM= $( this); Freader.onload=function(evt) {vardata =evt.target.result;//Console.log (data);        varencoding =checkencoding (data);//Console.log (encoding);        //convert to a two-dimensional array, you need to introduce papaparse.jsPapa.parse ($ ($fileDOM) [0].files[0], {encoding:encoding, complete:function(results) {//UTF8 \ \ \ \ \ \ \ \ \ \ \ \ \ \//Console.log (results);                varres =results.data; if(res[res.length-1] = = "") {//remove the last empty lineRes.pop (); } Callback&&Callback (res);    }        }); } freader.onerror=function(evt) {//Console.log (evt);Alert ("the file has been modified, please re-select (Firefox) \nthe file has changed,please select again." ( Firefox) "); }        //Check the code, reference the Jschardet    functioncheckencoding (base64str) {//in this way, a binary string is Obtained.        varstr = Atob (base64str.split ("; Base64,") [1] );//Console.log (str);        //to use binary format        varencoding =Jschardet.detect (str); Encoding=encoding.encoding;//Console.log (encoding);        if(encoding = = "windows-1252") {//errors are sometimes identified (such as the Chinese word of UTF8)encoding = "ANSI"; }        returnencoding; }}

Examples of Use

<inputtype= "file"name= "csvfile" /><inputtype= "button"onclick= "csv2 ()"value= "js conversion"/><Scriptsrc= "__pjs__/jquery.js"></Script><Scriptsrc= "__pjs__/papaparse.js"></Script><Scriptsrc= "__pjs__/jschardet.js"></Script><Script>functioncsv2 () {$ ("input[name=csvfile]"). Csv2arr (function(res) {alerttips ("F12 Open the browser console to see");    Console.log (res); });}</Script>
Download and update zip package download GitHub address, add star, ask to fix bug

Dry--don ' t Repeat yourself. Don't build a bug-filled wheel.

CSV table processing (bottom)--pure JS parsing import csv

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.