node. js file Encoding format conversion

Source: Internet
Author: User
Tags lua

Project many LUA files are not in the Utf-8 format and are displayed as ASCII when viewed with EditPlus. There are also some with the BOM, with BOM to deal with, before written, there is a certain regularity.

ASCII code compared to egg pain, through the search of online resources, repeated testing and comparison, the final form of the following more reliable method (some EditPlus display encoded as Utf-8 but the node. JS Library returned is another encoding >_<)

To determine whether the correct changes, only after the modification, through SVN submit, browse the submission list, double-click any of the files to be submitted, if the display of the dialog box, the changes are successful, others will see the Chinese instead of garbled

var fs =require('FS'); var chardet=require('Chardet'); var jschardet=require("Jschardet"); var encoding=require("encoding"); var path="Lua Directory"; functionreaddirectory (dirpath) {if(Fs.existssync (Dirpath)) {var files=Fs.readdirsync (Dirpath); Files.foreach (function(file) {var filePath= Dirpath +"/"+file; var stats=Fs.statsync (FilePath); if(Stats.isdirectory ()) {Console.log ('/n Read directory: \ n', FilePath,"\ n");            Readdirectory (FilePath); } Else if(Stats.isfile () &&/\.lua$/. Test (FilePath)) {var buff=Fs.readfilesync (FilePath); if(Buff.length && buff[0].tostring ( -). toLowerCase () = ="EF"&& buff[1].tostring ( -). toLowerCase () = ="BB"&& buff[2].tostring ( -). toLowerCase () = ="BF") {                    EF BB BF239 187 191Console.log ('\ nthe BOM file found:', FilePath,"\ n"); Buff= Buff.slice (3); Fs.writefile (FilePath, buff.tostring (),"UTF8"); }                {encoding:'UTF-8', confidence:0.99 }                var charset =Chardet.detectfilesync (FilePath); var info=jschardet.detect (Buff); if(Info.encoding = ="GB2312"|| Info.encoding = ="ASCII") {var resultbuffer= Encoding.convert (Buff,"UTF-8", info.encoding); Fs.writefile (FilePath, Resultbuffer,"UTF8"); }                Else if(Info.encoding! ="UTF-8"&& Chardet.detectfilesync (filePath)! ="UTF-8")                {                    if(Buff.tostring (). IndexOf ("\ r \ n") >-1)                    {var resultbuffer= Encoding.convert (Buff,"UTF-8","GBK"); Fs.writefile (FilePath, Resultbuffer,"UTF8");    }                }            }        }); } Else{Console.log ('Not Found Path:', Dirpath); }}readdirectory (path);

Note The above judgment, the first is explicitly GB2312 or ASCII, the corresponding encoding directly to Utf-8. If the return is a format, first determine if there is a new line character under the PC, if any, it is treated as GBK.

The whole idea is actually relatively simple, the difficulty lies in if you judge the file encoding format. This is really hard to >_<, after getting the original encoding format, call Encoding.convert (buff, target encoding format , original encoding format ), you can get the required encoding. If you are free and interested, you can download notepad++ source code, see how it determines the encoding format of the file

Note: The file modified by the method above is consistent with the list of files that need to be submitted on the MAC, at least to solve the problem I am currently experiencing. If there is a special, the above code can be modified.

Third-party libraries used:

    • Encoding https://github.com/andris9/encoding
    • Jschardet Https://github.com/aadsm/jschardet
    • Node-chardet Https://github.com/runk/node-chardet

Coding-related basics, you can refer to this article of Nanyi: http://www.ruanyifeng.com/blog/2007/10/ascii_unicode_and_utf-8.html

Wikipedia and other materials are too professional, and the introduction of ASCII encoding is not much, no longer listed.

node. js file Encoding format conversion

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.