Project many LUA files are not in the Utf-8 format and are displayed as ASCII when viewed with EditPlus. There are also some with the BOM, with BOM to deal with, before written, there is a certain regularity.
ASCII code compared to egg pain, through the search of online resources, repeated testing and comparison, the final form of the following more reliable method (some EditPlus display encoded as Utf-8 but the node. JS Library returned is another encoding >_<)
To determine whether the correct changes, only after the modification, through SVN submit, browse the submission list, double-click any of the files to be submitted, if the display of the dialog box, the changes are successful, others will see the Chinese instead of garbled
var fs =require('FS'); var chardet=require('Chardet'); var jschardet=require("Jschardet"); var encoding=require("encoding"); var path="Lua Directory"; functionreaddirectory (dirpath) {if(Fs.existssync (Dirpath)) {var files=Fs.readdirsync (Dirpath); Files.foreach (function(file) {var filePath= Dirpath +"/"+file; var stats=Fs.statsync (FilePath); if(Stats.isdirectory ()) {Console.log ('/n Read directory: \ n', FilePath,"\ n"); Readdirectory (FilePath); } Else if(Stats.isfile () &&/\.lua$/. Test (FilePath)) {var buff=Fs.readfilesync (FilePath); if(Buff.length && buff[0].tostring ( -). toLowerCase () = ="EF"&& buff[1].tostring ( -). toLowerCase () = ="BB"&& buff[2].tostring ( -). toLowerCase () = ="BF") { EF BB BF239 187 191Console.log ('\ nthe BOM file found:', FilePath,"\ n"); Buff= Buff.slice (3); Fs.writefile (FilePath, buff.tostring (),"UTF8"); } {encoding:'UTF-8', confidence:0.99 } var charset =Chardet.detectfilesync (FilePath); var info=jschardet.detect (Buff); if(Info.encoding = ="GB2312"|| Info.encoding = ="ASCII") {var resultbuffer= Encoding.convert (Buff,"UTF-8", info.encoding); Fs.writefile (FilePath, Resultbuffer,"UTF8"); } Else if(Info.encoding! ="UTF-8"&& Chardet.detectfilesync (filePath)! ="UTF-8") { if(Buff.tostring (). IndexOf ("\ r \ n") >-1) {var resultbuffer= Encoding.convert (Buff,"UTF-8","GBK"); Fs.writefile (FilePath, Resultbuffer,"UTF8"); } } } }); } Else{Console.log ('Not Found Path:', Dirpath); }}readdirectory (path);
Note The above judgment, the first is explicitly GB2312 or ASCII, the corresponding encoding directly to Utf-8. If the return is a format, first determine if there is a new line character under the PC, if any, it is treated as GBK.
The whole idea is actually relatively simple, the difficulty lies in if you judge the file encoding format. This is really hard to >_<, after getting the original encoding format, call Encoding.convert (buff, target encoding format , original encoding format ), you can get the required encoding. If you are free and interested, you can download notepad++ source code, see how it determines the encoding format of the file
Note: The file modified by the method above is consistent with the list of files that need to be submitted on the MAC, at least to solve the problem I am currently experiencing. If there is a special, the above code can be modified.
Third-party libraries used:
- Encoding https://github.com/andris9/encoding
- Jschardet Https://github.com/aadsm/jschardet
- Node-chardet Https://github.com/runk/node-chardet
Coding-related basics, you can refer to this article of Nanyi: http://www.ruanyifeng.com/blog/2007/10/ascii_unicode_and_utf-8.html
Wikipedia and other materials are too professional, and the introduction of ASCII encoding is not much, no longer listed.
node. js file Encoding format conversion