Source: http://blog.darkthread.net/post-2015-08-07-big5-utf8-source-code-batch-converter.aspx
Not long after switching to VS2015. The principle of processing BIG5 (ANSI) encoded code differs from the previous (presumably related to the compiler's use of Roslyn), causing some old-fashioned files to be archived using BIG5 encoding, resulting in compilation errors due to Xu Gong caps.
Po text two days after the colleague told me that they changed VS2015 after a good time to shoot tea bags, and finally crawled back to my article. XD later chat can write program to all BIG5 encoding program file into UTF8 once and for all, colleagues said file not a few, manual save on the fix, do not need to raise cows.
These two days, received a message to ask VS2015 whether the problem will be corrected, there are netizens mentioned that the project has thousands of CS, changed a BIG5, there are tens of thousands of BIG5, had to talk to VS2015 goodbye.
First of all, although not sure VS2015 will be revised for this one issue, but in my opinion, BIG5 coding has been obsolete for many years, in addition to the VS2015 production is incompatible, encountered Chinese difficult words and other language languages have to be extra processing (in Visual Studio.NET 2003 times have been dealt with). Therefore, regardless of whether the VS2015 will be corrected, it is the correct direction to change the code to UTF8 encoding.
Problems come, such as netizens said, if the project has tens of thousands of CS, one hand dump UTF8 The vast project really chilling. I love to build this can save time and effort of the submarine Shield machine, write a batch shift program to do it!
Here I assume that the project's. cs file is divided into two categories, part of which is already UTF8 or Unicode encoded, the remainder of which is BIG5 encoded (assuming all are BIG5, excluding other ANSI encodings such as Simplified Chinese, Japanese, etc.). Therefore, you can use Directory.GetFiles () search to find the specific directory (including subdirectories) under the specified file type (for example: CS and JS), check the file to register the UTF8 or Unicode encoded files, find the BIG5 file with BIG5 encoding first read, Then the UTF8 encoding write to complete the transfer, and the original file is added to the. big5.bak file name backup reserved.
usingSystem;usingSystem.Collections.Generic;usingSystem.IO;usingSystem.Linq;usingSystem.Text;usingSystem.Threading.Tasks;namespaceb2ubatchconverter{classProgram { Public classAnalyzeresult { Public stringContent; PublicEncoding Encoding; } //REF:Http://goo.gl/jAJgIrby Rick Strahl Public StaticAnalyzeresult Analyzefile (stringsrcfile) { //Preset as Big5Encoding enc = encoding.getencoding (950); //The UTF8, Unicode, UTF32, etc. are encoded by the first five codes, and the remainder of the byte[] buffer =New byte[5]; using(FileStream file =NewFileStream (Srcfile, FileMode.Open)) {file. Read (Buffer,0,5); File. Close (); if(buffer[0] ==0xEF&& buffer[1] ==0XBB&& buffer[2] ==0XBF) Enc=Encoding.UTF8; Else if(buffer[0] ==0xFE&& buffer[1] ==0xFF) Enc=Encoding.unicode; Else if(buffer[0] ==0&& buffer[1] ==0&&buffer[2] ==0xFE&& buffer[3] ==0xFF) Enc=Encoding.utf32; Else if(buffer[0] ==0x2b&& buffer[1] ==0x2f&& buffer[2] ==0x76) Enc=Encoding.utf7; } //use the specified encoding to read the content return NewAnalyzeresult () {Content=File.readalltext (srcfile, enc), Encoding=ENC}; } Static voidMain (string[] args) { //args = new string[] {"D:\\lab\\l805\\conapp"}; stringPath = args[0]; //the name of the file to search for the password varScanfiletypes ="Cs,js". Split (','); //Ignore the folder name varSkipfolders ="Bin,obj". Split (','); foreach(varFileinch //list all the files under the catalogue.Directory.GetFiles (Path,"*.*", Searchoption.alldirectories)) { //get a copy of the file name varext = path.getextension (file). TrimStart ('.'). ToLower (); //If you don't specify a copy of the file name, if(!scanfiletypes.contains (EXT))Continue; //the files under \bin\* \obj\* are also skipped. if(Skipfolders.any (o =file. Contains (Path.directoryseparatorchar+ O +Path.directoryseparatorchar )))Continue; //Read the file and read the password varAnalysis =analyzefile (file); if(Analysis. Encoding.codepage = =950)//BIG5 the number of files to be handled.{Console.Write ("Process File {0} ...", file); //Rename the original file to *.big5.bak varBakfile = file +". Big5.bak"; if(File.exists (bakfile)) File.delete (bakfile); File.move (file, bakfile); //re-write in UTF8File.writealltext (File, analysis. Content, Encoding.UTF8); Console.WriteLine ("done!"); } Else{Console.WriteLine ("Skip File {0}/{1}", file, analysis. Encoding.encodingname); } } } }}
When the project is compiled into the console application and the path name is supplied as a parameter, the conversion program scans all the. cs and. js in that directory, and then dumps the BIG5 code code into UTF8.
Be sure to back up the "reminder" batch before converting it to avoid data loss.
"2015-08-07 Update" thank netizens Huang Wensheng, Norton Lin in the FB page left this supplement two off-the-shelf software tools: Convertz and Utfcast can also do batch code conversion. In addition, CHRISTORNG also shares the batch shift that can be integrated with TFS and mentions the StreamReader of detectencodingfrombyteordermarks parameters to detect encoding by the constructor.
Thank you for your feedback.
"2015-08-12 Update" VS2015 program file BIG5 compatibility Problem Quick solution-Modify Csproj/vbproj
vs2015 Compile error CS1003 Syntax error