Solution for converting UTF-8 from GBK to C ++ file encoding
When developing a Cocos program under VS, its default encoding is GBK, but UTF Encoding is more convenient during migration or compilation and debugging. Therefore there is a need to convert the encoding format of C ++ files to UTF-8.
In this case, you can select the advanced save option when creating a file and then select the Save format.
However, when there are many project files, this is not a smart choice. Therefore, we need to find a way to batch convert the data.
There are special commands in Linux to implement this function.
What should I do in Windows?
Of course, with the help of our omnipotent C ++, we can easily solve it. After searching for information, I will share my solution. First of all, create a VC console program project in.
Create a convert source file. The Code is as follows:
// Define the entry point of the console application. // # Include "stdafx. h" # include
# Include
# Ifdef _ DEBUG # define new DEBUG_NEW # endif # define _ lifecycle 1 // unique application object CWinApp theApp; using namespace std; void recursiveFile (CString strFileType ); void convertGBToUTF8 (CString strWritePath, const char * gb2312); int _ tmain (int argc, TCHAR * argv [], TCHAR * envp []) {int nRetCode = 0; // initialize the MFC and display the error if (! AfxWinInit (: GetModuleHandle (NULL), NULL,: GetCommandLine (), 0) {// TODO: change the error code to meet your needs _ tprintf (_ T ("error: MFC initialization failed \ n"); nRetCode = 1 ;} else {/* for (int I = 0; I <argc; I ++) {MessageBox (NULL, argv [I], L "Arglist contents", MB_ OK );} * /// declare a CFileFind class variable to search // accept a parameter as the source code file's root directory TCHAR * lpszDirName = argv [1]; CString strFileType; strFileType. format (_ T ("% s \\*. * "), lpszDirName); // recursive. h file and. cpp file. If it is not UTF-8 encoded Convert to utf8 encoding recursiveFile (strFileType);} return nRetCode;} void recursiveFile (CString strFileType) {CFileFind finder; BOOL isFinded = finder. findFile (strFileType); // find the first file while (isFinded) {isFinded = finder. findNextFile (); // recursively searches for other files if (! Finder. isDots () // if not ". "directory {CString strFoundFile = finder. getFilePath (); if (finder. isDirectory () // if it is a directory, it recursively calls {CString strNextFileType; strNextFileType. format (_ T ("% s \\*. * "), strFoundFile); recursiveFile (strNextFileType);} else {// if it is a header file or cpp file if (strFoundFile. right (4) = _ T (". cpp ") | strFoundFile. right (2) = _ T (". h ") {CFile fileReader (strFoundFile, CFile: modeRead); byte head [3]; fileReader. read (head, 3 ); // determine whether the BOM file header is included. if (head [0] = 0xef & head [1] = 0xbb & head [2] = 0xbf) {fileReader. close (); continue;} fileReader. seekToBegin (); int bufLength = 256; char * buf = new char [bufLength]; ZeroMemory (buf, bufLength); int nReadLength; std: string strContent; while (nReadLength = fileReader. read (buf, bufLength) {strContent. append (buf, nReadLength); ZeroMemory (buf, nReadLength);} delete buf; fileReader. close (); convertGBToUTF8 (strFoundFile, strContent. c_str (); TCHAR * fileName = new TCHAR [strFoundFile. getLength () + 1]; // wcscpy_s (* fileName, strFoundFile); // The Chinese path has a problem and the following output can be blocked, the program silently runs printf ("% S has been converted to UTF-8 encoding", strFoundFile. getBuffer (0); cout <endl; if (_ tcslen (fileName)> 0) {delete [] fileName ;}}} finder. close ();} void convertGBToUTF8 (CString strWritePath, const char * gb2312) {CFile fp; fp. open (strWritePath, CFile: modeCreate | CFile: modeWrite | CFile: typeBinary, NULL); int len = MultiByteToWideChar (CP_ACP, 0, gb2312,-1, NULL, 0); wchar_t * wstr = new wchar_t [len + 1]; memset (wstr, 0, len + 1); MultiByteToWideChar (CP_ACP, 0, gb2312,-1, wstr, len); len = WideCharToMultiByte (CP_UTF8, 0, wstr,-1, NULL, 0, NULL); char * str = new char [len + 1]; memset (str, 0, len + 1); len = WideCharToMultiByte (CP_UTF8, 0, wstr,-1, str, len, NULL, NULL); if (wstr) delete [] wstr; str [len] = '\ n'; const unsigned char aryBOM [] = {0xEF, 0xBB, 0xBF}; fp. write (aryBOM, sizeof (aryBOM); fp. write (str, len); delete [] str; fp. close ();}
If an error occurs during compilation, click "project -- Property -- General -- use MFC to share DLL.
Put the successfully compiled. EXE file in the project directory, for example:
After selecting the project, you can simply paste the source code directory to be converted to the directory after .exe. This saves the GBK when you write it, but before the project is compiled, all C ++ source code files under the specified directory will be converted to the UTF-8 format, the specific format is as follows:
Of course, you can also directly run the exe file in cmd and add the directory to be converted. Format: convert.exe Dir
Of course, you can use Python to write a script for conversion. There are still many ways to solve this problem.
That's all.
Hope to help you.