Linux under the UTF8 encoding batch conversion to GB2312 encoding method in Sqlplus import UTF8 encoded SQL script will appear garbled error, then need to convert UTF8 encoding GB2312 code, below for everyone to introduce under Linux how to convert
UTF8 encoding and GB2312 encoding is a difference, in Sqlplus import UTF8 encoded SQL script will appear garbled error, then need to convert UTF8 code GB2312 code, but one of the conversion is very troublesome, Below is a small compilation of how you can convert UTF8 encoding into GB2312 encoding under Linux.
Background
I use Oracle's Sqlplus Bulk import UTF8 encoded SQL script, because do not know how to set the Sqlplus to recognize UTF8 format, resulting in garbled, wrong line and other errors, and make work can not continue, in the case of Google has no effect, have to find ways to convert the code.
Due to the number of files, manual conversion is too cumbersome, so think of using a script batch conversion, fortunately online related scripts more, the only trouble to achieve is UTF8 BOM mark.
Content:
The code is as follows:
#! /bin/bash
For loop in ' find. -type f-name "*.sql"-print ' do
Echo $loop
Mv-f $loop $loop. tmp
Dos2unix $loop. tmp
file_check_utf8= ' File_check_utf8.log '
Sed-n ' 1l ' $loop. tmp $file _check_utf810. If grep ' ^\\357\\273\\277 ' $file _check_utf8 "/dev/null 2" &111. Then
Echo ' UTF-8 BOM '
Sed-n-E ' 1s/^ ... '-e ' w intermediate.txt ' $loop. Tmp14. Iconv-f UTF-8-T gb2312-o $loop intermediate.txt15. RM-RF Intermediate.txt
RM-RF $loop. tmp
Elif iconv-f UTF-8-t GB2312 $loop. tmp "/dev/null 2" &118. Then
Echo ' UTF-8 '
Iconv-f UTF-8-T gb2312-o $loop $loop. Tmp21. RM-RF $loop. tmp
Else
Echo ' ANSI '
Mv-f $loop. tmp $loop
Fi
RM-RF $file _check_utf8
#模拟unix2dos requires that the last line of the text file must have a newline character of 28. Sed-n-E ' s/$/\r/g '-e ' W ' $loop. tmp $loop 29. Mv-f $loop. tmp $loop
Done
#! /bin/bash
For loop in ' find. -type f-name "*.sql"-print ' do
Echo $loop
Mv-f $loop $loop. tmp
Dos2unix $loop. tmp
file_check_utf8= ' File_check_utf8.log '
Sed-n ' 1l ' $loop. tmp $file _check_utf810. If grep ' ^\\357\\273\\277 ' $file _check_utf8 "/dev/null 2" &111. Then
Echo ' UTF-8 BOM '
Sed-n-E ' 1s/^ ... '-e ' w intermediate.txt ' $loop. Tmp14. Iconv-f UTF-8-T gb2312-o $loop intermediate.txt15. RM-RF Intermediate.txt
RM-RF $loop. tmp
Elif iconv-f UTF-8-t GB2312 $loop. tmp "/dev/null 2" &118. Then
Echo ' UTF-8 '
Iconv-f UTF-8-T gb2312-o $loop $loop. Tmp21. RM-RF $loop. tmp
Else
Echo ' ANSI '
Mv-f $loop. tmp $loop
Fi
RM-RF $file _check_utf8
#模拟unix2dos requires that the last line of the text file must have a newline character of 28. Sed-n-E ' s/$/\r/g '-e ' W ' $loop. tmp $loop 29. Mv-f $loop. tmp $loop
Done
Explain
1. Deal with UTF8 BOM, I did not find a good way, finally with sed+grep judgment, if the first three bytes is \\357\\273\\277, then the file must be UTF8, with SED to remove these three bytes and then convert
2. In order to avoid duplication or omission, the script with Iconv to the file without the BOM attempt to convert, the conversion success of the file is UTF8, otherwise the description is ANSI is GB2312
3. With regard to the final sed command, it was because I did not have a unix2dos command on my system, so I simulated it to make it easier for me to view and edit it under Windows.
The above is the Linux UTF8 encoding batch conversion to GB2312 encoding method introduced, after the conversion can solve garbled problems, you can use the command image batch conversion, you learned it?
A method of converting UTF8 encoding into GB2312 encoding under Linux