It has been a while since Erlang was updated to R17. The company's projects planned to migrate from the old version of Erlang to R17, but there were a lot of troubles, one of which was a Chinese problem.
This problem is easy to reproduce: Create a new file T. erl and save it as UTF-8 without BOM format.
-Module (T).-Export ([test/0]). Test ()-> ["I", <"I">].
In the old Erlang version, the following code works properly. The result is:
Eshell V5.9.1 (abort with ^G)1> c(t).{ok,t}2> t:test().[[230,136,145],<<230,136,145>>]
The running result after R17 compilation is:
Eshell V6.0 (abort with ^G)1> c(t).{ok,t} 2> t:test(). [[25105],<<17>>]
Find the Erlang document. After Erlang is updated to R17, the default encoding is changed from Latin-1 to utf8.
In Erlang/OTP 17.0,The encoding default for Erlang source files was switched to UTF-8And in Erlang/OTP 18.0 Erlang will support atoms in the full Unicode range, meaning full Unicode function and module names
To allow R17 to correctly identify UTF-8 code files without BOM format, add "% coding: Latin-1" to the file header, and the code becomes like this:
% Coding: Latin-1-module (t ). -Export ([test/0]). test ()-> ["I", <"I">].
The problem is that it is not difficult to write a script to modify the old code file, but the newly added file also needs to add this header statement, so it is definitely a bit difficult. File: consult/1 is also affected, and the header Declaration must be added.
Erlang does not provide the startup parameter to support the original Latin-1 mode. I tried ERL + PC Latin1 and still couldn't solve the problem. I don't know if it's a bug.
Therefore, here we re-compile Erlang Based on Erlang. The Code is as follows:
-module(test).-compile(export_all).compile(FileName) ->compile(FileName, [verbose,report_errors,report_warnings]).compile(FileName, Options) ->Module = filename:basename(FileName),{ok, Forms } = epp:parse_file(FileName, [{default_encoding, latin1}]) ,{ok, Mod, Code} = compile:forms(Forms, Options),{ok, Cwd} = file:get_cwd(),code:load_binary(Mod, FileName, Code),file:write_file(lists:concat([Cwd, Module, ".beam"]), Code, [write, binary]).
Note that the above code should not be used by Erlang before R17, and some interfaces are not yet supported by Erlang. The running result is as follows:
14> c(test).{ok,test}15> test:compile("t.erl").ok16> t:test().[[230,136,145],<<230,136,145>>]
In addition, file: consult/1 is re-implemented as follows:
consult(File) ->case file:open(File, [read]) of{ok, Fd} ->R = consult_stream(Fd),_ = file:close(Fd),R;Error ->Errorend.consult_stream(Fd) ->_ = epp:set_encoding(Fd, latin1),consult_stream(Fd, 1, []).consult_stream(Fd, Line, Acc) ->case io:read(Fd, ‘‘, Line) of{ok,Term,EndLine} ->consult_stream(Fd, EndLine, [Term|Acc]);{error,Error,_Line} ->{error,Error};{eof,_Line} ->{ok,lists:reverse(Acc)}end.
Although Erlang R17 cannot recognize Chinese characters, it is expected that Erlang will provide a parameter in a later version to be compatible with Latin code.
Reference: http://blog.csdn.net/mycwq/article/details/40718281
Solve the problem that Erlang R17 cannot recognize Chinese Characters