Continue with yesterday's topic. I mentioned yesterday that I/O: the indirect impact of format on data sharing. I am afraid it is more likely to become a "pitfall" in the following two cases. Well, I happened to have encountered it;
If the test code is as follows, what will happen? Guess!
s2()-> L=[1,2,3,4,5,6], L2=[L,L,L,L],erlang:display( {{erts_debug:size(L),erts_debug:flat_size(L)},{erts_debug:size(L2),erts_debug:flat_size(L2)}}).
The result is
5> d:s2().{{12,12},{56,56}}
After this result is returned, I spent five minutes doubting my life. Why is it different from what I expected? Is it because I am using the latest version (17.2? Is the implementation modified but the document is not updated? Out of curiosity, I still generated the to_core file according to the previous steps of exploring the problem. The truth is:
‘s2‘/0 = %% Line 11 fun () -> let <_cor5> = %% Line 14 call ‘erts_debug‘:‘size‘ ([1|[2|[3|[4|[5|[6]]]]]]) in let <_cor4> = %% Line 14 call ‘erts_debug‘:‘flat_size‘ ([1|[2|[3|[4|[5|[6]]]]]]) in let <_cor3> = %% Line 14 call ‘erts_debug‘:‘size‘ ([[1|[2|[3|[4|[5|[6]]]]]]|[[1|[2|[3|[4|[5|[6]]]]]]|[[1|[2|[3|[4|[5|[6]]]]]]|[[1|[2|[3|[4|[5|[6]]]]]]]]]]) in let <_cor2> = %% Line 14 call ‘erts_debug‘:‘flat_size‘ ([[1|[2|[3|[4|[5|[6]]]]]]|[[1|[2|[3|[4|[5|[6]]]]]]|[[1|[2|[3|[4|[5|[6]]]]]]|[[1|[2|[3|[4|[5|[6]]]]]]]]]]) in %% Line 14 call ‘erlang‘:‘display‘ ({{_cor5,_cor4},{_cor3,_cor2}})
Modify the Code:
s3(L)-> L2=[L,L,L,L], {{erts_debug:size(L),erts_debug:flat_size(L)},{erts_debug:size(L2),erts_debug:flat_size(L2)}}.
The corresponding S3 code is
‘s3‘/1 = %% Line 18 fun (_cor0) -> let <L2> = %% Line 19 [_cor0|[_cor0|[_cor0|[_cor0|[]]]]] in let <_cor5> = %% Line 20 call ‘erts_debug‘:‘size‘ (_cor0) in let <_cor4> = %% Line 20 call ‘erts_debug‘:‘flat_size‘ (_cor0) in let <_cor3> = %% Line 20 call ‘erts_debug‘:‘size‘ (L2) in let <_cor2> = %% Line 20 call ‘erts_debug‘:‘flat_size‘ (L2) in %% Line 20 {{_cor5,_cor4},{_cor3,_cor2}}
In other words, the constant data in the S2 method in the compilation phase has been expanded, so L2 is the same for both size and flat_size. the reason for this test is to avoid the following test errors.
How can this be solved? In addition to the method passed in the preceding parameters, there is also a way to change to function call, as shown below:
s4()-> L=lists:seq(1,6), L2=[L,L,L,L],erlang:display( {{erts_debug:size(L),erts_debug:flat_size(L)},{erts_debug:size(L2),erts_debug:flat_size(L2)}}).
The corresponding code is:
‘s4‘/0 = %% Line 24 fun () -> let <L> = %% Line 25 call ‘lists‘:‘seq‘ (1, 6) in let <L2> = %% Line 26 [L|[L|[L|[L|[]]]]] in let <_cor5> = %% Line 27 call ‘erts_debug‘:‘size‘ (L) in let <_cor4> = %% Line 27 call ‘erts_debug‘:‘flat_size‘ (L) in let <_cor3> = %% Line 27 call ‘erts_debug‘:‘size‘ (L2) in let <_cor2> = %% Line 27 call ‘erts_debug‘:‘flat_size‘ (L2) in %% Line 27 call ‘erlang‘:‘display‘ ({{_cor5,_cor4},{_cor3,_cor2}})
Don't underestimate this problem. In extreme cases, constant optimization will have a "big surprise". This article provides an example:
show_compiler_crashes() ->L0 = [0],L1 = [L0, L0, L0, L0, L0, L0, L0, L0, L0, L0],L2 = [L1, L1, L1, L1, L1, L1, L1, L1, L1, L1],L3 = [L2, L2, L2, L2, L2, L2, L2, L2, L2, L2],L4 = [L3, L3, L3, L3, L3, L3, L3, L3, L3, L3],L5 = [L4, L4, L4, L4, L4, L4, L4, L4, L4, L4],L6 = [L5, L5, L5, L5, L5, L5, L5, L5, L5, L5],L7 = [L6, L6, L6, L6, L6, L6, L6, L6, L6, L6],L8 = [L7, L7, L7, L7, L7, L7, L7, L7, L7, L7],L9 = [L8, L8, L8, L8, L8, L8, L8, L8, L8, L8],L = [L9, L9, L9, L9, L9, L9, L9, L9, L9, L9],L.
What is the impact? Result: After a bit more of 45 minutes of struggling, the compiler tries to allocate 3.7 GB of memory and gives up:
$ ERlC demo. erl
Crash Dump was written to: erl_crash.dump
Eheap_alloc: cannot allocate 3716993744 bytes
Memory (of type "heap_frag ").
Abort
Well, I am brave enough to be self-defeating. Due to the above annoying problem, I decided to complete the subsequent test in shell. Then, I step into the new trap with one foot ":
Trap 2 shell! Shell!
Eshell V6.0 (abort with ^G)1> L=[1,2,3,4,5,6,7,8,9,10].[1,2,3,4,5,6,7,8,9,10]2> L2=[L,L,L,L,L,L].[[1,2,3,4,5,6,7,8,9,10],[1,2,3,4,5,6,7,8,9,10],[1,2,3,4,5,6,7,8,9,10],[1,2,3,4,5,6,7,8,9,10],[1,2,3,4,5,6,7,8,9,10],[1,2,3,4,5,6,7,8,9,10]]3> erts_debug:size(L2).324> erts_debug:flat_size(L2).1325> io:format("~p",[L2]).[[1,2,3,4,5,6,7,8,9,10],[1,2,3,4,5,6,7,8,9,10],[1,2,3,4,5,6,7,8,9,10],[1,2,3,4,5,6,7,8,9,10],[1,2,3,4,5,6,7,8,9,10],[1,2,3,4,5,6,7,8,9,10]]ok6> erts_debug:size(L2).327> erts_debug:flat_size(L2).132
When starting shell, the shell PID is <0.33.0>. then we deliberately execute a non-existent method fake: fake () in the middle (). check that the shell has been restarted and the PID has changed to <0.40.0>. run erts_debug: size (L2 ). the result has changed to 132. In other words, the L2 data has been expanded.
Eshell V6.0 (abort with ^G)1> self().<0.33.0>2> L=[1,2,3,4,5,6,7,8,9,10].[1,2,3,4,5,6,7,8,9,10]3> L2=[L,L,L,L,L,L].[[1,2,3,4,5,6,7,8,9,10],[1,2,3,4,5,6,7,8,9,10],[1,2,3,4,5,6,7,8,9,10],[1,2,3,4,5,6,7,8,9,10],[1,2,3,4,5,6,7,8,9,10],[1,2,3,4,5,6,7,8,9,10]]4> erts_debug:size(L2).325> erts_debug:flat_size(L2).1326> fake:fake().** exception error: undefined function fake:fake/07> self().<0.40.0>8> erts_debug:size(L2).1329> erts_debug:flat_size(L2).13210>
Why does it trigger data expansion (expand, flattening? See the following code. When the shell is started, the variable that has been bound previously will be used as the spawn_link parameter to start the new shell.
erl6.2\lib\stdlib-2.2\srcstart_eval(Bs, RT, Ds) -> Self = self(), Eval = spawn_link(fun() -> evaluator(Self, Bs, RT, Ds) end), put(evaluator, Eval), Eval.
In other words, when using spawn in Erlang to create a process, the input parameters (including function closures) need to be copied to the heap of the new process. In other words, the parameter size must be considered during process creation.
OK, this problem is almost done. Rest.
[Erlang 0128] term sharing in Erlang/OTP