Erlang ETS--something about cache continue

Source: Internet
Author: User

The last time I talked about the basic idea and idea of implementing a simple cache, http://www.cnblogs.com/--00/p/erlang_ets_something_about_cache.html at the end of the article, speaking of judging single Record memory consumption. This time, we will continue to talk about Erlang data item memory related issues.

In the Erlang efficiency_guide documentation, the memory consumption of the different data types in the Erlang system is clearly stated, in this simple one or two:

Small integer 1 word
On 32-bit architectures: -134217729 < i < 134217728 (bits)
On 64-bit architectures: -576460752303423489 < i < 576460752303423488 (bits)
List 1 word + 1 word per element + the size of each element
Atom 1 word. Note:an atom refers into an atom table which also consumes memory. The atom text is stored once for each unique atom in this table. The Atom table is not garbage-collected.
String (is the same as a list of integers) 1 word + 2 words per character

From the documentation, you can see that the small integer occupies 1 bytes, Atom occupies 1 bytes, and the bytes used by the List depend primarily on element amount and size of each element.

Give me a chestnut:

["123", "234"] the amount of memory occupied is calculated 1 + (1 + (1 + 2 * 3)) + (1 + (1 + 2 * 3)) = 17 is 17 bytes.

Tips:

Note that atom occupies only 1 word in the Erlang system, which is a great help for message.

The only operation that the Atom data type can do in Erlang is the comparison; in Erlang, the module name and method names are atoms; Atom is used to construct tag-message,atom time is constant, regardless of the length of atom (if you take binary as tag, the comparison time is linear); Atom is designed for comparison and does not use atom in other operations other than the comparison operation.

Extended reading see Strong blog.

Learn about the memory allocation rules of Erlang's various data items in the Erlang system, so how can we calculate them quickly? Is there a ready-made API function that can't be calculated manually every time?

Let's take a look at the various size provided by the Erlang system:

    • Where all the data items are common are: erlang:external_size/1 , erts_debug:size/1 ,erts_debug:flat_size/1

    • Applicable to binary strings are: erlang:size/1 , erlang:byte_size/1 ,erlang:bit_size/1

    • Applicable to tuples are: erlang:size/1 ,erlang:tuple_size/1

Among them, the more important erts_debug two functions:

erts_debug:size/1And erts_debug:flat_size/1 are all functions that are not in the official document, and can be used to calculate the space required for Erlang data items in memory. The space consumption of various data items can be found here: http://www.erlang.org/doc/efficiency_guide/ advanced.html#id68912. The difference between these two functions is that in a data structure with shared memory, erts_debug:size/1 only the shared data size is computed once, and erts_debug:flat_size/1 the calculation is repeated.

This is an example of the Erlang source code:

%% size(Term)%%  Returns the size of Term in actual heap words. Shared subterms are%%  counted once.  Example: If A = [a,b], B =[A,A] then size(B) returns 8,%%  while flat_size(B) returns 12.

There is another example in the document: http://www.erlang.org/doc/efficiency_guide/processes.html

In general, erts_debug:size/1 it is the amount of space that Erlang data items occupy in memory, the size of the erts_debug:flat_size/1 data that is required to be copied across processes to move data items (including ETS operations) within the same node.

OK, let's do a simple test first:

1$CatTest_for_ets_record_flat_size.erl2-module (test_for_ets_record_flat_size).3 4-compile (export_all).5 6Start ()7A =Ets:new (A, [named_table, Public]),8D = {[{} | | _ <-lists:seq(1, -)], [self () | | _ <-lists:seq(1,Ten)], [{<<"1234567890">>, {}} | | _ <-Lists:seq(1, +)]},9Io:format ("* * Data words size ~p~n", [Erts_debug:flat_size (D)]),TenIo:format ("* * before insert ~p~n", [ETS:Info(A, Memory)]), One Ets:insert (A, D), AIo:format ("* * After insert ~p~n", [ETS:Info(A, Memory)]).

Execution Result:

1 $ erl2Erlang/otp -[erts-6.3] [Source] [ --bit] [SMP:8:8] [async-threads:Ten] [HiPe] [Kernel-poll:false] [DTrace]3 4Eshell V6.3(Abort with ^G)5 1>Test_for_ets_record_flat_size:start ().6* * Data Words size103247* * before insert3058* * After insert106339Ok

As a result, the Flat_size method is 4 bytes in difference. (Erts_debug:size/1 can vary by 8,095 bytes ) Why?

ERTS_DEBUG:SIZE/1 only calculates the shared data size once, and erts_debug:flat_size/1 then it repeats the calculation. But why does the erts_debug:flat_size still have a difference of 4 bytes?

With this confusion to Google Erlang erts_debug Flat_size found erlang-maillist:

After looking at this more I had realized the documentation of the memory information is correct as would be expected.
    sorry for the noise.  providing the process heap sizeonly, with a additional 1 word excluded for the register or stack storage of the Top-level term would help make things clearer.  This is straight-forward for some since it makes logical sense, but I didn ' t know about these internal details and I W anted to be sure I am looking at the size correctly.

Mentioned erts_debug:flat_size only provides the footprint of the process heap size.

Look back at the source code:

Returns the size of the term in actual heap words.

There's something going on here: ERTS_DEBUG:FLAT_SIZE/1 can only calculate the memory occupied by Erlang term in the process heap, and it can't calculate all the memory footprint. And then through the erlang-maillist above Found this open source project on GitHub: Erlang_term

Extract the key piece of code:

1Byte_size_term (term, wordsize)2DataSize =if3Is_binary (term)4Binarysize =erlang:byte_size (term),5             if6Binarysize >7 binarysize;8                 true-9%In the heap sizeTen0 One             End; A         true- -0 -     End, the% stack/register size + Heap size +Data Size -(1 + erts_debug:flat_size (term)) * wordsize + datasize.

As can be seen from the code above, the memory footprint of Erlang term should be the memory footprint of the process heap (computed by ERTS_DEBUG:FLAT_SIZE/1), stack occupancy, and the sum of shared memory consumption.

OK, keep on test code:

1$CatTest_for_ets_record.erl2-module (test_for_ets_record).3 4-compile (export_all).5 6Start ()7A =Ets:new (A, [named_table, Public]),8D = {[{} | | _ <-lists:seq(1, -)], [self () | | _ <-lists:seq(1,Ten)], [{<<"1234567890">>, {}} | | _ <-Lists:seq(1, +)]},9Io:format ("* * Data words size ~p~n", [Erlang_term:byte_size (D)/8]),TenIo:format ("* * before insert ~p~n", [ETS:Info(A, Memory)]), One Ets:insert (A, D), AIo:format ("* * After insert ~p~n", [ETS:Info(A, Memory)]).

Test results:

1$ ERL-PA./ebin-pa./2Erlang/otp -[erts-6.3] [Source] [ --bit] [SMP:8:8] [async-threads:Ten] [HiPe] [Kernel-poll:false] [DTrace]3 4Eshell V6.3(Abort with ^G)5 1>Test_for_ets_record:start ().6* * Data Words size10325.07* * before insert3058* * After insert106339Ok

Why?? Why is 3 bytes worse? Well, you can only open issues to consult the author.

So, the Erlang_term module can help you manage caching, but the real situation in the Erlang VMS with the many memory pools is more complex.

To figure out where these 3 bytes are used in the ETS table, you need to know more about how ETS's memory management works. It can only be put on hold for the time being (to be continued).

Summarize:

1, ERTS_DEBUG:FLAT_SIZE/1 only calculates the size of Erlang term in the process heap;

2, Erlang_term is so amazing.

Erlang ETS--something about cache continue

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.