[Erlang0068] Erlang dict

Last Update:2018-12-06 Source: Internet

Author: User

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

Dict is a dictionary implemented by a dynamic hash table. the interface is consistent with orddict. The implementation is similar to the idea of dynamic expansion of array. Compared with proplists and orddict, it supports larger data volumes, you can switch from orddict to dict when the data volume expands. dict uses the dynamic Hash technology implementation, the theoretical basis is the paper: "The Design and Implementation of Dynamic Hashing for sets and tables in icon", the paper address: http://www.2007.cccg.ca /~ Morin/teaching/5408/refs/a99133 arrays are easy to address, but difficult to insert and delete; difficult to address the linked list, and easy to insert and delete; the time for inserting and deleting hash tables depends on the search time. the hash table establishes a definite functional relationship between the data and the data storage location, so it achieves efficient query efficiency. In linear tables and trees, the position of data items in the structure is random, there is no definite relationship with the data item values. In this structure, data items are searched based on "comparison", and the search efficiency depends on the number of comparisons. segment, slot, and bucket are the same concept in the hash table of Wikipedia:

The hash function is used to transform the key into the index ( Hash) Of an array element ( SlotOr Bucket) Where the corresponding value is to be sought.

In the implementation of dict, segment, slot, and bucket are three concepts that gradually become smaller. We can see their relationship from fetch:

fetch(Key, D) ->    Slot = get_slot(D, Key),    Bkt = get_bucket(D, Slot),    try fetch_val(Key, Bkt)    catch           badarg -> erlang:error(badarg, [Key, D])    end.%% get_slot(Hashdb, Key) -> Slot.%%  Get the slot.  First hash on the new range, if we hit a bucket%%  which has not been split use the unsplit buddy bucket.get_slot(T, Key) ->    H = erlang:phash(Key, T#dict.maxn),    if     H > T#dict.n -> H - T#dict.bso;     true -> H    end.%% get_bucket(Hashdb, Slot) -> Bucket.get_bucket(T, Slot) -> get_bucket_s(T#dict.segs, Slot).

The segment size is fixed. You only need to modify the size of the top-layer tuple as the data size continues. the last element of segments tuple is an empty segment for subsequent extension. segments scales exponentially at a time, which does not seriously damage the performance. note that the interface exposed by dict does not contain the actual location information of the data. store/3, append/3, append_list/3, update/3, update/4, update_counter/3 All check whether expansion is required,

Filter/2 Erase/2 checks whether to scale down. Because dict can dynamically adjust and scale as the data volume changes, it takes into account the memory consumption and access efficiency.

% Note: mk_seg/1 must be changed too if seg_size is changed. -Define (seg_size, 16 ). -Define (max_seg, 32 ). -Define (expand_load, 5 ). -Define (contract_load, 3 ). -Define (exp_size ,(? Seg_size *? Expand_load).-Define (con_size ,(? Seg_size *? Contract_load). % define a hashtable. The default values are the standard ones.-record (dict, {size = 0% number of elements n =? Seg_size % Number of activated slots maxn =? Seg_size % maximum slots BSO =? Seg_size Div 2% maximum Bucket number in the hash list currently allows the maximum number of buckets. The expansion operation requires you to determine whether to add a new bucket segment. The initial value is 16; exp_size =? Exp_size % the initial expansion threshold value is 16*5 = 80 con_size =? Con_size % The initial shrinkage threshold value is 16*3 = 48 empty: tuple (), % empty segment segs: tuple () % segments where all data is stored }).

When a dict is created, empty is initialized to become a data template.

New ()-> Empty = mk_seg (? Seg_size), # dict {empty = empty, segs = {empty }}. mk_seg (16)-> {[], [], [], [], [], [], [], [], [], [], [], [], [], [], [], []}. % 16 is also the test experience

K-V format-define (Kv (K, v), [k | V]). the key-value storage of % key-Value Pair format dict is not an improper list. Let's look at the implementation of append_bkt below. I guess the purpose of this operation is to treat bag as a whole.

Eshell V5.9.1  (abort with ^G)1> dict:new().{dict,0,16,16,8,80,48,      {[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],[]},      {{[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],[]}}}2> dict:store(k,v,v(1)).{dict,1,16,16,8,80,48,      {[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],[]},      {{[],[],[],[],[],[],[],[],[],[],[],[[k|v]],[],[],[],[]}}}3> dict:store(k2,v2,v(2)).{dict,2,16,16,8,80,48,      {[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],[]},      {{[],[],        [[k2|v2]],        [],[],[],[],[],[],[],[],        [[k|v]],        [],[],[],[]}}}4>

%% append_bkt(Key, Val, Bucket) -> {NewBucket,PutCount}.append_bkt(Key, Val, [?kv(Key,Bag)|Bkt]) -> {[?kv(Key,Bag ++ [Val])|Bkt],0};append_bkt(Key, Val, [Other|Bkt0]) ->    {Bkt1,Ic} = append_bkt(Key, Val, Bkt0),    {[Other|Bkt1],Ic};append_bkt(Key, Val, []) -> {[?kv(Key,[Val])],1}.%% app_list_bkt(Key, L, Bucket) -> {NewBucket,PutCount}.app_list_bkt(Key, L, [?kv(Key,Bag)|Bkt]) -> {[?kv(Key,Bag ++ L)|Bkt],0};app_list_bkt(Key, L, [Other|Bkt0]) ->    {Bkt1,Ic} = app_list_bkt(Key, L, Bkt0),    {[Other|Bkt1],Ic};app_list_bkt(Key, L, []) -> {[?kv(Key,L)],1}.

When should you use gb_trees over dicts? Well, it's not a clear demo. as the benchmark Module I have written will show, gb_trees and dicts have somewhat similar performances in each respects. however, the benchmark demonstrates that dicts have the best read speeds while the gb_trees tend to be a little quicker on other operations. you can judge based on your own needs which one wocould be the best.

Oh and also note that while dicts have a fold function, gb_trees don't: They instead haveIteratorFunction, which returns a bit of the tree on which you can callgb_trees:next(Iterator)To get the following values in order. what this means is that you need to write your own recursive functions on top of gb_trees rather than use a generic fold. on the other hand, gb_trees let you have quick access to the smallest and largest elements of the structuregb_trees:smallest/1Andgb_trees:largest/1.

Link: http://learnyousomeerlang.com/a-short-visit-to-common-data-structures

For more information, see the following article [1] Erlang dictionary examplehttp: // abel-perez.com/erlang-dictionary-example [2] working with dictionaries in erlanghttp: // www.techrepublic.com/article/working-with-dictionaries-in-erlang/6342630 Wikipedia http://en.wikipedia.org/wiki/Hash_table of hash table

Good night!

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More

[Erlang0068] Erlang dict

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support

[Erlang0068] Erlang dict

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

Trending Topic

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support