Data Structure |
Application scenarios |
Example |
Hash table |
All key-value pairs must be placed in the memory. The search can be completed within the constant time. |
L extract the IP address with the most frequent access to Baidu from a log L count the numbers of different phone numbers |
Heap |
It takes O (logn) Time to insert and adjust. n is the number of heap elements, and obtaining the heap top element only requires constant time. |
L calculate the first K of massive data L calculate the median of massive data streams |
Bitmap |
It usually records the occurrence of integers for fast search, number determination, and deletion of elements. |
L count the numbers of different phone numbers L number of repeated integers in the 0.25 billion Integers |
Double Bucket |
Two addressing modes to save memory, usually used for determining the maximum K, median, and number. |
L 0.25 billion integer to find the median L K value of massive data |
Reverse Index |
Index using words-documents, properties-objects to facilitate reverse search |
L keyword-based search L auto-completion entered in the search box |
Outbound |
Use hard disk space to sort massive data |
L 1 GB file, each line is a word, memory 1 MB, return the most frequently 100 words |
Prefix Tree |
Create a Prefix Tree for all words in the Set |
L find the popular query string L find words with high repetition rate |
Mapreduce |
In distributed processing, data is handed over to different machines for processing, data is divided, and then the results are normalized. |
L massive log analysis L Data Mining L Intelligent Recommendation System |