Programming Challenges: Querying

Source: Internet
Author: User

Title Description

The existing data is as follows:

K1-1-A-X0001=/common/gom/r00/xml/gom0101.xmlK3-2-B-W4565=/common/gom/r00/xml/gom0404.xmlK4-1-B-K0090=/common/gom/r00/xml/gom0403.xmlK2-3-A-W0004=/common/gom/r00/xml/gom0103.xml......

Where the first column is the ID, no duplicates, and the second column is the address. The data is for 100,000 articles.
Please design data structures and algorithms to provide query services. For example: input k4-1-b-k0090 get the corresponding address/common/gom/r00/xml/gom0403.xml.

Goal

The query is the fastest and consumes the least memory.

Requirements

Do not use relational databases, use file systems for storage, or unlimited programming languages.

Program and Code

I put all the code on GitHub:
Https://github.com/longjingjun/Programming_Challenge_Query
Use Eclipse to open this project.

Scenario 1

Use the property file in Java to store data, and use the properties class in Java to read the data.
-Filebuilder.java: Used to generate 100,000 data
-Finder.java: Query Code implementation

Scenario 2

Use MONGO db to store data and then query with MONGO DB interface.
-Mongodbbuilder.java: Used to insert 100,000 data into MongoDB
-Mongodbfinder.java: Query code implementation.
Note: Running this code requires the installation of MONGO DB. Please download the installation package here:
http://www.mongodb.org/

Scenario 3

With random files, the first row puts the total number of data, followed by storing the data in order from small to large. Queries are queries using the binary lookup algorithm.
-Ramdomfilebuilder.java: Generate 100,000 data according to the design and save to the file.
-Ramdomfinder.java: Query code implementation.

Scenario 4

Use B-tree to implement queries. I have no code for this scenario.

Tests and conclusions

100,000 data, the test results of the first three methods are as follows:

Implementation scenarios Test wheel Test results (MS)
    1 2 3 4 5 6 7 8
Normal prop-1 1 274 265 248 289 295 265 278 218
  2 273 275 270 273 265 286 280 210
  3 265 262 292 270 274 280 247 203
Normal prop-2 1 133 136 135 132 133 139 156 138
  2 145 148 136 135 141 134 135 137
  3 134 136 136 139 129 142 143 139
Mongodb 1 122 As data increases, the time to query grows.
  2 127              
  3 115              
Random File 1 1              
  2 2              
  3 1              

Another round of testing was conducted specifically for the B-tree implementation and the random file scheme, with 1 million test data. The test results are as follows:

1 million piece of data
1 4 7 10
B-tree 121 31 93 53
125 29 105 52
119 30 96 49
Random file 1 3 2 2
2 2 2 1
1 2 2 1

Based on the test results, the quickest solution is to use random files. At the same time, this scheme has a limitation: the data stored in each row is fixed, and how to design each row data size is a very important problem.

Programming Challenges: Querying

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.