Question:
10 million data entries. The data carrier may be in the form of a file or database. Different data types (regions and industries) exist)
Retrieve data of foreign trade enterprises in Hangzhou
Assessment indicators
1. Program abstraction capability
2. extract data using abstract ideas
3. the implementation method is better than the advantages and disadvantages in the form of comparison
Answer:
Is it information retrieval... Create an index and then optimize the search.
You can also create indexes manually, not limited to any programming language, but not limited to any object-oriented or object.
You do not need to sort the results like a search engine... Isn't it because the number is large?
You can use inverted document indexes, hash, or binary structure for indexes.
I personally think that the exam points of this interview question are encapsulated in data and the examination of a large number of quick queries. Data encapsulation can be used to define business classes based on specific business logic. Because of the large data volume query, basic classes are also required. I think the highlights should be in this area, it would be perfect if you could answer the database design. The business class can have enterprise data, representing the entity of each enterprise data. The addition, deletion, and modification differences of the DaO layer are implemented using ibatis. Because the data is complicated and the data volume is large, heib.pdf is discarded. If you have strict performance requirements, you can start with the database design, such as table shard storage, table shard query, and index table creation. By the way, how can I wrap a line when replying...
Yes
1. Create an index
2. Sorting
3. Comparison
4 ,.........
Your problem is a non-linear boundary judgment problem.
We recommend that you look at data mining and other materials.
Use the ADO Layer 3. Add the real layer to abstract the factory. This should be abstract.
Coupled with data and performance, it is certainly not a big problem for the landlord. The focus is on understanding the concept of object-oriented.
As for the object-oriented concept, it depends on the author's personal opinions.
This question is just like finding a person among 1.2 billion Chinese people. The regional industry he said is the characteristic of people. There are many ways to find someone. You can find them one by one. But how long does it take ?!
We will find another way, for example, to find the approximate place based on his place of origin, which will greatly narrow down the scope, but the person may not be born, however, you can find the relevant information, and then
Continue searching and expand the clues until they are concentrated on one point.
It seems to be far away, but I just want to say something actually. That is, according to the characteristics of a thing, it can also be said to be an attribute and the connection with other things to find it.
Back to the question above, the data carrier does not necessarily mean that the examiner does not care about your specific technology. I just want to see how your first response is to solve this problem. There is no need to talk about specific implementation. I think.
In my understanding, if the data is stored as files, I will classify the data into several categories, which are in small categories, in this way, according to the southern region of Hangzhou, I found the southern region and then the Pearl River Delta, and so on... Before saving specific objects in Object-Oriented objects, we need to classify them. Just like you can check the API.
I have said so much without knowing what I have said ,-. -No matter whether there is any value or not, you are addicted.
To put it bluntly,
According to the question, the analysis is as follows:
The data carrier may be in the form of a file or database. This means that you do not need to consider the data storage format. You only need to consider the logical structure of the Data. Therefore, you do not need to consider the database set, use indexes, views, or something;
Data has different types (regions and industries): it indicates that the actual data sources are complex and diversified, and there may be complex associations between data;
The question itself requires the search query results. The core of the question is how to select the appropriate data description under the large data volume of 10 million pieces of data and organize it into a logical structure suitable for displaying the data association, and provides quick search capabilities.
The details are as follows:
1. data layer, -- the data layer is raw and unprocessed data. My idea is to group data based on certain principles, such as the first letter of the alphabet in the region and industry, this greatly reduces the number of targets for retrieval;
2. data logic layer, which uses data objects to assemble data from the data layer. This reduces the dependency on the data layer and dynamically establishes logical associations between data layers. For example, to create a company object, the object has the company's operation direction attribute and the company's operation status attribute. In this way, the company's operation direction data, operation status data, and company description data are associated. Further, naturally, objects can contain objects. The encapsulation of the Data logic layer can greatly reduce the confusion of data management.
3. Add a data access and control layer on the data layer and data logic layer to provide interfaces for the data logic layer to access the data layer.
To improve search efficiency, You can optimize algorithms at the data access and control layer and combine them with the logical organizational structure of the data layer. For example, you can use grouping at the data layer, indexes and keyword hash table ing are used, and binary algorithms are used at the access layer.
That's all nonsense.