1, the purpose of the experiment
Answer the question before the experiment-using different partition granularity for the same data, but creating global spatial indexes, Q: Are their global spatial indexes consistent?
How can it be considered the same? is the tree structure of r-tree consistent? is the number of spatial index entries consistent with the size? Strict consistency, of course, refers to the number of index entries, index size, R-tree tree structure is exactly the same. However, the analysis shows that the number of index entries and the R-tree tree structure are identical, but the index size is slightly different. But we still think that their global indexes are consistent because the R-tree tree structure is the key to determining whether spatial indexes are consistent.
Based on the above prerequisites, consider a question: when executing a spatial query, the first step in executing a plan is to scan the global Spatial index, in which case the query efficiency is the same for a table with different partition granularity? Experiment before I would have thought it was the same. Because, although the granularity of partitions is different, the execution logic is exactly the same--finding the matching ROWID based on the Global Spatial index, and returning records based on rowID. The ROWID is no different, that is, the same disk device access under the same ROWID loss is the same. But what about the real situation?
2 Experimental data
The experimental data are 2,531 counties and districts in the country, the total factor is 46982394. Organized by county, city, province, regional partition, and non-zoning, respectively. Creates a global spatial index on all tables. First, we will introduce the relevant information by province partition, by region partition.
By province partition, each partition record number and blocks:
Partition by region, number of partition records and blocks:
3 Experimental methods
At the 1:500, 1:2000, 1:10,000, 1:25,000, 1:50,000, 1:100,000 scale, randomly selects 3 sample ranges from Nationwide, as the query scope of spatial query. In this paper, the spatial query operation of 6*3 sample range with 3 experimental subjects is carried out, and the time of each query is recorded.
The algorithm uniformly uses the algorithm that best fits the global Spatial index: Part_query3.
4 Experimental results
The results of the experiment are as follows:
Average of 3 samples per scale:
A line chart that plots the response time of different partition granularity at different scales.
5 Experimental conclusions
- Although the Global Spatial index is consistent, different data organization methods affect query efficiency. It can be seen that too fragmented data (by county partitioning) will result in performance degradation, and moderate dispersion has little or no impact on performance.
- In the case of the global index, the efficiency of the partition is the highest (9 hits 18 times), followed by the region partition (18 hits 6 times), by the provincial partition and second (18 hits 3 times).
- by province partition, partition by region, non-partition in the Global Spatial index of query efficiency is not large , whether the production of partitions, the production of what kind of partition, should be combined with business scenarios to choose the appropriate organizational mode.
(not to be continued ...) )
Oracle Spatial Partitioning Application study four: Comparison of the efficiency of different partition granularity + global Spatial indexes