DBMS (Database management System) often provides multiple access methods. Access method is a fast access to data in the database technology, generally used in three kinds of access methods: The first is the index method, the main time B + Tree Index method, the second is the cluster (clustering) method; The third is the hash method. The following is a summary of each of the characteristics of the method.
1. Index access method
The index access method is to determine which attribute columns of the corresponding relationship are indexed, which attribute columns are indexed, which indexes are designed as unique indexes, and so on according to the application requirements. The Mentoring programme has the following points:
1) If a (or group) attribute is frequently present in the query condition, you can index (or combine) the attribute (or group).
2) If an attribute is often used as a parameter to a clustered function such as the maximum and minimum values, then the index on this property may be considered.
3) If a (or group of) attribute is often present in the join condition of the join operation, consider establishing an index (or composite index) on this (group) attribute.
The higher the number of indexes, the better the system has to pay for maintaining the index, so if a relationship is updated very frequently, it is not easy to build too many indexes.
2. Cluster access method
To increase the query speed of a property (or group of properties), the tuples with the same value on these properties (called the clustered code, cluster key) can be stored on contiguous physical blocks, called clusters.
For example, there are 500 students in the information system, and in extreme cases 500 students have a tuple that is distributed on 500 different physical blocks, so that even if the index is indexed to avoid a full scan, it will take 500 IO operations to query all students in the information system, but if the student information in all information systems is stored centrally, Then, each reading of a physical block can obtain more than one student's information, can greatly reduce the number of IO operations, thus improving query efficiency. The scheme for designing candidate clusters is:
1) Clustering can be established for relationships that are often connected together.
2) If a set of attributes of a relationship is often present in an equality comparison condition, clustering can be established on that single relationship.
3) If the value repetition rate on one (group) attribute of a relationship is high, clustering can be established on that single relationship. That is, corresponding to each cluster code value of the average tuple number can not be too small, if too little, the effect of clustering is not obvious (the decrease in IO operations is not obvious).
In the candidate clusters, the unnecessary relationships are excluded. For example:
1) Eliminate the relationship of frequent full table scan from the cluster
2) Remove the update operation from the cluster far more than the connection operation relationship
3) Different clusters may contain the same relationship, a relationship can be in a cluster, but not in multiple clusters at the same time.
Clustering can only improve the performance of certain applications, and clustering is expensive to build and maintain. For example, establishing a cluster on an existing relationship causes the relationship tuple to move its physical location, invalidate the original index on the relationship, and must be rebuilt; When a tuple's cluster value changes, the storage location of the tuple changes. Therefore, access or connection via a clustered code is the primary application of the relationship, and other access unrelated to the cluster code is small or minor and can be clustered. In particular, the use of clustering is advantageous when the SQL statement contains an order by, a group by, a union, a distinct, and other clauses or phrases related to the cluster code.
3. Hash access method
Some DBSM provide the hash access method, the rule of choosing the hash access method is as follows:
A hash access method can be used if the attribute of a relationship mainly appears in the equivalent join condition or the equality comparison selection condition, and satisfies one of the following conditions:
1) The size of a relationship can be predictable and constant
2) If the relationship size can be changed, and the DBMS provides a dynamic hash access method.
Of the above three methods, the most commonly used are indexes and clusters, and we can create the corresponding indexes and clusters for the created tables.
DBSM common methods of data access