Database interview questions

Last Update:2018-12-04 Source: Internet

Author: User

Tags ming relational database table

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

1. What are the design paradigms of relational databases?

Paradigm 1: Each column of a relational database table is an inseparable basic data item. The first paradigm is to ensure that the values in the column are unique.

Second paradigm: it must be the first paradigm. In addition, each row in the database table must be uniquely distinguished. One or more attributes of a table are usually used as the primary keys of a row. (Another statement: In R, each non-primary attribute fully function depends on a candidate key of R)

Third paradigm: it must be the second paradigm. Columns in one table cannot depend on non-primary key columns in another table. (Another statement: each non-master attribute does not pass a candidate key dependent on R)

BCF: it must be the first paradigm, and each attribute does not pass a candidate key dependent on Relational R.

Fourth paradigm: Set R as a relational model, and D as a set of multi-value dependencies on R. If a non-trivial multi-value dependency X-> Y is set in D, X must be the super key of R.

2 what is the difference between stored procedures and functions?

A stored procedure is a collection of user-defined SQL statements involving tasks of a specific table or other objects. You can call the stored procedure. A function usually receives a parameter and returns a value of some type, and does not involve a specific user table.

The following table is shown in Oracle:

Definition of stored procedure:

Create [or replace] procedure [mode name.] process name

[(Parameter name [in | Out | in out] data type...)]

{Is |}

[Description]

<PL/SQL block>;

Function Definition:

Create [or replace] function [mode name.] process name

[(Parameter name [in] data type...)]

Return Data Type

{Is |}

[Description]

<PL/SQL block>;

3. What is a database transaction?

Answer: database transactions refer to a series of operations performed as a single logical unit of work. These operations are either full or not all. They are an inseparable unit of work. Meets acid requirements.

A atomicity: the atomicity of a transaction refers to whether a transaction is executed in full or not.

C consistency: Transaction consistency means that the transaction operation does not change the data consistency in the database. (Bank Transfer)

I independence: the independence of a transaction means that the transactions executed concurrently cannot interfere with each other.

D Persistence: the transaction Persistence means that after the transaction runs successfully, the system updates are permanent.

4 basic index problems

An index is a structure that sorts the values of one or more columns in a database table. You can use an index to quickly access specific information in a database table. The purpose of creating an index is to speed up the record SearchOr Sort.The cost for setting indexes for a table is: first, the storage space of the database is increased, second, it takes a lot of time to insert and modify data (because the index also needs to change ). Creating indexes can greatly improve the system performance. Advantages of creating an index:First, you can create a unique index to ensure the uniqueness of each row of data in the database table. Second, it can greatly speed up data retrieval, which is also the main reason for creating an index. Third, it can accelerate the connection between tables, especially in achieving Data Reference integrity. Fourth, when you use grouping and sorting clauses to retrieve data, you can also significantly reduce the time for grouping and sorting in queries. Fifth, by using indexes, you can use the optimizer during the query process to improve system performance.
Some may ask: why not create an index for each column in the table because increasing Indexes has so many advantages? This is because adding Indexes has many disadvantages. Disadvantages of index creation:First, it takes time to create and maintain indexes. This time increases with the increase of data volume. Second, indexes occupy physical space. In addition to data tables, each index occupies a certain amount of physical space. To create a clustered index, the required space is larger. Third, when adding, deleting, and modifying data in the table, the index must also be dynamically maintained, which reduces the Data Maintenance speed.
Indexes are created on certain columns in the database table. When creating an index, you should consider which columns can create an index and which Columns cannot create an index. When to create an index ??In general, you should create an index on these columns: 1 in the columns that frequently need to be searched, you can speed up the search; 2 in the column as the primary key, force the uniqueness of the column and the data arrangement structure in the organization table. 3. These columns are frequently used in connected columns. These columns are mainly foreign keys, which can speed up the connection; 4. Create an index on a column that often needs to be searched by range. Because the index has been sorted, the specified range is continuous. 5. Create an index on a column that frequently needs to be sorted, because the index has been sorted, the query can use the sorting of the index to speed up the sorting query time. 6. You often create an index on the column in The WHERE clause to speed up the condition judgment. Similarly, indexes should not be created for some columns. Under what circumstances should I not create an index ?? In general, these columns that should not be indexed have the following characteristics: first, indexes should not be created for those columns that are rarely used in queries or referenced. This is because, since these columns are rarely used, there is an index or no index, and the query speed cannot be improved. On the contrary, the addition of indexes reduces the system maintenance speed and space requirements. Second, indexes should not be added to columns with only few data values. This is because these columns have very few values, such as gender columns in the personnel table. In the query results, the data rows in the result set account for a large proportion of the data rows in the table, that is, the proportion of data rows to be searched in the table is large. Adding indexes does not significantly accelerate the search speed. Third, indexes should not be added for columns defined as text, image, and BIT data types. This is because the data volume of these columns is either large or small, which is not conducive to the use of indexes. Fourth, when the modification performance is far greater than the retrieval performance, you should not create an index. This is because the modification performance and retrieval performance are inconsistent. When an index is added, the search performance is improved, but the modification performance is reduced. When the index is reduced, the modification performance is improved and the retrieval performance is reduced. Therefore, when modification operations are far more than search operations, you should not create an index.

5. Differences between clustered indexes and non-clustered Indexes

Answer: The order of clustered indexes is the physical storage order of data, but the order of non-clustered indexes is irrelevant to the physical storage order of data. Because of this, a table can have at most one clustered index.

The clustered index determines the physical sequence of data. Clustered indexes are similar to phone books (sorted by surnames ). Because clustered indexes specify the physical storage sequence of data in a table, a table can only contain one clustered index. However, this index can contain multiple columns (composite indexes), just as the phone book is organized by the last name and name. Clustered indexes are particularly effective for columns that frequently search for range values. When a clustered index is used to locate the row that contains the first value, the rows that contain the subsequent index value are physically adjacent.

Non-clustered indexes: Non-clustered indexes are similar to those in textbooks. The data is stored in one place, and the index is stored in another place. The index has a pointer to the data storage location. The items in the index are stored in the order of index key values, while the information in the table is stored in another order.

Eg: in SQL Server, the index is described by the data structure of the binary tree. We can understand the clustered index as follows: the leaf node of the index is the data node. The leaf node without clustered indexes is still an index node, but there is a pointer pointing to the corresponding data block.

6 roles

A database role is a group of named permissions related to database operations. A role is a set of permissions. Therefore, you can create a role for a group of users with the same permissions. Using roles to manage database permissions can simplify the authorization process.

1. Create a role

Create role R1;

2. Use the grant statement to grant the select, update, and insert permissions of the student table to role R1.

Grant select, pdate, insert

On Table student

To R1;

3. Assign this role to Wang Ping, Zhang Ming, and Zhao Ling. Grant all permissions of role r1

Grant r1

To Wang Ping, Zhang Ming, Zhao Ling;

4 of course, you can also revoke these three permissions of Wang Ping through R1 at a time.

Revoke r1

From Wang Ping;

Role Modification

Grant Delete

On Table student

To r1

This allows role R1 to add the delete permission for the student table based on the original one.

Revoke select

On Table student

From R1;

7. Name of the student with the second highest score

1. The highest score for each subject.
2. Name with the highest Java score
3. name with the second highest score in Java

1. Highest score for each subject:
Select Kemu, max (score) from Table group by Kemu;

Analysis: after finding the person with the highest score, you can get the highest score for each group by group of subjects.

2. Name with the highest Java score:
Select name from table where Kemu = 'java' and
Score = (select max (score) from table where Kemu = 'java ');

Analysis: When querying a name from a table, two conditions are proposed: 1. The subject is Java, 2. The subject is the highest score in Java.

3. name with the second highest score in Java:
Select name from table where Kemu = 'java' group by name
Order by score DESC limit 1, 1;

Analysis: When querying a name, the query results are sorted by the [color = Blue] Kemu = 'java' [/color] condition and by name, finally, select the second name starting from 0 based on the top method limit in the list, and only take one person. OK! This topic mainly examines the knowledge of grouping, ranking, and top values, which is relatively difficult. If you encounter such a problem, you should divide the big problems into small ones one by one, solve the small problems one by one, and then piece them together to solve the big problem.

8. dense and sparse Indexes

I. Dense indexes if records are sorted, we can create a dense index on the records. It is a series of storage blocks like this: block stores only the record key and pointer to the record itself. The pointer is a point to record or storage block address. The index block in the dense index file keeps the key order consistent with that in the file. Since we assume that the storage space occupied by the search key and pointer is much smaller than that occupied by the record itself, we can think that the storage index file is much less than the storage block required to store the data file. When the memory can not accommodate data files, but can accommodate the index files, the index advantage is particularly obvious. In this case, by using the index file, we can find the record with the given key value in every query using only one I/O operation.
As shown in www.2cto.com, It is a dense index created on an ordered file. Figure 1 The first index block of the dense index (left) on the ordered file (right) refers to the pointer of the first four records, and the second index block refers to the pointer of the next four records, and so on. The dense index supports searching for corresponding records based on the given key value. Given a key value K, we first look for K in the index block. When K is found, the corresponding records are found in the data file according to the pointer corresponding to K. It seems that before K is found, we need to retrieve each storage block of the index file, or an average of half of the storage block. However, because of the following factors, index-based search is more effective than it looks:
1. The number of index blocks is usually smaller than the number of data blocks. 2. Because the keys are sorted, we can use the binary search method to find K. If there are n index blocks, we only need to find the log2n blocks. 3. The index file may be small enough to be permanently stored in the primary buffer. In this case, when the key K is queried, only the primary storage access is involved, and I/O operations are not required.
2. sparse index the sparse index only sets one key-pointer pair for each storage block of the data file, which saves more storage space than the dense index, however, it takes more time to search for a specified value. A dense index can be used only when data files are sorted by a search key. A dense index can be applied to any search key. As shown in 2, the sparse index only sets one key-pointer pair for each storage block. The key value is the corresponding value of the first record in each data block. Figure 2 sparse indexes on ordered files are the same as those in Figure 1. We assume that the data files are sorted and their key values are multiples of 10 consecutive times until a large number is reached. We also continue to assume that each storage block can store four key-pointer pairs. In this way, the first index storage block is the index of the first key value of the first four data storage blocks, which are 10, 30, 50, and 70, respectively. According to the previously assumed key-value mode, the second index storage block is the index of the first key value of the fifth to eighth data storage block, they are 90, 110, 130, and 150 respectively. The figure also lists the key values of the third index storage block, which are the first key values of the ninth to 12th data storage blocks.
Www.2cto.com when there is a sparse index, to find the record with the key value K, we have to find the maximum key value of the key value less than or equal to K in the index. Because the index file has been sorted by buttons, we can use the Binary Search Method to locate this index item, and then find the corresponding data block based on its pointer. Now we must search for this data block to find the record with the key value K. Of course, there must be enough formatting information in the data block to indicate the record and record content. Any technology in Sections 2.5 and 2.7 can be used.

10B-tree indexes and hash Indexes

Because of the particularity of the hash index structure, the retrieval efficiency is very high, and the index retrieval can be located at a time, unlike B-tree indexes that need to go from the root node to the branch node, the hash index query efficiency is much higher than that of B-tree indexes.

Many people may have doubts. Since hash indexes are much more efficient than B-tree indexes, why do we need to use B-tree indexes instead of hash indexes? Everything has two sides. The same is true for hash indexes. Although hash indexes are highly efficient, hash indexes also impose many restrictions and drawbacks due to their particularity.

(1) The hash index only supports "=", "in" and "<=>" queries, and does not support range queries.

Because the hash Index compares the hash value after hash calculation, it can only be used for equivalent filtering and cannot be used for range-based filtering, because the relationship between the size of hash values processed by the corresponding hash algorithm cannot be exactly the same as that before the hash operation.

(2)Hash indexes cannot be used to avoid data sorting.

Hash indexes store hash values after hash calculation, and the relationship between hash values is not necessarily the same as that before hash calculation, therefore, the database cannot use the index data to avoid any sort operations;

(3)Hash indexes cannot be queried using some index keys.

For a composite index, when calculating the hash value, the hash value is calculated after the composite index is bonded, instead of separately calculating the hash value, therefore, when one or more index keys are used to query a combined index, the hash index cannot be used.

(4)Hash indexes cannot avoid table scanning at any time.

As we already know, the hash index stores the hash value of the hash operation result and the row pointer information corresponding to the index key in a hash table, because different index keys have the same hash value, the query cannot be completed directly from the hash index even if the number of records that meet the hash key value is obtained, you still need to compare the actual data in the Access Table and obtain the corresponding results.

(5)When a large number of hash values are equal, the performance of the hash index is not necessarily higher than that of the B-tree index.

If a hash index is created for an index with a low selectivity (that is, a large number of records of pointer information are stored in the same hash value. In this way, it will be very troublesome to locate a record, which will waste multiple table data accesses, resulting in low overall performance.

Hash is equivalent to calculating the key through the hash function to obtain the hash value of the key, and use this hash value as a pointer to find whether the key exists in the hash table. If so, the corresponding value of the key is returned, it is very important to select a good hash function. A good hash function can evenly distribute the calculated hash values to reduce conflicts. Only when the conflicts are reduced can the query time of the hash table be reduced.

B-tree is completely key-based comparison. Similar to binary tree, B-tree is equivalent to building a sorted dataset. It uses the binary search algorithm and is actually very fast, in addition, the data volume growth has a very small impact.

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More