The prerequisites are as follows:
A table containing four fields named test has 1 million records.
The first column is ID, primary key, auto-increment.
The second column is col1, Which is random among Mike, Bob, Jack, Alice, Cathy, Ann, Betty, Cindy, Mary, and Jane.
The third column is col2, which is a random five-character letter with a letter limited to a-e.
The third column is col3, which is a random integer between 1 and 20.
(1) Select count (*) from test group by col1 order by count (*); record execution time
(2) locate all records starting with AB in the second column and record the execution time
After the index on col1 is created in the first question, the query time is halved. In contrast, the index discovery time based on col2 increased from 0.6s to 7 s.
My environment is a Java language and a MySQL database, connected using JDBC.
The following is the code for adding the index code and querying the second question:
Public static void dropcol1index () {<br/> try {<br/> statement ST = Conn. createstatement (); <br/> string SQL = "drop index test_col1 on test;"; <br/> st.exe cute (SQL); <br/> St. close (); <br/> system. out. println ("col1index has been deleted"); <br/> system. out. println (); <br/>}catch (exception e) {<br/> E. printstacktrace (); <br/>}< br/> Public static void query2 () {<br/> try {<br/> system. out <br/>. println ("START query2: 'select * from test where col2>/'AB/' and col2 </'ac /';'"); <br/> long begintime = system. currenttimemillis (); <br/> statement ST = Conn. createstatement (); <br/> string SQL = "select * from test where col2>/'AB/' and col2 </'ac /';"; <br/> resultset rs = st.exe cutequery (SQL); <br/> long endtime = system. currenttimemillis (); <br/> // system. out. println ("Result:"); <br/> // while (RS. next () {<br/> // system. out. print (RS. getint (1) + ""); <br/> // system. out. print (RS. getstring (2) + ""); <br/> // system. out. print (RS. getstring (3) + ""); <br/> // system. out. println (RS. getint (4); <br/>/}< br/> system. out. println ("query2 time:" + (double) (endtime-begintime) <br/>/1000 + "S"); <br/> system. out. println (); <br/> St. close (); <br/>}catch (exception e) {<br/> E. printstacktrace (); <br/>}< br/>}
I got some advice from someone else and solved the problem.
I tried to insert regular data into col2 because the total number of rows is 1 million, and the data ranges from AAAAA to eeeee, each type of data can be inserted into 320 rows in order, so that the overall col2 data is sorted by lexicographically. Similarly, it takes about seconds to query data after an index is created. Finally, I suddenly realized.
We can see relevant information on the Internet and learn that when col2 is a random value, the index created on col2 is a non-clustered index (non-clustered index is like searching the words in the Chinese Dictionary Based on the index of the radicals). When col2 is stored in a certain order, the index created on col2 is a clustered index (the clustered index is like searching for words in the Chinese Dictionary Based on the index of Chinese pinyin ). In practical applications, clustered indexes can improve the query speed efficiently, but not clustered indexes.
The specific reason may be that when you query clustered data, you can locate the data in the disk through the index, you can read a large number of compliant information from this position in sequence. It's like you can find a large number of words with the pronunciation of a Chinese pinyin index. However, if the data is not clustered, the results of non-clustered index queries are distributed in various locations of the disk. This information can be obtained only when the head is moved frequently, it is as if you are searching for a word with a radicals as the index, You Need To frequently flip the book page to find all the words that meet the requirements.