To make this query more efficient, add an index to one of the join columns and rerun the explain statement:
mysql> ALTER TABLE t2 ADD INDEX (i2);
mysql> EXPLAIN SELECT t1.i1, t2.i2 FROM t1, t2 WHERE t1.i1 = t2.i2\G
*************************** 1. row ***************************
id: 1
select_type: SIMPLE
table: t1
type: ALL
possible_keys: NULL
key: NULL
key_len: NULL
ref: NULL
rows: 1000
Extra:
*************************** 2. row ***************************
id: 1
select_type: SIMPLE
table: t2
type: ref
possible_keys: i2
key: i2
key_len: 5
ref: sampdb.t1.i1
rows: 10
Extra: Using where; Using index
We can see a performance improvement. The output of the T1 does not change (indicating that a full table scan is required), but the optimizer handles T2 differently:
· The type change from all to ref means that the reference value (from the T1 value) can be used to perform an index lookup to locate a qualified data row in the T2.
· The reference value is given in the reference (ref) field: Sampdb.t1.i1.
· The row value is lowered from 1000 to 10, showing that the optimizer believes that for each row in the T1, it only needs to check 10 rows in the T2 (this is a pessimistic estimate.) In fact, only one row in the T2 matches the data row in the T1. We'll see how to help the optimizer improve this estimate later. All estimates for the combination of data rows make 1000x10=10000. It's 1 million better than the previous estimate of no index.
Is it valuable to index T1? In fact, for this particular join operation, it is necessary to scan a table, so it is not necessary to index the T1. If you want to see the effect, you can index T1.I1 and run explain again:
mysql> ALTER TABLE t1 ADD INDEX (i1);
mysql> EXPLAIN SELECT t1.i1, t2.i2 FROM t1, t2 WHERE t1.i1 = t2.i2\G
*************************** 1. row ***************************
id: 1
select_type: SIMPLE
table: t1
type: index
possible_keys: i1
key: i1
key_len: 5
ref: NULL
rows: 1000
Extra: Using index
*************************** 2. row ***************************
id: 1
select_type: SIMPLE
table: t2
type: ref
possible_keys: i2
key: i2
key_len: 5
ref: sampdb.t1.i1
rows: 10
Extra: Using where; Using index
The above output is similar to the output of the previous explain, but the addition of indexes has some changes to the output of T1. The type is changed from NULL to index, and the additional (Extra) is changed from NULL to the using index. These changes indicate that although the value of the index still needs to perform a full table scan, the optimizer can read the value directly from the index file, depending on the need to use the data file. You can see this kind of result from the MyISAM table, in which case the optimizer knows that it can get all the information it needs only by asking for the index file. This is also true for InnoDB and BDB tables, where the optimizer can use the information in the index alone without searching for rows of data.