Writing efficient SQL requires the following knowledge
Knowledge about the physical organization of the content being queried
What the database can do, for example: If you don't know the Jump Scan index and its purpose, then you might look at the pattern and say, "Index lost."
SQL all the intricacies of knowledge
A true understanding of the goal-what is the problem?
Access method
Full Scan,
Various types of indexed access
Accessed directly by hashing or Rowin. A goto statement in a similar 3PL language.
Full scan
Full scan is an effective way to read a large number of ORACLE data rows, because the database uses multiple blocks of reads, it is necessary to remember that full scan is not evil, in many cases, full scan is the quickest way to get results.
The relationship between full sweep and high water level
ROWID Access
ROWID is the physical address of the data, and a ROWID contains a line of information about the file, the block, and the rows in the block. rowID is probably the quickest way to get a particular row. However, using ROWID to get thousands of rows is not the best approach. In general, ROWID is used to access tables that have been scanned by some sort index, and you can also use ROWID directly, for example:
SELECT *
From Wwc_people_people where ROWID in (select row_id from wwc_fast_people where staring LIKE:BV and rownum < 101);
In addition, you can also use between and, for example:
SELECT * FROM t where ROWID between:a1 AND:A2
Index Scan
Is the most common way to access data
B* tree structure: The lowest-level block in the book (which becomes a leaf node) contains each index key value and a ROWID that points to the row being indexed (in other words, through the index key value, find the ROWID, in the ROWID to find the real row), interestingly, the index of the leaf node is actually a doubly linked list, In this way, it is easy to perform an interval scan, looking directly at the front or rear of the node.
There are two frequently occurring primary index scans: index unique scan and Index range scan, index hop scan , index full scan , fast full scan , Index connection
Index Unique Scan (Index-unique scans), the optimizer knows that the index column is unique in the index and returns at most one row from the index lookup.
index range scanning , for index range scanning, the optimizer knows that it may return 0 rows, 1 rows, or more rows. For example, select Max (empno) from EMP; For this query, Oracle does not scan the full table once and then finds the maximum, but instead starts reading the last (highest) value directly from the back of the index. (because the index is normally stored from low to high)
index hop scan , typically, in order to use an index, define the columns in front of the index to be referenced in the query, for example: Create TABLE T (A, B, C, D, E, F, g); CREATE index T_IDX on t (A, B, c);
Usually we can use SELECT * from t where a =: A; SELECT * FROM t where a =: A and B =: b; However, the following query cannot use the index:
SELECT * from t where B =: b and c =: c; (since there is no starting from a)
Before Oracle 8i has been used, we can use the hint Select/*+ index (t t_idx) */* from t; However, from Oracle 9i,
The concept of index hop scanning is implemented in Oracle 9i and is used in the following conditions:
- Referencing columns in the index in predicates (for example, B and C above)
- The optimizer knows that there are few discontinuous values in the columns preceding the index
index full Scan , that is, first through the index structure to find the first leaf block, and then because all the leaf block composed of a doubly linked list, next, you can use this list of pointer relations, to find the next block.
Index fast full scan, which is significantly different from index full scan, has the following characteristics:
- It reads every block in the index structure, including all the internal branching blocks
- It uses multiple blocks of reads, just like a full scan of a table.
- It does not retrieve data in sorted order
index connections , which are the index paths selected for a query when there are multiple indexes in the table, which should contain all the columns selected in the query (no access to the table)
Connection Concepts
Nested loops Join , the primary connection method of the database.
Select Ename, dname from EMP, dept where emp.deptno = Dept.deptno
Execution Plan display: NESTED LOOPS, the above statement is executed by the following pseudo-code:
For x in (SELECT * from EMP)
Loop
Index lookup ROWID for X.deptno--find the row ID for each DEPTNO
SELECT * FROM dept where rowid = that rowid
Output joined record
End Loop
To perform the steps:
1) Read the first block of the EMP table.
2) Remove the first line and use its DEPTNO value
3) read at most two or three index blocks that find the key value in the index.
4) Read the reference block of the Dept table.
Outer joins with nested loops, execution plan display: NESTED LOOPS (OUTER)
Hash joins , execution plan shows hash join
For example:
Select T1.object_name, t2.object_name from T T1, T T2
where t1.object_id = t2.object_id and t1.owner= ' Wmsys ';
Execution Plan display:
Under optimal conditions, Oracle will take the smaller of the two tables, in this case, take the result set of the t1.owner= ' Wmsys ' after the full scan of T1-and create an in-memory hash of this result, an array at the time of the hash, and the rows that go with the hash key are stored in the item in that index. As a result, Oracle now has a memory structure-array that can be accessed quickly, and if you now get a object_id value from T2, Oracle can hash it and retrieve the matching rows in the T1 in memory. Because this hash list is in private memory, accessing this data structure does not incur latch activity in general logical I/O. Once Oracle hashes the smaller table into memory, it scans the larger table (T2 in this example) and then, for each row, it hashes the retrieved object_id, finds the matching rows in the hash list, and returns the connection image. The customer application needs to wait for Oracle to fully scan the T1 and hash before getting the first line, and then the other rows get faster and faster.
In the case of an outer join, the execution plan shows the hash join (OUTER) first, and the outer join IS hashed first against the primary table and in memory.
Typically, a hash join excels at two small result sets.
Sort Merge Connection , execution plan display, Sort-merge Join
Sort a merge connection, sort the first input set, sort the 2nd input set, and then merge the results.
Typically, a sort-merge connection is useful in a non-equal join operation: that is, when the join condition is not an equation but a range comparison (for example, <, >=)
For example:
Select A.ename, B.ename, A.hiredate, b.hiredate from EMP A, EMP b
where A.hiredate <= b.hiredate and A.empno <> b.empno;
Cartesian connection, two tables have no connection condition of the full connection, such as the first table has 10 rows, the 2nd table has 10 rows, after connecting, 10*10=100 line
The anti-join (anti-join) anti-join is used to return rows of data that are not in another data source from one table, for example:
SELECT * FROM dept where deptno not in (select Deptno from EMP)
To test the not in problem:
CREATE TABLE T1
As SELECT * from All_objects where rownum <= 10000;
CREATE TABLE T2
As SELECT * from All_objects where rownum <= 9950;
Create index T2_idx on T2 (object_id);
1) Not in
Select COUNT (*) from Scott.t1 Rbo
where object_id not in (select object_id from Scott.t2);
2) Not EXISTS
Select COUNT (*) from Scott.t1 Rbo where NOT EXISTS (select null from scott.t2 where t2.object_id = rbo.object_id);
3) External Connection
Select COUNT (*) from T1, t2 rbo where t1.object_id = rbo.object_id (+) and rbo.object_id are NULL
In CBO mode:
The not exist method is the most time-consuming and logical I/O
Center Outer Connection
The not in method is best
In RBO mode:
The not in method is the worst
Center Outer Connection
Not exist best
In conclusion,
1) First, the outer connection method realizes the anti-connection, because in the Rbo and the CBO, the performance can
2) Under the CBO, you can use not in, so use the hint statement to force the CBO optimization plan when using not.
In addition, when implementing the anti-connection, it is important to note that NULL is present in the reference data source, if the data source field does not have a NOT null limit, you should use the IS not NULL condition to limit:
Really understand SQL
With players as
(SELECT ' P ' | | ROWNUM username
From All_objects
where RowNum <= 8),
Weeks as
(Select RowNum Week
From All_objects
where rownum <= 7)
Select Week,
Max (Decode (RN, 1, username, null)) U1,
Max (Decode (RN, 2, username, null)) U2,
Max (Decode (RN, 3, username, null)) U3,
Max (Decode (RN, 4, username, null)) U4,
Max (Decode (RN, 5, username, null)) U5,
Max (Decode (RN, 6, username, null)) U6,
Max (Decode (RN, 7, username, null)) U7,
Max (Decode (RN, 8, username, null)) U8
From (select Username,
Week
row_number () over (partition by week ORDER by Rnd) RN
From (select Username, week, dbms_random.random Rnd from players, weeks)
)
Group BY week;
Analytic functions, such as the upper blue section
Previous function, over (area, blank for all areas) alias
08 Efficient SQL