Oralce distinct method of removing duplicate records

Source: Internet
Author: User
Tags create index

Distinct shows only one repeat of the value.

No matter how many times this value appears, it appears only once.

Select DISTINCT field name 1, field name 2 from table order by field name 1

It is best used in conjunction with the order by. can improve efficiency

Sql>
Sql> CREATE TABLE Employees (
2 au_id CHAR (3) Not NULL,
3 au_fname VARCHAR () not NULL,
4 au_lname VARCHAR () not NULL,
5 phone VARCHAR () NULL,
6 address VARCHAR (m) NULL,
7 City VARCHAR () NULL,
8 State CHAR (2) NULL,
9 Zip CHAR (5) NULL
10);

Table created.

Sql>
Sql> INSERT into Employees VALUES (' A01 ', ' S ', ' B ', ' 111-111-1111 ', ' St ', ' Boston ', ' NY ', ' 11111 ');

1 row created.

Sql> INSERT into Employees VALUES (' A02 ', ' W ', ' H ', ' 222-222-2222 ', ' 2922 Rd ', ' Boston ', ' CO ', ' 22222 ');

1 row created.

Sql> INSERT into Employees VALUES (' A03 ', ' h ', ' h ', ' 333-333-3333 ', ' 3800 Ave, #14F ', ' San Francisco ', ' CA ', ' 33333 ');

1 row created.

Sql> INSERT into Employees VALUES (' A04 ', ' K ', ' H ', ' 444-444-4444 ', ' 3800 Ave, #14F ', ' San Francisco ', ' CA ', ' 44444 ');

1 row created.

Sql> INSERT into Employees VALUES (' A05 ', ' C ', ' K ', ' 555-555-5555 ', ' 114 St ', ' New York ', ' NY ', ' 55555 ');

1 row created.

Sql> INSERT into Employees VALUES (' A06 ', ', ', ' K ', ' 666-666-666 ', ' 390 Mall ', ' Palo Alto ', ' CA ', ' 66666 ');

1 row created.

Sql> INSERT into Employees VALUES (' A07 ', ' P ', ' O ', ' 777-777-7777 ', ' 1442 St ', ' Sarasota ', ' FL ', ' 77777 ');

1 row created.

Sql>
Sql>
sql> SELECT DISTINCT State
2 from Employees;


Look at the filtered duplicate SQL statement

Sql> SELECT * from Employee
2/

ID first_name last_name start_dat end_date SALARY City DESCRIPTION
---- ---------- ---------- --------- --------- ---------- ---------- ---------------
Jason Martin 25-jul-96 25-jul-06 1234.56 Toronto Programmer
Alison Mathews 21-mar-76 21-feb-86 6661.78 Vancouver Tester
James Smith 12-dec-78 15-mar-90 6544.78 Vancouver Tester
Celia Rice 24-oct-82 21-apr-99 2344.78 Vancouver Manager
Robert Black 15-jan-84 08-aug-98 2334.78 Vancouver Tester
Linda Green 30-jul-87 04-jan-96 4322.78 New York Tester
David Larry 31-dec-90 12-feb-98 7897.78 New York Manager
James Cat 17-sep-96 15-apr-02 1232.78 Vancouver Tester

8 rows selected.

Sql>
Sql>
Sql> SELECT Description from employee
2/

DESCRIPTION
---------------
Programmer
Tester
Tester
Manager
Tester
Tester
Manager
Tester

8 rows selected.

Sql> SELECT DISTINCT Description from employee
2/

DESCRIPTION
---------------
Programmer
Manager
Tester

If you want to work with multiple lists, the fields are separated by "," as follows

Sql> SELECT DISTINCT City, state
2 from Employees


About increasing the efficiency test of distinct query

Only by adding the DISTINCT keyword, Oracle will necessarily need to sort all the fields that follow. It has often been found that because developers do not understand SQL very well, adding distinct in front of more than 20 fields in the select list makes it virtually impossible for a query to execute or even produce ORA-7445 errors. So the developers have been stressed to the performance impact of distinct.

I didn't expect developers to test a large SQL, tell me if you add distinct, the query will take about 4 minutes to finish, if not add distinct, the query executed more than 10 minutes, still not get results.

The first thought is possible distinct is in the subquery, because of the addition of distinct, the first step of the result set narrowed, resulting in improved query performance, the results of a look at SQL, found distinct incredibly is in the outermost layer of the query.

Because the original SQL is too long and too many tables involved, it is hard to say, and here is an example of an example that cannot be seen as a significant difference in execution time due to the amount of data and the complexity of the SQL. Here is a logical reading comparison between the two cases to illustrate the problem.

First set up the simulation environment:

Sql> CREATE TABLE T1 as SELECT * from dba_objects
2 WHERE OWNER = ' SYS '
3 and object_type not like '%body '
4 and object_type not like ' java% ';
Table created.
Sql> CREATE TABLE T2 as SELECT * from dba_segments WHERE OWNER = ' SYS ';
Table created.
Sql> CREATE TABLE T3 as SELECT * from dba_indexes WHERE OWNER = ' SYS ';
Table created.
sql> ALTER TABLE T1 ADD CONSTRAINT pk_t1 PRIMARY KEY (object_name);
Table altered.
Sql> CREATE INDEX ind_t2_segname on T2 (segment_name);
Index created.
Sql> CREATE INDEX ind_t3_tabname on T3 (table_name);
Index created.
Sql> EXEC dbms_stats. Gather_table_stats (USER, ' T1 ', method_opt => ' for all INDEXED COLUMNS SIZE ', CASCADE => TRUE)
Pl/sql procedure successfully completed.
Sql> EXEC dbms_stats. Gather_table_stats (USER, ' T2 ', method_opt => ' for all INDEXED COLUMNS SIZE ', CASCADE => TRUE)
Pl/sql procedure successfully completed.
Sql> EXEC dbms_stats. Gather_table_stats (USER, ' T3 ', method_opt => ' for all INDEXED COLUMNS SIZE ', CASCADE => TRUE)
Pl/sql procedure successfully completed.

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.