This article describes several methods to quickly find duplicate records in Oracle databases.
As an Oracle database developer, you often need to create indexes for one or several columns of tables to provide direct and fast access to rows. However, the following prompt is often prompted during creation:
ORA-01452: A unique index cannot be created and duplicate records are found.
The Oracle system prompts that a unique index cannot be created for the table because the system finds that there are duplicate records in the table. A unique index can be created only when a duplicate record in the table is found and deleted. The following uses table table_name as an example to describe three different methods to determine Repeated Records in database tables.
[B] 1. [/B] [B] Use grouping functions to find duplicate rows in a table [/B] [B]
[/B] using the grouping function GROUP BY/HAVING in the select statement, duplicate rows can be easily identified. If you want to create a column with a unique index, use group by to count the column and return the number of each group. If the number of records in the group exceeds one, duplicate rows exist. The command is as follows:
SQL> Select column from table_name
Group by column
Having count (column)> 1;
This query method is simple and quick, and is the most common method in ORACLE databases.
[B] 2. [/B] [B] use pseudo-column auto-join query [/B] [B] [/B] [B]
[/B] in the ORACLE database, each table has a rowid pseudo column, which uniquely identifies a row and provides quick access to special rows. Using the max or min function for this column can easily determine duplicate rows.
1) Use the max function to find duplicate rows
SQL> select column1, column2, column3 from table_name
Where rowid
2). Use the min function to find duplicate rows
SQL> select column1, column2, column3 from table_name
Where rowid> (select min (rowid) from table_name
Where column1 = a. column1 and column2 = a. column2
And colum3 = a. colum3 and ...);
However, when the table is large (for example, more than 0.5 million rows), the efficiency of this method is intolerable.
[B] 3. [/B] [B] Searching duplicate rows by defining integrity constraints [/B]
Define an integrity constraint. integrity constraint is a rule that limits the values of one or more columns in the base table. You can define a UNIQUE constraint on a table to specify a UNIQUE keyword. To satisfy this constraint, the unique keyword Column cannot contain the same value. Therefore, the exceptions into clause can be used to store records that violate the activation integrity constraints in one table (EXCEPTIONS). This table must be created before using this option. Associate the EXCEPTIONS and table_name tables with rowid to obtain repeated records in the table_name table. The specific method is as follows:
1) create a table EXCEPTIONS to store duplicate records.
SQL> create table exceptions (row_id rowid,
Owner varchar2 (30 ),
Table_name varchar2 (30 ),
Constraint varchar2 (30 ));
2) define a UNIQUE (UNIQUE) constraint for the table table_name, if the same value is included in the defined keyword, the system prompts ORA-02299: cannot create-there are repeated values, and store the record information in the EXCEPTIONS table.
SQL> alter table table_name
Add constraint unq_column
Unique (column1, column2 ,......)
Exceptions into EXCEPTIONS;
2. Associate table_name with partitions by using the pseudo column (rowid). records with the same pseudo column are repeated records in table_name.
SQL> select column1, column2 ,......
From table_name a, EXCEPTIONS B
Where a. rowid = B. row_id;
This query method is highly efficient and can fully record duplicate records, but the steps are cumbersome.