轉自:CSDN 平時工作中可能會遇到當試圖對庫表中的某一列或幾列建立唯一索引時,系統提示 ORA-01452 :不能建立唯一索引,發現重複記錄。 下面總結一下幾種尋找和重複資料刪除記錄的方法(以表CZ為例): 表CZ的結構如下: SQL> desc cz Name Null? Type ----------------------------------------- -------- ------------------ C1 NUMBER(10) C10 NUMBER(5) C20 VARCHAR2(3) 重複資料刪除記錄的方法原理: (1).在Oracle中,每一條記錄都有一個rowid,rowid在整個資料庫中是唯一的,rowid確定了每條記錄是在Oracle中的哪一個資料檔案、塊、行上。 (2).在重複的記錄中,可能所有列的內容都相同,但rowid不會相同,所以只要確定出重複記錄中那些具有最大rowid的就可以了,其餘全部刪除。 重複記錄判斷的標準是: C1,C10和C20這三列的值都相同才算是重複記錄。 經查看錶CZ總共有16條記錄: SQL>set pagesize 100 SQL>select * from cz; C1 C10 C20 ---------- ---------- --- 1 2 dsf 1 2 dsf 1 2 dsf 1 2 dsf 2 3 che 1 2 dsf 1 2 dsf 1 2 dsf 1 2 dsf 2 3 che 2 3 che 2 3 che 2 3 che 3 4 dff 3 4 dff 3 4 dff 4 5 err 5 3 dar 6 1 wee 7 2 zxc 20 rows selected.
1.尋找重複記錄的幾種方法: (1).SQL>select * from cz group by c1,c10,c20 having count(*) >1; C1 C10 C20 ---------- ---------- --- 1 2 dsf 2 3 che 3 4 dff (2).SQL>select distinct * from cz; C1 C10 C20 ---------- ---------- --- 1 2 dsf 2 3 che 3 4 dff (3).SQL>select * from cz a where rowid=(select max(rowid) from cz where c1=a.c1 and c10=a.c10 and c20=a.c20); C1 C10 C20 ---------- ---------- --- 1 2 dsf 2 3 che 3 4 dff 2.重複資料刪除記錄的幾種方法: (1).適用於有大量重複記錄的情況(在C1,C10和C20列上建有索引的時候,用以下語句效率會很高): SQL>delete cz where (c1,c10,c20) in (select c1,c10,c20 from cz group by c1,c10,c20 having count(*)>1) and rowid not in (select min(rowid) from cz group by c1,c10,c20 having count(*)>1); SQL>delete cz where rowid not in(select min(rowid) from cz group by c1,c10,c20); (2).適用於有少量重複記錄的情況(注意,對於有大量重複記錄的情況,用以下語句效率會很低): SQL>delete from cz a where a.rowid!=(select max(rowid) from cz b where a.c1=b.c1 and a.c10=b.c10 and a.c20=b.c20); SQL>delete from cz a where a.rowid<(select max(rowid) from cz b where a.c1=b.c1 and a.c10=b.c10 and a.c20=b.c20); SQL>delete from cz a where rowid <(select max(rowid) from cz where c1=a.c1 and c10=a.c10 and c20=a.c20); (3).適用於有少量重複記錄的情況(暫存資料表法): SQL>create table test as select distinct * from cz; (建一個暫存資料表test用來存放重複的記錄) SQL>truncate table cz; (清空cz表的資料,但保留cz表的結構) SQL>insert into cz select * from test; (再將暫存資料表test裡的內容反插回來) (4).適用於有大量重複記錄的情況(Exception into 子句法): 採用alter table 命令中的 Exception into 子句也可以確定出庫表中重複的記錄。這種方法稍微麻煩一些,為了使用“excepeion into ”子句,必須首先建立 EXCEPTIONS 表。建立該表的 SQL 指令檔為 utlexcpt.sql 。對於win2000系統和 UNIX 系統, Oracle 存放該檔案的位置稍有不同,在win2000系統下,該指令檔存放在$ORACLE_HOME/Ora90/rdbms/admin 目錄下;而對於 UNIX 系統,該指令檔存放在$ORACLE_HOME/rdbms/admin 目錄下。 具體步驟如下: SQL>@?/rdbms/admin/utlexcpt.sql Table created. SQL>desc exceptions Name Null? Type ----------------------------------------- -------- -------------- ROW_ID ROWID OWNER VARCHAR2(30) TABLE_NAME VARCHAR2(30) CONSTRAINT VARCHAR2(30) SQL>alter table cz add constraint cz_unique unique(c1,c10,c20) exceptions into exceptions; * ERROR at line 1: ORA-02299: cannot validate (TEST.CZ_UNIQUE) - duplicate keys found SQL>create table dups as select * from cz where rowid in (select row_id from exceptions); Table created. SQL>select * from dups; C1 C10 C20 ---------- ---------- --- 1 2 dsf 1 2 dsf 1 2 dsf 1 2 dsf 2 3 che 1 2 dsf 1 2 dsf 1 2 dsf 1 2 dsf 2 3 che 2 3 che 2 3 che 2 3 che 3 4 dff 3 4 dff 3 4 dff 16 rows selected. SQL>select row_id from exceptions; ROW_ID ------------------ AAAHD/AAIAAAADSAAA AAAHD/AAIAAAADSAAB AAAHD/AAIAAAADSAAC AAAHD/AAIAAAADSAAF AAAHD/AAIAAAADSAAH AAAHD/AAIAAAADSAAI AAAHD/AAIAAAADSAAG AAAHD/AAIAAAADSAAD AAAHD/AAIAAAADSAAE AAAHD/AAIAAAADSAAJ AAAHD/AAIAAAADSAAK AAAHD/AAIAAAADSAAL AAAHD/AAIAAAADSAAM AAAHD/AAIAAAADSAAN AAAHD/AAIAAAADSAAO AAAHD/AAIAAAADSAAP 16 rows selected. SQL>delete from cz where rowid in ( select row_id from exceptions); 16 rows deleted. SQL>insert into cz select distinct * from dups; 3 rows created. SQL>select *from cz; C1 C10 C20 ---------- ---------- --- 1 2 dsf 2 3 che 3 4 dff 4 5 err 5 3 dar 6 1 wee 7 2 zxc 7 rows selected. 從結果裡可以看到重複記錄已經刪除。 |