如何去除資料表中的重複資料

來源:互聯網
上載者:User

通常情況下,一個我們在做一個產品的時候,一開始可能由於設計考慮不周或者程式寫的不夠嚴謹,某個欄位上的值產生重複了,但是又必須去掉,這個時候就稍微麻煩了一點,直接加一個 UNIQUE KEY 肯定是不行了,因為會報錯。
 現在,我們來採用一種變通的辦法,不過可能會丟失一些資料 :)

 在這裡,我們設定一個表,其結構如下:

mysql> desc `user`;+-------+------------------+------+-----+---------+----------------+| Field | Type             | Null | Key | Default | Extra          |+-------+------------------+------+-----+---------+----------------+| id    | int(10) unsigned | NO   | PRI | NULL    | auto_increment || name  | char(10)         | NO   |     |         |                || extra | char(10)         | NO   |     |         |                |+-------+------------------+------+-----+---------+----------------+

 原來表中的資料假定有以下幾條:

mysql> SELECT * FROM `user`;+----+-------+--------+| id | name  | extra  |+----+-------+--------+|  1 | user1 | user1  ||  2 | user2 | user2  ||  3 | user3 | user3  ||  4 | user4 | user4  ||  5 | user5 | user5  ||  6 | user3 | user6  ||  7 | user6 | user7  ||  8 | user2 | user8  ||  9 | USER2 | user9  || 10 | USER6 | user10 |+----+-------+--------+

1、將原來的資料匯出

mysql>SELECT * INTO OUTFILE '/tmp/user.txt' FROM `user`;

2、清空資料表

mysql>TRUNCATE TABLE `user`;

3、建立唯一索引,並且修改 `name` 欄位的類型為 BINARY CHAR 區分大小寫

mysql> ALTER TABLE `user` MODIFY `name` CHAR(10) BINARY NOT NULL DEFAULT '';mysql> ALTER TABLE `user` ADD UNIQUE KEY ( `name` );

現在來看看新的表結構:

mysql> desc user;+-------+------------------+------+-----+---------+----------------+| Field | Type             | Null | Key | Default | Extra          |+-------+------------------+------+-----+---------+----------------+| id    | int(10) unsigned | NO   | PRI | NULL    | auto_increment || name  | char(10)         | NO   | UNI |         |                || extra | char(10)         | NO   |     |         |                |+-------+------------------+------+-----+---------+----------------+

4、把資料導回去,在這裡,有兩種選擇:新的重複記錄替換舊的記錄,只保留最新的記錄 或者是 新的記錄略過,只保留最舊的記錄

mysql> LOAD DATA INFILE '/tmp/user.txt' REPLACE INTO TABLE `user`;Query OK, 10 rows affected (0.00 sec)Records: 8  Deleted: 2  Skipped: 0  Warnings: 0mysql> SELECT * FROM USER;+----+-------+--------+| id | name  | extra  |+----+-------+--------+|  1 | user1 | user1  ||  8 | user2 | user8  ||  6 | user3 | user6  ||  4 | user4 | user4  ||  5 | user5 | user5  ||  7 | user6 | user7  ||  9 | USER2 | user9  || 10 | USER6 | user10 |+----+-------+--------+

 上面是採用 REPLACE 的方式,可以看到,匯入處理程序中刪掉了兩條資料,結果驗證確實是 新的重複記錄替換舊的記錄,只保留最新的記錄
 現在,來看看用 IGNORE 的方式:

mysql> LOAD DATA INFILE '/tmp/user.txt' IGNORE INTO TABLE `user`;Query OK, 6 rows affected (0.01 sec)Records: 8  Deleted: 0  Skipped: 2  Warnings: 0mysql> SELECT * FROM USER;+----+-------+--------+| id | name  | extra  |+----+-------+--------+|  1 | user1 | user1  ||  2 | user2 | user2  ||  3 | user3 | user3  ||  4 | user4 | user4  ||  5 | user5 | user5  ||  7 | user6 | user7  ||  9 | USER2 | user9  || 10 | USER6 | user10 |+----+-------+--------+

 看到了吧,確實是 新的記錄略過,只保留最舊的記錄

 

本文出自 “編程之路” 部落格,請務必保留此出處http://huangby.blog.51cto.com/5199904/1293273

相關文章

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.