MySQL has 5 million of data, but most of it is repeated, the real 1.8 million, so how to put these repeated data out, in the Internet to find a circle, a lot of the code is not in such a low efficiency, their own pondering the combination of a bit, to find an efficient way of processing, in this way, 5 million data , 10 minutes to remove all repeat, please reference.
First step: Extract non-repeating fields from the 5 million data table data_content_152 sfzhm corresponding ID field to TMP3 table
CREATE TABLE Tmp3 as select min (id) as col1 from data_content_152 Group by SFZHM;
Step Two: Create a new table res
CREATE TABLE ' res ' (' id ' int (one), ' Sfz ' char ()) Engine=myisam;
The third step: the TMP3 table ID corresponding to the data_content_152 need to extract the data to add to the Res table SFZ field
INSERT into Res (SFZ) SELECT sfzhm from Data_content_152,tmp3 where data_content_152.id=tmp3.col1
At this point, it is implemented in MySQL, to the data table data_content_152 completely delete duplicate data, the deduplication data into the Res table.
The second Kind
Delete from a where-id not-in (select-ID from (SELECT-ID from-a group by name) as B)
A simple way to delete duplicate data in MySQL, MySQL delete duplicate data