An example of deduplication Optimization for a XXX mobile phone company

Source: Internet
Author: User
This is an SQL statement they developed to delete duplicate data and the row whose id is the minimum value is not deleted: deletefromjd_chapterawherea. 'id' in (select 'id' fromjd _ chaptergroup & nbs ..

This is an SQL statement they developed to delete duplicate data and the row whose id is the minimum value is not deleted: deletefromjd_chapterawherea. 'id' in (select 'id' fromjd _ chaptergroup & nbs ..

This is an SQL statement they developed to delete duplicate data and the row whose id is the minimum value is not deleted:

Because the table is large (tens of millions) and two subqueries are used, the execution has not been completed for a long time.

-------------------------- Train of Thought ----------------------------

To associate a temporary table, follow these steps:

1. Export duplicate data to the Slave database first to ensure the website space is sufficient.

  • Into outfile '/tmp/jd_chapter. SQL' FIELDS TERMINATED ',';
  • 2. Copy the exported SQL to the/tmp/directory of the Master database.

    3. Create a temporary table on the Master database and create a primary key:

  • Query OK, 0 rows affected (0.07 sec)
  • 4. import data to the temporary table in the LOAD mode on the Master database of the Master database

  • Load data infile '/tmp/jd_chapter. SQL' into table tmp FIELDS TERMINATED ',';
  • 5. Delete the smallest id of the temporary table on the Master database of the Master database.

  • Delete from tmp limit 1;
  • 6. Use a temporary table for association and delete duplicate data in the jd_chapter table

  • This article is from the "hechun's technical column" blog, website space, please be sure to keep this source

    , Virtual host

    Contact Us

    The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

    If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

    A Free Trial That Lets You Build Big!

    Start building with 50+ products and up to 12 months usage for Elastic Compute Service

    • Sales Support

      1 on 1 presale consultation

    • After-Sales Support

      24/7 Technical Support 6 Free Tickets per Quarter Faster Response

    • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.