Take PostgreSQL 9.2 official document as an example:
Create table weather (
City varchar (80 ),
Temp_lo int, -- low temperature
Temp_hi int, -- high temperature
Prcp real, -- precipitation
Date
);
Insert into weather VALUES
('San Francisco ', 46, 50, 0.25, '2017-11-27 '),
('San Francisco ', 43, 57, 0, '2017-11-29 '),
('Hayward ', 37, 54, NULL, '2017-11-29 '),
('Hayward ', 37, 54, NULL, '2017-11-29'); --- duplicated row
Here are the three methods:
Method 1: Replacement
-- Remove duplicate rows and store them to the new table weather_temp.
Select distinct city, temp_lo, temp_hi, prcp, date
INTO weather_temp
FROM weather;
-- Delete the original table
Drop table weather;
-- Rename the new table as weather
Alter table weather_temp rename to weather;
Or
-- Create a weather_temp table that is the same as weather.
Create table weather_temp (LIKE weather including constraints );
-- Fill in weather_temp with data that removes duplicate rows
Insert into weather_temp select distinct * FROM weather;
-- Delete the original table
Drop table weather;
-- Rename it weather.
Alter table weather_temp rename to weather;
Easy to understand, there are a lot of destructive operations such as DROP, and when the data volume is large, it takes a lot of time and space. Not recommended.
Method 2: Add a field
-- Add a new field with the serial type
Alter table weather add column id SERIAL;
-- Delete duplicate rows
Delete from weather WHERE id
Not in (
SELECT max (id)
FROM weather
Group by city, temp_lo, temp_hi, prcp, date
);
-- Delete the added Field
Alter table weather drop column id;
You need to add a field. "I don't know how Postgres handles adding fields. Do you want to append fields directly to the original table or copy the original table to form a new table ?」, If it is an append to the original table, it may be caused by the addition of new fields (generally block: 8 k). If it is a copy, it is a sin. Not good.
Third: System field [view System Columns]
Delete from weather
WHERE ctid
Not in (
SELECT max (ctid)
FROM weather
Group by city, temp_lo, temp_hi, prcp, date
);
Highly targeted [exclusive to ipvs], but simple.
PostgreSQL details: click here
PostgreSQL: click here
Recommended reading:
How to delete duplicate records in PostgreSQL
Configuration and installation of LAPP (Linux + Apache + PostgreSQL + PHP) Environment in Ubuntu