Compare the data imported from MySQL and PostgreSQL, evaluate the load file methods and time of MySQL and PostgreSQL on the virtual machine. Because it is the evaluation on the virtual machine, so the time is only for reference, don't be too real, just look at it.
MySQL tool:
PostgreSQL tool:
Test table structure:
Mysql> desc t1; + ---------- + ----------- + ------ + ----- + ----------------- + ------- + | Field | Type | Null | Key | Default | Extra | + ------------ + --------- + ------ + ----- + ----------------- + ------- + | id | int (11) | NO | PRI | NULL | rank | int (11) | NO | NULL | log_time | timestamp | YES | CURRENT_TIMESTAMP | + ---------- + ----------- + ------ + ----- + ----------------- + ------- + 3 rows in set (0.00 sec) mysql> select count (*) from t1; + ---------- + | count (*) | + ---------- + | 1000000 | + ---------- + 1 row in set (6.80 sec)
MySQL loader: (time 24 wonderful) mysql> load data infile '/tmp/t1.csv 'into table t1 fields terminated ', 'enabledby' "'lines terminated by'/r/N'; Query OK, 1000000 rows affected (24.21 sec) Records: 1000000 Deleted: 0 Skipped: 0 Warnings: 0
MySQL python script (time 23 seconds) >>> MySQL comes with mysqlimport :( time 23 seconds) [root @ mysql56-master ~] # Time mysqlimport t_girl '/tmp/t1.csv' -- fields-terminated-by = ', '-- fields-enclosed-by =' "'-- lines-terminated-by ='/r/N' -- use-threads = 2-uroot-proot t_girl.t1: Records: 1000000 Deleted: 0 Skipped: 0 Warnings: 0 real 0m23. 664 s user 0m0. 016 s sys 0m0. 037 s
PostgreSQL COPY itself :( Time 7 seconds) t_girl = # copy t1 from '/tmp/t1.csv' with delimiter ','; COPY 1000000 Time: 7700.332 MS
Psycopg2 drive copy_to method: (6 Seconds) [root @ postgresql-instance scripts] # python load_data.py Running 5.969 Seconds.
Pgloader imports CSV data: (33 seconds) [root @ postgresql-instance ytt] # pgloader commands. load table name read imported errors time ytt. t1 1000000 1000000 0 33.514 s ------------------------------ --------- ------------------ -------------------------------- ----------- ---------------- Total import time 1000000 0 1000000 s
Pgloader pulls data directly from MySQL: (Time: 51 seconds) [root @ postgresql-instance ytt] # pgloader commands. mysql table name read imported errors time fetch meta data 2 2 0 0.138 s hour ------------- --------- hour t1 1000000 1000000 0 51.136 s hour ----------- --------- hour ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- -------------- Total import time 1000000 1000000 0 51.274 s
Attach commands. load and commands. mysql
Commands. load: load csv from '/tmp/ytt.csv' with encoding UTF-8 (id, rank, log_time) INTO postgresql: // t_girl: t_girl@127.0.0.1: 5432/t_girl? Ytt. t1 WITH skip header = 0, fields optionally enclosed by '"', fields escaped by backslash-quote, fields terminated by ', 'set work_mem to '32 MB ', maintenance_work_mem to '64 MB'; commands. mysql: load database from mysql: // python_user: python_user@192.168.1.131: 3306/t_girl? T1 INTO postgresql: // t_girl: t_girl@127.0.0.1: 5432/t_girl? Ytt. t1 with data only SET maintenance_work_mem to '64mb', work_mem to '3mb', search_path to 'ytt'; attached pgloader Manual: http://pgloader.io/howto/pgloader.1.html