標籤:des class blog code http ext
在虛擬機器上測評了下MySQL 和 PostgreSQL 的各種LOAD FILE方式以及時間。 由於是虛擬機器上的測評,所以時間僅僅做參考,不要太較真, 看看就好了。
MySQL 工具:
1. 內建mysqlimport工具。
2. 命令列 load data infile ...
3. 利用mysql-connector-python Driver來寫的指令碼。
PostgreSQL 工具:
1. pgloader 第三方工具。
2. 命令列 copy ... from ...
3. 利用psycopg2寫的python 指令碼。
測試表結構:
mysql> desc t1;+----------+-----------+------+-----+-------------------+-------+| Field | Type | Null | Key | Default | Extra |+----------+-----------+------+-----+-------------------+-------+| id | int(11) | NO | PRI | NULL | || rank | int(11) | NO | | NULL | || log_time | timestamp | YES | | CURRENT_TIMESTAMP | |+----------+-----------+------+-----+-------------------+-------+3 rows in set (0.00 sec)mysql> select count(*) from t1;+----------+| count(*) |+----------+| 1000000 |+----------+1 row in set (6.80 sec)
測試CSV檔案:
t1.csv
MySQL 自身的loader: (時間24妙)
mysql> load data infile ‘/tmp/t1.csv‘ into table t1 fields terminated by ‘,‘ enclosed by ‘"‘ lines terminated by ‘\r\n‘; Query OK, 1000000 rows affected (24.21 sec)Records: 1000000 Deleted: 0 Skipped: 0 Warnings: 0
MySQL python 指令碼:(時間23秒)
>>>
Running 23.289 Seconds
MySQL 內建mysqlimport:(時間23秒)
[[email protected] ~]# time mysqlimport t_girl ‘/tmp/t1.csv‘ --fields-terminated-by=‘,‘ --fields-enclosed-by=‘"‘ --lines-terminated-by=‘\r\n‘ --use-threads=2 -uroot -proott_girl.t1: Records: 1000000 Deleted: 0 Skipped: 0 Warnings: 0real 0m23.664suser 0m0.016ssys 0m0.037s
PostgreSQL 自身COPY:(時間7秒)
t_girl=# copy t1 from ‘/tmp/t1.csv‘ with delimiter ‘,‘;COPY 1000000Time: 7700.332 ms
Psycopg2 驅動copy_to方法:(時間6秒)
[[email protected] scripts]# python load_data.py Running 5.969 Seconds.
Pgloader 匯入CSV:(時間33秒)
[[email protected] ytt]# pgloader commands.load table name read imported errors time ytt.t1 1000000 1000000 0 33.514s------------------------------ --------- --------- --------- -------------------------------------------- --------- --------- --------- -------------- Total import time 1000000 1000000 0 33.514s
Pgloader 直接從MySQL 拉資料:(時間51秒)
[[email protected] ytt]# pgloader commands.mysql table name read imported errors time fetch meta data 2 2 0 0.138s------------------------------ --------- --------- --------- -------------- t1 1000000 1000000 0 51.136s------------------------------ --------- --------- --------- -------------------------------------------- --------- --------- --------- -------------------------------------------- --------- --------- --------- -------------- Total import time 1000000 1000000 0 51.274s
附上commands.load和commands.mysql
commands.load:LOAD CSV FROM ‘/tmp/ytt.csv‘ WITH ENCODING UTF-8 ( id, rank, log_time ) INTO postgresql://t_girl:[email protected]:5432/t_girl?ytt.t1 WITH skip header = 0, fields optionally enclosed by ‘"‘, fields escaped by backslash-quote, fields terminated by ‘,‘ SET work_mem to ‘32 MB‘, maintenance_work_mem to ‘64 MB‘; commands.mysql:LOAD DATABASE FROM mysql://python_user:[email protected]:3306/t_girl?t1 INTO postgresql://t_girl:[email protected]:5432/t_girl?ytt.t1 with data only SET maintenance_work_mem to ‘64MB‘, work_mem to ‘3MB‘, search_path to ‘ytt‘;附pgloader 手冊:http://pgloader.io/howto/pgloader.1.html